-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add support for Arrow Dictionary type in Substrait #16608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a28e926 to
82327f4
Compare
82327f4 to
dfac1e0
Compare
|
@gabotechs FYI |
| /// Used for the arrow type [`DataType::Map`]. | ||
| /// | ||
| /// [`DataType::Map`]: datafusion::arrow::datatypes::DataType::Map | ||
| pub const DICTIONARY_CONTAINER_TYPE_VARIATION_REF: u32 = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I see the other *_CONTAINER_TYPE_VARIATION_REF are referring to the kind of variations you'd get for the different binary/string/list variants (e.g. Utf8, LargeUtf8 and Utf8View) which seem unrelated to Maps or Dictionaries.
I think the fact that the r#type::Map type chose to use the DEFAULT_CONTAINER_TYPE_VARIATION_REF for their current implementation is an unfortunate choice with no consequences, as the value is 0 anyways.
As the DataType::Dictionary or DataType::Map do not have the DataType::LargeDictionary, DataType::DictionaryView,DataType::LargeMap or DataType::MapView, I don't think they should be using the *_CONTAINER_TYPE_VARIATION_REF variations. What I'd do instead is to have a different set of variation refs for them, something like:
pub const DEFAULT_MAP_TYPE_VARIATION_REF: u32 = 0;
pub const DICTIONARY_MAP_TYPE_VARIATION_REF: u32 = 1;what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I 100% agree, I've updated it use the consts you've suggested.
This commit adds support for the Arrow Dictionary type in Substrait plans. Resolves apache#16273
dfac1e0 to
35ff2ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome! this one LGTM, cc @alamb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jkosh44 and @gabotechs
* Add support for Arrow Dictionary type in Substrait This commit adds support for the Arrow Dictionary type in Substrait plans. Resolves apache#16273 * Add more specific type variation consts (cherry picked from commit d359d64)
* Add support for Arrow Dictionary type in Substrait (apache#16608) * Add support for Arrow Dictionary type in Substrait This commit adds support for the Arrow Dictionary type in Substrait plans. Resolves apache#16273 * Add more specific type variation consts (cherry picked from commit d359d64) * Remove DataDog Specific Workaround --------- Co-authored-by: Joseph Koshakow <[email protected]>
Which issue does this PR close?
Rationale for this change
This commit adds support for the Arrow Dictionary type in Substrait plans.
What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?
Add support for Arrow Dictionary type in Substrait plans