-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[Variant]: Implement DataType::Union support for cast_to_variant kernel
#8196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| /// Convert dictionary encoded arrays | ||
| fn convert_dictionary_encoded( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change just moves the code from the Dictionary branch into this helper function. Since all the impls use slightly different coding styles, we could do a follow-up PR to make them consistent once all the variant cast implementations are complete.
f34bcd2 to
6dda6a2
Compare
|
|
||
| // Convert each child array to variant arrays | ||
| let mut child_variant_arrays = HashMap::new(); | ||
| for (type_id, _) in fields.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to merge the two passes into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you are suggesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean using one loop instead of two? That way we will compute the child array of each type_id on demand.
But IIUC, we will use all the child arrays anyway if it's a valid union, and I think precomute all the child arrays may benefit the lookup in the second loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, we'll use the same child_variant_array many times. The current would have a better performance
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @liamzwbao -- this looks great. The only thing I think it is missing is a test for nulls in the UnionArray.
Thank you @klion26 for the review
| let value = child_variant_array.value(value_offset); | ||
| builder.append_variant(value); | ||
| } else { | ||
| // This should not happen in a valid union, but handle gracefully |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
|
||
| // Convert each child array to variant arrays | ||
| let mut child_variant_arrays = HashMap::new(); | ||
| for (type_id, _) in fields.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you are suggesting
| } | ||
|
|
||
| #[test] | ||
| fn test_cast_to_variant_union_sparse() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we please add a test for a UnionArray where the child element is null? So that the output VariantArray has a null as well?
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @liamzwbao
I also merged up from main to resolve a merge conflict with this PR
|
🚀 |
Which issue does this PR close?
DataType::Unionsupport forcast_to_variantkernel #8195.Rationale for this change
What changes are included in this PR?
Implement
DataType::Unionforcast_to_variantAre these changes tested?
Yes
Are there any user-facing changes?
New cast type supported