-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Support Time Parquet Data Page Statistics #11187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| [<$stat_type_prefix Int32DataPageStatsIterator>]::new($iterator).flatten(), | ||
| )), | ||
| _ => { | ||
| let len = $iterator.count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have referenced get_statistics macro to decide on returing a null array, my other idea was to conclued with unimplemented!().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You beat me to it ;-).
I think this can just be:
_ => new_empty_array(&DataType::Time32(unit.clone())),And then you need to add new_empty_array into the use statement. Time64 gets a similar treatment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner. Thanks for the idea @efredine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks right to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| )), | ||
| _ => { | ||
| // don't know how to extract statistics, so return an empty array | ||
| new_empty_array(&DataType::Time32(unit.clone())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically speaking I think Time32 must be either Seconds or Milliseconds otherwise it is not a valid type.
https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Time32
However, the arrow-rs library doesn't encode this in the types so I think this is a reasonable behavior
* add parquet page stats for time * return empty array instead of null * fix typos
Which issue does this PR close?
Closes #11114
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Covered by existing test cases
Are there any user-facing changes?
No