-
Notifications
You must be signed in to change notification settings - Fork 1.7k
pipe column orderings into pruning predicate creation #15821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| let pruning_predicate = PruningPredicate::try_new( | ||
| Arc::clone(predicate), | ||
| self.schema(), | ||
| vec![ColumnOrdering::Unknown; self.schema().fields().len()], | ||
| )?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add a new signature to avoid API churn, but I wanted to make it explicit for now to see all of the callsites
3e15afe to
8c2ceb1
Compare
|
@adriangb please check out pydantic#28 |
| return expr; | ||
| } | ||
|
|
||
| // Special handlng for floats. Because current Parquet statistics do not allow NaN, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block is why we need to get the column ordering info passed down. Here we know which column is being pruned and with which operation. For floats we can disallow pruning because the Parquet stats are incomplete. We can also skip pruning if the stats are not valid because an ordering is not defined for the type.
|
@etseidl I'm sorry we haven't made any progress here. I see there are a lot of merge conflicts but we do now finally have all of the building blocks in place since we evaluate predicates against the physical file schema. Should we revive this? |
|
I'll be able to revisit this early in the week. It's been a back burner issue for me lately because of the slow progress on parquet format changes. It seems the sort order keeps popping up, though, so I think we'll want this change eventually. |
|
marking as draft as this doesn't appear to be waiting on review |
NaNis present #15812