Skip to content

Optimize FilterExec::statistics / don't ignore errors #7553

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

This came up from looking at the backgrace in #7522 where we saw a significant amount of planning time going to creating DataFusionError which were then thrown away.

I tracked this down to creating the error (which is ignored) here:

https://github.com/apache/arrow-datafusion/blob/c47ae0d13dbc9474897609f3dea4f80a93cf4de0/datafusion/physical-plan/src/filter.rs#L206-L209

Describe the solution you'd like

I would like less time being spent creating erorrs that are thrown away. I can see two potential ways to do this:

  1. Stop ignoring the error and instead fix whatever the underlying issue is
  2. Cache the calculation of the statistics so it doesn't get called multiple times

Both might be appropriate

Describe alternatives you've considered

No response

Additional context

cc @berkaysynnada as I believe this came in #6982

I don't think this is a critical fix, but I wanted to file it given I had tracked it down

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions