Skip to content

Implement physical execution of uncorrelated scalar subqueries #3781

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

We currently support uncorrelated scalar subqueries by translating them into a cross-join. It would likely be more efficient to execute the subquery and update the original plan with the scalar value.

We also need to do this so we can throw an error if more than 1 row is returned and there is no way to do this in the logical plan.

Describe the solution you'd like

The optimizer will need some kind of trait that execution engines (DataFusion, Ballista, Dask SQL, etc) can implement for resolving scalar values.

trait OptimizerContext {
  fn supports_scalar_subquery_execution() -> bool;
  fn execute_scalar_subquery(plan: &LogicalPlan) -> Result<ScalarValue>;
}

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformanceMake DataFusion faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions