Skip to content

Allow custom planning behavior for selecting wildcard expression #11639

@goldmedal

Description

@goldmedal

Is your feature request related to a problem or challenge?

Currently, DataFusion always expands the wildcard expression to the list of columns when planning a SELECT statement. This causes problems for those who want to check the wildcard expression in the logical plan layer.

For example, given a table "A" with three columns: c1, c2, and c3, it is difficult to distinguish between SELECT * FROM A and SELECT c1, c2, c3 FROM A in the logical plan layer.

If we provide a way for users to customize the behavior of wildcard expression planning, it will be easier to distinguish between these two cases.

Describe the solution you'd like

I plan to implement this feature based on ExprPlanner. I guess we can add a method called plan_select_wildcard and putting the default expression expansion there. Then, invoke this method in SqlToRel::sql_select_to_rex.
https://github.com/apache/datafusion/blob/main/datafusion/sql/src/select.rs#L594

Describe alternatives you've considered

No response

Additional context

My current project, Wren engine, is a semantic engine based on the DataFusion Logical Plan. We have a type of column called a "calculated field" that can be selected specifically but won't be included when querying using the wildcard expression. Since it could trigger some joins for its parent table, we want to ensure that it is only used when the user specifically selects it.
This feature helps us align with the native behavior of DataFusion.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions