Skip to content

Remove struct UDF, and use named_struct everywhere #9839

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

This is a follow on to #9743 where @gstvg added a great named_struct function to construct StructArrays ❤️

As part of that PR, @yyy1000 noted that the existing code in the struct udf is now never called: #9743 (comment)

Describe the solution you'd like

  1. Make the invoke()` function reutrn a not yet implemented error https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/functions/src/core/struct.rs#L90-L92

  2. Implement the simplify API to rewrite calls to struct() to a call to named_struct

https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/expr/src/udf.rs#L372-L378

  1. Update the sql planner to call struct rather than building up the c0, `c1, etc and calling named_struct

Describe alternatives you've considered

We could also just remove the struct udf entirely, though in that case it is important to keep the struct expr_fn function for backwards compatibility

https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/functions/src/core/mod.rs#L44

I think it could be implemented as its own function like

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions