Skip to content

Conversation

@Veeupup
Copy link
Contributor

@Veeupup Veeupup commented Nov 7, 2023

Which issue does this PR close?

Closes #6978

Rationale for this change

Using RowConverter to implement array_intersect, no duplicate code for different data types,

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Nov 7, 2023
@Veeupup
Copy link
Contributor Author

Veeupup commented Nov 7, 2023

hi @edmondop @jayzhan211 , I have a draft implementation for array_intersect, I use a hashmap to store the first array values and find values that appear in the hashmap. And it works just fine and looks like a general approach?

But I encounter a problem for Float32 and Float64, we can not just use HashMap<f32/f64> because std::hash::Hash is not implemented for float types, so I want to fix it by introducing ordered-float to make it hashable, is it too heavy for this case ?
Or do you have any better ideas to make it? Very happy to see your advice !

@tustvold
Copy link
Contributor

tustvold commented Nov 7, 2023

Perhaps you could use https://docs.rs/arrow-row/latest/arrow_row/? This would have the added benefit of supporting more complex types, e.g. lists of lists, and not adding additional codegen (#7988). The notes I wrote on #6981 (comment) might be helpful for this

@Veeupup Veeupup changed the title Initial Implementation of array_intersect Implementation of array_intersect Nov 8, 2023
@Veeupup Veeupup marked this pull request as ready for review November 8, 2023 16:38
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Nov 9, 2023
@Veeupup Veeupup force-pushed the array_interact branch 2 times, most recently from 6b7198f to f12525c Compare November 9, 2023 12:42
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly LGTM

@Veeupup
Copy link
Contributor Author

Veeupup commented Nov 9, 2023

@alamb PTAL : )

@Veeupup Veeupup requested a review from Dandandan November 9, 2023 14:28
@xudong963
Copy link
Member

@Veeupup Triggered the ci, please fix it.

Signed-off-by: veeupup <[email protected]>

x
@Veeupup
Copy link
Contributor Author

Veeupup commented Nov 10, 2023

@xudong963 hi try fixing ci locally, maybe you can help me trigger it again or review it? : )

@Veeupup Veeupup requested a review from tustvold November 10, 2023 16:01
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice implementation to me -- thank you @Veeupup @Dandandan @tustvold @xudong963 and @Dandandan -- what a team effort ❤️

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, welcome to arrow-datafusion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement array_intersect function

6 participants