Skip to content

Conversation

@jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Dec 6, 2023

Which issue does this PR close?

Rationale for this change

I found the reason why we need level 3 vec in convert_array_agg_to_orderings https://github.com/apache/arrow-datafusion/pull/7629/files#r1357637090. For multi ordering cases, we have len > 1.

Add multi ordering test to improve the test coverage, also simply the code.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Dec 6, 2023
7,30,6
8,30,7
9,30,8
10,10,9 No newline at end of file
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why we need csv test is that only csv file trigger merge_batch, the normal table go to update_batch only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal table

logical_plan
Aggregate: groupBy=[[]], aggr=[[ARRAY_AGG(arrays.column1) ORDER BY [arrays.column1 ASC NULLS LAST]]]
--TableScan: arrays projection=[column1]
physical_plan
AggregateExec: mode=Single, gby=[], aggr=[ARRAY_AGG(arrays.column1)]
--SortExec: expr=[column1@0 ASC NULLS LAST]
----MemoryExec: partitions=1, partition_sizes=[1]

AggregateExec: mode is Single, so run update_batch.

partition_ordering_values.push(v);
// Get the inner struct array
let ordering_array = agg_orderings.value(0);
let orderings = ScalarValue::convert_array_to_scalar_vec(&ordering_array)?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we always have ListArray(StructArray) for agg_orderings, so we can safely get orderings with convert_array_to_scalar_vec only.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Dec 6, 2023
Aggregate: groupBy=[[]], aggr=[[ARRAY_AGG(agg_order.c1) ORDER BY [agg_order.c2 DESC NULLS FIRST, agg_order.c3 ASC NULLS LAST]]]
--TableScan: agg_order projection=[c1, c2, c3]
physical_plan
AggregateExec: mode=Final, gby=[], aggr=[ARRAY_AGG(agg_order.c1)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AggregateExec mode is final, so run merge_batch

// TODO: Since ScalarValue::Struct is not yet includes StructArray,
// so we need to convert it to ListArray for ScalarValue::List
let arr = ScalarValue::new_list(&orderings, &struct_type);
Ok(ScalarValue::List(arr))

This comment was marked as outdated.

@jayzhan211 jayzhan211 force-pushed the array_agg_order_improve_doc_and_test branch from 22f3288 to 984551e Compare December 7, 2023 01:23
@jayzhan211 jayzhan211 marked this pull request as ready for review December 7, 2023 01:24
for v in other_ordering_values.into_iter() {
partition_ordering_values.push(v);

for partition_ordering_rows in orderings.into_iter() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think convert_array_agg_to_orderings is doing quite specific thing, so remove the function and do things here.

@jayzhan211 jayzhan211 changed the title Add multi ordering test for array agg order Minor: Add multi ordering test for array agg order Dec 7, 2023
@alamb alamb requested a review from mustafasrepo December 8, 2023 17:09
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks reasonable to me. Thank you @jayzhan211

I wonder if @mustafasrepo could give this a look as I think he contributed the original implementation of this aggregate

Copy link
Contributor

@mustafasrepo mustafasrepo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is LGTM!. Thanks @jayzhan211 for this PR.

@alamb alamb merged commit ff65dee into apache:main Dec 11, 2023
@alamb
Copy link
Contributor

alamb commented Dec 11, 2023

Thanks again

appletreeisyellow pushed a commit to appletreeisyellow/datafusion that referenced this pull request Dec 15, 2023
appletreeisyellow pushed a commit to appletreeisyellow/datafusion that referenced this pull request Jan 3, 2024
appletreeisyellow pushed a commit to appletreeisyellow/datafusion that referenced this pull request Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants