Skip to content

Commit 11b3b23

Browse files
Hisoka-Xcloud-fan
authored andcommitted
[SPARK-43838][SQL][FOLLOWUP] Add missing aggregate in renewDuplicatedRelations
### What changes were proposed in this pull request? This is a follow up PR for #41347 , add missing aggregate case in `renewDuplicatedRelations` ### Why are the changes needed? add missing case ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? exist test. Closes #42160 from Hisoka-X/SPARK-43838_subquery_aggregate_follow_up. Authored-by: Jia Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent 6d0fed9 commit 11b3b23

File tree

10 files changed

+734
-726
lines changed

10 files changed

+734
-726
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,14 @@ object DeduplicateRelations extends Rule[LogicalPlan] {
115115
newProject => findAliases(newProject.projectList).map(_.exprId.id).toSeq,
116116
newProject => newProject.copy(newAliases(newProject.projectList)))
117117

118+
case a: Aggregate =>
119+
deduplicateAndRenew[Aggregate](
120+
existingRelations,
121+
a,
122+
newAggregate => findAliases(newAggregate.aggregateExpressions).map(_.exprId.id).toSeq,
123+
newAggregate => newAggregate.copy(aggregateExpressions =
124+
newAliases(newAggregate.aggregateExpressions)))
125+
118126
case s: SerializeFromObject =>
119127
deduplicateAndRenew[SerializeFromObject](
120128
existingRelations,

sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q14a.sf100/explain.txt

Lines changed: 122 additions & 122 deletions
Large diffs are not rendered by default.

sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q14a/explain.txt

Lines changed: 122 additions & 122 deletions
Large diffs are not rendered by default.

sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q5a.sf100/explain.txt

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -444,60 +444,60 @@ Aggregate Attributes [3]: [sum(sales#39)#139, sum(returns#40)#140, sum(profit#41
444444
Results [5]: [channel#37, id#38, cast(sum(sales#39)#139 as decimal(37,2)) AS sales#142, cast(sum(returns#40)#140 as decimal(37,2)) AS returns#143, cast(sum(profit#41)#141 as decimal(38,2)) AS profit#144]
445445

446446
(76) ReusedExchange [Reuses operator id: 74]
447-
Output [8]: [channel#145, id#146, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#147, isEmpty#148]
447+
Output [8]: [channel#145, id#146, sum#147, isEmpty#148, sum#149, isEmpty#150, sum#151, isEmpty#152]
448448

449449
(77) HashAggregate [codegen id : 48]
450-
Input [8]: [channel#145, id#146, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#147, isEmpty#148]
450+
Input [8]: [channel#145, id#146, sum#147, isEmpty#148, sum#149, isEmpty#150, sum#151, isEmpty#152]
451451
Keys [2]: [channel#145, id#146]
452-
Functions [3]: [sum(sales#39), sum(returns#40), sum(profit#149)]
453-
Aggregate Attributes [3]: [sum(sales#39)#139, sum(returns#40)#140, sum(profit#149)#141]
454-
Results [4]: [channel#145, sum(sales#39)#139 AS sales#150, sum(returns#40)#140 AS returns#151, sum(profit#149)#141 AS profit#152]
452+
Functions [3]: [sum(sales#153), sum(returns#154), sum(profit#155)]
453+
Aggregate Attributes [3]: [sum(sales#153)#139, sum(returns#154)#140, sum(profit#155)#141]
454+
Results [4]: [channel#145, sum(sales#153)#139 AS sales#156, sum(returns#154)#140 AS returns#157, sum(profit#155)#141 AS profit#158]
455455

456456
(78) HashAggregate [codegen id : 48]
457-
Input [4]: [channel#145, sales#150, returns#151, profit#152]
457+
Input [4]: [channel#145, sales#156, returns#157, profit#158]
458458
Keys [1]: [channel#145]
459-
Functions [3]: [partial_sum(sales#150), partial_sum(returns#151), partial_sum(profit#152)]
460-
Aggregate Attributes [6]: [sum#153, isEmpty#154, sum#155, isEmpty#156, sum#157, isEmpty#158]
461-
Results [7]: [channel#145, sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
459+
Functions [3]: [partial_sum(sales#156), partial_sum(returns#157), partial_sum(profit#158)]
460+
Aggregate Attributes [6]: [sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
461+
Results [7]: [channel#145, sum#165, isEmpty#166, sum#167, isEmpty#168, sum#169, isEmpty#170]
462462

463463
(79) Exchange
464-
Input [7]: [channel#145, sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
464+
Input [7]: [channel#145, sum#165, isEmpty#166, sum#167, isEmpty#168, sum#169, isEmpty#170]
465465
Arguments: hashpartitioning(channel#145, 5), ENSURE_REQUIREMENTS, [plan_id=10]
466466

467467
(80) HashAggregate [codegen id : 49]
468-
Input [7]: [channel#145, sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
468+
Input [7]: [channel#145, sum#165, isEmpty#166, sum#167, isEmpty#168, sum#169, isEmpty#170]
469469
Keys [1]: [channel#145]
470-
Functions [3]: [sum(sales#150), sum(returns#151), sum(profit#152)]
471-
Aggregate Attributes [3]: [sum(sales#150)#165, sum(returns#151)#166, sum(profit#152)#167]
472-
Results [5]: [channel#145, null AS id#168, sum(sales#150)#165 AS sum(sales)#169, sum(returns#151)#166 AS sum(returns)#170, sum(profit#152)#167 AS sum(profit)#171]
470+
Functions [3]: [sum(sales#156), sum(returns#157), sum(profit#158)]
471+
Aggregate Attributes [3]: [sum(sales#156)#171, sum(returns#157)#172, sum(profit#158)#173]
472+
Results [5]: [channel#145, null AS id#174, sum(sales#156)#171 AS sum(sales)#175, sum(returns#157)#172 AS sum(returns)#176, sum(profit#158)#173 AS sum(profit)#177]
473473

474474
(81) ReusedExchange [Reuses operator id: 74]
475-
Output [8]: [channel#172, id#173, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#174, isEmpty#175]
475+
Output [8]: [channel#178, id#179, sum#180, isEmpty#181, sum#182, isEmpty#183, sum#184, isEmpty#185]
476476

477477
(82) HashAggregate [codegen id : 73]
478-
Input [8]: [channel#172, id#173, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#174, isEmpty#175]
479-
Keys [2]: [channel#172, id#173]
480-
Functions [3]: [sum(sales#39), sum(returns#40), sum(profit#176)]
481-
Aggregate Attributes [3]: [sum(sales#39)#139, sum(returns#40)#140, sum(profit#176)#141]
482-
Results [3]: [sum(sales#39)#139 AS sales#177, sum(returns#40)#140 AS returns#178, sum(profit#176)#141 AS profit#179]
478+
Input [8]: [channel#178, id#179, sum#180, isEmpty#181, sum#182, isEmpty#183, sum#184, isEmpty#185]
479+
Keys [2]: [channel#178, id#179]
480+
Functions [3]: [sum(sales#186), sum(returns#187), sum(profit#188)]
481+
Aggregate Attributes [3]: [sum(sales#186)#139, sum(returns#187)#140, sum(profit#188)#141]
482+
Results [3]: [sum(sales#186)#139 AS sales#189, sum(returns#187)#140 AS returns#190, sum(profit#188)#141 AS profit#191]
483483

484484
(83) HashAggregate [codegen id : 73]
485-
Input [3]: [sales#177, returns#178, profit#179]
485+
Input [3]: [sales#189, returns#190, profit#191]
486486
Keys: []
487-
Functions [3]: [partial_sum(sales#177), partial_sum(returns#178), partial_sum(profit#179)]
488-
Aggregate Attributes [6]: [sum#180, isEmpty#181, sum#182, isEmpty#183, sum#184, isEmpty#185]
489-
Results [6]: [sum#186, isEmpty#187, sum#188, isEmpty#189, sum#190, isEmpty#191]
487+
Functions [3]: [partial_sum(sales#189), partial_sum(returns#190), partial_sum(profit#191)]
488+
Aggregate Attributes [6]: [sum#192, isEmpty#193, sum#194, isEmpty#195, sum#196, isEmpty#197]
489+
Results [6]: [sum#198, isEmpty#199, sum#200, isEmpty#201, sum#202, isEmpty#203]
490490

491491
(84) Exchange
492-
Input [6]: [sum#186, isEmpty#187, sum#188, isEmpty#189, sum#190, isEmpty#191]
492+
Input [6]: [sum#198, isEmpty#199, sum#200, isEmpty#201, sum#202, isEmpty#203]
493493
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=11]
494494

495495
(85) HashAggregate [codegen id : 74]
496-
Input [6]: [sum#186, isEmpty#187, sum#188, isEmpty#189, sum#190, isEmpty#191]
496+
Input [6]: [sum#198, isEmpty#199, sum#200, isEmpty#201, sum#202, isEmpty#203]
497497
Keys: []
498-
Functions [3]: [sum(sales#177), sum(returns#178), sum(profit#179)]
499-
Aggregate Attributes [3]: [sum(sales#177)#192, sum(returns#178)#193, sum(profit#179)#194]
500-
Results [5]: [null AS channel#195, null AS id#196, sum(sales#177)#192 AS sum(sales)#197, sum(returns#178)#193 AS sum(returns)#198, sum(profit#179)#194 AS sum(profit)#199]
498+
Functions [3]: [sum(sales#189), sum(returns#190), sum(profit#191)]
499+
Aggregate Attributes [3]: [sum(sales#189)#204, sum(returns#190)#205, sum(profit#191)#206]
500+
Results [5]: [null AS channel#207, null AS id#208, sum(sales#189)#204 AS sum(sales)#209, sum(returns#190)#205 AS sum(returns)#210, sum(profit#191)#206 AS sum(profit)#211]
501501

502502
(86) Union
503503

@@ -534,22 +534,22 @@ BroadcastExchange (95)
534534

535535

536536
(91) Scan parquet spark_catalog.default.date_dim
537-
Output [2]: [d_date_sk#24, d_date#200]
537+
Output [2]: [d_date_sk#24, d_date#212]
538538
Batched: true
539539
Location [not included in comparison]/{warehouse_dir}/date_dim]
540540
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,1998-08-04), LessThanOrEqual(d_date,1998-08-18), IsNotNull(d_date_sk)]
541541
ReadSchema: struct<d_date_sk:int,d_date:date>
542542

543543
(92) ColumnarToRow [codegen id : 1]
544-
Input [2]: [d_date_sk#24, d_date#200]
544+
Input [2]: [d_date_sk#24, d_date#212]
545545

546546
(93) Filter [codegen id : 1]
547-
Input [2]: [d_date_sk#24, d_date#200]
548-
Condition : (((isnotnull(d_date#200) AND (d_date#200 >= 1998-08-04)) AND (d_date#200 <= 1998-08-18)) AND isnotnull(d_date_sk#24))
547+
Input [2]: [d_date_sk#24, d_date#212]
548+
Condition : (((isnotnull(d_date#212) AND (d_date#212 >= 1998-08-04)) AND (d_date#212 <= 1998-08-18)) AND isnotnull(d_date_sk#24))
549549

550550
(94) Project [codegen id : 1]
551551
Output [1]: [d_date_sk#24]
552-
Input [2]: [d_date_sk#24, d_date#200]
552+
Input [2]: [d_date_sk#24, d_date#212]
553553

554554
(95) BroadcastExchange
555555
Input [1]: [d_date_sk#24]

sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q5a/explain.txt

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -429,60 +429,60 @@ Aggregate Attributes [3]: [sum(sales#39)#139, sum(returns#40)#140, sum(profit#41
429429
Results [5]: [channel#37, id#38, cast(sum(sales#39)#139 as decimal(37,2)) AS sales#142, cast(sum(returns#40)#140 as decimal(37,2)) AS returns#143, cast(sum(profit#41)#141 as decimal(38,2)) AS profit#144]
430430

431431
(73) ReusedExchange [Reuses operator id: 71]
432-
Output [8]: [channel#145, id#146, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#147, isEmpty#148]
432+
Output [8]: [channel#145, id#146, sum#147, isEmpty#148, sum#149, isEmpty#150, sum#151, isEmpty#152]
433433

434434
(74) HashAggregate [codegen id : 42]
435-
Input [8]: [channel#145, id#146, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#147, isEmpty#148]
435+
Input [8]: [channel#145, id#146, sum#147, isEmpty#148, sum#149, isEmpty#150, sum#151, isEmpty#152]
436436
Keys [2]: [channel#145, id#146]
437-
Functions [3]: [sum(sales#39), sum(returns#40), sum(profit#149)]
438-
Aggregate Attributes [3]: [sum(sales#39)#139, sum(returns#40)#140, sum(profit#149)#141]
439-
Results [4]: [channel#145, sum(sales#39)#139 AS sales#150, sum(returns#40)#140 AS returns#151, sum(profit#149)#141 AS profit#152]
437+
Functions [3]: [sum(sales#153), sum(returns#154), sum(profit#155)]
438+
Aggregate Attributes [3]: [sum(sales#153)#139, sum(returns#154)#140, sum(profit#155)#141]
439+
Results [4]: [channel#145, sum(sales#153)#139 AS sales#156, sum(returns#154)#140 AS returns#157, sum(profit#155)#141 AS profit#158]
440440

441441
(75) HashAggregate [codegen id : 42]
442-
Input [4]: [channel#145, sales#150, returns#151, profit#152]
442+
Input [4]: [channel#145, sales#156, returns#157, profit#158]
443443
Keys [1]: [channel#145]
444-
Functions [3]: [partial_sum(sales#150), partial_sum(returns#151), partial_sum(profit#152)]
445-
Aggregate Attributes [6]: [sum#153, isEmpty#154, sum#155, isEmpty#156, sum#157, isEmpty#158]
446-
Results [7]: [channel#145, sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
444+
Functions [3]: [partial_sum(sales#156), partial_sum(returns#157), partial_sum(profit#158)]
445+
Aggregate Attributes [6]: [sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
446+
Results [7]: [channel#145, sum#165, isEmpty#166, sum#167, isEmpty#168, sum#169, isEmpty#170]
447447

448448
(76) Exchange
449-
Input [7]: [channel#145, sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
449+
Input [7]: [channel#145, sum#165, isEmpty#166, sum#167, isEmpty#168, sum#169, isEmpty#170]
450450
Arguments: hashpartitioning(channel#145, 5), ENSURE_REQUIREMENTS, [plan_id=9]
451451

452452
(77) HashAggregate [codegen id : 43]
453-
Input [7]: [channel#145, sum#159, isEmpty#160, sum#161, isEmpty#162, sum#163, isEmpty#164]
453+
Input [7]: [channel#145, sum#165, isEmpty#166, sum#167, isEmpty#168, sum#169, isEmpty#170]
454454
Keys [1]: [channel#145]
455-
Functions [3]: [sum(sales#150), sum(returns#151), sum(profit#152)]
456-
Aggregate Attributes [3]: [sum(sales#150)#165, sum(returns#151)#166, sum(profit#152)#167]
457-
Results [5]: [channel#145, null AS id#168, sum(sales#150)#165 AS sum(sales)#169, sum(returns#151)#166 AS sum(returns)#170, sum(profit#152)#167 AS sum(profit)#171]
455+
Functions [3]: [sum(sales#156), sum(returns#157), sum(profit#158)]
456+
Aggregate Attributes [3]: [sum(sales#156)#171, sum(returns#157)#172, sum(profit#158)#173]
457+
Results [5]: [channel#145, null AS id#174, sum(sales#156)#171 AS sum(sales)#175, sum(returns#157)#172 AS sum(returns)#176, sum(profit#158)#173 AS sum(profit)#177]
458458

459459
(78) ReusedExchange [Reuses operator id: 71]
460-
Output [8]: [channel#172, id#173, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#174, isEmpty#175]
460+
Output [8]: [channel#178, id#179, sum#180, isEmpty#181, sum#182, isEmpty#183, sum#184, isEmpty#185]
461461

462462
(79) HashAggregate [codegen id : 64]
463-
Input [8]: [channel#172, id#173, sum#133, isEmpty#134, sum#135, isEmpty#136, sum#174, isEmpty#175]
464-
Keys [2]: [channel#172, id#173]
465-
Functions [3]: [sum(sales#39), sum(returns#40), sum(profit#176)]
466-
Aggregate Attributes [3]: [sum(sales#39)#139, sum(returns#40)#140, sum(profit#176)#141]
467-
Results [3]: [sum(sales#39)#139 AS sales#177, sum(returns#40)#140 AS returns#178, sum(profit#176)#141 AS profit#179]
463+
Input [8]: [channel#178, id#179, sum#180, isEmpty#181, sum#182, isEmpty#183, sum#184, isEmpty#185]
464+
Keys [2]: [channel#178, id#179]
465+
Functions [3]: [sum(sales#186), sum(returns#187), sum(profit#188)]
466+
Aggregate Attributes [3]: [sum(sales#186)#139, sum(returns#187)#140, sum(profit#188)#141]
467+
Results [3]: [sum(sales#186)#139 AS sales#189, sum(returns#187)#140 AS returns#190, sum(profit#188)#141 AS profit#191]
468468

469469
(80) HashAggregate [codegen id : 64]
470-
Input [3]: [sales#177, returns#178, profit#179]
470+
Input [3]: [sales#189, returns#190, profit#191]
471471
Keys: []
472-
Functions [3]: [partial_sum(sales#177), partial_sum(returns#178), partial_sum(profit#179)]
473-
Aggregate Attributes [6]: [sum#180, isEmpty#181, sum#182, isEmpty#183, sum#184, isEmpty#185]
474-
Results [6]: [sum#186, isEmpty#187, sum#188, isEmpty#189, sum#190, isEmpty#191]
472+
Functions [3]: [partial_sum(sales#189), partial_sum(returns#190), partial_sum(profit#191)]
473+
Aggregate Attributes [6]: [sum#192, isEmpty#193, sum#194, isEmpty#195, sum#196, isEmpty#197]
474+
Results [6]: [sum#198, isEmpty#199, sum#200, isEmpty#201, sum#202, isEmpty#203]
475475

476476
(81) Exchange
477-
Input [6]: [sum#186, isEmpty#187, sum#188, isEmpty#189, sum#190, isEmpty#191]
477+
Input [6]: [sum#198, isEmpty#199, sum#200, isEmpty#201, sum#202, isEmpty#203]
478478
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=10]
479479

480480
(82) HashAggregate [codegen id : 65]
481-
Input [6]: [sum#186, isEmpty#187, sum#188, isEmpty#189, sum#190, isEmpty#191]
481+
Input [6]: [sum#198, isEmpty#199, sum#200, isEmpty#201, sum#202, isEmpty#203]
482482
Keys: []
483-
Functions [3]: [sum(sales#177), sum(returns#178), sum(profit#179)]
484-
Aggregate Attributes [3]: [sum(sales#177)#192, sum(returns#178)#193, sum(profit#179)#194]
485-
Results [5]: [null AS channel#195, null AS id#196, sum(sales#177)#192 AS sum(sales)#197, sum(returns#178)#193 AS sum(returns)#198, sum(profit#179)#194 AS sum(profit)#199]
483+
Functions [3]: [sum(sales#189), sum(returns#190), sum(profit#191)]
484+
Aggregate Attributes [3]: [sum(sales#189)#204, sum(returns#190)#205, sum(profit#191)#206]
485+
Results [5]: [null AS channel#207, null AS id#208, sum(sales#189)#204 AS sum(sales)#209, sum(returns#190)#205 AS sum(returns)#210, sum(profit#191)#206 AS sum(profit)#211]
486486

487487
(83) Union
488488

@@ -519,22 +519,22 @@ BroadcastExchange (92)
519519

520520

521521
(88) Scan parquet spark_catalog.default.date_dim
522-
Output [2]: [d_date_sk#22, d_date#200]
522+
Output [2]: [d_date_sk#22, d_date#212]
523523
Batched: true
524524
Location [not included in comparison]/{warehouse_dir}/date_dim]
525525
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,1998-08-04), LessThanOrEqual(d_date,1998-08-18), IsNotNull(d_date_sk)]
526526
ReadSchema: struct<d_date_sk:int,d_date:date>
527527

528528
(89) ColumnarToRow [codegen id : 1]
529-
Input [2]: [d_date_sk#22, d_date#200]
529+
Input [2]: [d_date_sk#22, d_date#212]
530530

531531
(90) Filter [codegen id : 1]
532-
Input [2]: [d_date_sk#22, d_date#200]
533-
Condition : (((isnotnull(d_date#200) AND (d_date#200 >= 1998-08-04)) AND (d_date#200 <= 1998-08-18)) AND isnotnull(d_date_sk#22))
532+
Input [2]: [d_date_sk#22, d_date#212]
533+
Condition : (((isnotnull(d_date#212) AND (d_date#212 >= 1998-08-04)) AND (d_date#212 <= 1998-08-18)) AND isnotnull(d_date_sk#22))
534534

535535
(91) Project [codegen id : 1]
536536
Output [1]: [d_date_sk#22]
537-
Input [2]: [d_date_sk#22, d_date#200]
537+
Input [2]: [d_date_sk#22, d_date#212]
538538

539539
(92) BroadcastExchange
540540
Input [1]: [d_date_sk#22]

0 commit comments

Comments
 (0)