[SPARK-41914][SQL] FileFormatWriter materializes AQE plan before accessing outputOrdering #39431

EnricoMi · 2023-01-06T12:32:38Z

What changes were proposed in this pull request?

The FileFormatWriter materializes an AdaptiveQueryPlan before accessing the plan's outputOrdering. This is required when planned writing is disabled (spark.sql.optimizer.plannedWrite.enabled is true by default). With planned writing enabled FileFormatWriter gets the final plan already.

Why are the changes needed?

FileFormatWriter enforces an ordering if the written plan does not provide that ordering. An AdaptiveQueryPlan does not know its final ordering, in which case FileFormatWriter enforces the ordering (e.g. by column "a") even if the plan provides a compatible ordering (e.g. by columns "a", "b"). In case of spilling, that order (e.g. by columns "a", "b") gets broken (see SPARK-40588).

Does this PR introduce any user-facing change?

This fixes SPARK-40588 for 3.4, which was introduced in 3.0. This restores behaviour from Spark 2.4.

How was this patch tested?

The final plan that is written to files is now stored in FileFormatWriter.executedPlan (similar to existing FileFormatWriter.outputOrderingMatched). Unit tests assert the outermost sort order written to files.

The actual plan written into the files changed from (taken from "SPARK-41914: v1 write with AQE and in-partition sorted - non-string partition column"):

Sort [input[2, int, false] ASC NULLS FIRST], false, 0
+- *(3) Sort [key#13 ASC NULLS FIRST, value#14 ASC NULLS FIRST], false, 0
   +- *(3) Project [b#24, value#14, key#13]
      +- *(3) BroadcastHashJoin [key#13], [a#23], Inner, BuildLeft, false
         :- BroadcastQueryStage 2
         :  +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=376]
         :     +- AQEShuffleRead local
         :        +- ShuffleQueryStage 0
         :           +- Exchange hashpartitioning(key#13, 5), ENSURE_REQUIREMENTS, [plan_id=328]
         :              +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13, staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false, true) AS value#14]
         :                 +- Scan[obj#12]
         +- AQEShuffleRead local
            +- ShuffleQueryStage 1
               +- Exchange hashpartitioning(a#23, 5), ENSURE_REQUIREMENTS, [plan_id=345]
                  +- *(2) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#23, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#24]
                     +- Scan[obj#22]

where FileFormatWriter enforces order with Sort [input[2, int, false] ASC NULLS FIRST], false, 0, to

*(3) Sort [key#13 ASC NULLS FIRST, value#14 ASC NULLS FIRST], false, 0
+- *(3) Project [b#24, value#14, key#13]
   +- *(3) BroadcastHashJoin [key#13], [a#23], Inner, BuildLeft, false
      :- BroadcastQueryStage 2
      :  +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=375]
      :     +- AQEShuffleRead local
      :        +- ShuffleQueryStage 0
      :           +- Exchange hashpartitioning(key#13, 5), ENSURE_REQUIREMENTS, [plan_id=327]
      :              +- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).key AS key#13, staticinvoke(class org.apache.spark.unsafe.types.UTF8String, StringType, fromString, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData, true])).value, true, false, true) AS value#14]
      :                 +- Scan[obj#12]
      +- AQEShuffleRead local
         +- ShuffleQueryStage 1
            +- Exchange hashpartitioning(a#23, 5), ENSURE_REQUIREMENTS, [plan_id=344]
               +- *(2) SerializeFromObject [knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).a AS a#23, knownnotnull(assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, true])).b AS b#24]
                  +- Scan[obj#22]

where the sort given by the user is the outermost sort now.

EnricoMi · 2023-01-06T12:36:03Z

@cloud-fan here is the fix for SPARK-40588 migrated to Spark 3.4.

This finally includes unit tests for the actual plan written to files (that has never been tested before).

cloud-fan · 2023-01-06T13:57:35Z

sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/V1WriteCommandSuite.scala

            |""".stripMargin)
        executeAndCheckOrdering(
-          hasLogicalSort = true, orderingMatched = enabled, hasEmpty2Null = enabled) {
+          hasLogicalSort = true, orderingMatched = true, hasEmpty2Null = enabled) {


do we still need the orderingMatched parameter if it's always true?

Not sure what you mean, as in needed in this test?

I mean, shall we remove orderingMatched from the method executeAndCheckOrdering?

There is no default value for orderingMatched and two other unit tests still use orderingMatched=enabled.

cloud-fan · 2023-01-06T13:58:09Z

sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/V1WriteCommandSuite.scala

+
+          // SPARK-40885: this bug removes the in-partition sort, which manifests here
+          case (true, SortExec(Seq(
+          SortOrder(AttributeReference("value", StringType, _, _), Ascending, NullsFirst, _)


do you know why the sorting key is different when planned write is enabled?

that is correctness bug SPARK-40885 discussed in #38356

@ulysses-you can you take a look at this bug?

EnricoMi · 2023-01-06T17:45:39Z

All tests green: https://github.com/G-Research/spark/actions/runs/3855300306

AmplabJenkins · 2023-01-07T23:48:34Z

Can one of the admins verify this patch?

ulysses-you · 2023-01-09T02:03:57Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala

    // Use the output ordering from the original plan before adding the empty2null projection.
-    val actualOrdering = writeFilesOpt.map(_.child).getOrElse(plan).outputOrdering.map(_.child)
+    val actualOrdering = writeFilesOpt.map(_.child)
+      .getOrElse(materializeAdaptiveSparkPlan(plan))


shall we put all code changes inside if writeFilesOpt is empty ? if writeFilesOpt is defined that means the write have been planned which does not have this issue.

.getOrElse already does what you said, isn't it?

Yes, materializeAdaptiveSparkPlan is applied on plan only if writeFilesOpt is undefined.

cloud-fan · 2023-01-10T05:10:02Z

thanks, merging to master!

EnricoMi · 2023-01-10T06:27:48Z

Thanks!

EnricoMi added 4 commits January 6, 2023 09:17

Return optimized logical write query plan as promised in comment

64c6797

Materialize adaptive plan before checking actual ordering, add tests

eb22ff8

Test in-partition order with AQE (requires join)

a68d605

Remove printouts, add comments, fix imports

8414715

github-actions bot added the SQL label Jan 6, 2023

EnricoMi changed the title ~~[SPARK-41914] FileFormatWriter materializes AQE plan before accessing outputOrdering~~ [SPARK-41914][SQL] FileFormatWriter materializes AQE plan before accessing outputOrdering Jan 6, 2023

EnricoMi mentioned this pull request Jan 6, 2023

[SPARK-40588] FileFormatWriter materializes AQE plan before accessing outputOrdering #38358

Closed

Fix Scala lint errors

0ae3ce5

cloud-fan reviewed Jan 6, 2023

View reviewed changes

ulysses-you reviewed Jan 9, 2023

View reviewed changes

cloud-fan closed this in c091198 Jan 10, 2023

cloud-fan mentioned this pull request Jan 10, 2023

[SPARK-41959][SQL] Improve v1 writes with empty2null #39475

Closed

EnricoMi deleted the branch-materialize-aqe-plan branch January 10, 2023 06:27

cfmcgrady mentioned this pull request Apr 4, 2023

[ARROW] Arrow serialization should not introduce extra shuffle for outermost limit apache/kyuubi#4662

Closed

3 tasks

EnricoMi mentioned this pull request May 11, 2023

[SPARK-43327] Trigger committer.setupJob before plan execute in FileFormatWriter#write #41000

Closed

EnricoMi mentioned this pull request Dec 22, 2023

[SPARK-46378][SQL][FOLLOWUP] Do not rely on TreeNodeTag in Project #44429

Closed

[SPARK-41914][SQL] FileFormatWriter materializes AQE plan before accessing outputOrdering #39431

[SPARK-41914][SQL] FileFormatWriter materializes AQE plan before accessing outputOrdering #39431

Uh oh!

Conversation

EnricoMi commented Jan 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

EnricoMi commented Jan 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EnricoMi commented Jan 6, 2023

Uh oh!

AmplabJenkins commented Jan 7, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Jan 10, 2023

Uh oh!

EnricoMi commented Jan 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

EnricoMi commented Jan 6, 2023 •

edited

Loading

EnricoMi commented Jan 6, 2023 •

edited

Loading