Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This PR updates ShuffleExchangeExec to carry more information about how much we can change the partitioning. For repartition(col), we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Why are the changes needed?

Similar to repartition(number, col), we should respect the user-specified partitioning.

Does this PR introduce any user-facing change?

No

How was this patch tested?

a new test

@cloud-fan
Copy link
Contributor Author

cloud-fan commented Nov 19, 2020

@github-actions github-actions bot added the SQL label Nov 19, 2020
@SparkQA
Copy link

SparkQA commented Nov 19, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35973/

@SparkQA
Copy link

SparkQA commented Nov 19, 2020

Test build #131369 has finished for PR 30432 at commit 7b556cf.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 19, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35973/

@JkSelf
Copy link
Contributor

JkSelf commented Nov 20, 2020

@cloud-fan
If we want to preserve the user-specified partitioning, whether need to disable the CoalesceShufflePartitions rule for the repartition(col) or not? Because the outputPartitioning will be changed to UnknownPartitioning after coalesced.

@maryannxue
Copy link
Contributor

maryannxue commented Nov 20, 2020

I believe this is more of a refactoring than a fix, @JkSelf .
Can we change the title of this PR, @cloud-fan ?

* Returns whether the shuffle partition number can be changed.
*/
def canChangeNumPartitions: Boolean
def canChangeNumPartitions: Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these two methods are redundant if we have partitioningFlexibilty already.

I know that removing these methods would cause more code changes on the caller side, but for the integrity and simplicity of the interface, it's probably worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more like a util method as implementations don't need to implement it.

@cloud-fan
Copy link
Contributor Author

whether need to disable the CoalesceShufflePartitions rule for the repartition(col) or not?

We already did. We don't coalesce shuffles if canChangeNumPartitions return false.

@cloud-fan
Copy link
Contributor Author

I believe this is more of a refactoring than a fix

It does fix an issue, see the added test.

@JkSelf
Copy link
Contributor

JkSelf commented Nov 23, 2020

We already did. We don't coalesce shuffles if canChangeNumPartitions return false.

The canChangeNumPartitions return true when call the repartition(col). So aqe will coalesce shuffles and change the user-specified partitioning to UnknownPartitioning .

@cloud-fan
Copy link
Contributor Author

cloud-fan commented Nov 23, 2020

@JkSelf code-wise, you are right, as it becomes UnknownPartitioning. However, here I'm talking about the real partitioning. After coalesce, the data is still "partitioned by col" for repartition(col), so we don't break user expectations. Does it make sense to you?

@JkSelf
Copy link
Contributor

JkSelf commented Nov 23, 2020

I see. It makes sense to me. Thanks for your detailed explanations.

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36148/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36148/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Test build #131545 has finished for PR 30432 at commit 40a0096.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member

wangyum commented Nov 23, 2020

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36152/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36152/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36154/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36154/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36158/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36158/

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Test build #131556 has finished for PR 30432 at commit 5f8f4ed.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Test build #131551 has finished for PR 30432 at commit 5383910.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 23, 2020

Test build #131549 has finished for PR 30432 at commit 40a0096.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 24, 2020

Test build #131595 has finished for PR 30432 at commit 5f8f4ed.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

// Describes where the shuffle operator comes from.
object ShuffleOrigin extends Enumeration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use seal trait/abstract class, so we type won't have a trailing .Value?

@SparkQA
Copy link

SparkQA commented Nov 24, 2020

Test build #131627 has finished for PR 30432 at commit 1ed28f2.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 24, 2020

Test build #131636 has finished for PR 30432 at commit 1ed28f2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@maryannxue maryannxue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gatorsmile
Copy link
Member

LGTM

Thanks! Merged to master

@cloud-fan cloud-fan closed this in d1b4f06 Nov 25, 2020
jdcasale pushed a commit to palantir/spark that referenced this pull request May 12, 2021
This PR updates `ShuffleExchangeExec` to carry more information about how much we can change the partitioning. For `repartition(col)`, we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Similar to `repartition(number, col)`, we should respect the user-specified partitioning.

No

a new test

Closes apache#30432 from cloud-fan/aqe.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
MGHawes pushed a commit to palantir/spark that referenced this pull request May 16, 2021
This PR updates `ShuffleExchangeExec` to carry more information about how much we can change the partitioning. For `repartition(col)`, we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Similar to `repartition(number, col)`, we should respect the user-specified partitioning.

No

a new test

Closes apache#30432 from cloud-fan/aqe.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
LorenzoMartini pushed a commit to palantir/spark that referenced this pull request May 18, 2021
This PR updates `ShuffleExchangeExec` to carry more information about how much we can change the partitioning. For `repartition(col)`, we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Similar to `repartition(number, col)`, we should respect the user-specified partitioning.

No

a new test

Closes apache#30432 from cloud-fan/aqe.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
LorenzoMartini pushed a commit to palantir/spark that referenced this pull request May 18, 2021
This PR updates `ShuffleExchangeExec` to carry more information about how much we can change the partitioning. For `repartition(col)`, we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Similar to `repartition(number, col)`, we should respect the user-specified partitioning.

No

a new test

Closes apache#30432 from cloud-fan/aqe.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
LorenzoMartini pushed a commit to palantir/spark that referenced this pull request May 19, 2021
This PR updates `ShuffleExchangeExec` to carry more information about how much we can change the partitioning. For `repartition(col)`, we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Similar to `repartition(number, col)`, we should respect the user-specified partitioning.

No

a new test

Closes apache#30432 from cloud-fan/aqe.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
16pierre pushed a commit to 16pierre/spark that referenced this pull request May 24, 2021
This PR updates `ShuffleExchangeExec` to carry more information about how much we can change the partitioning. For `repartition(col)`, we should preserve the user-specified partitioning and don't apply the AQE local shuffle reader.

Similar to `repartition(number, col)`, we should respect the user-specified partitioning.

No

a new test

Closes apache#30432 from cloud-fan/aqe.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants