Skip to content

Conversation

@Ngone51
Copy link
Member

@Ngone51 Ngone51 commented Aug 27, 2020

What changes were proposed in this pull request?

This PR proposes to add a specific AQEOptimizer for the AdaptiveSparkPlanExec instead of implementing an anonymous RuleExecutor. At the same time, this PR also adds the configuration spark.sql.adaptive.optimizer.excludedRules, which follows the same pattern of Optimizer, to make the AQEOptimizer more flexible for users and developers.

Why are the changes needed?

Currently, AdaptiveSparkPlanExec has implemented an anonymous RuleExecutor to apply the AQE optimize rules on the plan. However, the anonymous class usually could be inconvenient to maintain and extend for the long term.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

It's a pure refactor so pass existing tests should be ok.

@Ngone51
Copy link
Member Author

Ngone51 commented Aug 27, 2020

@cloud-fan @maryannxue Please take a look, thanks!

@SparkQA
Copy link

SparkQA commented Aug 27, 2020

Test build #127955 has finished for PR 29559 at commit 291db45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 27, 2020

Test build #127956 has finished for PR 29559 at commit 9f4d3b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Comment on lines 525 to 532
val ADAPTIVE_OPTIMIZER_EXCLUDED_RULES =
buildConf("spark.sql.adaptive.optimizer.excludedRules")
.doc("Configures a list of rules to be disabled in the adaptive optimizer, in which the " +
"rules are specified by their rule names and separated by comma. The optimizer will log " +
"the rules that have indeed been excluded.")
.version("3.1.0")
.stringConf
.createOptional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering why we need to introduce a config for user to disable rule in AQE? Is there a use case we are considering now? I am seeing spark.sql.optimizer.excludedRules was introduced to work around https://issues.apache.org/jira/browse/SPARK-24624 (#21764 (comment)), but not sure what's the use case here for AQE.

@Ngone51
Copy link
Member Author

Ngone51 commented Aug 28, 2020

@c21 @maropu Thank you for your comments. spark.sql.adaptive.optimizer.excludedRule doesn't have a real use case yet, but I think it's quite intuitive to follow the normal optimizer given the same reason(more flexible for users). I also think it can be very useful for developers to write certain tests in AQE. For example, If they want to avoid applying the rule DemoteBroadcastHashJoin. Previously, they have to tweak the size of the test data and the partition num, which can be troublesome. With this configuration, they can just exclude the rule easily.

@maropu
Copy link
Member

maropu commented Aug 28, 2020

yea, adding it itself looks okay to me.

@c21
Copy link
Contributor

c21 commented Aug 28, 2020

@Ngone51 - thanks, the rationale for developer makes sense to me.

@Ngone51
Copy link
Member Author

Ngone51 commented Aug 28, 2020

Thanks, I've added a unit test and updated the PR description for the rule excluding configuration.

@SparkQA
Copy link

SparkQA commented Aug 28, 2020

Test build #127991 has finished for PR 29559 at commit cfd348e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master.

@Ngone51
Copy link
Member Author

Ngone51 commented Aug 28, 2020

thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants