-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-27604][SQL] Enhance constant propagation #24553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27604][SQL] Enhance constant propagation #24553
Conversation
This PR is extracted from #24495 as @dongjoon-hyun suggested to make review easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peter-toth may you please also add some UTs? eg. the example you added in the scaladoc may be also formalized with a UT... thanks.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Show resolved
Hide resolved
@mgaido91, I've moved the relevant UTs from the other PR here. |
ok to test |
1 similar comment
ok to test |
Test build #105274 has finished for PR 24553 at commit
|
retest this please |
...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
Show resolved
Hide resolved
Test build #105275 has finished for PR 24553 at commit
|
Test build #105278 has finished for PR 24553 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Outdated
Show resolved
Hide resolved
Test build #105283 has finished for PR 24553 at commit
|
Any comments or suggestions @dongjoon-hyun or @gatorsmile ? |
@dongjoon-hyun @gatorsmile @mgaido91 @viirya @HyukjinKwon any feedback or suggestion is very welcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the change makes sense in general, just some questions on the tests. Thanks @peter-toth
...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
Show resolved
Hide resolved
...talyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantPropagationSuite.scala
Outdated
Show resolved
Hide resolved
...talyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantPropagationSuite.scala
Outdated
Show resolved
Hide resolved
Thanks so much @mgaido91 for the review. Please let me know if you have any more questions. |
Test build #105769 has finished for PR 24553 at commit
|
...talyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantPropagationSuite.scala
Outdated
Show resolved
Hide resolved
...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
Show resolved
Hide resolved
Test build #105780 has finished for PR 24553 at commit
|
LGTM, thanks! |
@dongjoon-hyun @gatorsmile @cloud-fan @viirya @HyukjinKwon could you please help me to review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a cursory look, seems making sense but I guess @hvanhovell should take a look as well.
retest this please |
Test build #106493 has finished for PR 24553 at commit
|
827cfe5
to
054624e
Compare
Test build #106794 has finished for PR 24553 at commit
|
retest this please |
retest this please |
Test build #116717 has finished for PR 24553 at commit
|
Test build #116721 has finished for PR 24553 at commit
|
39c563d
to
85d5430
Compare
Test build #116792 has finished for PR 24553 at commit
|
Test build #116861 has finished for PR 24553 at commit
|
Test build #116866 has finished for PR 24553 at commit
|
Test build #116980 has finished for PR 24553 at commit
|
4f0a89a
to
5d67ffd
Compare
Test build #117354 has finished for PR 24553 at commit
|
5d67ffd
to
767233e
Compare
Test build #117514 has finished for PR 24553 at commit
|
767233e
to
6c33c71
Compare
Test build #117661 has finished for PR 24553 at commit
|
6c33c71
to
63ea687
Compare
Test build #117663 has finished for PR 24553 at commit
|
@cloud-fan, @dongjoon-hyun, @maropu I improved |
Test build #118355 has finished for PR 24553 at commit
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
This PR improves
ConstantPropagation
rule, the new implementation:e.g.
a = 1 AND a = 2
=>a = 1 AND 1 = 2
Before this PR this unsatisfiable filter was not reduced:
e.g.
abs(a) = 5 AND b = abs(a) + 3
=>abs(a) = 5 AND b = 8
EqualTo
andEqualNullSafe
expressions:e.g.
a = 5 AND b < a + 3
=>a = 5 AND b < 8
Project
and other nodes (except forJoin
):e.g.
SELECT a = 5 AND b = a + 3
=>SELECT a = 5 AND b = 8
Before this PR only
Filter
was supported.e.g.
... WHERE IF(..., a = 5 AND b = a + 3, ...)
=>... WHERE IF(..., a = 5 AND b = 8, ...)
Before this PR only top level
And
/Or
/Not
nodes were supported.... WHERE a = c AND f(a)
orIF(a = c AND f(a), ..., ...)
wherea
is a nullable expression andc
is a constant thenull
result ofa = c AND f(a)
means the same as if it resultedfalse
therefore constant propagation can be safely applied (a = c AND f(a)
=>a = c AND f(c)
).SELECT a = c AND f(a)
thenull
result really meansnull
. In this context constant propagation can't be applied safely.Not
in which the context flips. E.g. constant propagation can't be applied onWHERE NOT(a = c AND f(a))
but can be again onWHERE NOT(IF(..., NOT(a = c AND f(a)), ...)
.Why are the changes needed?
Improve performance.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
New UTs.