-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-26057][SQL] Transform also analyzed plans when dedup references #23035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #98829 has finished for PR 23035 at commit
|
| } | ||
|
|
||
| test("SPARK-26057: attribute deduplication on already analyzed plans") { | ||
| withTempView("cc", "p", "c") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't care about naming, how about a, b, c instead of cc, p, c?
| | WHERE c.id = cc.id AND c.layout = cc.layout AND c.ts > p.ts) | ||
| |GROUP BY cc.id, cc.layout | ||
| """.stripMargin).createOrReplaceTempView("pcc") | ||
| val res = spark.sql( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch on the problem! Do you think it's possible to simplify the test? I think we just need a temp view with subquery, and use it in a join.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I simplified as much as I was able to. I hope now it is fine. Thanks.
|
Test build #98861 has finished for PR 23035 at commit
|
|
thanks, merging to master/2.4! |
## What changes were proposed in this pull request? In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans. ## How was this patch tested? added UT Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #23035 from mgaido91/SPARK-26057. Authored-by: Marco Gaido <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit b46f75a) Signed-off-by: Wenchen Fan <[email protected]>
## What changes were proposed in this pull request? In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans. ## How was this patch tested? added UT Please review http://spark.apache.org/contributing.html before opening a pull request. Closes apache#23035 from mgaido91/SPARK-26057. Authored-by: Marco Gaido <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans. added UT Please review http://spark.apache.org/contributing.html before opening a pull request. Closes apache#23035 from mgaido91/SPARK-26057. Authored-by: Marco Gaido <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit b46f75a) Signed-off-by: Wenchen Fan <[email protected]> Conflicts: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
## What changes were proposed in this pull request? In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans. ## How was this patch tested? added UT Please review http://spark.apache.org/contributing.html before opening a pull request. Closes apache#23035 from mgaido91/SPARK-26057. Authored-by: Marco Gaido <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit b46f75a) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
In SPARK-24865
AnalysisBarrierwas removed and in order to improve resolution speed, theanalyzedflag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans.How was this patch tested?
added UT
Please review http://spark.apache.org/contributing.html before opening a pull request.