Skip to content

Conversation

@mgaido91
Copy link
Contributor

What changes were proposed in this pull request?

In SPARK-24865 AnalysisBarrier was removed and in order to improve resolution speed, the analyzed flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans.

How was this patch tested?

added UT

Please review http://spark.apache.org/contributing.html before opening a pull request.

@mgaido91
Copy link
Contributor Author

cc @cloud-fan @gatorsmile @rxin

@mgaido91 mgaido91 changed the title [SPARK-26054][SQL] Transform also analyzed plans when dedup references [SPARK-26057][SQL] Transform also analyzed plans when dedup references Nov 14, 2018
@SparkQA
Copy link

SparkQA commented Nov 14, 2018

Test build #98829 has finished for PR 23035 at commit 62a895f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

test("SPARK-26057: attribute deduplication on already analyzed plans") {
withTempView("cc", "p", "c") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't care about naming, how about a, b, c instead of cc, p, c?

| WHERE c.id = cc.id AND c.layout = cc.layout AND c.ts > p.ts)
|GROUP BY cc.id, cc.layout
""".stripMargin).createOrReplaceTempView("pcc")
val res = spark.sql(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch on the problem! Do you think it's possible to simplify the test? I think we just need a temp view with subquery, and use it in a join.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I simplified as much as I was able to. I hope now it is fine. Thanks.

@SparkQA
Copy link

SparkQA commented Nov 15, 2018

Test build #98861 has finished for PR 23035 at commit 98d91a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/2.4!

asfgit pushed a commit that referenced this pull request Nov 15, 2018
## What changes were proposed in this pull request?

In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans.

## How was this patch tested?

added UT

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes #23035 from mgaido91/SPARK-26057.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit b46f75a)
Signed-off-by: Wenchen Fan <[email protected]>
@asfgit asfgit closed this in b46f75a Nov 15, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans.

## How was this patch tested?

added UT

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes apache#23035 from mgaido91/SPARK-26057.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans.

added UT

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes apache#23035 from mgaido91/SPARK-26057.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit b46f75a)
Signed-off-by: Wenchen Fan <[email protected]>

Conflicts:
	sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
## What changes were proposed in this pull request?

In SPARK-24865 `AnalysisBarrier` was removed and in order to improve resolution speed, the `analyzed` flag was (re-)introduced in order to process only plans which are not yet analyzed. This should not be the case when performing attribute deduplication as in that case we need to transform also the plans which were already analyzed, otherwise we can miss to rewrite some attributes leading to invalid plans.

## How was this patch tested?

added UT

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes apache#23035 from mgaido91/SPARK-26057.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit b46f75a)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants