Skip to content

Conversation

ueshin
Copy link
Member

@ueshin ueshin commented May 23, 2014

CountFunction should count up only if the child's evaluated value is not null.

Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15164/

Conflicts:
	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@ueshin
Copy link
Member Author

ueshin commented May 26, 2014

Merged master to fix conflicts.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15200/

@liancheng
Copy link
Contributor

LGTM. Added steps to reproduce this bug in sbt hive/console in the JIRA issue description.

This should be a blocking issue for Spark 1.0 release. @rxin @marmbrus

@liancheng
Copy link
Contributor

And, good catch @ueshin, thanks very much!

@rxin
Copy link
Contributor

rxin commented May 26, 2014

Thanks. I've merged this in master & branch-1.0.

@asfgit asfgit closed this in d6395d8 May 26, 2014
asfgit pushed a commit that referenced this pull request May 26, 2014
… all child expressions.

`CountFunction` should count up only if the child's evaluated value is not null.

Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null.

Author: Takuya UESHIN <[email protected]>

Closes #861 from ueshin/issues/SPARK-1914 and squashes the following commits:

3b37315 [Takuya UESHIN] Merge branch 'master' into issues/SPARK-1914
2afa238 [Takuya UESHIN] Simplify CountFunction not to traverse to evaluate all child expressions.

(cherry picked from commit d6395d8)
Signed-off-by: Reynold Xin <[email protected]>
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
… all child expressions.

`CountFunction` should count up only if the child's evaluated value is not null.

Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null.

Author: Takuya UESHIN <[email protected]>

Closes apache#861 from ueshin/issues/SPARK-1914 and squashes the following commits:

3b37315 [Takuya UESHIN] Merge branch 'master' into issues/SPARK-1914
2afa238 [Takuya UESHIN] Simplify CountFunction not to traverse to evaluate all child expressions.
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
… all child expressions.

`CountFunction` should count up only if the child's evaluated value is not null.

Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null.

Author: Takuya UESHIN <[email protected]>

Closes apache#861 from ueshin/issues/SPARK-1914 and squashes the following commits:

3b37315 [Takuya UESHIN] Merge branch 'master' into issues/SPARK-1914
2afa238 [Takuya UESHIN] Simplify CountFunction not to traverse to evaluate all child expressions.
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
wangyum added a commit that referenced this pull request May 26, 2023
…861)

* [SPARK-36183][SQL][FOLLOWUP] Fix push down limit 1 through Aggregate

### What changes were proposed in this pull request?

Use `Aggregate.aggregateExpressions` instead of `Aggregate.output` when  pushing down limit 1 through Aggregate.

For example:

```scala
spark.range(10).selectExpr("id % 5 AS a", "id % 5 AS b").write.saveAsTable("t1")
spark.sql("SELECT a, b, a AS alias FROM t1 GROUP BY a, b LIMIT 1").explain(true)
```
Before this pr:
```
== Optimized Logical Plan ==
GlobalLimit 1
+- LocalLimit 1
   +- !Project [a#227L, b#228L, alias#226L]
      +- LocalLimit 1
         +- Relation default.t1[a#227L,b#228L] parquet
```
After this pr:
```
== Optimized Logical Plan ==
GlobalLimit 1
+- LocalLimit 1
   +- Project [a#227L, b#228L, a#227L AS alias#226L]
      +- LocalLimit 1
         +- Relation default.t1[a#227L,b#228L] parquet
```

### Why are the changes needed?

Fix bug.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test.

Closes #35286 from wangyum/SPARK-36183-2.

Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>

(cherry picked from commit 9b12571)
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants