Skip to content

Commit 0b17e13

Browse files
author
aokolnychyi
committed
[SPARK-16046][DOCS] Aggregations in the Spark SQL programming guide. Improved consistency
1 parent 87a68bd commit 0b17e13

File tree

3 files changed

+12
-6
lines changed

3 files changed

+12
-6
lines changed

docs/sql-programming-guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -384,8 +384,8 @@ For example:
384384

385385
## Aggregations
386386

387-
The [built-in DataFrames functions](api/scala/index.html#org.apache.spark.sql.functions$) mentioned
388-
before provide such common aggregations as `count()`, `countDistinct()`, `avg()`, `max()`, `min()`, etc.
387+
The [built-in DataFrames functions](api/scala/index.html#org.apache.spark.sql.functions$) provide common
388+
aggregations such as `count()`, `countDistinct()`, `avg()`, `max()`, `min()`, etc.
389389
While those functions are designed for DataFrames, Spark SQL also has type-safe versions for some of them in
390390
[Scala](api/scala/index.html#org.apache.spark.sql.expressions.scalalang.typed$) and
391391
[Java](api/java/org/apache/spark/sql/expressions/javalang/typed.html) to work with strongly typed Datasets.

examples/src/main/java/org/apache/spark/examples/sql/JavaUserDefinedTypedAggregation.java

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,9 +102,11 @@ public Average reduce(Average buffer, Employee employee) {
102102
}
103103
// Merge two intermediate values
104104
public Average merge(Average b1, Average b2) {
105-
long newSum = b1.getSum() + b2.getSum();
106-
long newCount = b1.getCount() + b2.getCount();
107-
return new Average(newSum, newCount);
105+
long mergedSum = b1.getSum() + b2.getSum();
106+
long mergedCount = b1.getCount() + b2.getCount();
107+
b1.setSum(mergedSum);
108+
b1.setCount(mergedCount);
109+
return b1;
108110
}
109111
// Transform the output of the reduction
110112
public Double finish(Average reduction) {

examples/src/main/scala/org/apache/spark/examples/sql/UserDefinedTypedAggregation.scala

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,11 @@ object UserDefinedTypedAggregation {
4040
buffer
4141
}
4242
// Merge two intermediate values
43-
def merge(b1: Average, b2: Average): Average = Average(b1.sum + b2.sum, b1.count + b2.count)
43+
def merge(b1: Average, b2: Average): Average = {
44+
b1.sum += b2.sum
45+
b1.count += b2.count
46+
b1
47+
}
4448
// Transform the output of the reduction
4549
def finish(reduction: Average): Double = reduction.sum.toDouble / reduction.count
4650
// Specifies the Encoder for the intermediate value type

0 commit comments

Comments
 (0)