Skip to content

Commit ce6a71e

Browse files
cloud-fangatorsmile
authored andcommitted
[SPARK-22076][SQL] Expand.projections should not be a Stream
## What changes were proposed in this pull request? Spark with Scala 2.10 fails with a group by cube: ``` spark.range(1).select($"id" as "a", $"id" as "b").write.partitionBy("a").mode("overwrite").saveAsTable("rollup_bug") spark.sql("select 1 from rollup_bug group by rollup ()").show ``` It can be traced back to #15484 , which made `Expand.projections` a lazy `Stream` for group by cube. In scala 2.10 `Stream` captures a lot of stuff, and in this case it captures the entire query plan which has some un-serializable parts. This change is also good for master branch, to reduce the serialized size of `Expand.projections`. ## How was this patch tested? manually verified with Spark with Scala 2.10. Author: Wenchen Fan <[email protected]> Closes #19289 from cloud-fan/bug.
1 parent e17901d commit ce6a71e

File tree

1 file changed

+8
-2
lines changed
  • sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis

1 file changed

+8
-2
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -279,9 +279,15 @@ class Analyzer(
279279
* We need to get all of its subsets for a given GROUPBY expression, the subsets are
280280
* represented as sequence of expressions.
281281
*/
282-
def cubeExprs(exprs: Seq[Expression]): Seq[Seq[Expression]] = exprs.toList match {
282+
def cubeExprs(exprs: Seq[Expression]): Seq[Seq[Expression]] = {
283+
// `cubeExprs0` is recursive and returns a lazy Stream. Here we call `toIndexedSeq` to
284+
// materialize it and avoid serialization problems later on.
285+
cubeExprs0(exprs).toIndexedSeq
286+
}
287+
288+
def cubeExprs0(exprs: Seq[Expression]): Seq[Seq[Expression]] = exprs.toList match {
283289
case x :: xs =>
284-
val initial = cubeExprs(xs)
290+
val initial = cubeExprs0(xs)
285291
initial.map(x +: _) ++ initial
286292
case Nil =>
287293
Seq(Seq.empty)

0 commit comments

Comments
 (0)