Skip to content

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

Spark with Scala 2.10 fails with a group by cube:

spark.range(1).select($"id" as "a", $"id" as "b").write.partitionBy("a").mode("overwrite").saveAsTable("rollup_bug")
spark.sql("select 1 from rollup_bug group by rollup ()").show

It can be traced back to #15484 , which made Expand.projections a lazy Stream for group by cube.

In scala 2.10 Stream captures a lot of stuff, and in this case it captures the entire query plan which has some un-serializable parts.

This change is also good for master branch, to reduce the serialized size of Expand.projections.

How was this patch tested?

manually verified with Spark with Scala 2.10.

@cloud-fan
Copy link
Contributor Author

@SparkQA
Copy link

SparkQA commented Sep 20, 2017

Test build #81974 has finished for PR 19289 at commit 20ea0c4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubeExprs -> cubeExprs0 ?

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Sep 20, 2017

Test build #81977 has finished for PR 19289 at commit 20ea0c4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 20, 2017

Test build #81979 has finished for PR 19289 at commit 518fe49.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Sep 20, 2017
## What changes were proposed in this pull request?

Spark with Scala 2.10 fails with a group by cube:
```
spark.range(1).select($"id" as "a", $"id" as "b").write.partitionBy("a").mode("overwrite").saveAsTable("rollup_bug")
spark.sql("select 1 from rollup_bug group by rollup ()").show
```

It can be traced back to #15484 , which made `Expand.projections` a lazy `Stream` for group by cube.

In scala 2.10 `Stream` captures a lot of stuff, and in this case it captures the entire query plan which has some un-serializable parts.

This change is also good for master branch, to reduce the serialized size of `Expand.projections`.

## How was this patch tested?

manually verified with Spark with Scala 2.10.

Author: Wenchen Fan <[email protected]>

Closes #19289 from cloud-fan/bug.

(cherry picked from commit ce6a71e)
Signed-off-by: gatorsmile <[email protected]>
@gatorsmile
Copy link
Member

gatorsmile commented Sep 20, 2017

LGTM

Thanks! Merged to master/2.2.

@asfgit asfgit closed this in ce6a71e Sep 20, 2017
ghost pushed a commit to dbtsai/spark that referenced this pull request Sep 21, 2017
## What changes were proposed in this pull request?

This a follow-up of apache#19289 , we missed another place: `rollup`. `Seq.init.toSeq` also returns a `Stream`, we should fix it too.

## How was this patch tested?

manually

Author: Wenchen Fan <[email protected]>

Closes apache#19298 from cloud-fan/bug.
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
## What changes were proposed in this pull request?

Spark with Scala 2.10 fails with a group by cube:
```
spark.range(1).select($"id" as "a", $"id" as "b").write.partitionBy("a").mode("overwrite").saveAsTable("rollup_bug")
spark.sql("select 1 from rollup_bug group by rollup ()").show
```

It can be traced back to apache#15484 , which made `Expand.projections` a lazy `Stream` for group by cube.

In scala 2.10 `Stream` captures a lot of stuff, and in this case it captures the entire query plan which has some un-serializable parts.

This change is also good for master branch, to reduce the serialized size of `Expand.projections`.

## How was this patch tested?

manually verified with Spark with Scala 2.10.

Author: Wenchen Fan <[email protected]>

Closes apache#19289 from cloud-fan/bug.

(cherry picked from commit ce6a71e)
Signed-off-by: gatorsmile <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants