[SPARK-37535][CORE] Update default spark.io.compression.codec to zstd #34798

wangyum · 2021-12-03T10:03:24Z

What changes were proposed in this pull request?

This pr update default spark.io.compression.codec to zstd.

Why are the changes needed?

To workaround Stream is corrupted issue:

org.apache.spark.shuffle.FetchFailedException: Stream is corrupted
	at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:830)
	at org.apache.spark.storage.BufferReleasingInputStream.read(ShuffleBlockFetcherIterator.scala:926)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
	at java.io.DataInputStream.read(DataInputStream.java:149)
	at org.sparkproject.guava.io.ByteStreams.read(ByteStreams.java:899)
	at org.sparkproject.guava.io.ByteStreams.readFully(ByteStreams.java:733)
	at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2$$anon$3.next(UnsafeRowSerializer.scala:127)
	at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2$$anon$3.next(UnsafeRowSerializer.scala:110)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:494)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
	at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:29)
	at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:40)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:50)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:730)
	at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:255)
	at org.apache.spark.sql.execution.SortExecBase.$anonfun$doExecute$1(SortExec.scala:266)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:913)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:913)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:388)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:315)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:129)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:486)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1379)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:489)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Stream is corrupted
	at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:259)
	at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:157)
	at org.apache.spark.storage.BufferReleasingInputStream.read(ShuffleBlockFetcherIterator.scala:922)
	... 32 more

Please see SPARK-18105 for more details.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing unit tests.

HyukjinKwon · 2021-12-03T10:04:38Z

cc @dongjoon-hyun FYI

wangyum · 2021-12-03T10:11:06Z

SparkQA · 2021-12-03T11:29:43Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50373/

SparkQA · 2021-12-03T12:27:59Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50373/

SparkQA · 2021-12-03T12:44:34Z

Test build #145898 has finished for PR 34798 at commit ce477f4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2021-12-03T17:17:59Z

Just for my context - is this default change to match other default changes we already made?
Is the idea that lz4 support has a bug, so default to zstd?
or both?

dongjoon-hyun

Could you keep the original JIRA ID instead of filing a new one, @wangyum ?

#32286 [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default

Also, we need a test case if this claims for a bug fix.

wangyum · 2021-12-05T11:39:48Z

Thank you all. The root cause may be a hardware issue.

wangyum · 2021-12-07T10:29:38Z

It turns out that Stream is corrupted is a hardware problem. Zstd has similar issues.

dongjoon-hyun · 2021-12-07T21:25:03Z

Thank you for the info, @wangyum .

sleep1661 · 2022-03-18T09:57:54Z

@wangyum @dongjoon-hyun Why root cause may be a hardware issue, could you provide more information about this. Finally, how to fix

pan3793 · 2022-03-31T04:19:43Z

@sleep1661 You can get more information at #32385 (comment), and #32385 (comment) may be a direction to fix it

wangyum and others added 2 commits December 3, 2021 17:55

Update package.scala

b92bfd3

Update configuration.md

ce477f4

github-actions bot added CORE DOCS labels Dec 3, 2021

dongjoon-hyun requested changes Dec 3, 2021

View reviewed changes

wangyum closed this Dec 7, 2021

wangyum deleted the SPARK-37535 branch December 7, 2021 10:29

Uh oh!

[SPARK-37535][CORE] Update default spark.io.compression.codec to zstd #34798

[SPARK-37535][CORE] Update default spark.io.compression.codec to zstd #34798

Uh oh!

Conversation

wangyum commented Dec 3, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

HyukjinKwon commented Dec 3, 2021

Uh oh!

wangyum commented Dec 3, 2021

Uh oh!

SparkQA commented Dec 3, 2021

Uh oh!

SparkQA commented Dec 3, 2021

Uh oh!

SparkQA commented Dec 3, 2021

Uh oh!

srowen commented Dec 3, 2021

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangyum commented Dec 5, 2021

Uh oh!

wangyum commented Dec 7, 2021

Uh oh!

dongjoon-hyun commented Dec 7, 2021

Uh oh!

sleep1661 commented Mar 18, 2022

Uh oh!

pan3793 commented Mar 31, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dongjoon-hyun left a comment •

edited

Loading