Skip to content

Conversation

@zsxwing
Copy link
Member

@zsxwing zsxwing commented May 15, 2015

We should not call DirectTaskResult.value when holding the TaskSchedulerImpl lock. It may cost dozens of seconds to deserialize a large object.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 15, 2015

Test build #32842 has started for PR 6195 at commit 15010b5.

@zsxwing
Copy link
Member Author

zsxwing commented May 15, 2015

@rxin @yhuai @marmbrus

@zsxwing zsxwing changed the title [SPARK-7655][Core] Deserialize value should not hold the TaskSchedulerImpl lock [SPARK-7655][Core] Deserializing value should not hold the TaskSchedulerImpl lock May 15, 2015
@SparkQA
Copy link

SparkQA commented May 15, 2015

Test build #32842 has finished for PR 6195 at commit 15010b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32842/
Test PASSed.

@rxin
Copy link
Contributor

rxin commented May 17, 2015

As discussed offline, let's update the pull request with more comments on where the lock is. That is not obvious just looking at the diff.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@rxin
Copy link
Contributor

rxin commented May 17, 2015

lgtm

@SparkQA
Copy link

SparkQA commented May 17, 2015

Test build #32919 has started for PR 6195 at commit e25fa88.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 17, 2015

Test build #32920 has started for PR 6195 at commit 21f502e.

@SparkQA
Copy link

SparkQA commented May 17, 2015

Test build #32919 has finished for PR 6195 at commit e25fa88.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32919/
Test PASSed.

asfgit pushed a commit that referenced this pull request May 17, 2015
…lerImpl lock

We should not call `DirectTaskResult.value` when holding the `TaskSchedulerImpl` lock. It may cost dozens of seconds to deserialize a large object.

Author: zsxwing <[email protected]>

Closes #6195 from zsxwing/SPARK-7655 and squashes the following commits:

21f502e [zsxwing] Add more comments
e25fa88 [zsxwing] Add comments
15010b5 [zsxwing] Deserialize value should not hold the TaskSchedulerImpl lock

(cherry picked from commit 3b6ef2c)
Signed-off-by: Reynold Xin <[email protected]>
@asfgit asfgit closed this in 3b6ef2c May 17, 2015
@SparkQA
Copy link

SparkQA commented May 17, 2015

Test build #32920 has finished for PR 6195 at commit 21f502e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32920/
Test PASSed.

@zsxwing zsxwing deleted the SPARK-7655 branch May 17, 2015 04:15
@shivaram
Copy link
Contributor

@rxin did you benchmark what / how much this improves things ? I've been meaning to debug these really slow collect operations and am wondering if this was the reason

@rxin
Copy link
Contributor

rxin commented May 17, 2015

I didn't do anything here.

@shivaram
Copy link
Contributor

You reviewed the PR :). Anyways @zsxwing -- Did you benchmark any improvements due to this change with some workload ?

@zsxwing
Copy link
Member Author

zsxwing commented May 17, 2015

You reviewed the PR :). Anyways @zsxwing -- Did you benchmark any improvements due to this change with some workload ?

Depends on your case. If you have only one job task, or the values to be collected are very small, I think this PR improves nothing. But if you have multiple jobs tasks running at the same time, deserializing multiple values is in parallel now. It may improve the performance.

markhamstra pushed a commit to markhamstra/spark that referenced this pull request May 22, 2015
…lerImpl lock

We should not call `DirectTaskResult.value` when holding the `TaskSchedulerImpl` lock. It may cost dozens of seconds to deserialize a large object.

Author: zsxwing <[email protected]>

Closes apache#6195 from zsxwing/SPARK-7655 and squashes the following commits:

21f502e [zsxwing] Add more comments
e25fa88 [zsxwing] Add comments
15010b5 [zsxwing] Deserialize value should not hold the TaskSchedulerImpl lock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be @volatile?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The two places calling value() are in the same thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Yes, you're correct. Thanks.

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…lerImpl lock

We should not call `DirectTaskResult.value` when holding the `TaskSchedulerImpl` lock. It may cost dozens of seconds to deserialize a large object.

Author: zsxwing <[email protected]>

Closes apache#6195 from zsxwing/SPARK-7655 and squashes the following commits:

21f502e [zsxwing] Add more comments
e25fa88 [zsxwing] Add comments
15010b5 [zsxwing] Deserialize value should not hold the TaskSchedulerImpl lock
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…lerImpl lock

We should not call `DirectTaskResult.value` when holding the `TaskSchedulerImpl` lock. It may cost dozens of seconds to deserialize a large object.

Author: zsxwing <[email protected]>

Closes apache#6195 from zsxwing/SPARK-7655 and squashes the following commits:

21f502e [zsxwing] Add more comments
e25fa88 [zsxwing] Add comments
15010b5 [zsxwing] Deserialize value should not hold the TaskSchedulerImpl lock
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…lerImpl lock

We should not call `DirectTaskResult.value` when holding the `TaskSchedulerImpl` lock. It may cost dozens of seconds to deserialize a large object.

Author: zsxwing <[email protected]>

Closes apache#6195 from zsxwing/SPARK-7655 and squashes the following commits:

21f502e [zsxwing] Add more comments
e25fa88 [zsxwing] Add comments
15010b5 [zsxwing] Deserialize value should not hold the TaskSchedulerImpl lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants