-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5759][Yarn]ExecutorRunnable should catch YarnException while NMClient start contain... #4554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #27329 has started for PR 4554 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same exception shouldn't be logged twice. If we're going to rethrow it, we should wrap it in another exception and include the additional info in that exception's message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a couple nits: we should use String interpolation instead of .format, and "start" should be "starting".
|
Thanks for posting this @lianhuiwang. Left a couple comments. |
|
Test build #27339 has started for PR 4554 at commit
|
|
@sryza thanks. I think i can use SparkException to warp Exception. Could you review this again? |
|
Test build #27329 has finished for PR 4554 at commit
|
|
Test PASSed. |
|
Test build #27339 has finished for PR 4554 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: imports should be ordered alphabetically, so SparkException should come after SparkConf here
|
LGTM I'm going to merge this in master 1.3 after fixing @sryza's comments myself. Thanks @lianhuiwang |
|
For branch-1.2 there's going to be significant merge conflicts, so I will merge it there once you open a PR against that branch. |
…MClient start contain... some time since some reasons, it lead to some exception while NMClient start some containers.example:we do not config spark_shuffle on some machines, so it will throw a exception: java.lang.Error: org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:spark_shuffle does not exist. because YarnAllocator use ThreadPoolExecutor to start Container, so we can not find which container or hostname throw exception. I think we should catch YarnException in ExecutorRunnable when start container. if there are some exceptions, we can know the container id or hostname of failed container. Author: lianhuiwang <[email protected]> Closes #4554 from lianhuiwang/SPARK-5759 and squashes the following commits: caf5a99 [lianhuiwang] use SparkException to warp exception c02140f [lianhuiwang] ExecutorRunnable should catch YarnException while NMClient start container (cherry picked from commit 947b8bd) Signed-off-by: Andrew Or <[email protected]>
some time since some reasons, it lead to some exception while NMClient start some containers.example:we do not config spark_shuffle on some machines, so it will throw a exception:
java.lang.Error: org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:spark_shuffle does not exist.
because YarnAllocator use ThreadPoolExecutor to start Container, so we can not find which container or hostname throw exception. I think we should catch YarnException in ExecutorRunnable when start container. if there are some exceptions, we can know the container id or hostname of failed container.