-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-19831][CORE] Reuse the existing cleanupThreadExecutor to clean up the directories of finished applications to avoid the block #17189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…plication to avoid the block
|
I think you need to call |
| logInfo(s"Cleaning up local directories for application $id") | ||
| dirList.foreach { dir => | ||
| Utils.deleteRecursively(new File(dir)) | ||
| cleanupApplicationThreadExecutor.submit(new Runnable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to just reuse the existing cleanupThreadExecutor like this:
appDirectories.remove(id).foreach { dirList =>
concurrent.Future {
logInfo(s"Cleaning up local directories for application $id")
dirList.foreach { dir =>
Utils.deleteRecursively(new File(dir))
}
}(cleanupThreadExecutor).onFailure {
case e: Throwable =>
logError(s"Clean up app dir $dirList failed: ${e.getMessage}", e)
}(cleanupThreadExecutor)
}
shuffleService.applicationRemoved(id)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, but I am worried that cleaning up the workDir and application all cost mush time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That won't become an issue. The worst case is there will be some pending tasks in the queue of cleanupThreadExecutor. Considering the number of applications is not huge, it won't be an issue.
|
ok to test |
|
Test build #74220 has finished for PR 17189 at commit
|
|
Test build #74266 has finished for PR 17189 at commit
|
|
|
||
| // A separated thread to clean up the workDir. Used to provide the implicit parameter of `Future` | ||
| // methods. | ||
| // A separated thread to clean up the workDir and the finished application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the directories of finished applications.
|
@hustfxj Looks pretty good. Could you update the PR title and description to reflect the latest changes? |
|
@zsxwing Of course, Thank you for your reminding |
|
Test build #74397 has finished for PR 17189 at commit
|
|
LGTM. Merging to master. Thanks! |
Cleaning the application may cost much time at worker, then it will block that the worker send heartbeats master because the worker is extend ThreadSafeRpcEndpoint. If the heartbeat from a worker is blocked by the message ApplicationFinished, master will think the worker is dead. If the worker has a driver, the driver will be scheduled by master again.
It had better reuse the existing cleanupThreadExecutor to clean up the directories of finished applications to avoid the block.