-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-10878 Fix race condition when multiple clients resolves artifacts at the same time #18801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…cts at the same time.
|
Update the title: |
| // Include timestamp in module name, so multiple clients resolving maven coordinates at the | ||
| // same time do not modify the same resolution file concurrently. | ||
| ModuleRevisionId.newInstance("org.apache.spark", | ||
| "spark-submit-parent-" + System.currentTimeMillis().toString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "spark-submit-parent-" + System.currentTimeMillis().toString -> s"spark-submit-parent-${System.currentTimeMillis}"
| } | ||
| } | ||
|
|
||
| test("test resolution files cleaned after resolving artifact") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case doesn't test the senario we describe in the PR, that is, having multiple clients trying to resolve artifacts at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually triggering the original race in a test could be pretty hard, so I'm not sure that we necessarily should block this fix on having an end-to-end integration test which could reproduce the original race. I think that the use of the UUID should be sufficient and therefore the only important thing to test is that we're still cleaning up the files properly (as is being done here).
Therefore I would welcome a re-submitted and cleaned-up version of this PR which addresses the other review comments (which I hope to merge soon if it's in good shape).
| /** A nice function to use in tests as well. Values are dummy strings. */ | ||
| def getModuleDescriptor: DefaultModuleDescriptor = DefaultModuleDescriptor.newDefaultInstance( | ||
| ModuleRevisionId.newInstance("org.apache.spark", "spark-submit-parent", "1.0")) | ||
| // Include timestamp in module name, so multiple clients resolving maven coordinates at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: coordinates -> coordinate ?
|
ok to test |
|
Test build #80324 has finished for PR 18801 at commit
|
|
retest this please |
|
Test build #80351 has finished for PR 18801 at commit
|
|
Your test case don't reflect the change you made to support multiple clients resolves artifacts at the same time, could you add a new test case or manually test that? Thanks! |
|
@jiangxb1987 @kiszk |
|
@Victsm kindly ping |
|
gentle ping @Victsm |
1 similar comment
|
gentle ping @Victsm |
| // Include timestamp in module name, so multiple clients resolving maven coordinates at the | ||
| // same time do not modify the same resolution file concurrently. | ||
| ModuleRevisionId.newInstance("org.apache.spark", | ||
| "spark-submit-parent-" + System.currentTimeMillis().toString, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System.currentTimeMillis() is not very unique. If you want uniqueness, use a UUID, or better yet, java.nio.file.Files.createTempFile.
EDIT: this is not a file, so go with UUID.
| s"resolved-${mdId.getOrganisation}-${mdId.getName}-${mdId.getRevision}.properties") | ||
| ) | ||
| currentResolutionFiles.foreach{ file => | ||
| if (file.exists) file.delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.exists() is not needed before .delete(). Also, add parentheses.
| new File(ivySettings.getDefaultCache, | ||
| s"resolved-${mdId.getOrganisation}-${mdId.getName}-${mdId.getRevision}.properties") | ||
| ) | ||
| currentResolutionFiles.foreach{ file => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space before {
|
+1 on using UUID and having a better test case |
|
+1 for UUID |
|
gentle ping @Victsm |
|
Can one of the admins verify this patch? |
|
Anyone can take over this PR and submit it again? |
|
I can take over this PR. |
|
@kiszk Thank you! |
… artifacts at the same time ## What changes were proposed in this pull request? When multiple clients attempt to resolve artifacts via the `--packages` parameter, they could run into race condition when they each attempt to modify the dummy `org.apache.spark-spark-submit-parent-default.xml` file created in the default ivy cache dir. This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir. Since this PR is a successor of #18801, close #18801. Many codes were ported from #18801. **Many efforts were put here. I think this PR should credit to Victsm .** ## How was this patch tested? added UT into `SparkSubmitUtilsSuite` Author: Kazuaki Ishizaki <[email protected]> Closes #21251 from kiszk/SPARK-10878. (cherry picked from commit d3c426a) Signed-off-by: Marcelo Vanzin <[email protected]>
|
@Victsm you can close this PR now. |
… artifacts at the same time ## What changes were proposed in this pull request? When multiple clients attempt to resolve artifacts via the `--packages` parameter, they could run into race condition when they each attempt to modify the dummy `org.apache.spark-spark-submit-parent-default.xml` file created in the default ivy cache dir. This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir. Since this PR is a successor of apache#18801, close apache#18801. Many codes were ported from apache#18801. **Many efforts were put here. I think this PR should credit to Victsm .** ## How was this patch tested? added UT into `SparkSubmitUtilsSuite` Author: Kazuaki Ishizaki <[email protected]> Closes apache#21251 from kiszk/SPARK-10878.
… artifacts at the same time When multiple clients attempt to resolve artifacts via the `--packages` parameter, they could run into race condition when they each attempt to modify the dummy `org.apache.spark-spark-submit-parent-default.xml` file created in the default ivy cache dir. This PR changes the behavior to encode UUID in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir. Since this PR is a successor of apache#18801, close apache#18801. Many codes were ported from apache#18801. **Many efforts were put here. I think this PR should credit to Victsm .** added UT into `SparkSubmitUtilsSuite` Author: Kazuaki Ishizaki <[email protected]> Closes apache#21251 from kiszk/SPARK-10878. (cherry picked from commit d3c426a) RB=1313943 G=superfriends-reviewers R=fli,mshen,yezhou,edlu A=fli
What changes were proposed in this pull request?
When multiple clients attempt to resolve artifacts via the "--packages" parameter, they could run into race condition when they each attempt to modify the dummy "org.apache.spark-spark-submit-parent-default.xml" file created in the default ivy cache dir.
This patch changes the behavior to encode timestamp in the dummy module descriptor so each client will operate on a different resolution file in the ivy cache dir. In addition, this patch changes the behavior of when and which resolution files are cleaned to prevent accumulation of resolution files in the default ivy cache dir.
How was this patch tested?
Unit test added in
SparkSubmitUtilsSuite