-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-18002. abfs rename idempotency broken -remove recovery #3641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18002. abfs rename idempotency broken -remove recovery #3641
Conversation
Cut modtime-based rename recovery as it doesn't work. Applications will have to use etag API of HADOOP-17979 and implement it themselves. Why not do the HEAD and etag recovery in ABFS client? Cuts the IO capacity in half so kills job commit performance. The manifest committer of MAPREDUCE-7341 will do this recovery and act as the reference implementation of the algorithm. Change-Id: I071fb31967ae63e0247e2f328d9cfd0e2423b2bf
|
@snvijaya @joshelser @mehakmeet @mukund-thakur modified tests all happy; running full suite against azure cardiff |
|
full test run ok, overloaded system triggered |
|
🎊 +1 overall
This message was automatically generated. |
|
Why can't we just revert the commit rather than deleting it? Also there is delete idempotency checks as well not sure if they also use same LMT check. |
|
good idea..let me see. delete idempotency is easier...if the file isn't there, delete worked |
|
@mukund-thakur way too many changes since the original change went in -and that did rename and delete |
|
this patch is complete. I have nothing further to do in it, all it needs is review. |
mukund-thakur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
Cut modtime-based rename recovery as object modification time is not updated during rename operation. Applications will have to use etag API of HADOOP-17979 and implement it themselves. Why not do the HEAD and etag recovery in ABFS client? Cuts the IO capacity in half so kills job commit performance. The manifest committer of MAPREDUCE-7341 will do this recovery and act as the reference implementation of the algorithm. Contributed by: Steve Loughran Change-Id: I810054c9fd05041dac552f13d31fb15d7524721b
Cut modtime-based rename recovery as object modification time is not updated during rename operation. Applications will have to use etag API of HADOOP-17979 and implement it themselves. Why not do the HEAD and etag recovery in ABFS client? Cuts the IO capacity in half so kills job commit performance. The manifest committer of MAPREDUCE-7341 will do this recovery and act as the reference implementation of the algorithm. Contributed by: Steve Loughran Change-Id: I810054c9fd05041dac552f13d31fb15d7524721b
…#3641) Cut modtime-based rename recovery as object modification time is not updated during rename operation. Applications will have to use etag API of HADOOP-17979 and implement it themselves. Why not do the HEAD and etag recovery in ABFS client? Cuts the IO capacity in half so kills job commit performance. The manifest committer of MAPREDUCE-7341 will do this recovery and act as the reference implementation of the algorithm. Contributed by: Steve Loughran
Cut modtime-based rename recovery as it doesn't work.
Applications will have to use etag API of HADOOP-17979
and implement it themselves.
Why not do the HEAD and etag recovery in ABFS client?
Cuts the IO capacity in half so kills job commit performance.
The manifest committer of MAPREDUCE-7341 will do this recovery
and act as the reference implementation of the algorithm.
Change-Id: I071fb31967ae63e0247e2f328d9cfd0e2423b2bf
How was this patch tested?
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?