-
Notifications
You must be signed in to change notification settings - Fork 13
Single Database instance -- Resume request when client fails fixes #468 #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…s SUBMITTED or COMPLETE
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #497 +/- ##
==========================================
+ Coverage 83.27% 83.65% +0.37%
==========================================
Files 26 26
Lines 1429 1462 +33
==========================================
+ Hits 1190 1223 +33
Misses 239 239
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
@BenGalewsky, @ponyisi, @gordonwatts ready for review! |
|
Just had a chat with @ketan96-m that we need to check how download behaves when resume (or rerun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comments following discussion -
- only add to cache when request ID is obtained
- do not worry about edge case where two requests of the same hash are made in the same Spec (we could instead raise an error if this happens, but it is almost certainly a user mistake)
- ensure that if the transform fails, we wipe the in-progress cache record
- things are bit copy-and-paste in a few places, reduce this
Otherwise the overall structure looks fine to me
Hi @kyungeonchoi if the user changes the spec of the request and therefore the target file area, I think we should just go ahead and deliver there. I believe with this PR we will just redownload all files from the incomplete requests (regardless of what has already been downloaded) which is a bit brute force but can be improved in the future - I don't think we need to include "avoid re-downloading" yet, since that really will require us to do an integrity check on the files that we have already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I think my questions are just because I don't understand something deeper going on. Please take a look and then decide!
|
It seems like the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few comments. In principle we could go ahead with this but I think for long-term maintenance a few tweaks would improve things.
servicex/query_cache.py
Outdated
| transform = Query() | ||
| with self.lock: | ||
| records = self.db.search((transform.hash == hash_value) | ||
| & (transform.status.exists())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mild preference to do this check after the query, i.e. look up the request first (because having the request missing should be an error!) and then check if the field is there.
It's also true that you basically never use this function except to check if the result is SUBMITTED. Perhaps better to make the function get_transform_request_submitted and return a bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the function to a bool. However, the request missing shouldn't raise an error since the the request is still not submitted in the case and it should return False rather than a error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I see. Somewhat unfortunate naming of the function I suppose. It would be good to document this contract (it returns False if the transformation is not in the cache at all) in the docstring, since you rely on this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See one line to update, otherwise I approve
tests/test_dataset.py
Outdated
| python_dataset.download_files = AsyncMock() | ||
| python_dataset.download_files.return_value = [] | ||
| python_dataset.cache.update_transform_request_id = Mock() | ||
| # python_dataset.cache.update_transform_status = Mock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Resolves #468
Problem
When the client submits the request but encounters client failure, there is no way of resuming the transforms from the last point. The client is forced to rerun the request which triggers the backend again.
Fix
Single Database instance -- Added flag
This PR fixes the broken client problem. The client will be able to resume the request from the last point. Given the client submits the same request again (important since the request hash is calculated and compared with any pending requests)