-
-
Notifications
You must be signed in to change notification settings - Fork 76
Fix keyerror for eager api #281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #281 +/- ##
=======================================
Coverage 0.00% 0.00%
=======================================
Files 34 35 +1
Lines 2754 2913 +159
=======================================
- Misses 2754 2913 +159
Continue to review full report at Codecov.
|
|
I checked for hangs and overall it doesn't seem to hang now, but the scheduler really slows down at some point when ~100k thunks were processed, so it feels like a hang sometimes. That will need another issue for tracking |
|
I don't understand how this can fix the hangs, since the RNG is thread-safe. Is there some insight that I'm missing? |
|
I'm not so sure myself and I wouldn't consider the hangs issue 100% gone yet, but I definitely noticed that they happen less often now and usually I'm just getting super slowdowns instead. Whenever i crl-c'd out of the hangs the stack trace would usually point at a wait on a future. My initial theory is that if sometimes the EAGER_ID_MAP was pointing at the wrong tid the wait would then be called on a future that wouldn't notify the waiting task for some reason?(maybe the task related to that future didn't start yet and there's a deadlock) Not sure though, so I'd have to investigate more But as I said, my synthetic hang scenarios would usually lock up after 150 iterations or so, with this i can easily go above 1k and usually get a slowdown after 2k iterations instead of a hang |
|
Hey, so do we merge this at some point or is there some reason not to? |
|
Ok, I'll merge this now, although I still don't really understand why it fixes it. |
|
Thanks again! |
|
It fixes the key error, which still should have been rare because that's a rand over uint64, but it wasn't rare, which is weird to me. |
So since the UID was a rand() very rarely weird things like keyerrors would happen, especially in case when you're rerunning tests or using
@btimeNoticed by running these usecases and by looking at EAGER_ID_MAP and thunk_dict.
I'm 50% sure it fixes the occasional hangs as well, but I'll try to confirm it next week
fixes #267