-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[AutoScheduler] Use PopenPool instead of multiprocessing.pool #8492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
What is the purpose of switching from ThreadPool to PopenPoolExecutor? |
|
PopenPool is intended to replace multiprocessing (not multithreading), so just curious, do we have specific reasons here to replace the ThreadPool here? |
|
We can keep the thread pool here, and only replace |
|
By far those are all uses of |
|
please fix ci error |
|
auto_scheduler need to be fixed: |
|
Exception ignored in: <function XGBoostCostModel.del at 0x7f93082e7b00> |
Discussed with @shingjan , we should avoid importing non-package file in the test directory when writing tests, these common code should be put in |
|
|
|
would be great if @comaniac @jcf94 @merrymercy could also take a look |
|
Thanks for all your effort in this refactoring 🙏 |
tkonolige
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me. I've listed a couple of little things to change. Thanks!
junrushao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really exciting work! I don't have other comments :-)
|
@tqchen @shingjan @tkonolige @junrushao1994 @vinx13 this pr make |
|
@Wanger-SJTU I also met this problem. The problem is in new mechanism of creating processes and in serializing global object such as |
|
We should send the callable |
Are you sure that cloud pickle will be able to serialize such callable object? It was my first idea, but I faced the problem that I wasn't able to serialize python callable object and then call it from C++ code. I got an error like something like that: "Not able to serialize callable object". Don't remember the whole test, unfortunately. But maybe I missed something and did something wrong. |
|
Python function is pickled by it's qualified name (that means the module of the function should be importable). It might be different in our case I'll double check |
|
we can serialize callable functions(through cloud pickle), but the requirement is that the callable functions can only call functions in the tvm namespace and should be available in https://github.com/apache/tvm/blob/main/python/tvm/exec/popen_worker.py It does mean that if we call functions that are not by default imported, we might need to do def custom_func():
# import here so things are available in the callee side
import dep1
dep1.xyz |
|
Additionally, because some of the custom function might need to use functions from |
|
@tqchen, @vinx13 I tried to do that (passing callable function as an argument) and specially I commited these changes on my local branch: https://github.com/echuraev/tvm/tree/echuraev/fix_measure_build_func But it doesn't work. I got the following error: One interesting thing that I tried to dump each of these parameters in tuple Another idea how we can quick fix this problem: move |
|
The finding looks correct. Only python function can be sent. PackedFunc cannot be pickled. But they can be looked up by name they are also registered in the worker |
|
@Wanger-SJTU @echuraev can you check if this fix works? vinx13@23c5ffc |
I checked, and yes, it works! Thank you very much! |
This PR intends to replace the use of
multiprocessing.Pooltopopen_pool.PopenPoolExecutorinauto_scheduler.