-
-
Notifications
You must be signed in to change notification settings - Fork 33.4k
Description
Bug report
Bug description:
When switching a Python project from the default multiprocessing fork mode to forkserver, I've noticed the processes in the pool got significantly slower to start. Specifying preload modules didn't help.
Debugging internals of multiprocessing showed that most of the time (50 ms in each process in my test) is spent in _fixup_main_from_path(data['init_main_from_path']):
cpython/Lib/multiprocessing/spawn.py
Line 246 in eae9d7d
| _fixup_main_from_path(data['init_main_from_path']) |
This seemed surprising because __main__ was mentioned in the preload parameter, however I've noticed that forkserver.py tries to populate the main_path parameter from spawn.get_preparation_data():
cpython/Lib/multiprocessing/forkserver.py
Lines 149 to 151 in eae9d7d
| desired_keys = {'main_path', 'sys_path'} | |
| data = spawn.get_preparation_data('ignore') | |
| main_kws = {x: y for x, y in data.items() if x in desired_keys} |
However, the latter only writes the path to the init_main_from_path parameter:
cpython/Lib/multiprocessing/spawn.py
Line 202 in eae9d7d
| d['init_main_from_path'] = os.path.normpath(main_path) |
The end effect is that the __main__ module wasn't preloaded in practice, and every child process had to re-run the main script. Unless I'm missing something, the logic in forkserver.py needs to get main_path from the value of init_main_from_path?
CPython versions tested on:
3.13
Operating systems tested on:
Linux