-
Notifications
You must be signed in to change notification settings - Fork 70
Description
What happened
The current implementation of the multiprocessing fails when running with multiple workers on macOS and Windows. This is because these platforms use the spawn start method by default for multiprocessing. you may see some exceptions like
TypeError: cannot pickle 'generator' object
Root Cause
The issue is not just about specific unpicklable objects, but fundamentally that the spawn method (unlike fork) creates a fresh interpreter for the child process. This requires that the entire class instance and its state be serialized (pickled) and passed to the new process.
Currently, our data classes are not designed to be passed between processes in this way. When spawn attempts to transfer the class to the worker process, it fails because the state (including generators like itertools.cycle or streaming datasets) cannot be serialized.
Future Compatibility (Python 3.14+):
This issue is critical for long-term support because Python is moving away from fork. The fork method is already discouraged in Python 3.12+ and is expected to be removed as the default on Linux in Python 3.14. This means this breakage will eventually affect Linux users as well.
ref: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
Proposed Solution (Best Practice)
We should support cross-platform multiprocessing by explicitly pass those class based on recommendation in python's doc
Action
- Refactor the code to ensure all data classes and context objects are fully
pickleable. - Explicitly pass context objects to worker processes rather than relying on fork's shared memory.
- Explicitly ensure all platforms use
spawn
Mitigation Plan
Until the proper cross-platform support is implemented
- macOS / Linux (Temporary): If breakage occurs on newer Python versions, the mitigation is to roll back to Python 3.12 or explicitly enforce the
forkstart method. eg.import multiprocessing as mp if __name__ == '__main__': mp.set_start_method('fork', force=True) - Windows: There is currently no mitigation plan for Windows (Win32) as it does not support
fork.