Should batch() always try to use multiprocessing?

Multiprocessing should always accelerate `batch()` for large movies. But for the small movie in the [walkthrough notebook](https://soft-matter.github.io/trackpy/dev/tutorial/walkthrough.html), multiprocessing takes 10x longer than a single process on my M3 Mac (~4 s vs. 0.5 s)! This is presumably because of the [large overhead of spawning Python interpreters from scratch](https://docs.python.org/3.13/library/multiprocessing.html#contexts-and-start-methods). If I add at the top of the notebook

```
import multiprocessing
multiprocessing.set_start_method('fork')
```

so that this overhead is negligible, multiprocessing is now faster than single-process (0.3 s). But `fork` is not the default on Mac because it can cause crashes, and it was never available on Windows.

**Questions for the community:**
1. Is there a similar slowdown for short movies on Windows? What about macOS on Intel? (Or is it just my computer?!)
2. Can we propose an "elegant" and "smart" way to predict if multiprocessing will be an advantage? Or at least, to have `batch()` print a recommendation for next time, after the job finishes? For example, if the `spawn` method was used for multiprocessing, and the total time was under 10 s.
3. Should we just go back to making `processes=1` the default?

For now (v0.7 release), I'd like to use `processes=1` in main part of the walkthrough, with a brief explanation. This at least lets us showcase the (excellent) baseline performance of trackpy on all platforms, and would help some users optimize their own workflows. We can keep the `%%timeit` comparisons at the bottom of the notebook.

Please reply with your performance reports and thoughts!

(*More context:* When we made `processes='auto'` the default for `batch()`, macOS still used `fork` as the default method. However, since Python 3.8 it is `spawn`. Starting with Python 3.14, the slower and safer [`spawn` will be the default](https://docs.python.org/3.13/library/multiprocessing.html#contexts-and-start-methods) on all platforms.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should batch() always try to use multiprocessing? #788

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Should batch() always try to use multiprocessing? #788

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions