Skip to content

Should batch() always try to use multiprocessing? #788

@nkeim

Description

@nkeim

Multiprocessing should always accelerate batch() for large movies. But for the small movie in the walkthrough notebook, multiprocessing takes 10x longer than a single process on my M3 Mac (~4 s vs. 0.5 s)! This is presumably because of the large overhead of spawning Python interpreters from scratch. If I add at the top of the notebook

import multiprocessing
multiprocessing.set_start_method('fork')

so that this overhead is negligible, multiprocessing is now faster than single-process (0.3 s). But fork is not the default on Mac because it can cause crashes, and it was never available on Windows.

Questions for the community:

  1. Is there a similar slowdown for short movies on Windows? What about macOS on Intel? (Or is it just my computer?!)
  2. Can we propose an "elegant" and "smart" way to predict if multiprocessing will be an advantage? Or at least, to have batch() print a recommendation for next time, after the job finishes? For example, if the spawn method was used for multiprocessing, and the total time was under 10 s.
  3. Should we just go back to making processes=1 the default?

For now (v0.7 release), I'd like to use processes=1 in main part of the walkthrough, with a brief explanation. This at least lets us showcase the (excellent) baseline performance of trackpy on all platforms, and would help some users optimize their own workflows. We can keep the %%timeit comparisons at the bottom of the notebook.

Please reply with your performance reports and thoughts!

(More context: When we made processes='auto' the default for batch(), macOS still used fork as the default method. However, since Python 3.8 it is spawn. Starting with Python 3.14, the slower and safer spawn will be the default on all platforms.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions