Question about concurrency (parallelism vs. context switching) #818

paulzakin · 2023-08-31T18:05:42Z

paulzakin
Aug 31, 2023

Hello!

When you enable concurrency, say procrastinate worker --concurrency=4, what happens exactly? If you have a computer with a CPU with four cores - are all four cores saturated (e.g. parallelism)? Or is this flag just opening an event loop on one of the cores for asynchronous processing and (artificially) governing how many it can keep on the thread at one time (e.g. context switching)? From some experimentation, I'm reasonably sure it is the latter, but just want to check.

And if it is the latter, does true parallelism require running multiple versions of the worker itself? For example, running python -m procrastinate --app=server.worker.app worker in two separate processes on that same four-core machine.

Thanks!

PS: Love the library and our team has been experimenting with it to build a better mental model of how it works. Would you be open to a pull request from us for some improvements on documentation? Think it might make things a little easier for users like us that are new to this library and task queues!

Answered by ewjoachim

Aug 31, 2023

I think you got it right.

Even with an event loop, I/O is still happening in parallel, so if your tasks are I/O-bound, they're parallel. If your tasks are CPU-bound, the mechanisms within Procrastinate aren't going to be very helpful but, as you said, launching multiple processes will get you to use all the CPU.

The reason behind this choice is that multiprocessing is hard, and is often the place where a program crosses boundaries that make it more complex to deploy and scale. I felt the mental model of procrastinate was much simpler keeping it as a (mostly) single-threaded, single-processed program.

Note that those points are mentioned here, and here but if it was still confusing to you,…

View full answer

ewjoachim · 2023-08-31T22:03:00Z

ewjoachim
Aug 31, 2023
Maintainer

I think you got it right.

Even with an event loop, I/O is still happening in parallel, so if your tasks are I/O-bound, they're parallel. If your tasks are CPU-bound, the mechanisms within Procrastinate aren't going to be very helpful but, as you said, launching multiple processes will get you to use all the CPU.

The reason behind this choice is that multiprocessing is hard, and is often the place where a program crosses boundaries that make it more complex to deploy and scale. I felt the mental model of procrastinate was much simpler keeping it as a (mostly) single-threaded, single-processed program.

Note that those points are mentioned here, and here but if it was still confusing to you, then by all means, feel free to do a PR, and I'll do my best to review & merge it.

We're conceptually in the same situation as uvicorn, which lets you deploy a Python program that runs in an event loop. They have multiple modes of deployment:

a basic --worker n argument that doesn't do process management
an integration in Gunicorn, where Gunicorn gets to manage subprocesses, and uvicorn only needs to manage a single process
an invitation to use the process manager of your stack to manage the different processes (e.g. SystemD, or k8s, or ...)

I think the idea is that (especially if you're managing an event loop already), it makes sense not to also manage child processes. People are going to be using a process manager of their choice, and trying to do anything from the lib itself will only get in their way and bring unwanted complexity.