Until root-project/root#11442 is fixed it's important to set the environment variable EXTRA_CLING_ARGS='-O2' (or -O3) to get good performance. For distributed execution, the env var needs to be set in every worker.
On my laptop, running on only 1 file per process on 8 physical cores, the event loop runtime goes from 9.6 seconds to 4.8 setting the variable to -O2 (runtimes are comparable using -O3).
For local multi-thread execution the instructions are simple: simply use env EXTRA_CLING_ARGS='-O2' python analysis.py instead of just python analysis.py.
For distributed execution I defer to @vepadulano 's expertise 😬