Description
The multithreading support on zstd is a great performance booster on modern multi-core CPUs. We've been getting favorable results utilizing it by setting ZSTD_c_nbWorkers
to be equal to the number of threads the system can execute simultaneously.
However for hardware like Apple Silicon Macs, where not all cores are built for performance, it can be tricky deciding number of threads to specify in ZSTD_c_nbWorkers
. Specify too many threads and some which get assigned to efficiency cores would lag behind the others. Specify too little and you might be leaving potential gain on the table. Even if you manage to find the number of performance cores and only create that many threads, there is still no control on which thread gets executed on which core.
Apple recommends using GCD for Apple Silicon (https://developer.apple.com/news/?id=vk3m204o), but that would mean moving away from the generic pthread implementation which works on all posix systems.
Has there been any explorations on how to optimize zstd performance for such asymmetric multiprocessing systems?