Open
Description
Timing measurements with microsecond accuracy are needed to analyse the performance of OpenMP parallel constructs in the core code of fst
. We need to determine the speedup due to parallel processing, the effect of locks in the code (e.g. while writing blocks of data) and the startup and tear-down costs of a parallel section. These costs are likely platform dependent.
Compression algorithms in fst
are fast on a single thread. That means that performance will be dominated by IO throughput when using only a few cores in parallel. Using more cores will degrade performance due to threading overhead. The ideal (maximum) number of threads to use should be determined per column type.