Skip to content

Effect of fully inlining try_read_unlock

Laurynas Biveinis edited this page Oct 14, 2021 · 7 revisions

In the cold OLC algorithm paths, GCC 11 calls try_read_unlock instead of inlining it even though it should compile down to a single instruction, saving code size. Try forcing inlining by __attribute__((always_inline,flatten)).

baseline commit, patch

  • micro_benchmark_key_prefix: 2% slowdown (unpredictable_cut_key_prefix) to 2% speedup (unpredictable_get_shared_length)
  • micro_benchmark_n4: 2% slowdown (n4_full_scan/100) to 2% speedup (full_n4_to_minimal_sequential_delete/512)

Clone this wiki locally