Skip to content

Effect of removing std::uncaught_exceptions calls on OL obsoleting critical section enter and exit

Laurynas Biveinis edited this page Apr 22, 2021 · 5 revisions

The initial OLC ART implementation introduced unique_write_lock_obsoleting_guard class that took a write-locked lock in constructor and in destructor decided whether to write unlock and obsolete on normal path or just write unlock on error path by comparing std::uncaught_exceptions results in constructor and destructor. This did not work out due to guard lifetime extending beyond their variable scope due to copy elision (?) and I had to introduce explicit commit and abort calls, removing the need for decision in destructor, thus making std::uncaught_exceptions calls redundant.

Removing them has beneficial performance effect. Node4 microbenchmarks: a ~2% slowdown (large Node4 tree random gets) to a ~7% (sequential insert to full Node4) speedup. Node16: a ~5% slowdown (shrinking a large Node48 tree to Node16 tree randomly) to a ~4% speed up (adding to a Node16 tree sequentially). Node48: a ~0% (growing a small Node16 tree to Node48 randomly) to a 5% speed up (shrinking a Node256 tree to Node48). Node256: 3% slowdown (growing a small Node48 tree to Node256 sequentially) to a 9% speed up (full Node256 tree random deletes).

perf stat baseline on Node256 deletes:

2021-04-22T05:37:48+02:00
Running ../../../unodb/build.release/benchmark/micro_benchmark_node256
Run on (8 X 3800 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 8192 KiB (x1)
Load Average: 0.28, 0.60, 0.40
--------------------------------------------------------------------------------------------------------------------
Benchmark                                                          Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------------------------
full_node256_tree_sequential_delete<unodb::olc_db>/192          22.6 us         22.6 us        31015 items_per_second=6.37658M/s size=26.1484k
full_node256_tree_sequential_delete<unodb::olc_db>/512          66.3 us         66.2 us        10580 items_per_second=6.24942M/s size=68.2344k
full_node256_tree_sequential_delete<unodb::olc_db>/4096          522 us          522 us         1342 items_per_second=6.34678M/s size=545.156k
full_node256_tree_sequential_delete<unodb::olc_db>/32768        1602 us         1601 us          437 items_per_second=6.33357M/s size=4.25503M
full_node256_tree_sequential_delete<unodb::olc_db>/196608       5187 us         5185 us          135 items_per_second=5.86828M/s size=25.5237M
full_node256_tree_random_delete<unodb::olc_db>/192              21.6 us         21.6 us        32464 items_per_second=6.67622M/s size=26.1484k
full_node256_tree_random_delete<unodb::olc_db>/512              64.4 us         64.3 us        10876 items_per_second=6.43436M/s size=68.2344k
full_node256_tree_random_delete<unodb::olc_db>/4096              538 us          538 us         1300 items_per_second=6.1522M/s size=545.156k
full_node256_tree_random_delete<unodb::olc_db>/32768            1782 us         1782 us          393 items_per_second=5.69098M/s size=4.25503M
full_node256_tree_random_delete<unodb::olc_db>/196608           8338 us         8336 us           85 items_per_second=3.65034M/s size=25.5237M

 Performance counter stats for '../../../unodb/build.release/benchmark/micro_benchmark_node256 --benchmark_filter=delete<unodb::olc_db>':

         32,194.83 msec task-clock                #    0.999 CPUs utilized          
                39      context-switches          #    0.001 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
         1,601,915      page-faults               #    0.050 M/sec                  
   122,408,024,779      cycles                    #    3.802 GHz                      (83.33%)
    43,148,309,518      stalled-cycles-frontend   #   35.25% frontend cycles idle     (83.33%)
    20,235,238,905      stalled-cycles-backend    #   16.53% backend cycles idle      (66.68%)
   213,171,307,574      instructions              #    1.74  insn per cycle         
                                                  #    0.20  stalled cycles per insn  (83.34%)
    41,002,130,533      branches                  # 1273.562 M/sec                    (83.34%)
        68,024,882      branch-misses             #    0.17% of all branches          (83.33%)

      32.213041437 seconds time elapsed

      30.499034000 seconds user
       1.696168000 seconds sys

With the patch:

2021-04-22T05:37:48+02:00
Running ../../../unodb/build.release/benchmark/micro_benchmark_node256
Run on (8 X 3800 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 8192 KiB (x1)
Load Average: 0.28, 0.60, 0.40
--------------------------------------------------------------------------------------------------------------------
Benchmark                                                          Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------------------------------
full_node256_tree_sequential_delete<unodb::olc_db>/192          22.6 us         22.6 us        31015 items_per_second=6.37658M/s size=26.1484k
full_node256_tree_sequential_delete<unodb::olc_db>/512          66.3 us         66.2 us        10580 items_per_second=6.24942M/s size=68.2344k
full_node256_tree_sequential_delete<unodb::olc_db>/4096          522 us          522 us         1342 items_per_second=6.34678M/s size=545.156k
full_node256_tree_sequential_delete<unodb::olc_db>/32768        1602 us         1601 us          437 items_per_second=6.33357M/s size=4.25503M
full_node256_tree_sequential_delete<unodb::olc_db>/196608       5187 us         5185 us          135 items_per_second=5.86828M/s size=25.5237M
full_node256_tree_random_delete<unodb::olc_db>/192              21.6 us         21.6 us        32464 items_per_second=6.67622M/s size=26.1484k
full_node256_tree_random_delete<unodb::olc_db>/512              64.4 us         64.3 us        10876 items_per_second=6.43436M/s size=68.2344k
full_node256_tree_random_delete<unodb::olc_db>/4096              538 us          538 us         1300 items_per_second=6.1522M/s size=545.156k
full_node256_tree_random_delete<unodb::olc_db>/32768            1782 us         1782 us          393 items_per_second=5.69098M/s size=4.25503M
full_node256_tree_random_delete<unodb::olc_db>/196608           8338 us         8336 us           85 items_per_second=3.65034M/s size=25.5237M

 Performance counter stats for '../../../unodb/build.release/benchmark/micro_benchmark_node256 --benchmark_filter=delete<unodb::olc_db>':

         32,194.83 msec task-clock                #    0.999 CPUs utilized          
                39      context-switches          #    0.001 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
         1,601,915      page-faults               #    0.050 M/sec                  
   122,408,024,779      cycles                    #    3.802 GHz                      (83.33%)
    43,148,309,518      stalled-cycles-frontend   #   35.25% frontend cycles idle     (83.33%)
    20,235,238,905      stalled-cycles-backend    #   16.53% backend cycles idle      (66.68%)
   213,171,307,574      instructions              #    1.74  insn per cycle         
                                                  #    0.20  stalled cycles per insn  (83.34%)
    41,002,130,533      branches                  # 1273.562 M/sec                    (83.34%)
        68,024,882      branch-misses             #    0.17% of all branches          (83.33%)

      32.213041437 seconds time elapsed

      30.499034000 seconds user
       1.696168000 seconds sys

Clone this wiki locally