Skip to content

Performance Benchmarks

mrambacher edited this page Jul 2, 2021 · 75 revisions

These benchmarks measure RocksDB performance when data resides on flash storage.

Setup

All of the benchmarks are run on the same AWS instance. Here are the details of the test setup:

  • Instance type: m5d.2xlarge 8 CPU, 32 GB Memory, 1 x 300 NVMe SSD.
  • Kernel version: Linux 4.14.177-139.253.amzn2.x86_64
  • File System: XFS with discard enabled

To understand the performance of the SSD card, we ran an fio test and observed 117K IOPS of 4KB reads (See Performance Benchmarks#fio test results for outputs).

All tests were executed against by executing benchmark.sh with the following parameters (unless otherwise specified): NUM_KEYS=900000000 CACHE_SIZE=6442450944 For long-running tests, the tests were executed with a duration of 5400 seconds (DURATION=5400)

Unless explicitly specified, the remaining test parameters used the default values. DIO tests were executed with the options "--use_direct_io_for_flush_and_compaction --use_direct_reads". All other parameters used the default values, unless explicitly mentioned here. Tests were executed sequentially against the same database instance. The db_bench tool was generated via "make release".

The following test sequence was executed:

Test 1. Bulk Load of keys in Random Order (benchmark.sh bulkload)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 benchmark.sh bulkload

Measure performance to load 900 million keys into the database. The keys are inserted in random order. The database is empty at the beginning of this benchmark run and gradually fills up. No data is being read when the data load is in progress.

Version Opts Time ops/sec mb/sec usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
6.10.4 None 4002 1062310 425.5 0.9 0.5 0.8 1 3 1013 00:02:24.652 17.2 101402936
6.15.5 None 4068 1045195 418.6 1.0 0.5 0.8 1 3 980 00:02:08.223 15.3 101401808
6.21.2 None 4141 1017259 407.5 0.9 0.5 0.8 2 3 556 00:01:32.279 11.0 101406320
6.22.1 None 4143 1013002 405.8 1.0 0.5 0.8 2 3 224 00:01:26.032 10.2 101405804
6.22.1 DIO 4058 1031703 413.2 1.0 0.5 0.8 2 3 125 00:01:23.019 9.9 101403424

Test 2. Random Read (benchmark.sh randomread)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh randomread

Measure performance to randomly read existing keys. The database after bulkload was used as the starting point.

Version Opts ops/sec mb/sec usec/op p50 p75 p99 p99.9 p99.99
6.10.4 None 138496 35.1 462.1 617.1 764.3 1240 1295 2484
6.15.5 None 138513 35.1 462.0 616.9 764.2 1240 1295 3083
6.21.2 None 138633 35.1 461.6 616.8 764.1 1240 1295 2461
6.22.1 None 138623 35.1 461.7 616.8 763.9 1239 1295 2663
6.22.1 DIO 189237 47.9 338.2 430.0 540.7 854 960 1291

Test 3. Multi-Random Read (benchmark.sh multireadrandom)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh multireadrandom --multiread_batched

Measure performance to randomly multi-get existing keys. The database after bulkload was used as the starting point.

Version Opts ops/sec p50 p75 p99 p99.9 p99.99
6.10.4 None 138476 462.1 4583.0 5710.9 9298 9864
6.15.5 None 138507 462.0 4582.3 5710.2 9299 9867
6.21.2 None 138623 461.7 4577.4 5707.2 9294 9866
6.22.1 None 138660 461.6 4576.1 5706.3 9294 9867
6.22.1 DIO 189235 338.2 3410.7 4047.7 6406 6583

Test 4. Range Scan (benchmark.sh fwdrange)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh fwdrange

Measure performance to randomly iterate over keys. The database after bulkload was used as the starting point.

Version Opts ops/sec mb/sec usec/op p50 p75 p99 p99.9 p99.99
6.10.4 None 70973 284.3 901.7 787.6 1438.0 1882 1899 9945
6.15.5 None 70978 284.3 901.7 787.7 1437.5 1882 1899 9890
6.21.2 None 70967 284.3 901.8 787.8 1437.3 1882 1899 10253
6.22.1 None 70971 284.3 901.7 787.8 1437.2 1882 1899 10274
6.22.1 DIO 78829 315.7 811.9 870.3 1090.5 1411 1859 2618

Test 5. Overwrite (benchmark.sh overwrite)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite

Measure performance to randomly overwrite keys into the database. The database was first created by the previous benchmark.

Version Opts ops/sec mb/sec W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
6.10.4 None 94539 37.9 4.7 161.9 676.9 328.4 700.4 10022 29843 56548 00:07:11.226 8.0 162965216
6.15.5 None 92911 37.2 4.7 158.4 688.8 351.9 697.7 10031 29894 58333 00:08:43.334 9.7 161156844
6.21.2 None 91776 36.8 4.7 155.6 697.3 379.9 708.7 10055 29782 55942 00:08:24.882 9.4 162082088
6.22.1 None 91586 36.7 4.7 155.0 698.8 380.5 709.4 10140 29887 58244 00:08:29.153 9.5 161321740
6.22.1 DIO 90310 36.2 4.8 154.7 708.7 419.0 730.1 9227 28816 55513 00:06:22.790 7.1 160400436

Test 6. Multi-threaded read and single-threaded write (benchmark.sh readwhilewriting)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 MB_WRITE_PER_SEC=2 benchmark.sh readwhilewriting

Measure performance with one writer and multiple reader threads. The writes are rate limited.

Version Opts ops/sec mb/sec W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 du -s -k
6.10.4 None 95649 30.5 30.2 19.7 669.1 608.1 834.5 3999 6597 17861 141636760
6.15.5 None 96223 30.6 28.0 18.7 665.1 609.4 835.1 3965 6475 15613 141339712
6.21.2 None 96891 30.7 29.1 18.9 660.5 607.1 833.4 3961 6465 13669 141208940
6.22.1 None 96812 30.7 28.4 18.5 661.1 606.5 832.8 3958 6500 15777 140972560
6.22.1 DIO 140635 44.5 32.5 14.9 455.1 400.4 543.4 2804 5051 9348 140465744

Test 7. Multi-threaded scan and single-threaded write (benchmark.sh fwdrangewhilewriting)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 MB_WRITE_PER_SEC=2 benchmark.sh fwdrangewhilewriting

Measure performance with one writer and multiple iterator threads. The writes are rate limited.

Version Opts ops/sec mb/sec W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 du -s -k
6.10.4 None 41827 167.5 17.6 8.0 1530.0 1329.2 1826.2 6212 13129 23244 142497356
6.15.5 None 42197 169.0 17.5 8.0 1516.6 1315.5 1817.8 6185 13018 23081 142157224
6.21.2 None 41360 165.7 18.2 8.0 1547.2 1354.1 1840.5 6257 13434 25280 141820100
6.22.1 None 41244 165.2 16.7 7.6 1551.6 1362.3 1845.8 6234 13346 24988 141716340
6.22.1 DIO 35962 144.0 17.0 7.4 1779.5 1538.8 2150.6 6455 9661 13264 141008104

Appendix

fio test results

]$ fio --randrepeat=1 --ioengine=sync --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread --numjobs=32 --group_reporting
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=64
...
fio-2.14
Starting 32 processes
Jobs: 3 (f=3): [_(3),r(1),_(1),E(1),_(10),r(1),_(13),r(1),E(1)] [100.0% done] [445.3MB/0KB/0KB /s] [114K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=32): err= 0: pid=28042: Fri Jul 24 01:36:19 2020
  read : io=131072MB, bw=469326KB/s, iops=117331, runt=285980msec
  cpu          : usr=1.29%, sys=3.26%, ctx=33585114, majf=0, minf=297
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=33554432/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=131072MB, aggrb=469325KB/s, minb=469325KB/s, maxb=469325KB/s, mint=285980msec, maxt=285980msec

Disk stats (read/write):
  nvme1n1: ios=33654742/61713, merge=0/40, ticks=8723764/89064, in_queue=8788592, util=100.00%
]$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.14
Starting 1 process
Jobs: 1 (f=1): [r(1)] [100.0% done] [456.3MB/0KB/0KB /s] [117K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=28385: Fri Jul 24 01:36:56 2020
  read : io=4096.0MB, bw=547416KB/s, iops=136854, runt=  7662msec
  cpu          : usr=22.20%, sys=48.81%, ctx=144112, majf=0, minf=73
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=1048576/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=4096.0MB, aggrb=547416KB/s, minb=547416KB/s, maxb=547416KB/s, mint=7662msec, maxt=7662msec

Disk stats (read/write):
  nvme1n1: ios=1050868/1904, merge=0/1, ticks=374836/2900, in_queue=370532, util=98.70%

Previous Results

Contents

Clone this wiki locally