Skip to content

Performance Benchmarks

mrambacher edited this page Aug 24, 2021 · 75 revisions

These benchmarks measure RocksDB performance when data resides on flash storage.

Setup

All of the benchmarks are run on the same AWS instance. Here are the details of the test setup:

  • Instance type: m5d.2xlarge 8 CPU, 32 GB Memory, 1 x 300 NVMe SSD.
  • Kernel version: Linux 4.14.177-139.253.amzn2.x86_64
  • File System: XFS with discard enabled

To understand the performance of the SSD card, we ran an fio test and observed 117K IOPS of 4KB reads (See Performance Benchmarks#fio test results for outputs).

All tests were executed against by executing benchmark.sh with the following parameters (unless otherwise specified): NUM_KEYS=900000000 CACHE_SIZE=6442450944 For long-running tests, the tests were executed with a duration of 5400 seconds (DURATION=5400)

Unless explicitly specified, the remaining test parameters used the default values. DIO tests were executed with the options "--use_direct_io_for_flush_and_compaction --use_direct_reads". All other parameters used the default values, unless explicitly mentioned here. Tests were executed sequentially against the same database instance. The db_bench tool was generated via "make release".

The following test sequence was executed:

Test 1. Bulk Load of keys in Random Order (benchmark.sh bulkload)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 benchmark.sh bulkload

Measure performance to load 900 million keys into the database. The keys are inserted in random order. The database is empty at the beginning of this benchmark run and gradually fills up. No data is being read when the data load is in progress.

Version Opts Time ops/sec mb/sec usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
6.10.4 None 4002 1062310 425.5 0.9 0.5 0.8 1 3 1013 00:02:24.652 17.2 101402936
6.15.5 None 4068 1045195 418.6 1.0 0.5 0.8 1 3 980 00:02:08.223 15.3 101401808
6.21.2 None 4141 1017259 407.5 0.9 0.5 0.8 2 3 556 00:01:32.279 11.0 101406320
6.22.1 None 4143 1013002 405.8 1.0 0.5 0.8 2 3 224 00:01:26.032 10.2 101405804
6.22.1 DIO 4058 1031703 413.2 1.0 0.5 0.8 2 3 125 00:01:23.019 9.9 101403424
6.23.0 None 4175 1015722 406.8 1.0 0.5 0.8 2 3 69 00:01:17.541 9.2 101405292
6.23.0 DIO 3885 1055232 422.7 0.9 0.5 0.8 2 3 21 00:00:52.360 6.2 101402116
6.24.0 None 3983 1025562 410.8 1.0 0.5 0.8 2 3 32 00:01:12.296 8.6 101406524
6.24.0 DIO 3880 1052049 421.4 1.0 0.5 0.8 2 3 22 00:01:05.862 7.8 101405064

Test 2. Random Read (benchmark.sh readrandom)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readrandom

Measure performance to randomly read existing keys. The database after bulkload was used as the starting point.

Version Opts ops/sec mb/sec usec/op p50 p75 p99 p99.9 p99.99
6.10.4 None 138496 35.1 462.1 617.1 764.3 1240 1295 2484
6.15.5 None 138513 35.1 462.0 616.9 764.2 1240 1295 3083
6.21.2 None 138633 35.1 461.6 616.8 764.1 1240 1295 2461
6.22.1 None 138623 35.1 461.7 616.8 763.9 1239 1295 2663
6.22.1 DIO 189237 47.9 338.2 430.0 540.7 854 960 1291
6.23.0 None 137664 34.9 464.9 618.4 766.5 1244 1295 2557
6.23.0 DIO 189252 47.9 338.2 430.0 540.7 854 926 1296
6.24.0 None 130852 33.1 489.1 632.6 791.4 1266 1297 2152
6.24.0 DIO 189240 47.9 338.2 430.0 540.7 854 930 1292

Test 3. Multi-Random Read (benchmark.sh multireadrandom)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh multireadrandom --multiread_batched

Measure performance to randomly multi-get existing keys. The database after bulkload was used as the starting point.

Version Opts ops/sec p50 p75 p99 p99.9 p99.99
6.10.4 None 138476 462.1 4583.0 5710.9 9298 9864
6.15.5 None 138507 462.0 4582.3 5710.2 9299 9867
6.21.2 None 138623 461.7 4577.4 5707.2 9294 9866
6.22.1 None 138660 461.6 4576.1 5706.3 9294 9867
6.22.1 DIO 189235 338.2 3410.7 4047.7 6406 6583
6.23.0 None 137638 4607.1 5736.4 9360 9869 17172
6.23.0 DIO 189237 3409.0 4047.0 6406 6584 8125
6.24.0 None 130859 4836.9 5995.1 9630 9880 14169
6.24.0 DIO 189234 3409.0 4047.2 6406 6584 7996

Test 4. Range Scan (benchmark.sh fwdrange)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh fwdrange

Measure performance to randomly iterate over keys. The database after bulkload was used as the starting point.

Version Opts ops/sec mb/sec usec/op p50 p75 p99 p99.9 p99.99
6.10.4 None 70973 284.3 901.7 787.6 1438.0 1882 1899 9945
6.15.5 None 70978 284.3 901.7 787.7 1437.5 1882 1899 9890
6.21.2 None 70967 284.3 901.8 787.8 1437.3 1882 1899 10253
6.22.1 None 70971 284.3 901.7 787.8 1437.2 1882 1899 10274
6.22.1 DIO 78829 315.7 811.9 870.3 1090.5 1411 1859 2618
6.23.0 None 70459 282.2 908.3 789.6 1443.0 1883 1899 10273
6.23.0 DIO 78829 315.7 811.9 870.5 1090.1 1356 1855 2596
6.24.0 None 67004 268.4 955.1 802.8 1480.1 1884 1899 7216
6.24.0 DIO 78832 315.8 811.8 870.4 1090.2 1376 1856 2582

Test 5. Overwrite (benchmark.sh overwrite)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite

Measure performance to randomly overwrite keys into the database. The database was first created by the previous benchmark.

Version Opts ops/sec mb/sec W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 Stall-time Stall% du -s -k
6.10.4 None 94539 37.9 4.7 161.9 676.9 328.4 700.4 10022 29843 56548 00:07:11.226 8.0 162965216
6.15.5 None 92911 37.2 4.7 158.4 688.8 351.9 697.7 10031 29894 58333 00:08:43.334 9.7 161156844
6.21.2 None 91776 36.8 4.7 155.6 697.3 379.9 708.7 10055 29782 55942 00:08:24.882 9.4 162082088
6.22.1 None 91586 36.7 4.7 155.0 698.8 380.5 709.4 10140 29887 58244 00:08:29.153 9.5 161321740
6.22.1 DIO 90310 36.2 4.8 154.7 708.7 419.0 730.1 9227 28816 55513 00:06:22.790 7.1 160400436
6.23.0 None 89052 35.7 4.7 151.3 718.7 442.5 758.8 10263 29763 53874 00:04:40.429 5.2 160633196
6.23.0 DIO 88624 35.5 4.9 152.4 722.1 441.5 749.0 9319 28887 53792 00:04:53.783 5.5 158994508
6.24.0 None 90829 36.4 4.7 155.2 704.6 397.1 726.4 10359 29968 58160 00:07:01.849 7.9 160757048
6.24.0 DIO 90105 36.1 4.8 153.7 710.3 421.8 736.9 9344 28869 52676 00:05:22.128 6.0 160833572

Test 6. Multi-threaded read and single-threaded write (benchmark.sh readwhilewriting)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 MB_WRITE_PER_SEC=2 benchmark.sh readwhilewriting

Measure performance with one writer and multiple reader threads. The writes are rate limited.

Version Opts ops/sec mb/sec W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 du -s -k
6.10.4 None 95649 30.5 30.2 19.7 669.1 608.1 834.5 3999 6597 17861 141636760
6.15.5 None 96223 30.6 28.0 18.7 665.1 609.4 835.1 3965 6475 15613 141339712
6.21.2 None 96891 30.7 29.1 18.9 660.5 607.1 833.4 3961 6465 13669 141208940
6.22.1 None 96812 30.7 28.4 18.5 661.1 606.5 832.8 3958 6500 15777 140972560
6.22.1 DIO 140635 44.5 32.5 14.9 455.1 400.4 543.4 2804 5051 9348 140465744
6.23.0 None 96811 30.6 29.1 18.9 661.1 607.3 835.3 3966 6433 13513 140384360
6.23.0 DIO 142410 44.9 29.6 13.5 449.4 394.0 539.3 2789 4598 8989 139961824
6.24.0 None 92974 29.5 27.6 19.0 688.3 624.8 869.7 4010 6436 14558 140384360
6.24.0 DIO 141491 44.7 32.7 15.0 452.3 395.8 540.8 2802 4867 9311 140255568

Test 7. Multi-threaded scan and single-threaded write (benchmark.sh fwdrangewhilewriting)

NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 MB_WRITE_PER_SEC=2 benchmark.sh fwdrangewhilewriting

Measure performance with one writer and multiple iterator threads. The writes are rate limited.

Version Opts ops/sec mb/sec W-Amp W-MB/s usec/op p50 p75 p99 p99.9 p99.99 du -s -k
6.10.4 None 41827 167.5 17.6 8.0 1530.0 1329.2 1826.2 6212 13129 23244 142497356
6.15.5 None 42197 169.0 17.5 8.0 1516.6 1315.5 1817.8 6185 13018 23081 142157224
6.21.2 None 41360 165.7 18.2 8.0 1547.2 1354.1 1840.5 6257 13434 25280 141820100
6.22.1 None 41244 165.2 16.7 7.6 1551.6 1362.3 1845.8 6234 13346 24988 141716340
6.22.1 DIO 35962 144.0 17.0 7.4 1779.5 1538.8 2150.6 6455 9661 13264 141008104
6.23.0 None 41322 165.5 16.7 7.6 1584.7 1359.9 1838.4 6225 13285 24338 141101776
6.23.0 DIO 35968 144.1 17.0 7.4 1779.2 1536.8 2167.6 6446 9648 13218 140731428
6.24.0 None 38940 156.0 17.7 8.1 1643.4 1445.8 1970.8 6411 13480 21757 141724476
6.24.0 DIO 35955 144.0 17.1 7.5 1779.8 1534.8 2173.0 6427 9615 13093 140989196

Appendix

fio test results

]$ fio --randrepeat=1 --ioengine=sync --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread --numjobs=32 --group_reporting
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=64
...
fio-2.14
Starting 32 processes
Jobs: 3 (f=3): [_(3),r(1),_(1),E(1),_(10),r(1),_(13),r(1),E(1)] [100.0% done] [445.3MB/0KB/0KB /s] [114K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=32): err= 0: pid=28042: Fri Jul 24 01:36:19 2020
  read : io=131072MB, bw=469326KB/s, iops=117331, runt=285980msec
  cpu          : usr=1.29%, sys=3.26%, ctx=33585114, majf=0, minf=297
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=33554432/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=131072MB, aggrb=469325KB/s, minb=469325KB/s, maxb=469325KB/s, mint=285980msec, maxt=285980msec

Disk stats (read/write):
  nvme1n1: ios=33654742/61713, merge=0/40, ticks=8723764/89064, in_queue=8788592, util=100.00%
]$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.14
Starting 1 process
Jobs: 1 (f=1): [r(1)] [100.0% done] [456.3MB/0KB/0KB /s] [117K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=28385: Fri Jul 24 01:36:56 2020
  read : io=4096.0MB, bw=547416KB/s, iops=136854, runt=  7662msec
  cpu          : usr=22.20%, sys=48.81%, ctx=144112, majf=0, minf=73
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=1048576/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=4096.0MB, aggrb=547416KB/s, minb=547416KB/s, maxb=547416KB/s, mint=7662msec, maxt=7662msec

Disk stats (read/write):
  nvme1n1: ios=1050868/1904, merge=0/1, ticks=374836/2900, in_queue=370532, util=98.70%

Previous Results

Contents

Clone this wiki locally