-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Performance Benchmarks
These benchmarks measure RocksDB performance when data resides on flash storage.
All of the benchmarks are run on the same AWS instance. Here are the details of the test setup:
- Instance type: m5d.2xlarge 8 CPU, 32 GB Memory, 1 x 300 NVMe SSD.
- Kernel version: Linux 4.14.177-139.253.amzn2.x86_64
- File System: XFS with discard enabled
To understand the performance of the SSD card, we ran an fio test and observed 117K IOPS of 4KB reads (See Performance Benchmarks#fio test results for outputs).
All tests were executed against by executing benchmark.sh with the following parameters (unless otherwise specified): NUM_KEYS=900000000 CACHE_SIZE=6442450944 For long-running tests, the tests were executed with a duration of 5400 seconds (DURATION=5400)
Unless explicitly specified, the remaining test parameters used the default values. DIO tests were executed with the options "--use_direct_io_for_flush_and_compaction --use_direct_reads". All other parameters used the default values, unless explicitly mentioned here. Tests were executed sequentially against the same database instance. The db_bench tool was generated via "make release".
The following test sequence was executed:
NUM_KEYS=900000000 CACHE_SIZE=6442450944 benchmark.sh bulkload
Measure performance to load 900 million keys into the database. The keys are inserted in random order. The database is empty at the beginning of this benchmark run and gradually fills up. No data is being read when the data load is in progress.
Version | Opts | Time | ops/sec | mb/sec | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.4 | None | 4002 | 1062310 | 425.5 | 0.9 | 0.5 | 0.8 | 1 | 3 | 1013 | 00:02:24.652 | 17.2 | 101402936 |
6.15.5 | None | 4068 | 1045195 | 418.6 | 1.0 | 0.5 | 0.8 | 1 | 3 | 980 | 00:02:08.223 | 15.3 | 101401808 |
6.21.2 | None | 4141 | 1017259 | 407.5 | 0.9 | 0.5 | 0.8 | 2 | 3 | 556 | 00:01:32.279 | 11.0 | 101406320 |
6.22.1 | None | 4143 | 1013002 | 405.8 | 1.0 | 0.5 | 0.8 | 2 | 3 | 224 | 00:01:26.032 | 10.2 | 101405804 |
6.22.1 | DIO | 4058 | 1031703 | 413.2 | 1.0 | 0.5 | 0.8 | 2 | 3 | 125 | 00:01:23.019 | 9.9 | 101403424 |
6.23.0 | None | 4175 | 1015722 | 406.8 | 1.0 | 0.5 | 0.8 | 2 | 3 | 69 | 00:01:17.541 | 9.2 | 101405292 |
6.23.0 | DIO | 3885 | 1055232 | 422.7 | 0.9 | 0.5 | 0.8 | 2 | 3 | 21 | 00:00:52.360 | 6.2 | 101402116 |
6.24.0 | None | 3983 | 1025562 | 410.8 | 1.0 | 0.5 | 0.8 | 2 | 3 | 32 | 00:01:12.296 | 8.6 | 101406524 |
6.24.0 | DIO | 3880 | 1052049 | 421.4 | 1.0 | 0.5 | 0.8 | 2 | 3 | 22 | 00:01:05.862 | 7.8 | 101405064 |
NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readrandom
Measure performance to randomly read existing keys. The database after bulkload was used as the starting point.
Version | Opts | ops/sec | mb/sec | usec/op | p50 | p75 | p99 | p99.9 | p99.99 |
---|---|---|---|---|---|---|---|---|---|
6.10.4 | None | 138496 | 35.1 | 462.1 | 617.1 | 764.3 | 1240 | 1295 | 2484 |
6.15.5 | None | 138513 | 35.1 | 462.0 | 616.9 | 764.2 | 1240 | 1295 | 3083 |
6.21.2 | None | 138633 | 35.1 | 461.6 | 616.8 | 764.1 | 1240 | 1295 | 2461 |
6.22.1 | None | 138623 | 35.1 | 461.7 | 616.8 | 763.9 | 1239 | 1295 | 2663 |
6.22.1 | DIO | 189237 | 47.9 | 338.2 | 430.0 | 540.7 | 854 | 960 | 1291 |
6.23.0 | None | 137664 | 34.9 | 464.9 | 618.4 | 766.5 | 1244 | 1295 | 2557 |
6.23.0 | DIO | 189252 | 47.9 | 338.2 | 430.0 | 540.7 | 854 | 926 | 1296 |
6.24.0 | None | 130852 | 33.1 | 489.1 | 632.6 | 791.4 | 1266 | 1297 | 2152 |
6.24.0 | DIO | 189240 | 47.9 | 338.2 | 430.0 | 540.7 | 854 | 930 | 1292 |
NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh multireadrandom --multiread_batched
Measure performance to randomly multi-get existing keys. The database after bulkload was used as the starting point.
Version | Opts | ops/sec | p50 | p75 | p99 | p99.9 | p99.99 |
---|---|---|---|---|---|---|---|
6.10.4 | None | 138476 | 462.1 | 4583.0 | 5710.9 | 9298 | 9864 |
6.15.5 | None | 138507 | 462.0 | 4582.3 | 5710.2 | 9299 | 9867 |
6.21.2 | None | 138623 | 461.7 | 4577.4 | 5707.2 | 9294 | 9866 |
6.22.1 | None | 138660 | 461.6 | 4576.1 | 5706.3 | 9294 | 9867 |
6.22.1 | DIO | 189235 | 338.2 | 3410.7 | 4047.7 | 6406 | 6583 |
6.23.0 | None | 137638 | 4607.1 | 5736.4 | 9360 | 9869 | 17172 |
6.23.0 | DIO | 189237 | 3409.0 | 4047.0 | 6406 | 6584 | 8125 |
6.24.0 | None | 130859 | 4836.9 | 5995.1 | 9630 | 9880 | 14169 |
6.24.0 | DIO | 189234 | 3409.0 | 4047.2 | 6406 | 6584 | 7996 |
NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh fwdrange
Measure performance to randomly iterate over keys. The database after bulkload was used as the starting point.
Version | Opts | ops/sec | mb/sec | usec/op | p50 | p75 | p99 | p99.9 | p99.99 |
---|---|---|---|---|---|---|---|---|---|
6.10.4 | None | 70973 | 284.3 | 901.7 | 787.6 | 1438.0 | 1882 | 1899 | 9945 |
6.15.5 | None | 70978 | 284.3 | 901.7 | 787.7 | 1437.5 | 1882 | 1899 | 9890 |
6.21.2 | None | 70967 | 284.3 | 901.8 | 787.8 | 1437.3 | 1882 | 1899 | 10253 |
6.22.1 | None | 70971 | 284.3 | 901.7 | 787.8 | 1437.2 | 1882 | 1899 | 10274 |
6.22.1 | DIO | 78829 | 315.7 | 811.9 | 870.3 | 1090.5 | 1411 | 1859 | 2618 |
6.23.0 | None | 70459 | 282.2 | 908.3 | 789.6 | 1443.0 | 1883 | 1899 | 10273 |
6.23.0 | DIO | 78829 | 315.7 | 811.9 | 870.5 | 1090.1 | 1356 | 1855 | 2596 |
6.24.0 | None | 67004 | 268.4 | 955.1 | 802.8 | 1480.1 | 1884 | 1899 | 7216 |
6.24.0 | DIO | 78832 | 315.8 | 811.8 | 870.4 | 1090.2 | 1376 | 1856 | 2582 |
NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite
Measure performance to randomly overwrite keys into the database. The database was first created by the previous benchmark.
Version | Opts | ops/sec | mb/sec | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.4 | None | 94539 | 37.9 | 4.7 | 161.9 | 676.9 | 328.4 | 700.4 | 10022 | 29843 | 56548 | 00:07:11.226 | 8.0 | 162965216 |
6.15.5 | None | 92911 | 37.2 | 4.7 | 158.4 | 688.8 | 351.9 | 697.7 | 10031 | 29894 | 58333 | 00:08:43.334 | 9.7 | 161156844 |
6.21.2 | None | 91776 | 36.8 | 4.7 | 155.6 | 697.3 | 379.9 | 708.7 | 10055 | 29782 | 55942 | 00:08:24.882 | 9.4 | 162082088 |
6.22.1 | None | 91586 | 36.7 | 4.7 | 155.0 | 698.8 | 380.5 | 709.4 | 10140 | 29887 | 58244 | 00:08:29.153 | 9.5 | 161321740 |
6.22.1 | DIO | 90310 | 36.2 | 4.8 | 154.7 | 708.7 | 419.0 | 730.1 | 9227 | 28816 | 55513 | 00:06:22.790 | 7.1 | 160400436 |
6.23.0 | None | 89052 | 35.7 | 4.7 | 151.3 | 718.7 | 442.5 | 758.8 | 10263 | 29763 | 53874 | 00:04:40.429 | 5.2 | 160633196 |
6.23.0 | DIO | 88624 | 35.5 | 4.9 | 152.4 | 722.1 | 441.5 | 749.0 | 9319 | 28887 | 53792 | 00:04:53.783 | 5.5 | 158994508 |
6.24.0 | None | 90829 | 36.4 | 4.7 | 155.2 | 704.6 | 397.1 | 726.4 | 10359 | 29968 | 58160 | 00:07:01.849 | 7.9 | 160757048 |
6.24.0 | DIO | 90105 | 36.1 | 4.8 | 153.7 | 710.3 | 421.8 | 736.9 | 9344 | 28869 | 52676 | 00:05:22.128 | 6.0 | 160833572 |
NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 MB_WRITE_PER_SEC=2 benchmark.sh readwhilewriting
Measure performance with one writer and multiple reader threads. The writes are rate limited.
Version | Opts | ops/sec | mb/sec | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.4 | None | 95649 | 30.5 | 30.2 | 19.7 | 669.1 | 608.1 | 834.5 | 3999 | 6597 | 17861 | 141636760 |
6.15.5 | None | 96223 | 30.6 | 28.0 | 18.7 | 665.1 | 609.4 | 835.1 | 3965 | 6475 | 15613 | 141339712 |
6.21.2 | None | 96891 | 30.7 | 29.1 | 18.9 | 660.5 | 607.1 | 833.4 | 3961 | 6465 | 13669 | 141208940 |
6.22.1 | None | 96812 | 30.7 | 28.4 | 18.5 | 661.1 | 606.5 | 832.8 | 3958 | 6500 | 15777 | 140972560 |
6.22.1 | DIO | 140635 | 44.5 | 32.5 | 14.9 | 455.1 | 400.4 | 543.4 | 2804 | 5051 | 9348 | 140465744 |
6.23.0 | None | 96811 | 30.6 | 29.1 | 18.9 | 661.1 | 607.3 | 835.3 | 3966 | 6433 | 13513 | 140384360 |
6.23.0 | DIO | 142410 | 44.9 | 29.6 | 13.5 | 449.4 | 394.0 | 539.3 | 2789 | 4598 | 8989 | 139961824 |
6.24.0 | None | 92974 | 29.5 | 27.6 | 19.0 | 688.3 | 624.8 | 869.7 | 4010 | 6436 | 14558 | 140384360 |
6.24.0 | DIO | 141491 | 44.7 | 32.7 | 15.0 | 452.3 | 395.8 | 540.8 | 2802 | 4867 | 9311 | 140255568 |
NUM_KEYS=900000000 CACHE_SIZE=6442450944 DURATION=5400 MB_WRITE_PER_SEC=2 benchmark.sh fwdrangewhilewriting
Measure performance with one writer and multiple iterator threads. The writes are rate limited.
Version | Opts | ops/sec | mb/sec | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.4 | None | 41827 | 167.5 | 17.6 | 8.0 | 1530.0 | 1329.2 | 1826.2 | 6212 | 13129 | 23244 | 142497356 |
6.15.5 | None | 42197 | 169.0 | 17.5 | 8.0 | 1516.6 | 1315.5 | 1817.8 | 6185 | 13018 | 23081 | 142157224 |
6.21.2 | None | 41360 | 165.7 | 18.2 | 8.0 | 1547.2 | 1354.1 | 1840.5 | 6257 | 13434 | 25280 | 141820100 |
6.22.1 | None | 41244 | 165.2 | 16.7 | 7.6 | 1551.6 | 1362.3 | 1845.8 | 6234 | 13346 | 24988 | 141716340 |
6.22.1 | DIO | 35962 | 144.0 | 17.0 | 7.4 | 1779.5 | 1538.8 | 2150.6 | 6455 | 9661 | 13264 | 141008104 |
6.23.0 | None | 41322 | 165.5 | 16.7 | 7.6 | 1584.7 | 1359.9 | 1838.4 | 6225 | 13285 | 24338 | 141101776 |
6.23.0 | DIO | 35968 | 144.1 | 17.0 | 7.4 | 1779.2 | 1536.8 | 2167.6 | 6446 | 9648 | 13218 | 140731428 |
6.24.0 | None | 38940 | 156.0 | 17.7 | 8.1 | 1643.4 | 1445.8 | 1970.8 | 6411 | 13480 | 21757 | 141724476 |
6.24.0 | DIO | 35955 | 144.0 | 17.1 | 7.5 | 1779.8 | 1534.8 | 2173.0 | 6427 | 9615 | 13093 | 140989196 |
]$ fio --randrepeat=1 --ioengine=sync --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread --numjobs=32 --group_reporting
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=64
...
fio-2.14
Starting 32 processes
Jobs: 3 (f=3): [_(3),r(1),_(1),E(1),_(10),r(1),_(13),r(1),E(1)] [100.0% done] [445.3MB/0KB/0KB /s] [114K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=32): err= 0: pid=28042: Fri Jul 24 01:36:19 2020
read : io=131072MB, bw=469326KB/s, iops=117331, runt=285980msec
cpu : usr=1.29%, sys=3.26%, ctx=33585114, majf=0, minf=297
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=33554432/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: io=131072MB, aggrb=469325KB/s, minb=469325KB/s, maxb=469325KB/s, mint=285980msec, maxt=285980msec
Disk stats (read/write):
nvme1n1: ios=33654742/61713, merge=0/40, ticks=8723764/89064, in_queue=8788592, util=100.00%
]$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.14
Starting 1 process
Jobs: 1 (f=1): [r(1)] [100.0% done] [456.3MB/0KB/0KB /s] [117K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=28385: Fri Jul 24 01:36:56 2020
read : io=4096.0MB, bw=547416KB/s, iops=136854, runt= 7662msec
cpu : usr=22.20%, sys=48.81%, ctx=144112, majf=0, minf=73
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=1048576/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: io=4096.0MB, aggrb=547416KB/s, minb=547416KB/s, maxb=547416KB/s, mint=7662msec, maxt=7662msec
Disk stats (read/write):
nvme1n1: ios=1050868/1904, merge=0/1, ticks=374836/2900, in_queue=370532, util=98.70%
- June 2020: RocksDB 6.10.0
- July 2018: Performance Benchmark 201807
- 2014: Performance Benchmark 2014
Contents
- RocksDB Wiki
- Overview
- RocksDB FAQ
- Terminology
- Requirements
- Contributors' Guide
- Release Methodology
- RocksDB Users and Use Cases
- RocksDB Public Communication and Information Channels
-
Basic Operations
- Iterator
- Prefix seek
- SeekForPrev
- Tailing Iterator
- Compaction Filter
- Multi Column Family Iterator
- Read-Modify-Write (Merge) Operator
- Column Families
- Creating and Ingesting SST files
- Single Delete
- Low Priority Write
- Time to Live (TTL) Support
- Transactions
- Snapshot
- DeleteRange
- Atomic flush
- Read-only and Secondary instances
- Approximate Size
- User-defined Timestamp
- Wide Columns
- BlobDB
- Online Verification
- Options
- MemTable
- Journal
- Cache
- Write Buffer Manager
- Compaction
- SST File Formats
- IO
- Compression
- Full File Checksum and Checksum Handoff
- Background Error Handling
- Huge Page TLB Support
- Tiered Storage (Experimental)
- Logging and Monitoring
- Known Issues
- Troubleshooting Guide
- Tests
- Tools / Utilities
-
Implementation Details
- Delete Stale Files
- Partitioned Index/Filters
- WritePrepared-Transactions
- WriteUnprepared-Transactions
- How we keep track of live SST files
- How we index SST
- Merge Operator Implementation
- RocksDB Repairer
- Write Batch With Index
- Two Phase Commit
- Iterator's Implementation
- Simulation Cache
- [To Be Deprecated] Persistent Read Cache
- DeleteRange Implementation
- unordered_write
- Extending RocksDB
- RocksJava
- Lua
- Performance
- Projects Being Developed
- Misc