-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Block cache analysis and simulation tools
RocksDB configures a certain amount of main memory as a block cache to accelerate data access. Understanding the efficiency of block cache is very important. The block cache analysis and simulation tools help a user to collect block cache access traces, analyze its access pattern, and evaluate alternative caching policies.
Quick Start
Tracing block cache accesses
Trace Format
Cache Simulations
Analyzing Block Cache Traces
db_bench supports tracing block cache accesses. This section demonstrates how to trace accesses when running db_bench. It also shows how to analyze and evaluate caching policies using the generated trace file.
Create a database:
./db_bench --benchmarks="fillseq" --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --cache_index_and_filter_blocks --cache_size=1048576 --disable_auto_compactions=1 --disable_wal=1 --compression_type=none --min_level_to_compress=-1 --compression_ratio=1 --num=10000000
To trace block cache accesses when running readrandom
benchmark:
./db_bench --benchmarks="readrandom" --use_existing_db --duration=60 --key_size=20 --prefix_size=20 --keys_per_prefix=0 --value_size=100 --cache_index_and_filter_blocks --cache_size=1048576 --disable_auto_compactions=1 --disable_wal=1 --compression_type=none --min_level_to_compress=-1 --compression_ratio=1 --num=10000000 --threads=16 -block_cache_trace_file="/tmp/trace" -block_cache_trace_max_trace_file_size_in_bytes=1073741824 -block_cache_trace_sampling_frequency=1
Convert the trace file to human readable format:
./block_cache_trace_analyzer -block_cache_trace_path=/tmp/block_trace_test_example -human_readable_trace_file_path=/tmp/human_readable_block_trace_test_example
Evaluate alternative caching policies:
bash block_cache_pysim.sh /tmp/human_readable_block_trace_test_example /tmp/sim_results/bench 1 0 30
Plot graphs:
python block_cache_trace_analyzer_plot.py /tmp/sim_results /tmp/sim_results_graphs
RocksDB supports block cache tracing APIs StartBlockCacheTrace
and EndBlockCacheTrace
. When tracing starts, RocksDB logs detailed information of block cache accesses into a trace file. A user must specify a trace option and trace file path when start tracing block cache accesses.
A trace option contains max_trace_file_size
and sampling_frequency
.
-
max_trace_file_size
specifies the maximum size of the trace file. The tracing stops when the trace file size exceeds the specifiedmax_trace_file_size
. -
sampling_frequency
determines how frequent should RocksDB trace an access. RocksDB uses spatial downsampling such that it traces all accesses to sampled blocks. Asampling_frequency
of 1 means tracing all block cache accesses. Asampling_frequency
of 100 means tracing all accesses on ~1% blocks.
An example to start tracing block cache accesses:
Env* env = rocksdb::Env::Default();
EnvOptions env_options;
std::string trace_path = "/tmp/block_trace_test_example"
std::unique_ptr<TraceWriter> trace_writer;
DB* db = nullptr;
std::string db_name = "/tmp/rocksdb"
/*Create the trace file writer*/
NewFileTraceWriter(env, env_options, trace_path, &trace_writer);
DB::Open(options, dbname, &db);
/*Start tracing*/
db->StartBlockCacheTrace(trace_opt, std::move(trace_writer));
/* Your call of RocksDB APIs */
/*End tracing*/
db->EndBlockCacheTrace()
Supported simulators.
- RocksDB cache simulators.
- Python cache simulators.
Must first convert the binary trace file into human readable trace file.
Provides insights into how to improve a caching policy.
Contents
- RocksDB Wiki
- Overview
- RocksDB FAQ
- Terminology
- Requirements
- Contributors' Guide
- Release Methodology
- RocksDB Users and Use Cases
- RocksDB Public Communication and Information Channels
-
Basic Operations
- Iterator
- Prefix seek
- SeekForPrev
- Tailing Iterator
- Compaction Filter
- Multi Column Family Iterator
- Read-Modify-Write (Merge) Operator
- Column Families
- Creating and Ingesting SST files
- Single Delete
- Low Priority Write
- Time to Live (TTL) Support
- Transactions
- Snapshot
- DeleteRange
- Atomic flush
- Read-only and Secondary instances
- Approximate Size
- User-defined Timestamp
- Wide Columns
- BlobDB
- Online Verification
- Options
- MemTable
- Journal
- Cache
- Write Buffer Manager
- Compaction
- SST File Formats
- IO
- Compression
- Full File Checksum and Checksum Handoff
- Background Error Handling
- Huge Page TLB Support
- Tiered Storage (Experimental)
- Logging and Monitoring
- Known Issues
- Troubleshooting Guide
- Tests
- Tools / Utilities
-
Implementation Details
- Delete Stale Files
- Partitioned Index/Filters
- WritePrepared-Transactions
- WriteUnprepared-Transactions
- How we keep track of live SST files
- How we index SST
- Merge Operator Implementation
- RocksDB Repairer
- Write Batch With Index
- Two Phase Commit
- Iterator's Implementation
- Simulation Cache
- [To Be Deprecated] Persistent Read Cache
- DeleteRange Implementation
- unordered_write
- Extending RocksDB
- RocksJava
- Lua
- Performance
- Projects Being Developed
- Misc