Skip to content

Conversation

@biryukovmaxim
Copy link

@biryukovmaxim biryukovmaxim commented Dec 17, 2025

This PR introduces a parallel implementation of multi_get using io_uring for concurrent I/O operations, which significantly reduces latency in scenarios where keys are scattered across many SST files and involve large, I/O-bound reads (e.g., from cold page cache or slow storage)

Benchmark Results

A new ignored benchmark test (multi_get_scattered_large_values_outperforms_single_gets) validates this:

Setup: 64 keys scattered across 64 SSTs, 1 MiB values, compression/filters/block cache disabled, page cache dropped before each run.
Results (on my machine, NVMe SSD):
Average sequential gets time: 67.76 ms
Average multi_get time: 27.39 ms
Speedup: ~2.5x

How to Run the Benchmark

This test requires root privileges (for dropping caches via sudo) and is ignored by default. Run it in release mode for accurate perf:
cargo test --color=always --package lsm-tree --test tree_multi_get sudo_required::multi_get_scattered_large_values_outperforms_single_gets --profile release -- --nocapture --ignored
Prefix with sudo if needed for cache dropping. Adjust num_ssts or value_size in the test for experimentation.
Notes

The test uses sudo in drop_caches() to ensure cold I/O; without it, page cache masks the benefit.

@marvin-j97 marvin-j97 marked this pull request as ready for review December 17, 2025 19:01
@marvin-j97 marvin-j97 marked this pull request as draft December 17, 2025 19:01
@biryukovmaxim biryukovmaxim marked this pull request as ready for review December 22, 2025 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants