Skip to content

Conversation

@thomasjungblut
Copy link
Owner

@thomasjungblut thomasjungblut commented May 10, 2025

That PR adds a magic number search based on AVX2. This reduces the sstable disk index search time from:

BenchmarkSSTableRandomReadByIndexTypes/disk-20 44634 26629 ns/op

to

BenchmarkSSTableRandomReadByIndexTypes/disk-20 56840 20364 ns/op

Not an insane improvement, but I think we can use that to very quickly read the recordio file to determine the index offsets. Which in turn allows to binary search/map lookup without additional seeking.

The SSE4.2 version can read 10-12gb/s to find the magic numbers (up from 3gb/s with simple looping), so this allows us to compute the lookup table very quickly.

@thomasjungblut thomasjungblut force-pushed the simd branch 5 times, most recently from 0da2e2d to c1fc63c Compare December 23, 2025 12:22
This implements SIMD based vectorized magic number search with SSE4.2,
AVX2 and AVX512.

Signed-off-by: Thomas Jungblut <[email protected]>
@thomasjungblut thomasjungblut marked this pull request as ready for review December 23, 2025 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants