Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
|
Add This PR significantly expands the benchmark example at It also scales the dataset loader in This summary was automatically generated by @propel-code-bot |
e874e94 to
460583d
Compare
There was a problem hiding this comment.
One important logic issue was found that can crash benchmark execution on empty datasets and should be guarded before release.
Status: Changes Suggested | Risk: Medium
Issues Identified & Suggestions
- Prevent divide-by-zero panic on empty documents; add early error guard:
rust/index/examples/sparse_vector_benchmark.rs
Review Details
📁 2 files reviewed | 💬 1 comments
👍 / 👎 individual comments to help improve reviews for you
| } else { | ||
| // Dataset has fewer documents than requested — recycle with unique IDs. | ||
| let base_len = raw_documents.len(); | ||
| let repeats = args.num_documents.div_ceil(base_len); |
There was a problem hiding this comment.
[Logic] This can panic when no documents are returned: base_len can be 0, then args.num_documents.div_ceil(base_len) divides by zero. That makes the new document-recycling path crash on empty/failed datasets.
Add an explicit guard for raw_documents.is_empty() and return an error with context before computing repeats.
Context for Agents
This can panic when no documents are returned: `base_len` can be 0, then `args.num_documents.div_ceil(base_len)` divides by zero. That makes the new document-recycling path crash on empty/failed datasets.
Add an explicit guard for `raw_documents.is_empty()` and return an error with context before computing `repeats`.
File: rust/index/examples/sparse_vector_benchmark.rs
Line: 1014d47e07c to
6920d76
Compare
460583d to
1888806
Compare
1888806 to
5bc0451
Compare
6920d76 to
0e9eaa3
Compare
There was a problem hiding this comment.
Important reliability and logic issues were found that can cause runtime panics in new benchmark paths and should be fixed before merge.
Status: Changes Suggested | Risk: Medium
Issues Identified & Suggestions
- Validate batch size >0 to prevent divide/chunk panics:
rust/index/examples/sparse_vector_benchmark.rs - Replace task join unwraps with contextual error propagation:
rust/index/examples/sparse_vector_benchmark.rs
Review Details
📁 2 files reviewed | 💬 2 comments
👍 / 👎 individual comments to help improve reviews for you
|
|
||
| // Process documents in batches with write-commit-flush loop | ||
| let batch_size = 65536; | ||
| let batch_size = batch_size_override; |
There was a problem hiding this comment.
[Logic] batch_size_override is used directly in div_ceil and chunks. If --batch-size 0 is passed, this will panic at runtime (division by zero / invalid chunk size).
Add CLI validation (value_parser with range 1..) or an explicit guard returning an error when batch_size_override == 0.
Context for Agents
`batch_size_override` is used directly in `div_ceil` and `chunks`. If `--batch-size 0` is passed, this will panic at runtime (division by zero / invalid chunk size).
Add CLI validation (`value_parser` with range `1..`) or an explicit guard returning an error when `batch_size_override == 0`.
File: rust/index/examples/sparse_vector_benchmark.rs
Line: 308| }) | ||
| .collect::<Vec<_>>(); | ||
| for handle in handles { | ||
| handle.await.unwrap(); |
There was a problem hiding this comment.
[Reliability] handle.await.unwrap() will panic if any spawned indexing task panics/cancels, aborting the entire benchmark without context. This is now on the newly added MaxScore path.
Propagate join failures as anyhow errors with context instead of unwrapping, so failures are diagnosable and don't crash abruptly.
Context for Agents
`handle.await.unwrap()` will panic if any spawned indexing task panics/cancels, aborting the entire benchmark without context. This is now on the newly added MaxScore path.
Propagate join failures as `anyhow` errors with context instead of unwrapping, so failures are diagnosable and don't crash abruptly.
File: rust/index/examples/sparse_vector_benchmark.rs
Line: 6690e9eaa3 to
ac08b0c
Compare
5bc0451 to
946139b
Compare
There was a problem hiding this comment.
One important logic issue was identified: an edge-case panic should be replaced with recoverable error handling before approval.
Status: Changes Suggested | Risk: Medium
Issues Identified & Suggestions
- Replace empty-documents panic with anyhow recoverable error path:
rust/index/examples/sparse_vector_benchmark.rs
Review Details
📁 2 files reviewed | 💬 1 comments
👍 / 👎 individual comments to help improve reviews for you
| Ok(( | ||
| temp_dir, | ||
| provider, | ||
| posting_writer_id.expect("Should have created at least one writer"), |
There was a problem hiding this comment.
[Logic] posting_writer_id.expect("Should have created at least one writer") will panic when documents is empty (no chunks processed). That's a realistic edge case for small/failed dataset loads.
Return a proper anyhow error when no writer is created instead of panicking, so callers get a recoverable failure path.
Context for Agents
`posting_writer_id.expect("Should have created at least one writer")` will panic when `documents` is empty (no chunks processed). That's a realistic edge case for small/failed dataset loads.
Return a proper `anyhow` error when no writer is created instead of panicking, so callers get a recoverable failure path.
File: rust/index/examples/sparse_vector_benchmark.rs
Line: 711946139b to
c800ddb
Compare
ac08b0c to
014e4ed
Compare

Description of changes
This is PR #5 of the BlockMaxMaxScore series, stacked on
hammad/maxscore_simd. It extends the existing Wikipedia SPLADE sparse vector benchmark to support BlockMaxMaxScore as an alternative to the existing Block-Max WAND algorithm.--block-maxscoreflag: Builds and searches using the BlockMaxMaxScore index instead of WAND.--sweep-termsmode: Builds both WAND and MaxScore indices, then runs queries at 5, 10, 15, ..., 40 max terms, printing a side-by-side latency comparison table with speedup ratios.--max-terms <N>: Truncates each query to its top-N highest-weight terms before searching. Useful for studying the relationship between query complexity and latency.--batch-size <N>: Configurable commit/flush batch size during indexing (default 65536).--block-sizedefault changed from 128 to 256 entries per posting block. Shared between WAND and MaxScore paths.build_block_maxscore_index: Parallel document ingestion into BlockMaxMaxScore index with incremental fork-commit-flush loop, progress bar, and storage size measurement.search_with_block_maxscore: Query loop with per-query timing, iteration support, and progress bar.--num-documentsexceeds the dataset size, documents are recycled with unique IDs to reach the requested count.run_brute_forcehelper: Extracted from duplicated inline brute-force code for reuse by both algorithm paths.wikipedia_splade.rs: Downloads all 7 train shards (~1M documents) instead of just the first shard (~142K), enabling benchmarks at larger scale.Test plan
This is a benchmark binary, not a library — no unit tests. Verified manually:
--block-maxscore -n 65536 -m 256 -k 128: 100% recall, ~20x speedup over brute force--sweep-termsprints comparison table--max-terms 20truncates queries correctly--wand-onlyand--block-maxscore --wand-onlyprofiling modes workMigration plan
No migration needed. This only changes a benchmark example binary and a benchmark dataset loader.
Observability plan
No instrumentation changes. The benchmark itself prints detailed timing and recall metrics.
Documentation Changes
No user-facing API changes. CLI usage is documented in the file's module-level doc comment.