[ENH] Benchmark maxscore by Sicheng-Pan · Pull Request #6866 · chroma-core/chroma

Sicheng-Pan · 2026-04-10T03:11:38Z

Description of changes

This is PR #5 of the BlockMaxMaxScore series, stacked on hammad/maxscore_simd. It extends the existing Wikipedia SPLADE sparse vector benchmark to support BlockMaxMaxScore as an alternative to the existing Block-Max WAND algorithm.

New functionality
- --block-maxscore flag: Builds and searches using the BlockMaxMaxScore index instead of WAND.
- --sweep-terms mode: Builds both WAND and MaxScore indices, then runs queries at 5, 10, 15, ..., 40 max terms, printing a side-by-side latency comparison table with speedup ratios.
- --max-terms <N>: Truncates each query to its top-N highest-weight terms before searching. Useful for studying the relationship between query complexity and latency.
- --batch-size <N>: Configurable commit/flush batch size during indexing (default 65536).
- --block-size default changed from 128 to 256 entries per posting block. Shared between WAND and MaxScore paths.
- build_block_maxscore_index: Parallel document ingestion into BlockMaxMaxScore index with incremental fork-commit-flush loop, progress bar, and storage size measurement.
- search_with_block_maxscore: Query loop with per-query timing, iteration support, and progress bar.
- Dataset recycling: When --num-documents exceeds the dataset size, documents are recycled with unique IDs to reach the requested count.
- Query term statistics: Prints min/median/avg/max query term counts on startup.
- Storage size reporting: Both WAND and MaxScore index builds report on-disk storage size.
- run_brute_force helper: Extracted from duplicated inline brute-force code for reuse by both algorithm paths.
Dataset change
- wikipedia_splade.rs: Downloads all 7 train shards (~1M documents) instead of just the first shard (~142K), enabling benchmarks at larger scale.

Test plan

This is a benchmark binary, not a library — no unit tests. Verified manually:

--block-maxscore -n 65536 -m 256 -k 128: 100% recall, ~20x speedup over brute force
Default WAND mode still works unchanged
--sweep-terms prints comparison table
--max-terms 20 truncates queries correctly
--wand-only and --block-maxscore --wand-only profiling modes work

Migration plan

No migration needed. This only changes a benchmark example binary and a benchmark dataset loader.

Observability plan

No instrumentation changes. The benchmark itself prints detailed timing and recall metrics.

Documentation Changes

No user-facing API changes. CLI usage is documented in the file's module-level doc comment.

Sicheng-Pan · 2026-04-10T03:11:50Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

[ENH] Add maxscore index to metadata segment #6880
[ENH] Add maxscore option in schema #6878
[ENH] Benchmark maxscore #6866 👈 (View in Graphite)
[ENH] Add SIMD for maxscore #6865
[ENH] Add maxscore lazy cursor #6829
[ENH] Add basic maxscore writer/reader #6825
[ENH] Add SparsePostingBlock #6823
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2026-04-10T03:11:51Z

propel-code-bot · 2026-04-10T03:12:41Z

Add BlockMaxMaxScore benchmarking mode and term-sweep analysis to sparse benchmark

This PR significantly expands the benchmark example at rust/index/examples/sparse_vector_benchmark.rs to support comparing two sparse retrieval algorithms: existing Block-Max WAND and new BlockMaxMaxScore. It introduces new CLI options for algorithm selection, query truncation (--max-terms), cross-algorithm term sweep (--sweep-terms), and configurable indexing batch size (--batch-size), while preserving the existing WAND workflow.

It also scales the dataset loader in rust/benchmark/src/datasets/wikipedia_splade.rs from a single train shard to all 7 train shards, enabling larger benchmark runs. Additional benchmark ergonomics include storage size reporting, query-term statistics, reusable brute-force baseline logic, and dataset recycling when requested document count exceeds available data.

This summary was automatically generated by @propel-code-bot

propel-code-bot

One important logic issue was found that can crash benchmark execution on empty datasets and should be guarded before release.

Status: Changes Suggested | Risk: Medium

Issues Identified & Suggestions

Prevent divide-by-zero panic on empty documents; add early error guard: rust/index/examples/sparse_vector_benchmark.rs

Review Details

📁 2 files reviewed | 💬 1 comments

👍 / 👎 individual comments to help improve reviews for you

propel-code-bot · 2026-04-10T03:15:54Z

rust/index/examples/sparse_vector_benchmark.rs

+    } else {
+        // Dataset has fewer documents than requested — recycle with unique IDs.
+        let base_len = raw_documents.len();
+        let repeats = args.num_documents.div_ceil(base_len);


[Logic] This can panic when no documents are returned: base_len can be 0, then args.num_documents.div_ceil(base_len) divides by zero. That makes the new document-recycling path crash on empty/failed datasets.

Add an explicit guard for raw_documents.is_empty() and return an error with context before computing repeats.

Context for Agents

This can panic when no documents are returned: `base_len` can be 0, then `args.num_documents.div_ceil(base_len)` divides by zero. That makes the new document-recycling path crash on empty/failed datasets. Add an explicit guard for `raw_documents.is_empty()` and return an error with context before computing `repeats`. File: rust/index/examples/sparse_vector_benchmark.rs Line: 1014

propel-code-bot

Important reliability and logic issues were found that can cause runtime panics in new benchmark paths and should be fixed before merge.

Status: Changes Suggested | Risk: Medium

Issues Identified & Suggestions

Validate batch size >0 to prevent divide/chunk panics: rust/index/examples/sparse_vector_benchmark.rs
Replace task join unwraps with contextual error propagation: rust/index/examples/sparse_vector_benchmark.rs

Review Details

📁 2 files reviewed | 💬 2 comments

👍 / 👎 individual comments to help improve reviews for you

propel-code-bot · 2026-04-10T17:23:10Z

rust/index/examples/sparse_vector_benchmark.rs


-    // Process documents in batches with write-commit-flush loop
-    let batch_size = 65536;
+    let batch_size = batch_size_override;


[Logic] batch_size_override is used directly in div_ceil and chunks. If --batch-size 0 is passed, this will panic at runtime (division by zero / invalid chunk size).

Add CLI validation (value_parser with range 1..) or an explicit guard returning an error when batch_size_override == 0.

Context for Agents

`batch_size_override` is used directly in `div_ceil` and `chunks`. If `--batch-size 0` is passed, this will panic at runtime (division by zero / invalid chunk size). Add CLI validation (`value_parser` with range `1..`) or an explicit guard returning an error when `batch_size_override == 0`. File: rust/index/examples/sparse_vector_benchmark.rs Line: 308

propel-code-bot · 2026-04-10T17:23:10Z

rust/index/examples/sparse_vector_benchmark.rs

+            })
+            .collect::<Vec<_>>();
+        for handle in handles {
+            handle.await.unwrap();


[Reliability] handle.await.unwrap() will panic if any spawned indexing task panics/cancels, aborting the entire benchmark without context. This is now on the newly added MaxScore path.

Propagate join failures as anyhow errors with context instead of unwrapping, so failures are diagnosable and don't crash abruptly.

Context for Agents

`handle.await.unwrap()` will panic if any spawned indexing task panics/cancels, aborting the entire benchmark without context. This is now on the newly added MaxScore path. Propagate join failures as `anyhow` errors with context instead of unwrapping, so failures are diagnosable and don't crash abruptly. File: rust/index/examples/sparse_vector_benchmark.rs Line: 669

propel-code-bot

One important logic issue was identified: an edge-case panic should be replaced with recoverable error handling before approval.

Status: Changes Suggested | Risk: Medium

Issues Identified & Suggestions

Replace empty-documents panic with anyhow recoverable error path: rust/index/examples/sparse_vector_benchmark.rs

Review Details

📁 2 files reviewed | 💬 1 comments

👍 / 👎 individual comments to help improve reviews for you

propel-code-bot · 2026-04-10T20:13:43Z

rust/index/examples/sparse_vector_benchmark.rs

+    Ok((
+        temp_dir,
+        provider,
+        posting_writer_id.expect("Should have created at least one writer"),


[Logic] posting_writer_id.expect("Should have created at least one writer") will panic when documents is empty (no chunks processed). That's a realistic edge case for small/failed dataset loads.

Return a proper anyhow error when no writer is created instead of panicking, so callers get a recoverable failure path.

Context for Agents

`posting_writer_id.expect("Should have created at least one writer")` will panic when `documents` is empty (no chunks processed). That's a realistic edge case for small/failed dataset loads. Return a proper `anyhow` error when no writer is created instead of panicking, so callers get a recoverable failure path. File: rust/index/examples/sparse_vector_benchmark.rs Line: 711

propel-code-bot

Review found no issues; changes appear well-scoped and benchmark-focused with clear manual validation.

Status: No Issues Found | Risk: Low

Review Details

📁 2 files reviewed | 💬 0 comments

HammadB mentioned this pull request Apr 10, 2026

[ENH] Add maxscore lazy cursor #6829

Open

1 task

This was referenced Apr 10, 2026

[ENH] Add SIMD for maxscore #6865

Open

[ENH] Add SparsePostingBlock #6823

Open

[ENH] Add basic maxscore writer/reader #6825

Open

Sicheng-Pan marked this pull request as ready for review April 10, 2026 03:12

Sicheng-Pan changed the title ~~Add BlockMaxMaxScore mode to sparse vector benchmark~~ [ENH] Benchmark maxscore Apr 10, 2026

Sicheng-Pan force-pushed the hammad/maxscore_benchmark branch from e874e94 to 460583d Compare April 10, 2026 03:13

propel-code-bot bot reviewed Apr 10, 2026

View reviewed changes

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from d47e07c to 6920d76 Compare April 10, 2026 03:26

Sicheng-Pan force-pushed the hammad/maxscore_benchmark branch from 460583d to 1888806 Compare April 10, 2026 03:26

This comment was marked as outdated.

Sign in to view

Sicheng-Pan force-pushed the hammad/maxscore_benchmark branch from 1888806 to 5bc0451 Compare April 10, 2026 17:18

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from 6920d76 to 0e9eaa3 Compare April 10, 2026 17:18

propel-code-bot bot reviewed Apr 10, 2026

View reviewed changes

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add maxscore option in schema #6878

Open

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from 0e9eaa3 to ac08b0c Compare April 10, 2026 20:11

Sicheng-Pan force-pushed the hammad/maxscore_benchmark branch from 5bc0451 to 946139b Compare April 10, 2026 20:11

propel-code-bot bot reviewed Apr 10, 2026

View reviewed changes

Sicheng-Pan added 2 commits April 10, 2026 13:36

Add BlockMaxMaxScore mode to sparse vector benchmark

2174fb2

Rename BlockSparse* to MaxScore* in benchmark

c800ddb

Sicheng-Pan force-pushed the hammad/maxscore_benchmark branch from 946139b to c800ddb Compare April 10, 2026 20:36

Sicheng-Pan force-pushed the hammad/maxscore_simd branch from ac08b0c to 014e4ed Compare April 10, 2026 20:36

propel-code-bot bot reviewed Apr 10, 2026

View reviewed changes

Sicheng-Pan mentioned this pull request Apr 10, 2026

[ENH] Add maxscore index to metadata segment #6880

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Benchmark maxscore#6866

[ENH] Benchmark maxscore#6866
Sicheng-Pan wants to merge 2 commits intohammad/maxscore_simdfrom
hammad/maxscore_benchmark

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

Uh oh!

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

propel-code-bot bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

propel-code-bot bot left a comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

propel-code-bot bot left a comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Uh oh!

propel-code-bot bot Apr 10, 2026

Uh oh!

propel-code-bot bot left a comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Uh oh!

propel-code-bot bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Sicheng-Pan commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Migration plan

Observability plan

Documentation Changes

Uh oh!

Sicheng-Pan commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 10, 2026

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

propel-code-bot bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

propel-code-bot bot left a comment

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

propel-code-bot bot left a comment

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot left a comment

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

propel-code-bot bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

Sicheng-Pan commented Apr 10, 2026 •

edited

Loading

propel-code-bot bot commented Apr 10, 2026 •

edited

Loading