Skip to content

Refactor of diskann-benchmarks to make it potentially easier to spin up new backends#1142

Open
JordanMaples wants to merge 5 commits into
mainfrom
jordanmaples/benchmarks-refactor-v2
Open

Refactor of diskann-benchmarks to make it potentially easier to spin up new backends#1142
JordanMaples wants to merge 5 commits into
mainfrom
jordanmaples/benchmarks-refactor-v2

Conversation

@JordanMaples

Copy link
Copy Markdown
Contributor

Refactors the diskann-benchmark infrastructure to improve modularity, reduce duplication, and privatize input struct fields behind accessor methods.

  • Introduced StreamRunner trait and shared run_streaming helper to unify full-precision and bf-tree streaming benchmark execution, eliminating duplicatedrunbook-driving logic.
  • Extracted build_streamer helper for constructing the streaming stack (index + strategy + maintainer).
  • Renamed Dynamic → Streaming throughout (e.g., BfTreeDynamicRun → BfTreeStreamingRun, DynamicRunbookParams → StreamingRunbookParams) for clarity.

Input field privatization:

  • Made input struct fields private with purpose-specific accessor methods, forcing backends to explicitly handle each field and reject invalidconfigurations.

@codecov-commenter

codecov-commenter commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.50936% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.44%. Comparing base (9b0ae29) to head (4d1b44e).

Files with missing lines Patch % Lines
diskann-benchmark/src/index/benchmarks.rs 86.00% 7 Missing ⚠️
diskann-benchmark/src/index/streaming/mod.rs 90.54% 7 Missing ⚠️
diskann-benchmark/src/index/streaming/runner.rs 94.93% 4 Missing ⚠️
diskann-benchmark/src/index/inmem/streaming.rs 97.43% 1 Missing ⚠️
diskann-benchmark/src/inputs/graph_index.rs 94.73% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1142   +/-   ##
=======================================
  Coverage   89.43%   89.44%           
=======================================
  Files         484      486    +2     
  Lines       91495    91565   +70     
=======================================
+ Hits        81829    81900   +71     
+ Misses       9666     9665    -1     
Flag Coverage Δ
miri 89.44% <92.50%> (+<0.01%) ⬆️
unittests 89.09% <92.50%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark/src/backend/mod.rs 100.00% <ø> (ø)
diskann-benchmark/src/index/build.rs 85.92% <ø> (ø)
diskann-benchmark/src/index/inmem/product.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/index/inmem/scalar.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/index/inmem/spherical.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/index/mod.rs 100.00% <ø> (ø)
diskann-benchmark/src/index/result.rs 49.03% <ø> (ø)
diskann-benchmark/src/index/search/knn.rs 73.61% <ø> (ø)
diskann-benchmark/src/index/search/plugins.rs 66.00% <ø> (ø)
diskann-benchmark/src/index/search/range.rs 0.00% <ø> (ø)
... and 9 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

JordanMaples and others added 2 commits June 9, 2026 12:46
Merge BfTreeFullPrecisionBuild + BfTreeSphericalBuild into BfTreeBuild,
and BfTreeStreamingRun + BfTreeSphericalStreamingRun into a single
BfTreeStreamingRun. Both structs now carry a QuantConfig enum that
discriminates between full-precision (None) and spherical quantization.

- Add QuantConfig enum with tagged serde (kind: none | spherical)
- Consolidate quantizer training into quantizer_util::build_quantizer
- Update benchmark callers to check QuantConfig variant in try_match
- Update example JSON files with new unified tags and quantization field
- Add Clone to exhaustive::TransformKind

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JordanMaples JordanMaples force-pushed the jordanmaples/benchmarks-refactor-v2 branch from 1875da1 to 28d08fa Compare June 9, 2026 21:41
@JordanMaples JordanMaples marked this pull request as ready for review June 9, 2026 21:47
@JordanMaples JordanMaples requested review from a team and Copilot June 9, 2026 21:47

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the diskann-benchmark “graph index” benchmark stack to improve modularity and reduce duplication, primarily by introducing a shared streaming runner/managed layer and by renaming “Dynamic” concepts to “Streaming”. It also restructures benchmark registration so index benchmarks live under a new crate::index module rather than crate::backend::index.

Changes:

  • Added a reusable streaming execution path (StreamRunner, Managed, build_streamer, run_streaming) shared across in-mem and bf-tree streaming benchmarks.
  • Renamed Dynamic → Streaming for runbook params and streaming runs, updating example inputs and integration tests accordingly.
  • Consolidated bf-tree build/stream inputs to support either “none” or “spherical” quantization via a single QuantConfig.

Reviewed changes

Copilot reviewed 36 out of 39 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
diskann-benchmark/src/main.rs Registers new index benchmark module; updates integration tests and example input filenames.
diskann-benchmark/src/inputs/graph_index.rs Renames Dynamic→Streaming types; privatizes some input fields and adds accessors; updates tags.
diskann-benchmark/src/inputs/exhaustive.rs Derives Clone for TransformKind for reuse in new config structures.
diskann-benchmark/src/inputs/bftree.rs Unifies bf-tree inputs with QuantConfig for build + streaming modes; updates validation/display.
diskann-benchmark/src/index/streaming/stats.rs Rehomes imports under crate::index::* after refactor.
diskann-benchmark/src/index/streaming/runner.rs Introduces StreamRunner + Maintainer abstraction used by streaming benchmarks.
diskann-benchmark/src/index/streaming/mod.rs Adds shared build_streamer and run_streaming helpers for runbook-driven streaming.
diskann-benchmark/src/index/streaming/managed.rs Adds the Managed ID→slot layer and SlotReclaim policy for streaming backends.
diskann-benchmark/src/index/search/range.rs Updates imports to new crate::index module structure.
diskann-benchmark/src/index/search/plugins.rs Updates imports to new crate::index module structure.
diskann-benchmark/src/index/search/mod.rs New module wrapper for search submodules.
diskann-benchmark/src/index/search/knn.rs Updates imports to new crate::index module structure.
diskann-benchmark/src/index/result.rs Updates imports to new crate::index module structure.
diskann-benchmark/src/index/mod.rs New top-level index module with benchmark registration entrypoint.
diskann-benchmark/src/index/inmem/streaming.rs Adds in-mem streaming maintenance implementation (DropDeleted + Release).
diskann-benchmark/src/index/inmem/spherical.rs Adjusts to new module layout and new accessor-based input APIs.
diskann-benchmark/src/index/inmem/scalar.rs Adjusts to new module layout and new accessor-based input APIs.
diskann-benchmark/src/index/inmem/product.rs Adjusts to new module layout and new accessor-based input APIs.
diskann-benchmark/src/index/inmem/mod.rs New inmem module wrapper; exports InmemMaintainer.
diskann-benchmark/src/index/build.rs New shared build helpers (insert variants, start points, progress meter, save/load).
diskann-benchmark/src/index/bftree/spherical.rs Refactors bf-tree spherical build benchmark to use QuantConfig and shared quantizer helper.
diskann-benchmark/src/index/bftree/spherical_streaming.rs Adds bf-tree spherical streaming benchmark using shared streaming helpers.
diskann-benchmark/src/index/bftree/quantizer_util.rs New shared spherical quantizer construction helper for bf-tree benchmarks.
diskann-benchmark/src/index/bftree/mod.rs Updates bf-tree benchmark registration and module structure.
diskann-benchmark/src/index/bftree/full_precision.rs Refactors bf-tree full-precision build benchmark to use unified BfTreeBuild input.
diskann-benchmark/src/index/bftree/full_precision_streaming.rs Adds bf-tree full-precision streaming benchmark using shared streaming helpers.
diskann-benchmark/src/index/benchmarks.rs Refactors graph-index benchmark registration and streaming path to use shared helpers.
diskann-benchmark/src/backend/mod.rs Stops registering the old backend::index benchmarks (now registered via crate::index).
diskann-benchmark/src/backend/index/streaming/mod.rs Removed (streaming moved under crate::index::streaming).
diskann-benchmark/src/backend/index/streaming/full_precision.rs Removed (replaced by StreamRunner + inmem maintainer).
diskann-benchmark/src/backend/index/bftree/streaming_utils.rs Removed (replaced by crate::index::streaming::run_streaming).
diskann-benchmark/src/backend/index/bftree/spherical_streaming.rs Removed (replaced by crate::index::bftree::spherical_streaming).
diskann-benchmark/src/backend/index/bftree/full_precision_streaming.rs Removed (replaced by crate::index::bftree::full_precision_streaming).
diskann-benchmark/README.md Updates example command and job type/tag for streaming runs.
diskann-benchmark/example/graph-index-stream.json Updates job type to the new streaming tag.
diskann-benchmark/example/graph-index-bftree.json Updates job type to unified bf-tree tag and adds quantization.kind.
diskann-benchmark/example/graph-index-bftree-stream.json Updates job type to unified bf-tree streaming tag and adds quantization.kind.
diskann-benchmark/example/graph-index-bftree-stream-spherical.json New example demonstrating bf-tree streaming with spherical quantization.
diskann-benchmark/example/graph-index-bftree-spherical.json Updates example to use unified bf-tree tag and nested quantization config.
Comments suppressed due to low confidence (1)

diskann-benchmark/src/inputs/graph_index.rs:1204

  • PR description says input fields are privatized behind accessor methods, but StreamingRunbookParams still exposes most fields as pub(crate) and downstream code accesses them directly. Either update the PR description/scope, or make these fields private and add accessors to keep the encapsulation goal consistent.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-benchmark/src/inputs/bftree.rs
Comment thread diskann-benchmark/src/inputs/bftree.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants