Refactor of diskann-benchmarks to make it potentially easier to spin up new backends#1142
Refactor of diskann-benchmarks to make it potentially easier to spin up new backends#1142JordanMaples wants to merge 5 commits into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1142 +/- ##
=======================================
Coverage 89.43% 89.44%
=======================================
Files 484 486 +2
Lines 91495 91565 +70
=======================================
+ Hits 81829 81900 +71
+ Misses 9666 9665 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Merge BfTreeFullPrecisionBuild + BfTreeSphericalBuild into BfTreeBuild, and BfTreeStreamingRun + BfTreeSphericalStreamingRun into a single BfTreeStreamingRun. Both structs now carry a QuantConfig enum that discriminates between full-precision (None) and spherical quantization. - Add QuantConfig enum with tagged serde (kind: none | spherical) - Consolidate quantizer training into quantizer_util::build_quantizer - Update benchmark callers to check QuantConfig variant in try_match - Update example JSON files with new unified tags and quantization field - Add Clone to exhaustive::TransformKind Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1875da1 to
28d08fa
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors the diskann-benchmark “graph index” benchmark stack to improve modularity and reduce duplication, primarily by introducing a shared streaming runner/managed layer and by renaming “Dynamic” concepts to “Streaming”. It also restructures benchmark registration so index benchmarks live under a new crate::index module rather than crate::backend::index.
Changes:
- Added a reusable streaming execution path (
StreamRunner,Managed,build_streamer,run_streaming) shared across in-mem and bf-tree streaming benchmarks. - Renamed Dynamic → Streaming for runbook params and streaming runs, updating example inputs and integration tests accordingly.
- Consolidated bf-tree build/stream inputs to support either “none” or “spherical” quantization via a single
QuantConfig.
Reviewed changes
Copilot reviewed 36 out of 39 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-benchmark/src/main.rs | Registers new index benchmark module; updates integration tests and example input filenames. |
| diskann-benchmark/src/inputs/graph_index.rs | Renames Dynamic→Streaming types; privatizes some input fields and adds accessors; updates tags. |
| diskann-benchmark/src/inputs/exhaustive.rs | Derives Clone for TransformKind for reuse in new config structures. |
| diskann-benchmark/src/inputs/bftree.rs | Unifies bf-tree inputs with QuantConfig for build + streaming modes; updates validation/display. |
| diskann-benchmark/src/index/streaming/stats.rs | Rehomes imports under crate::index::* after refactor. |
| diskann-benchmark/src/index/streaming/runner.rs | Introduces StreamRunner + Maintainer abstraction used by streaming benchmarks. |
| diskann-benchmark/src/index/streaming/mod.rs | Adds shared build_streamer and run_streaming helpers for runbook-driven streaming. |
| diskann-benchmark/src/index/streaming/managed.rs | Adds the Managed ID→slot layer and SlotReclaim policy for streaming backends. |
| diskann-benchmark/src/index/search/range.rs | Updates imports to new crate::index module structure. |
| diskann-benchmark/src/index/search/plugins.rs | Updates imports to new crate::index module structure. |
| diskann-benchmark/src/index/search/mod.rs | New module wrapper for search submodules. |
| diskann-benchmark/src/index/search/knn.rs | Updates imports to new crate::index module structure. |
| diskann-benchmark/src/index/result.rs | Updates imports to new crate::index module structure. |
| diskann-benchmark/src/index/mod.rs | New top-level index module with benchmark registration entrypoint. |
| diskann-benchmark/src/index/inmem/streaming.rs | Adds in-mem streaming maintenance implementation (DropDeleted + Release). |
| diskann-benchmark/src/index/inmem/spherical.rs | Adjusts to new module layout and new accessor-based input APIs. |
| diskann-benchmark/src/index/inmem/scalar.rs | Adjusts to new module layout and new accessor-based input APIs. |
| diskann-benchmark/src/index/inmem/product.rs | Adjusts to new module layout and new accessor-based input APIs. |
| diskann-benchmark/src/index/inmem/mod.rs | New inmem module wrapper; exports InmemMaintainer. |
| diskann-benchmark/src/index/build.rs | New shared build helpers (insert variants, start points, progress meter, save/load). |
| diskann-benchmark/src/index/bftree/spherical.rs | Refactors bf-tree spherical build benchmark to use QuantConfig and shared quantizer helper. |
| diskann-benchmark/src/index/bftree/spherical_streaming.rs | Adds bf-tree spherical streaming benchmark using shared streaming helpers. |
| diskann-benchmark/src/index/bftree/quantizer_util.rs | New shared spherical quantizer construction helper for bf-tree benchmarks. |
| diskann-benchmark/src/index/bftree/mod.rs | Updates bf-tree benchmark registration and module structure. |
| diskann-benchmark/src/index/bftree/full_precision.rs | Refactors bf-tree full-precision build benchmark to use unified BfTreeBuild input. |
| diskann-benchmark/src/index/bftree/full_precision_streaming.rs | Adds bf-tree full-precision streaming benchmark using shared streaming helpers. |
| diskann-benchmark/src/index/benchmarks.rs | Refactors graph-index benchmark registration and streaming path to use shared helpers. |
| diskann-benchmark/src/backend/mod.rs | Stops registering the old backend::index benchmarks (now registered via crate::index). |
| diskann-benchmark/src/backend/index/streaming/mod.rs | Removed (streaming moved under crate::index::streaming). |
| diskann-benchmark/src/backend/index/streaming/full_precision.rs | Removed (replaced by StreamRunner + inmem maintainer). |
| diskann-benchmark/src/backend/index/bftree/streaming_utils.rs | Removed (replaced by crate::index::streaming::run_streaming). |
| diskann-benchmark/src/backend/index/bftree/spherical_streaming.rs | Removed (replaced by crate::index::bftree::spherical_streaming). |
| diskann-benchmark/src/backend/index/bftree/full_precision_streaming.rs | Removed (replaced by crate::index::bftree::full_precision_streaming). |
| diskann-benchmark/README.md | Updates example command and job type/tag for streaming runs. |
| diskann-benchmark/example/graph-index-stream.json | Updates job type to the new streaming tag. |
| diskann-benchmark/example/graph-index-bftree.json | Updates job type to unified bf-tree tag and adds quantization.kind. |
| diskann-benchmark/example/graph-index-bftree-stream.json | Updates job type to unified bf-tree streaming tag and adds quantization.kind. |
| diskann-benchmark/example/graph-index-bftree-stream-spherical.json | New example demonstrating bf-tree streaming with spherical quantization. |
| diskann-benchmark/example/graph-index-bftree-spherical.json | Updates example to use unified bf-tree tag and nested quantization config. |
Comments suppressed due to low confidence (1)
diskann-benchmark/src/inputs/graph_index.rs:1204
- PR description says input fields are privatized behind accessor methods, but
StreamingRunbookParamsstill exposes most fields aspub(crate)and downstream code accesses them directly. Either update the PR description/scope, or make these fields private and add accessors to keep the encapsulation goal consistent.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Refactors the diskann-benchmark infrastructure to improve modularity, reduce duplication, and privatize input struct fields behind accessor methods.
Input field privatization: