Skip to content

Consolidate insert into PruneAccessor#1138

Open
hildebrandmw wants to merge 9 commits into
mainfrom
mhildebr/prune
Open

Consolidate insert into PruneAccessor#1138
hildebrandmw wants to merge 9 commits into
mainfrom
mhildebr/prune

Conversation

@hildebrandmw

@hildebrandmw hildebrandmw commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

The insert followup to #1067, consolidating the following traits:

  • diskann::provider::{DelegateNeighbor, AsNeighbor, AsNeighborMut}: Neighbor accessor delegation in this manner is no longer needed.
  • diskann::provider::{HasElementRef, BuildDistanceComputer}: Consolidated into PruneAccessor.
  • diskann::graph::workingset::{Fill, AsWorkingSet}: Consolidated into PruneAccessor

Into a single PruneAccessor trait:

pub trait PruneAccessor: HasId + Send + Sync {
    type Neighbors<'a>: provider::NeighborAccessorMut<Id = Self::Id>
    where
        Self: 'a;

    type ElementRef<'a>;

    type View<'a>: for<'x> workingset::View<Self::Id, ElementRef<'x> = Self::ElementRef<'x>>
        + Send
        + Sync
    where
        Self: 'a;

    type Distance<'a>: for<'x, 'y> DistanceFunction<Self::ElementRef<'x>, Self::ElementRef<'y>, f32>
        + Send
        + Sync
    where
        Self: 'a;

    /// Replaces neighbor delegation
    fn neighbors(&mut self) -> Self::Neighbors<'_>;

    /// Replaces `workingset::Fill`.
    fn fill<Itr>(
        &mut self,
        itr: Itr,
    ) -> impl SendFuture<ANNResult<(Self::View<'_>, Self::Distance<'_>)>>
    where
        Itr: ExactSizeIterator<Item = Self::Id> + Clone + Send + Sync;

For insert/prune, the accessor has two jobs:

  1. Neighbor access for manipulating datasets. The new trait uses a neighbors method for delegation mainly because it's a pattern we use quite heavily already: implement one NeighborAccessor and have all the other accessors use that.
  2. Gathering item for pruning and enabling distance computation. This is done via fill, which now returns both a View and the distance computer. I think this should still remain as a method with a view because it scopes the duration for which pruning is being performed, allowing implementation to acquire locks or otherwise enter some critical region for a duration shorter than the lifetime of the whole accessor.

Interaction with the Working Set

The working set is used to cache entries across prunes to reduce trips to the backing provider. With this change, the working set and distance computers become part of the accessor, which adds a little bloat to the accessors, but simplifies how these objects get along.

Without an external working-set, PruneStrategy becomes much simpler (now it's like SearchStrategy with just two associated types and a single method. MultiInsertStrategy is slightly changed with finish still returning an opaque Seed but with the PruneAccessor being constructed via seeded_prune_accessor. This is the replacement for the AsWorkingSet transformation.

Suggested Review Order

  • diskann/src/graph/glue.rs: The new PruneAccessor trait
  • diskann/src/provider.rs: Simplification to provider-related traits.
  • diskann/src/graph/workingset/: Simplification to workingset API.
  • diskann/src/graph/index.rs: Updated indexing algorithm to take a PruneAccessor rather than a strategy + provider + context.
  • diskann-providers/: Mostly mechanical clean-up.
  • diskann-bftree/: Mostly mechanical clean-up.
  • diskann-garnet/: Mostly mechanical, with one simplification. Since the workingset is now part of the accessor, we can no longer run into situations where the working set contains full-precision vectors but the accessor wants quantized vectors, resulting in some simplification.

@codecov-commenter

codecov-commenter commented Jun 6, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.10916% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.41%. Comparing base (a5c745b) to head (ccec01a).

Files with missing lines Patch % Lines
...rs/src/model/graph/provider/async_/inmem/scalar.rs 56.66% 13 Missing ⚠️
...s/src/model/graph/provider/async_/inmem/product.rs 78.00% 11 Missing ⚠️
diskann-bftree/src/provider.rs 85.91% 10 Missing ⚠️
...src/model/graph/provider/async_/inmem/spherical.rs 50.00% 10 Missing ⚠️
diskann-garnet/src/provider.rs 87.69% 8 Missing ⚠️
diskann-providers/src/index/wrapped_async.rs 0.00% 6 Missing ⚠️
diskann/src/graph/index.rs 98.58% 2 Missing ⚠️
diskann/src/graph/test/provider.rs 98.18% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1138      +/-   ##
==========================================
- Coverage   89.43%   89.41%   -0.02%     
==========================================
  Files         484      484              
  Lines       91495    91419      -76     
==========================================
- Hits        81829    81744      -85     
- Misses       9666     9675       +9     
Flag Coverage Δ
miri 89.41% <88.10%> (-0.02%) ⬇️
unittests 89.06% <88.10%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-core/src/build/graph/multi.rs 98.92% <100.00%> (ø)
diskann-benchmark-core/src/build/graph/single.rs 100.00% <100.00%> (ø)
...benchmark-core/src/streaming/graph/drop_deleted.rs 98.03% <100.00%> (ø)
diskann-bftree/src/neighbors.rs 93.55% <100.00%> (-0.07%) ⬇️
diskann-providers/src/index/diskann_async.rs 96.20% <100.00%> (ø)
...roviders/src/model/graph/provider/async_/common.rs 90.05% <ø> (+1.75%) ⬆️
...iders/src/model/graph/provider/async_/distances.rs 60.86% <ø> (+21.43%) ⬆️
...odel/graph/provider/async_/inmem/full_precision.rs 98.52% <100.00%> (-0.02%) ⬇️
.../src/model/graph/provider/async_/inmem/provider.rs 92.51% <100.00%> (ø)
diskann-vector/src/traits.rs 100.00% <100.00%> (ø)
... and 13 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hildebrandmw hildebrandmw marked this pull request as ready for review June 8, 2026 17:01
@hildebrandmw hildebrandmw requested review from a team and Copilot June 8, 2026 17:01

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the “insert-side” consolidation started in #1067 by collapsing multiple provider/workingset traits used during pruning/index construction into a single glue::PruneAccessor abstraction, and then plumbing that change through the index build path and all in-tree providers/strategies.

Changes:

  • Introduces glue::PruneAccessor (neighbors delegation + fill returning (View, Distance)), and simplifies PruneStrategy / MultiInsertStrategy around accessor construction (including seeded_prune_accessor).
  • Removes/rewires now-redundant provider/workingset traits (e.g., BuildDistanceComputer, HasElementRef, Fill, AsWorkingSet, neighbor delegation helpers) and updates call sites accordingly.
  • Updates core index build/prune logic and all in-tree providers (async inmem, Garnet, BfTree, benchmarks/tests) to the new accessor model; adds DistanceFunction impl for &T to ease borrowing.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
diskann/src/provider.rs Simplifies neighbor accessor traits (now &mut self + ANNResult<()>) and introduces provider::Neighbors adaptor.
diskann/src/graph/workingset/mod.rs Removes Fill/AsWorkingSet API surface; keeps View as the pruning read interface (now intended to pair with PruneAccessor).
diskann/src/graph/workingset/map.rs Updates Map docs/seed story to align with seeded_prune_accessor; removes AsWorkingSet impl + related test.
diskann/src/graph/test/provider.rs Updates test provider + strategies to implement glue::PruneAccessor and new neighbor accessor method signatures.
diskann/src/graph/test/cases/inplace_delete.rs Adjusts tests to pass mutable neighbor accessors under new NeighborAccessor signature.
diskann/src/graph/test/cases/index.rs Adjusts prune-range test neighbor access to the new mutable accessor pattern.
diskann/src/graph/index.rs Reworks prune/build flows to use PruneAccessor directly (no external working set); updates multi-insert to build per-task accessors from seeds.
diskann/src/graph/glue.rs Adds PruneAccessor, simplifies PruneStrategy/MultiInsertStrategy contracts and docs to match the new model.
diskann-vector/src/traits.rs Adds DistanceFunction implementation for &T to allow passing borrowed computers/functors.
diskann-providers/src/model/graph/provider/async_/inmem/spherical.rs Migrates inmem spherical pruning to glue::PruneAccessor and constructs distance computer eagerly.
diskann-providers/src/model/graph/provider/async_/inmem/scalar.rs Migrates inmem scalar pruning to glue::PruneAccessor, moving distance computer creation into accessor construction.
diskann-providers/src/model/graph/provider/async_/inmem/provider.rs Updates inmem neighbor provider impls to new NeighborAccessor{,Mut} method signatures.
diskann-providers/src/model/graph/provider/async_/inmem/product.rs Migrates PQ/product prune accessors (including hybrid behavior) to glue::PruneAccessor and internalizes “working set” state.
diskann-providers/src/model/graph/provider/async_/inmem/mod.rs Removes the old PassThrough working-set ZST now that working-set traits are gone.
diskann-providers/src/model/graph/provider/async_/inmem/full_precision.rs Migrates full-precision inmem prune accessor to glue::PruneAccessor and caches the distance computer in the accessor.
diskann-providers/src/model/graph/provider/async_/distances.rs Removes hybrid working-set wrapper/overlay glue tied to the old workingset traits.
diskann-providers/src/model/graph/provider/async_/common.rs Removes the old Unseeded seed type since seeding is now handled differently.
diskann-providers/src/index/wrapped_async.rs Updates wrapped async index helpers to use NeighborAccessor{,Mut} bounds instead of AsNeighbor{,Mut}.
diskann-providers/src/index/diskann_async.rs Updates async index tests/helpers to use NeighborAccessor{,Mut} bounds.
diskann-garnet/src/provider.rs Migrates Garnet prune accessor to glue::PruneAccessor, internalizes caching in the accessor, and adjusts neighbor delegation.
diskann-bftree/src/provider.rs Migrates BfTree prune accessors to glue::PruneAccessor, internalizing map state + distance computers and updating multi-insert seeding.
diskann-benchmark-core/src/streaming/graph/drop_deleted.rs Updates benchmark build stage bounds/call sites to new neighbor accessor traits.
diskann-benchmark-core/src/build/graph/single.rs Updates benchmark test neighbor traversal to use mutable accessors.
diskann-benchmark-core/src/build/graph/multi.rs Updates benchmark test neighbor traversal to use mutable accessors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann/src/graph/glue.rs
Comment thread diskann/src/graph/glue.rs Outdated
Comment thread diskann/src/graph/glue.rs Outdated
Comment thread diskann/src/graph/workingset/map.rs
Comment thread diskann/src/graph/index.rs Outdated
Comment thread diskann/src/graph/index.rs
Comment thread diskann/src/graph/index.rs Outdated
Comment thread diskann/src/graph/workingset/mod.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants