Track tensor and index lifetimes in the Rust DLPack bindings by yan-zaretskiy · Pull Request #2245 · rapidsai/cuvs

yan-zaretskiy · 2026-06-15T20:34:33Z

The previous ManagedTensor type was a non-owning wrapper over the C's FFI DLManagedTensor type with no lifetime attached. Hence there was no mechanism to tie the tensor data and shape/stride metadata it referenced to their owners. Indexes that keep a non-owning view of their dataset (CAGRA, brute-force) could outlive that data. Here we replace it with lifetime-parameterized DLTensorView and DLTensorViewMut views. They are produced by the public IntoDlTensor and IntoDlTensorMut traits. Users are now expected to implement these traits on their tensor types, so that our API can accept them as input/output arguments.

The previous `ManagedTensor` type was a non-owning wrapper over the C's FFI `DLManagedTensor` type with no lifetime attached. Hence there was no mechanism to tie the tensor data and shape/stride metadata it referenced to their owners. Indexes that keep a non-owning view of their dataset (CAGRA, brute-force) could outlive that data. Here we replace it with lifetime-parameterized `DLTensorView` and `DLTensorViewMut` views. They are produced by the public `IntoDlTensor` and `IntoDlTensorMut` traits. Users are now expected to implement these traits on their tensor types, so that our API can accept them as input/output arguments.

copy-pr-bot · 2026-06-15T20:34:37Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

yan-zaretskiy · 2026-06-15T21:47:38Z

Since this is a fairly large change, I would like to explain my thought process to justify the proposed design. First of all, let's identify what we're trying to fix.

The original cuvs type for tensor data is simply pub struct ManagedTensor(ffi::DLManagedTensor). It is a thin wrapper around the C DLManagedTensor - the struct that holds the raw data, shape, and strides pointers. When it's built from a borrowed tensor (e.g. via From<&ndarray>) it holds those pointers with no lifetime attached, so nothing ties the ManagedTensor to whatever owns the data, which can be freed out from under it. Therefore, using this type is effectively unsafe, though it is not marked as such.

The second problem is in the index types. CAGRA and brute-force don't copy the dataset at build time, because the underlying C++ implementation keeps a non-owning view and reads the original vectors during search, so the dataset has to outlive the index. The old API doesn't model that.

The purpose of this PR is therefore to enable tracking two things: the lifetime of the tensor data, and the lifetime of the indexes that borrow it.

To fix the first issue, we add lifetimes to the tensor view types. Moreover, we make a distinction between mutable and immutable views: DLTensorView<'a> for read-only inputs like datasets and queries, and DLTensorViewMut<'a> for outputs the C API writes, like neighbors and distances.

Users convert their own tensor types into these views by implementing a pair of new public traits, IntoDlTensor<'a> and IntoDlTensorMut<'a>. Every entry point (build, search, etc) takes impl IntoDlTensor<'a> or impl IntoDlTensorMut<'a> rather than a concrete type, so cuvs ships no tensor type of its own and stays agnostic about where your data lives.

The path from DLTensorView<'a> into C requires some clarification. Here's how we define this type:

pub struct DLTensorView<'a> {
    data: *mut std::ffi::c_void,
    device: ffi::DLDevice,
    dtype: ffi::DLDataType,
    shape: TensorDims,
    strides: Option<TensorDims>,
    _marker: PhantomData<&'a ()>,
}

where TensorDims is a vector with SBO, and in practice is always stack allocated. This is done to avoid heap allocations when making FFI calls from Rust. The reason DLTensorView<'a> owns shape and strides is because the C type eventually reads them as *mut i64, so something must own the data that this *mut i64 points to. One problem is that this something cannot be the original tensor type, because there's no guarantee it uses i64 for shape/stride values. So we choose to store it inside DLTensorView<'a>. The second design decision is how we hand a DLTensorView<'a> to the C function, which ultimately wants a *mut DLManagedTensor whose shape/strides are *mut i64 pointing at those arrays.

The obvious choice is to store a DLManagedTensor right inside the view, with its shape/strides pointing at our own shape/strides fields, and hand out a pointer to it. But that makes DLTensorView a self-referential struct, since the embedded DLManagedTensor would point back into the same value. And because TensorDims keeps its data inline, moving the view moves that inline array, while the DLManagedTensor's *mut i64 keeps pointing at the old location. Hence this choice is unsound.

There is a known way to make the embedding sound: store shape/strides as Box<[i64]> instead. Then the arrays live on the heap, and moving the view only moves the Box handle. But this forces a heap allocation per view.

So rather than embed the C struct, we build it only at a point we're about to call into C:

pub(crate) struct ManagedTensorRef<'a> {
    inner: ffi::DLManagedTensor,
    _borrow: PhantomData<&'a ()>,
}
impl<'a> DLTensorView<'a> {
    // Fills `inner` with `shape: self.shape.as_ptr() as *mut _`, etc.
    pub(crate) fn to_c(&self) -> ManagedTensorRef<'_> { /* ... */ }
}

Here to_c takes &self, and the returned ManagedTensorRef<'_> borrows it, so its lifetime is tied to the view. The compiler now guarantees the view, with the inline shape/strides its pointers point into, cannot move or be dropped while the ManagedTensorRef is alive. Finally, we take the raw *mut DLManagedTensor out of the ManagedTensorRef for the FFI call.

As for the second fix, the index types that borrow their dataset now also get a lifetime. CAGRA and brute-force now return Index<'d>, where 'd is the dataset's lifetime, so the compiler forbids the index from outliving the data it reads. IVF-Flat, IVF-PQ and Vamana copy the dataset at build time, so they carry no lifetime.

One special case here is indexes loaded from disk. An index produced by deserialize doesn't borrow any caller-supplied dataset, hence there is nothing for 'd to tie to. We model that by returning Index<'static>. The 'static says the index borrows nothing, so it isn't constrained by any dataset's lifetime and can live as long as you like.

coderabbitai · 2026-06-15T21:58:51Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added a public, trait-based DLPack tensor conversion API (IntoDlTensor / IntoDlTensorMut) for more flexible input/output handling.
Refactor
- Updated k-NN and ANN index APIs (CAGRA, Brute Force, IVF-Flat, IVF-PQ, Vamana) to accept the new DLPack conversion traits for building and searching.
- Updated the DLPack module exports accordingly.
Bug Fixes
- Improved error reporting for tensor conversion failures.
Documentation
- Refreshed module docs and rewrote the CAGRA example to match the new tensor workflow.
Chores
- Updated dependency versions.

Walkthrough

Replaces the owning ManagedTensor abstraction across the entire Rust cuVS crate with a new non-owning, lifetime-parameterized DLPack view API. New public traits IntoDlTensor/IntoDlTensorMut and view types DLTensorView/DLTensorViewMut replace ManagedTensor in all index build/search and algorithm (k-means, distance) signatures. A test-only DeviceTensor adapter and ndarray implementations are added, all index and algorithm APIs are updated uniformly, and the CAGRA example is rewritten to demonstrate user-defined GPU tensor integration.

Changes

DLPack View API Refactor and Consumer Migration

Layer / File(s)	Summary
DLPack core: traits, views, error, crate exports, deps `rust/cuvs/src/dlpack.rs`, `rust/cuvs/src/error.rs`, `rust/cuvs/src/lib.rs`, `rust/cuvs/Cargo.toml`	Rewrites `dlpack.rs` introducing `IntoDlTensor<'a>`, `IntoDlTensorMut<'a>`, `DType`, `DLTensorView<'a>`, `DLTensorViewMut<'a>`, `DLPackError`, and internal `ManagedTensorRef`; removes `ManagedTensor` entirely. Adds `DLPack` variant to `Error` with `From` impl. Promotes `dlpack` to `pub mod`, swaps `ManagedTensor` re-export for the new view/trait exports, adds `test_utils` under `#[cfg(test)]`, and adds `thiserror`/`tinyvec` dependencies.
Test-only DeviceTensor adapter and ndarray IntoDlTensor impls `rust/cuvs/src/test_utils.rs`	Adds `DeviceTensor<T>` backed by RMM device memory with `zeros`, `from_host`, `copy_to_host`, and `Drop`. Implements `IntoDlTensor`/`IntoDlTensorMut` for device references and for `&ndarray::ArrayRef` / `&mut ndarray::ArrayRef` (host CPU views), including stride elision for standard contiguous layouts.
CAGRA Index lifetime parameterization and API migration `rust/cuvs/src/cagra/index.rs`, `rust/cuvs/src/cagra/mod.rs`	Refactors `Index` to `Index<'d>` with `handle` and `PhantomData`; updates `build`, `search`, `search_with_filter`, `serialize`, `deserialize` (returns `Index<'static>`), and `Drop`. Replaces all test tensor construction with `DeviceTensor`. Updates module docs.
Brute-force Index lifetime parameterization and API migration `rust/cuvs/src/brute_force.rs`	Converts `Index` to `Index<'d>` with `PhantomData`; updates `build` to `impl IntoDlTensor<'d>` and `search` to generic `IntoDlTensor`/`IntoDlTensorMut`; updates test to use `DeviceTensor`.
IVF-Flat, IVF-PQ, and Vamana Index API migration `rust/cuvs/src/ivf_flat/`, `rust/cuvs/src/ivf_pq/`, `rust/cuvs/src/vamana/`	Migrates `build` and `search` signatures from `ManagedTensor`/`Into<ManagedTensor>` to `impl IntoDlTensor`/`impl IntoDlTensorMut` across all three index types; updates all tests to use `DeviceTensor`; condenses module-level docs.
K-means and pairwise distance API migration `rust/cuvs/src/cluster/kmeans/mod.rs`, `rust/cuvs/src/distance/mod.rs`	Migrates `fit`, `predict`, `cluster_cost`, and `pairwise_distance` to generic `IntoDlTensor`/`IntoDlTensorMut` parameters; updates optional `sample_weight` to `Option<impl IntoDlTensor>`; updates tests to use `DeviceTensor`.
CAGRA example rewrite with user-defined CudaTensor `rust/cuvs/examples/cagra.rs`, `rust/cuvs/src/resources.rs`	Rewrites the CAGRA example to define `CudaTensor<T>` using raw CUDA FFI, implement `IntoDlTensor`/`IntoDlTensorMut` on it, and demonstrate a full build/search workflow using only user-owned tensors. Adds `# Safety` docs to `Resources::set_cuda_stream`.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

non-breaking, Build

Suggested reviewers

AyodeAwe
robertmaynard

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Track tensor and index lifetimes in the Rust DLPack bindings' accurately and specifically describes the main change: introducing lifetime parameters to tensor views and indexes to prevent data use-after-free.
Description check	✅ Passed	The description accurately explains the core problem (ManagedTensor lacking lifetime constraints) and the solution (lifetime-parameterized DLTensorView/DLTensorViewMut with IntoDlTensor traits).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

rust/cuvs/src/cagra/index.rs (1)

22-23: ⚡ Quick win

Fix unresolved rustdoc link to Index::merge.

Line 22 links to [Index::merge], but this type currently does not expose a merge method in this module. That can produce a broken intra-doc link warning in docs builds.

Suggested doc tweak

-/// [`Index::merge`], the data is self-contained and the lifetime is
+/// a merged index, the data is self-contained and the lifetime is

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cuvs/src/cagra/index.rs` around lines 22 - 23, The rustdoc link in the
documentation comment references `Index::merge`, but the Index type does not
expose this method, creating a broken intra-doc link warning. Remove the
intra-doc link syntax around `Index::merge` in the doc comment (change
`[`Index::merge`]` to just `Index::merge` or remove the reference entirely if
it's not needed for clarity), or replace it with a reference to an actual method
that exists on the Index type.

rust/cuvs/examples/cagra.rs (1)

117-137: Add size validation guard to to_host before copying device data to host array.

The to_host method copies self.bytes to the destination host array without verifying that the caller-provided buffer has matching capacity. This safe wrapper could silently perform an out-of-bounds write if the destination array has a different element count than the source.

Add a check to ensure the destination buffer byte size matches:

Suggested guard

 fn to_host<D>(&self, res: &Resources, host: &mut ndarray::ArrayRef<T, D>) -> ExampleResult<()>
 where
     D: ndarray::Dimension,
 {
     if !host.is_standard_layout() {
         return Err("host array must be contiguous (row-major)".into());
     }
+    let host_bytes = host
+        .len()
+        .checked_mul(std::mem::size_of::<T>())
+        .ok_or("host array size overflow")?;
+    if host_bytes != self.bytes {
+        return Err(format!(
+            "host buffer size mismatch: expected {} bytes, got {} bytes",
+            self.bytes, host_bytes
+        )
+        .into());
+    }

     let stream = res.get_cuda_stream()?;
     check_cuda(unsafe {
         cudaMemcpyAsync(
             host.as_mut_ptr() as *mut c_void,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cuvs/examples/cagra.rs` around lines 117 - 137, The to_host method
performs an unsafe memory copy via cudaMemcpyAsync without validating that the
destination host array has sufficient capacity to receive self.bytes. Add a size
validation check after verifying the host array is contiguous but before calling
cudaMemcpyAsync to ensure the host array's byte capacity matches self.bytes,
returning an appropriate error if the sizes do not match. This prevents
potential out-of-bounds writes to the destination buffer.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rust/cuvs/Cargo.toml`:
- Line 18: The tinyvec dependency in rust/cuvs/Cargo.toml includes the
latest_stable_rust feature without an explicit rust-version declaration in the
workspace package metadata, which can cause silent breakage when new Rust stable
versions are released. Fix this by either adding a rust-version field to the
[workspace.package] section to define an explicit minimum supported Rust version
that aligns with your project requirements, or remove the latest_stable_rust
feature from the tinyvec dependency configuration and replace it with a stable
feature set that is compatible with your MSRV.

In `@rust/cuvs/src/test_utils.rs`:
- Around line 21-27: The DeviceTensor struct stores a raw ffi::cuvsResources_t
handle without tying its lifetime to the underlying Resources object, allowing
DeviceTensor to outlive Resources and cause a use-after-free when Drop calls
cuvsRMMFree with an invalid handle. Add a lifetime parameter to the DeviceTensor
struct definition (around line 21) and bind the resources field to that lifetime
by storing a reference instead of a raw pointer. Update the Drop implementation
(around lines 30-43) to work with the borrowed reference instead of the raw
handle. Ensure all construction sites of DeviceTensor (around lines 116-120)
properly pass a reference with the correct lifetime to guarantee the resource
stays valid for the entire lifetime of the DeviceTensor instance.

---

Nitpick comments:
In `@rust/cuvs/examples/cagra.rs`:
- Around line 117-137: The to_host method performs an unsafe memory copy via
cudaMemcpyAsync without validating that the destination host array has
sufficient capacity to receive self.bytes. Add a size validation check after
verifying the host array is contiguous but before calling cudaMemcpyAsync to
ensure the host array's byte capacity matches self.bytes, returning an
appropriate error if the sizes do not match. This prevents potential
out-of-bounds writes to the destination buffer.

In `@rust/cuvs/src/cagra/index.rs`:
- Around line 22-23: The rustdoc link in the documentation comment references
`Index::merge`, but the Index type does not expose this method, creating a
broken intra-doc link warning. Remove the intra-doc link syntax around
`Index::merge` in the doc comment (change `[`Index::merge`]` to just
`Index::merge` or remove the reference entirely if it's not needed for clarity),
or replace it with a reference to an actual method that exists on the Index
type.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1e292a2b-8b54-4ea0-84e8-da899f2498fb

📥 Commits

Reviewing files that changed from the base of the PR and between 6672103 and 1bc3436.

📒 Files selected for processing (18)

rust/cuvs/Cargo.toml
rust/cuvs/examples/cagra.rs
rust/cuvs/src/brute_force.rs
rust/cuvs/src/cagra/index.rs
rust/cuvs/src/cagra/mod.rs
rust/cuvs/src/cluster/kmeans/mod.rs
rust/cuvs/src/distance/mod.rs
rust/cuvs/src/dlpack.rs
rust/cuvs/src/error.rs
rust/cuvs/src/ivf_flat/index.rs
rust/cuvs/src/ivf_flat/mod.rs
rust/cuvs/src/ivf_pq/index.rs
rust/cuvs/src/ivf_pq/mod.rs
rust/cuvs/src/lib.rs
rust/cuvs/src/resources.rs
rust/cuvs/src/test_utils.rs
rust/cuvs/src/vamana/index.rs
rust/cuvs/src/vamana/mod.rs

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

rust/cuvs/src/brute_force.rs (1)
147-148: ⚠️ Potential issue | 🟠 Major

Change neighbors type from i64 to u32 to match FFI contract.

The FFI documentation for cuvsBruteForceSearch explicitly requires neighbors to have type kDLUInt with 32 bits (i.e., u32). Using i64 violates this contract and risks silent data corruption if the C implementation writes 32-bit values to a 64-bit buffer. CAGRA tests correctly use u32.

Update both allocations:
Suggested changes
-        let mut neighbors_host = ndarray::Array::<i64, _>::zeros((n_queries, k));
-        let mut neighbors = DeviceTensor::<i64>::zeros(&res, &[n_queries, k]).unwrap();
+        let mut neighbors_host = ndarray::Array::<u32, _>::zeros((n_queries, k));
+        let mut neighbors = DeviceTensor::<u32>::zeros(&res, &[n_queries, k]).unwrap();
Also update any assertions that compare or cast neighbors values accordingly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rust/cuvs/src/brute_force.rs` around lines 147 - 148, The `neighbors` and
`neighbors_host` tensor declarations are using `i64` type, but the FFI contract
for `cuvsBruteForceSearch` requires `u32`. Update the type parameter from `i64`
to `u32` in both the `neighbors_host` ndarray allocation and the `neighbors`
DeviceTensor allocation. Additionally, search the file for any assertions,
comparisons, or casts involving `neighbors` values and update them to work
correctly with the new `u32` type instead of `i64`.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@rust/cuvs/src/brute_force.rs`:
- Around line 147-148: The `neighbors` and `neighbors_host` tensor declarations
are using `i64` type, but the FFI contract for `cuvsBruteForceSearch` requires
`u32`. Update the type parameter from `i64` to `u32` in both the
`neighbors_host` ndarray allocation and the `neighbors` DeviceTensor allocation.
Additionally, search the file for any assertions, comparisons, or casts
involving `neighbors` values and update them to work correctly with the new
`u32` type instead of `i64`.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e9417b5c-73db-45d8-99a4-ac58de5132e3

📥 Commits

Reviewing files that changed from the base of the PR and between 1bc3436 and 07d3e5c.

📒 Files selected for processing (6)

rust/cuvs/Cargo.toml
rust/cuvs/src/brute_force.rs
rust/cuvs/src/cagra/index.rs
rust/cuvs/src/cluster/kmeans/mod.rs
rust/cuvs/src/distance/mod.rs
rust/cuvs/src/test_utils.rs

🚧 Files skipped from review as they are similar to previous changes (3)

rust/cuvs/src/distance/mod.rs
rust/cuvs/src/test_utils.rs
rust/cuvs/src/cluster/kmeans/mod.rs

github-project-automation Bot added this to Unstructured Data Processing Jun 15, 2026

yan-zaretskiy self-assigned this Jun 15, 2026

yan-zaretskiy added breaking Introduces a breaking change Rust improvement Improves an existing functionality labels Jun 15, 2026

yan-zaretskiy marked this pull request as ready for review June 15, 2026 21:47

yan-zaretskiy requested a review from a team as a code owner June 15, 2026 21:47

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread rust/cuvs/Cargo.toml Outdated

Comment thread rust/cuvs/src/test_utils.rs Outdated

PR feedback

07d3e5c

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track tensor and index lifetimes in the Rust DLPack bindings#2245

Track tensor and index lifetimes in the Rust DLPack bindings#2245
yan-zaretskiy wants to merge 2 commits into
rapidsai:mainfrom
yan-zaretskiy:rust-dlpack

yan-zaretskiy commented Jun 15, 2026

Uh oh!

copy-pr-bot Bot commented Jun 15, 2026

Uh oh!

yan-zaretskiy commented Jun 15, 2026

Uh oh!

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yan-zaretskiy commented Jun 15, 2026

Uh oh!

copy-pr-bot Bot commented Jun 15, 2026

Uh oh!

yan-zaretskiy commented Jun 15, 2026

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading