Skip to content

Conversation

@jeremywgleeson
Copy link

Summary

This PR adds filter support to the IVF-PQ C API (which previously had no filtering capability) and adds BITMAP filter support to the CAGRA and IVF-Flat C API (which only supported BITSET filters). Additionally, it adds strong filter support to the Rust crate, which was previously not exposed.

Changes

  1. IVF-PQ C API - Add Filter Support

    • Add cuvsFilter filter parameter to cuvsIvfPqSearch
    • Implement BITMAP (per-query) and BITSET (global) filter support
    • IVF-PQ previously had no filtering capability in the C API
  2. IVF-Flat C API - Add BITMAP Filter Support

    • Add BITMAP filter support to cuvsIvfFlatSearch
    • IVF-Flat previously only supported BITSET filters, now supports both
    • Matches full filtering functionality available in the C++ API
  3. CAGRA C API - Add BITMAP Filter Support

    • Add BITMAP filter support to cuvsCagraSearch
    • CAGRA previously only supported BITSET filters, now supports both
    • Matches full filtering functionality available in the C++ API
  4. Comprehensive Test Coverage

    • Add filtered search tests for IVF-PQ (ann_ivf_pq_c.cu)
    • Add filtered search tests for IVF-Flat (ann_ivf_flat_c.cu)
    • Add filtered search tests for CAGRA (ann_cagra_c.cu)
    • Each test suite includes both BITSET and BITMAP filter validation
    • Tests verify that filtered results respect the exclusion criteria
  5. Rust Language Bindings - Complete Filter Support

    • Add new filters.rs module with comprehensive filter utilities
    • BITSET Helpers:
      • bitset_from_excluded_indices() - Create global filter from excluded indices
      • bitset_from_included_indices() - Create global filter from included indices
    • BITMAP Helpers:
      • bitmap_from_excluded_indices() - Create per-query filters from excluded indices
      • bitmap_from_included_indices() - Create per-query filters from included indices
    • All functions follow idiomatic Rust patterns with proper error handling
    • Memory-safe wrappers around DLPack tensors for filter data
    • Comprehensive documentation and examples for each function
  6. Other Language Bindings

    • Update Python bindings to accept optional filter parameter
    • Update Go bindings to pass NO_FILTER by default

Backward Compatibility

C API - Breaking Changes:

  • ⚠️ BREAKING: cuvsIvfPqSearch now requires an additional trailing cuvsFilter filter parameter
  • Existing C code calling IVF-PQ search must be updated to pass a filter parameter (use {.type = NO_FILTER, .addr = (uintptr_t)NULL} for no filtering)
  • Note: IVF-Flat and CAGRA already had the filter parameter, so no breaking changes for those

(Based on #664 , it seems that this type of change is not considered breaking?)

Rust API - Breaking Changes:

  • ⚠️ BREAKING: Search function signatures updated to include filter parameter
  • Existing Rust code must be updated to pass a filter (use appropriate helper functions from new filters module or None)

Python API - Non-Breaking:

  • Filter parameter is optional with default value filter=None
  • Filter placed after resources parameter to maintain backward compatibility for positional arguments
  • Existing Python code continues to work unchanged

Go API - Non-Breaking:

  • Filter automatically initialized to NO_FILTER internally
  • Existing Go code continues to work unchanged

Testing

All new functionality is covered by unit tests that:

  • Create random datasets and queries
  • Apply filters to exclude even-indexed vectors (pattern: 0xAAAAAAAA)
  • Verify all returned neighbors are odd-indexed
  • Test both BITMAP and BITSET filter modes for each index type

Closes #1464

jeremywgleeson and others added 5 commits October 29, 2025 15:40
Add cuvsFilter parameter to IVF_PQ, IVF_Flat, CAGRA, and TieredIndex search functions.
- IVF_PQ: Add BITMAP and BITSET filter support
- IVF_Flat: Add BITMAP and BITSET filter support
- CAGRA: Add BITMAP and BITSET filter support
- TieredIndex: Add BITSET filter support (BITMAP not supported)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- C API: Add cuvsFilter parameter to IVF_PQ, IVF_Flat, CAGRA, TieredIndex search
- Rust: Add Filter trait with NoFilter, BitmapFilter, BitsetFilter implementations
- Python: Add filter parameter to search functions
- Go: Add filter creation and passing to C API
- Tests: Add comprehensive filter tests for all index types (ann_tiered_index_c.cu)
- Update function signatures and documentation examples

Based on upstream/main to avoid formatting changes
This commit adds comprehensive test coverage for bitmap and bitset
filtering functionality across multiple index types:

- IVF-PQ: Added BuildSearchBitsetFiltered and BuildSearchBitmapFiltered tests
- IVF-Flat: Added BuildSearchBitsetFiltered and BuildSearchBitmapFiltered tests
- CAGRA: Replaced BuildSearchFiltered with BuildSearchBitsetFiltered and added BuildSearchBitmapFiltered
- Added TIERED_INDEX_C_TEST configuration to CMakeLists.txt

All tests verify correct filter behavior by creating filters that remove
even-indexed vectors and asserting all returned neighbors are odd-indexed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Move filter parameter to end of search() signature to match CAGRA and
IVF-Flat, preventing breakage of existing code using positional args.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Remove TieredIndex BITMAP/BITSET filter changes and tests as they are
unrelated to the IVF-PQ, IVF-Flat, and CAGRA filter additions.

Preserved in branch: feat/tiered-index-filters

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 29, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Oct 30, 2025
@cjnolet
Copy link
Member

cjnolet commented Oct 30, 2025

/ok to test 6ee63bf

@cjnolet
Copy link
Member

cjnolet commented Oct 30, 2025

Thanks so much for the contribution @jeremywgleeson! This is a few important features that we've had on our roadmap and we really appreciate your help here.

Remove test configuration for ann_tiered_index_c.cu which was deleted
in the TieredIndex revert commit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@cjnolet
Copy link
Member

cjnolet commented Oct 30, 2025

/ok to test 577d5de

jeremywgleeson and others added 3 commits October 30, 2025 10:10
- Update copyright header to SPDX format
- Apply cargo fmt to function signatures

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add #include <cuvs/neighbors/common.h> to provide cuvsFilter type
definition required by cuvsIvfPqSearch function signature.

Fixes compilation errors:
- error: unknown type name 'cuvsFilter'
- error: 'NO_FILTER' undeclared
- error: 'cuvsFilter' has not been declared

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@benfred
Copy link
Member

benfred commented Oct 30, 2025

/ok to test 760b3d1

jeremywgleeson and others added 2 commits October 30, 2025 12:59
Update copyright headers for files modified in this PR to include 2025.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add NO_FILTER initialization to cuvsIvfPqSearch call in example to
match updated API signature.

Fixes compilation error:
error: too few arguments to function 'cuvsIvfPqSearch'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@jeremywgleeson jeremywgleeson requested a review from a team as a code owner October 30, 2025 22:22
Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C++ cagra and ivf-* search code doesn't currently support using a bitmap filter - and passing one to it should throw an exception saying Unsupported sample filter type right now. Do the tests that you've added for the bitmap filters for cagra and ivf-pq/ivf-flat work for you?

We'd love to add this bitmap support to the cagra and ivf indices, but the challenge right now is to do this without increasing the binary size of the libcuvs.so (which has the filter type as a template parameter to the search methods).

I'm wondering if we should break up this PR into multiple changes - and just have the bitset changes for ivf-pq and the rust api in this PR?

@benfred
Copy link
Member

benfred commented Oct 31, 2025

/ok to test e6feca5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Development

Successfully merging this pull request may close these issues.

[FEA] support filtering in c/downstream APIs

3 participants