Skip to content

Commit bc490b5

Browse files
nanlongyumeta-codesync[bot]
authored andcommitted
Make IVFPQSearchCagraConfig dtype fields settable from Python (#5191)
Summary: `IVFPQSearchCagraConfig.lut_dtype` and `.internal_distance_dtype` are typed as `cudaDataType_t` (a C enum from CUDA `<library_types.h>`). Without an int typemap SWIG generates accessors that read/write a `cudaDataType_t*` `SwigPyObject`, which Python users cannot construct, so the fields are effectively read-only from Python: ```python >>> import faiss >>> c = faiss.IVFPQSearchCagraConfig() >>> c.lut_dtype <Swig Object of type 'cudaDataType_t *' at 0x...> >>> c.lut_dtype = 2 # CUDA_R_16F TypeError: in method 'IVFPQSearchCagraConfig_lut_dtype_set', argument 2 of type 'cudaDataType_t' >>> faiss.CUDA_R_16F # also not exported AttributeError: module 'faiss' has no attribute 'CUDA_R_16F' ``` The doc comment on `IVFPQSearchCagraConfig.lut_dtype` (`faiss/gpu/GpuIndexCagra.h`) recommends low-precision LUT for large-dimension datasets — *"fast shared memory kernels can be used even for datasets with large dimensionality"* — but the Python binding does not let users act on that recommendation. After this PR: ```python c = faiss.IVFPQSearchCagraConfig() c.lut_dtype = faiss.CUDA_R_16F # or CUDA_R_8U / CUDA_R_32F c.internal_distance_dtype = faiss.CUDA_R_16F ``` The C++ ABI is unchanged; only the SWIG wrapper for these existing fields changes. The typemap is added inside the `FAISS_ENABLE_CUVS` block because that is where `cudaDataType_t` is referenced. ## Why it matters `lut_dtype` controls whether cuVS' IVF-PQ `compute_similarity_kernel` keeps the LUT in shared memory or falls back to a global-memory variant. A quick before/after on an A10 (`MaxSharedMemoryPerBlockOptin = 99 KiB`) running the CAGRA build path on a 768D / 1M-vector dataset (`pq_dim=192`, `pq_bits=8`): | `lut_dtype` | LutT in kernel template | EnableSMemLut | Per-kernel duration | |------------------------|-------------------------|---------------|---------------------| | `CUDA_R_32F` (default) | `float` | 0 | 187 ms | | `CUDA_R_16F` | `__half` | 0 | 166 ms | | `CUDA_R_8U` | `fp_8bit<5, 1>` | **1** | **119 ms** | End-to-end CAGRA `gpu_build` time on the same dataset goes from ~45 s (default) to ~26 s (`CUDA_R_8U`). Recall@100 delta on the converted HNSW index, normalized input, ef_search=100, 1000 queries: −0.25 pp for fp8, +0.09 pp for fp16, both within run-to-run noise. This is the lever the doc comment is pointing at; before this PR it just was not reachable from Python. Pull Request resolved: #5191 Test Plan: - New test: `faiss/gpu/test/test_lut_dtype_binding.py` (gated on `"CUVS" in faiss.get_compile_options()`). 5 cases — constants exported with the right values, default is `CUDA_R_32F`, set via `faiss.CUDA_R_*`, set via raw int, and field independence. All pass on a local `FAISS_ENABLE_CUVS=ON` build. - Existing `faiss/gpu/test/test_cagra.py` (18) and `faiss/gpu/test/test_binary_cagra.py` (4) unchanged, all pass. - Sanity sweep on the patched build of `tests/test_factory.py + test_index.py + test_index_accuracy.py + test_io.py + test_product_quantizer.py + test_fast_scan.py + test_fast_scan_ivf.py`: 407 pass, 1 unrelated skip, 0 fail. ## Out of scope This PR does **not** change the C++ default of `lut_dtype`. cuVS' own CAGRA wrapper (`cpp/include/cuvs/neighbors/graph_build_types.hpp`, `cagra::graph_build_params::ivf_pq_params` constructor) defaults to `CUDA_R_16F`, but aligning faiss with that default is a behaviour change that deserves more cross-GPU and cross-dataset validation than this PR carries. Best handled as a follow-up. ## Related There is no prior issue or PR tracking this — searched the repository issues, PRs, and the web. Happy to file a tracker issue if reviewers prefer that pattern. Reviewed By: mnorris11 Differential Revision: D104891046 Pulled By: bshethmeta fbshipit-source-id: 96b550957fd01a8f2c249228b835ba7c42ef3abd
1 parent 4b5a735 commit bc490b5

2 files changed

Lines changed: 79 additions & 0 deletions

File tree

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Copyright (c) Meta Platforms, Inc. and affiliates.
2+
#
3+
# This source code is licensed under the MIT license found in the
4+
# LICENSE file in the root directory of this source tree.
5+
6+
import unittest
7+
8+
import faiss
9+
10+
11+
@unittest.skipIf(
12+
"CUVS" not in faiss.get_compile_options(),
13+
"only if cuVS is compiled in")
14+
class TestIVFPQSearchCagraConfigDtypes(unittest.TestCase):
15+
"""Regression tests for the SWIG int typemap on cudaDataType_t.
16+
17+
Before the typemap was added, IVFPQSearchCagraConfig.lut_dtype and
18+
.internal_distance_dtype were exposed as SwigPyObject pointers that
19+
Python users could neither construct nor assign to, and the
20+
CUDA_R_* enum values were not exported. These tests pin the
21+
behavior that they are now plain ints settable from Python.
22+
"""
23+
24+
def test_constants_exported(self):
25+
self.assertEqual(faiss.CUDA_R_32F, 0)
26+
self.assertEqual(faiss.CUDA_R_64F, 1)
27+
self.assertEqual(faiss.CUDA_R_16F, 2)
28+
self.assertEqual(faiss.CUDA_R_8I, 3)
29+
self.assertEqual(faiss.CUDA_R_8U, 8)
30+
31+
def test_default_lut_dtype_is_fp32(self):
32+
c = faiss.IVFPQSearchCagraConfig()
33+
self.assertEqual(c.lut_dtype, faiss.CUDA_R_32F)
34+
self.assertEqual(c.internal_distance_dtype, faiss.CUDA_R_32F)
35+
36+
def test_set_lut_dtype_via_constants(self):
37+
c = faiss.IVFPQSearchCagraConfig()
38+
for value in (
39+
faiss.CUDA_R_16F,
40+
faiss.CUDA_R_8U,
41+
faiss.CUDA_R_32F,
42+
):
43+
c.lut_dtype = value
44+
self.assertEqual(c.lut_dtype, value)
45+
46+
def test_set_lut_dtype_via_raw_int(self):
47+
c = faiss.IVFPQSearchCagraConfig()
48+
c.lut_dtype = 2 # CUDA_R_16F
49+
self.assertEqual(c.lut_dtype, 2)
50+
51+
def test_dtype_fields_are_independent(self):
52+
c = faiss.IVFPQSearchCagraConfig()
53+
c.lut_dtype = faiss.CUDA_R_8U
54+
c.internal_distance_dtype = faiss.CUDA_R_16F
55+
self.assertEqual(c.lut_dtype, faiss.CUDA_R_8U)
56+
self.assertEqual(c.internal_distance_dtype, faiss.CUDA_R_16F)
57+
58+
59+
if __name__ == "__main__":
60+
unittest.main()

faiss/python/swigfaiss.swig

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -727,6 +727,25 @@ void gpu_sync_all_devices()
727727
%include <faiss/gpu/GpuClonerOptions.h>
728728
%include <faiss/gpu/GpuIndex.h>
729729
#ifdef FAISS_ENABLE_CUVS
730+
// IVFPQSearchCagraConfig.lut_dtype and .internal_distance_dtype are typed as
731+
// cudaDataType_t (a C enum from <library_types.h>). Without an int typemap
732+
// SWIG generates accessors that read/write a cudaDataType_t* SwigPyObject,
733+
// which Python users cannot construct, so the fields are effectively
734+
// read-only from Python and the documented values (CUDA_R_16F, CUDA_R_8U)
735+
// are not exported. Treat cudaDataType_t as int for SWIG and expose the
736+
// relevant enum values so:
737+
//
738+
// c = faiss.IVFPQSearchCagraConfig()
739+
// c.lut_dtype = faiss.CUDA_R_16F
740+
//
741+
// works as expected. Values match cudaDataType_t in <library_types.h>.
742+
typedef int cudaDataType_t;
743+
%apply int { cudaDataType_t };
744+
%constant int CUDA_R_32F = 0;
745+
%constant int CUDA_R_64F = 1;
746+
%constant int CUDA_R_16F = 2;
747+
%constant int CUDA_R_8I = 3;
748+
%constant int CUDA_R_8U = 8;
730749
%include <faiss/gpu/GpuIndexCagra.h>
731750
%include <faiss/gpu/GpuIndexBinaryCagra.h>
732751
#endif

0 commit comments

Comments
 (0)