Make IVFPQSearchCagraConfig dtype fields settable from Python (#5191)

nanlongyu · meta-codesync[bot] · commit bc490b56b089 · 2026-05-12T17:27:28.000-07:00
Summary: `IVFPQSearchCagraConfig.lut_dtype` and `.internal_distance_dtype` are typed as `cudaDataType_t` (a C enum from CUDA `<library_types.h>`). Without an int typemap SWIG generates accessors that read/write a `cudaDataType_t*` `SwigPyObject`, which Python users cannot construct, so the fields are effectively read-only from Python: ```python >>> import faiss >>> c = faiss.IVFPQSearchCagraConfig() >>> c.lut_dtype <Swig Object of type 'cudaDataType_t *' at 0x...> >>> c.lut_dtype = 2 # CUDA_R_16F TypeError: in method 'IVFPQSearchCagraConfig_lut_dtype_set', argument 2 of type 'cudaDataType_t' >>> faiss.CUDA_R_16F # also not exported AttributeError: module 'faiss' has no attribute 'CUDA_R_16F' ``` The doc comment on `IVFPQSearchCagraConfig.lut_dtype` (`faiss/gpu/GpuIndexCagra.h`) recommends low-precision LUT for large-dimension datasets — *"fast shared memory kernels can be used even for datasets with large dimensionality"* — but the Python binding does not let users act on that recommendation. After this PR: ```python c = faiss.IVFPQSearchCagraConfig() c.lut_dtype = faiss.CUDA_R_16F # or CUDA_R_8U / CUDA_R_32F c.internal_distance_dtype = faiss.CUDA_R_16F ``` The C++ ABI is unchanged; only the SWIG wrapper for these existing fields changes. The typemap is added inside the `FAISS_ENABLE_CUVS` block because that is where `cudaDataType_t` is referenced. ## Why it matters `lut_dtype` controls whether cuVS' IVF-PQ `compute_similarity_kernel` keeps the LUT in shared memory or falls back to a global-memory variant. A quick before/after on an A10 (`MaxSharedMemoryPerBlockOptin = 99 KiB`) running the CAGRA build path on a 768D / 1M-vector dataset (`pq_dim=192`, `pq_bits=8`): | `lut_dtype` | LutT in kernel template | EnableSMemLut | Per-kernel duration | |------------------------|-------------------------|---------------|---------------------| | `CUDA_R_32F` (default) | `float` | 0 | 187 ms | | `CUDA_R_16F` | `__half` | 0 | 166 ms | | `CUDA_R_8U` | `fp_8bit<5, 1>` | **1** | **119 ms** | End-to-end CAGRA `gpu_build` time on the same dataset goes from ~45 s (default) to ~26 s (`CUDA_R_8U`). Recall@100 delta on the converted HNSW index, normalized input, ef_search=100, 1000 queries: −0.25 pp for fp8, +0.09 pp for fp16, both within run-to-run noise. This is the lever the doc comment is pointing at; before this PR it just was not reachable from Python. Pull Request resolved: #5191 Test Plan: - New test: `faiss/gpu/test/test_lut_dtype_binding.py` (gated on `"CUVS" in faiss.get_compile_options()`). 5 cases — constants exported with the right values, default is `CUDA_R_32F`, set via `faiss.CUDA_R_*`, set via raw int, and field independence. All pass on a local `FAISS_ENABLE_CUVS=ON` build. - Existing `faiss/gpu/test/test_cagra.py` (18) and `faiss/gpu/test/test_binary_cagra.py` (4) unchanged, all pass. - Sanity sweep on the patched build of `tests/test_factory.py + test_index.py + test_index_accuracy.py + test_io.py + test_product_quantizer.py + test_fast_scan.py + test_fast_scan_ivf.py`: 407 pass, 1 unrelated skip, 0 fail. ## Out of scope This PR does **not** change the C++ default of `lut_dtype`. cuVS' own CAGRA wrapper (`cpp/include/cuvs/neighbors/graph_build_types.hpp`, `cagra::graph_build_params::ivf_pq_params` constructor) defaults to `CUDA_R_16F`, but aligning faiss with that default is a behaviour change that deserves more cross-GPU and cross-dataset validation than this PR carries. Best handled as a follow-up. ## Related There is no prior issue or PR tracking this — searched the repository issues, PRs, and the web. Happy to file a tracker issue if reviewers prefer that pattern. Reviewed By: mnorris11 Differential Revision: D104891046 Pulled By: bshethmeta fbshipit-source-id: 96b550957fd01a8f2c249228b835ba7c42ef3abd
diff --git a/faiss/gpu/test/test_lut_dtype_binding.py b/faiss/gpu/test/test_lut_dtype_binding.py
@@ -0,0 +1,60 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+#
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+
+import unittest
+
+import faiss
+
+
+@unittest.skipIf(
+    "CUVS" not in faiss.get_compile_options(),
+    "only if cuVS is compiled in")
+class TestIVFPQSearchCagraConfigDtypes(unittest.TestCase):
+    """Regression tests for the SWIG int typemap on cudaDataType_t.
+
+    Before the typemap was added, IVFPQSearchCagraConfig.lut_dtype and
+    .internal_distance_dtype were exposed as SwigPyObject pointers that
+    Python users could neither construct nor assign to, and the
+    CUDA_R_* enum values were not exported. These tests pin the
+    behavior that they are now plain ints settable from Python.
+    """
+
+    def test_constants_exported(self):
+        self.assertEqual(faiss.CUDA_R_32F, 0)
+        self.assertEqual(faiss.CUDA_R_64F, 1)
+        self.assertEqual(faiss.CUDA_R_16F, 2)
+        self.assertEqual(faiss.CUDA_R_8I, 3)
+        self.assertEqual(faiss.CUDA_R_8U, 8)
+
+    def test_default_lut_dtype_is_fp32(self):
+        c = faiss.IVFPQSearchCagraConfig()
+        self.assertEqual(c.lut_dtype, faiss.CUDA_R_32F)
+        self.assertEqual(c.internal_distance_dtype, faiss.CUDA_R_32F)
+
+    def test_set_lut_dtype_via_constants(self):
+        c = faiss.IVFPQSearchCagraConfig()
+        for value in (
+            faiss.CUDA_R_16F,
+            faiss.CUDA_R_8U,
+            faiss.CUDA_R_32F,
+        ):
+            c.lut_dtype = value
+            self.assertEqual(c.lut_dtype, value)
+
+    def test_set_lut_dtype_via_raw_int(self):
+        c = faiss.IVFPQSearchCagraConfig()
+        c.lut_dtype = 2  # CUDA_R_16F
+        self.assertEqual(c.lut_dtype, 2)
+
+    def test_dtype_fields_are_independent(self):
+        c = faiss.IVFPQSearchCagraConfig()
+        c.lut_dtype = faiss.CUDA_R_8U
+        c.internal_distance_dtype = faiss.CUDA_R_16F
+        self.assertEqual(c.lut_dtype, faiss.CUDA_R_8U)
+        self.assertEqual(c.internal_distance_dtype, faiss.CUDA_R_16F)
+
+
+if __name__ == "__main__":
+    unittest.main()
diff --git a/faiss/python/swigfaiss.swig b/faiss/python/swigfaiss.swig
@@ -727,6 +727,25 @@ void gpu_sync_all_devices()
 %include  <faiss/gpu/GpuClonerOptions.h>
 %include  <faiss/gpu/GpuIndex.h>
 #ifdef FAISS_ENABLE_CUVS
+// IVFPQSearchCagraConfig.lut_dtype and .internal_distance_dtype are typed as
+// cudaDataType_t (a C enum from <library_types.h>). Without an int typemap
+// SWIG generates accessors that read/write a cudaDataType_t* SwigPyObject,
+// which Python users cannot construct, so the fields are effectively
+// read-only from Python and the documented values (CUDA_R_16F, CUDA_R_8U)
+// are not exported. Treat cudaDataType_t as int for SWIG and expose the
+// relevant enum values so:
+//
+//     c = faiss.IVFPQSearchCagraConfig()
+//     c.lut_dtype = faiss.CUDA_R_16F
+//
+// works as expected. Values match cudaDataType_t in <library_types.h>.
+typedef int cudaDataType_t;
+%apply int { cudaDataType_t };
+%constant int CUDA_R_32F = 0;
+%constant int CUDA_R_64F = 1;
+%constant int CUDA_R_16F = 2;
+%constant int CUDA_R_8I  = 3;
+%constant int CUDA_R_8U  = 8;
 %include  <faiss/gpu/GpuIndexCagra.h>
 %include  <faiss/gpu/GpuIndexBinaryCagra.h>
 #endif