Commit bc490b5
Make IVFPQSearchCagraConfig dtype fields settable from Python (#5191)
Summary:
`IVFPQSearchCagraConfig.lut_dtype` and `.internal_distance_dtype` are
typed as `cudaDataType_t` (a C enum from CUDA `<library_types.h>`).
Without an int typemap SWIG generates accessors that read/write a
`cudaDataType_t*` `SwigPyObject`, which Python users cannot construct,
so the fields are effectively read-only from Python:
```python
>>> import faiss
>>> c = faiss.IVFPQSearchCagraConfig()
>>> c.lut_dtype
<Swig Object of type 'cudaDataType_t *' at 0x...>
>>> c.lut_dtype = 2 # CUDA_R_16F
TypeError: in method 'IVFPQSearchCagraConfig_lut_dtype_set',
argument 2 of type 'cudaDataType_t'
>>> faiss.CUDA_R_16F # also not exported
AttributeError: module 'faiss' has no attribute 'CUDA_R_16F'
```
The doc comment on `IVFPQSearchCagraConfig.lut_dtype`
(`faiss/gpu/GpuIndexCagra.h`) recommends low-precision LUT for
large-dimension datasets — *"fast shared memory kernels can be used
even for datasets with large dimensionality"* — but the Python binding
does not let users act on that recommendation.
After this PR:
```python
c = faiss.IVFPQSearchCagraConfig()
c.lut_dtype = faiss.CUDA_R_16F # or CUDA_R_8U / CUDA_R_32F
c.internal_distance_dtype = faiss.CUDA_R_16F
```
The C++ ABI is unchanged; only the SWIG wrapper for these existing
fields changes. The typemap is added inside the `FAISS_ENABLE_CUVS`
block because that is where `cudaDataType_t` is referenced.
## Why it matters
`lut_dtype` controls whether cuVS' IVF-PQ `compute_similarity_kernel`
keeps the LUT in shared memory or falls back to a global-memory
variant. A quick before/after on an A10
(`MaxSharedMemoryPerBlockOptin = 99 KiB`) running the CAGRA build path
on a 768D / 1M-vector dataset (`pq_dim=192`, `pq_bits=8`):
| `lut_dtype` | LutT in kernel template | EnableSMemLut | Per-kernel duration |
|------------------------|-------------------------|---------------|---------------------|
| `CUDA_R_32F` (default) | `float` | 0 | 187 ms |
| `CUDA_R_16F` | `__half` | 0 | 166 ms |
| `CUDA_R_8U` | `fp_8bit<5, 1>` | **1** | **119 ms** |
End-to-end CAGRA `gpu_build` time on the same dataset goes from ~45 s
(default) to ~26 s (`CUDA_R_8U`). Recall@100 delta on the converted
HNSW index, normalized input, ef_search=100, 1000 queries:
−0.25 pp for fp8, +0.09 pp for fp16, both within run-to-run noise.
This is the lever the doc comment is pointing at; before this PR it
just was not reachable from Python.
Pull Request resolved: #5191
Test Plan:
- New test: `faiss/gpu/test/test_lut_dtype_binding.py` (gated on
`"CUVS" in faiss.get_compile_options()`). 5 cases — constants
exported with the right values, default is `CUDA_R_32F`, set via
`faiss.CUDA_R_*`, set via raw int, and field independence. All pass
on a local `FAISS_ENABLE_CUVS=ON` build.
- Existing `faiss/gpu/test/test_cagra.py` (18) and
`faiss/gpu/test/test_binary_cagra.py` (4) unchanged, all pass.
- Sanity sweep on the patched build of `tests/test_factory.py +
test_index.py + test_index_accuracy.py + test_io.py +
test_product_quantizer.py + test_fast_scan.py +
test_fast_scan_ivf.py`: 407 pass, 1 unrelated skip, 0 fail.
## Out of scope
This PR does **not** change the C++ default of `lut_dtype`. cuVS' own
CAGRA wrapper (`cpp/include/cuvs/neighbors/graph_build_types.hpp`,
`cagra::graph_build_params::ivf_pq_params` constructor) defaults to
`CUDA_R_16F`, but aligning faiss with that default is a behaviour
change that deserves more cross-GPU and cross-dataset validation than
this PR carries. Best handled as a follow-up.
## Related
There is no prior issue or PR tracking this — searched the repository
issues, PRs, and the web. Happy to file a tracker issue if reviewers
prefer that pattern.
Reviewed By: mnorris11
Differential Revision: D104891046
Pulled By: bshethmeta
fbshipit-source-id: 96b550957fd01a8f2c249228b835ba7c42ef3abd1 parent 4b5a735 commit bc490b5
2 files changed
Lines changed: 79 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
727 | 727 | | |
728 | 728 | | |
729 | 729 | | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
730 | 749 | | |
731 | 750 | | |
732 | 751 | | |
| |||
0 commit comments