Skip to content

Commit ccc934f

Browse files
algoriddlemeta-codesync[bot]
authored andcommitted
ScalarQuantizer: split SIMD specializations into per-SIMD TUs + DD dispatch (#4839)
Summary: Pull Request resolved: #4839 Split the SIMD-gated template specializations out of ScalarQuantizer.cpp and the shared headers into per-SIMD compilation units and wire up the Dynamic Dispatch (DD) infrastructure (COMPILE_SIMD_*, with_simd_level). **What moved where** - SIMD specializations removed from `codecs.h`, `quantizers.h`, `similarities.h`, `distance_computers.h` — these now contain only primary templates and scalar (`SIMDLevel::NONE`) specializations. (Most use empty primary templates; `quantizers.h` uses an inheriting fallback pattern for `QuantizerFP16`, `QuantizerBF16`, etc.) - SIMD specializations moved into `sq-avx2.cpp` / `sq-avx512.cpp` / `sq-neon.cpp`, each guarded by `COMPILE_SIMD_*`. - `sq-generic.cpp` deleted — the `NONE` level is now instantiated directly in `ScalarQuantizer.cpp` via `sq-dispatch.h`. - `sq-inl.h` renamed to `scanners.h`. **Dispatch mechanism** - `sq-dispatch.h` is an X-macro-style header: each per-SIMD `.cpp` file `#define`s `THE_LEVEL_TO_DISPATCH` and `#include`s it to stamp out explicit template specializations of the selection functions (`sq_select_quantizer`, `sq_select_distance_computer`, `sq_select_InvertedListScanner`). - `ScalarQuantizer.cpp` uses `with_simd_level` for runtime dispatch and instantiates the `NONE` level via the same `sq-dispatch.h`. - Each per-SIMD selection function returns `nullptr` when the dimension doesn't align, and the caller falls back to `NONE`. - `sq-neon.cpp` handles both `ARM_NEON` and `ARM_SVE` (SVE forwards to NEON — no dedicated SVE SQ implementation yet). **Build** - `xplat.bzl`, `CMakeLists.txt` — register new SIMD source files and headers. - Within the SQ module, `COMPILE_SIMD_*` macros gate all SIMD code paths. (Compiler-defined macros like `__AVX2__` are still used in lower-level shared headers like `simdlib.h` and `fp16.h`.) Reviewed By: mdouze Differential Revision: D94375408 fbshipit-source-id: a07c31540242defcc605dd74e07bd25b8c163f43
1 parent db9ba35 commit ccc934f

14 files changed

Lines changed: 1980 additions & 1699 deletions

faiss/CMakeLists.txt

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,16 @@
1010
# =============================================================================
1111
set(FAISS_SIMD_AVX2_SRC
1212
impl/pq_code_distance/pq_code_distance-avx2.cpp
13+
impl/scalar_quantizer/sq-avx2.cpp
1314
utils/simd_impl/distances_avx2.cpp
1415
)
1516
set(FAISS_SIMD_AVX512_SRC
1617
impl/pq_code_distance/pq_code_distance-avx512.cpp
18+
impl/scalar_quantizer/sq-avx512.cpp
1719
utils/simd_impl/distances_avx512.cpp
1820
)
1921
set(FAISS_SIMD_NEON_SRC
22+
impl/scalar_quantizer/sq-neon.cpp
2023
utils/simd_impl/distances_aarch64.cpp
2124
)
2225
set(FAISS_SIMD_SVE_SRC
@@ -246,7 +249,9 @@ set(FAISS_HEADERS
246249
impl/scalar_quantizer/codecs.h
247250
impl/scalar_quantizer/distance_computers.h
248251
impl/scalar_quantizer/quantizers.h
252+
impl/scalar_quantizer/scanners.h
249253
impl/scalar_quantizer/similarities.h
254+
impl/scalar_quantizer/sq-dispatch.h
250255
impl/scalar_quantizer/training.h
251256
impl/ThreadedIndex-inl.h
252257
impl/ThreadedIndex.h
@@ -440,7 +445,7 @@ if(FAISS_OPT_LEVEL STREQUAL "dd")
440445
set_source_files_properties(${FAISS_SIMD_AVX512_SRC}
441446
TARGET_DIRECTORY faiss
442447
PROPERTIES COMPILE_OPTIONS
443-
"-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mpopcnt"
448+
"-mavx512f;-mavx512cd;-mavx512vl;-mavx512dq;-mavx512bw;-mfma;-mf16c;-mpopcnt"
444449
)
445450
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "(aarch64|arm64|ARM64)")
446451
# ARM NEON is always available on aarch64, no special compiler flags needed

0 commit comments

Comments
 (0)