Skip to content

Commit ccd3761

Browse files
algoriddlemeta-codesync[bot]
authored andcommitted
Add FastScanCodeScanner dispatch boundary with per-SIMD TUs
Summary: Add `FastScanCodeScanner`, a virtual base that bundles handler + kernel behind the SIMD dispatch boundary. In DD mode, `SINGLE_SIMD_LEVEL = NONE` so the existing fast scan code path uses emulated SIMD types. The new scanner provides per-SIMD translation units (AVX2, AVX512, ARM_NEON) compiled with the correct ISA flags, and a factory function (`make_fast_scan_knn_scanner`) that uses `DISPATCH_SIMDLevel` to select the right TU at runtime. This follows the proven `THE_LEVEL_TO_DISPATCH` pattern from the scalar quantizer per-SIMD TUs (`sq-dispatch.h`). Each per-SIMD TU includes `dispatching.h` which provides: - `ScannerMixIn<Handler>`: wraps a concrete handler and calls accumulation kernels (both search_1 multi-BB and QBS paths) - Factory specialization `make_fast_scan_scanner_impl<SL>()` with combinatorial dispatch over `is_max × with_id_map × handler_type` (SingleResultHandler for k=1, HeapHandler for k≤20, ReservoirHandler for k>20) New files: - `impl/fast_scan/dispatching.h` — dispatch template header - `impl/fast_scan/impl-avx2.cpp` — AVX2 per-SIMD TU - `impl/fast_scan/impl-avx512.cpp` — AVX512 per-SIMD TU - `impl/fast_scan/impl-neon.cpp` — ARM NEON TU (with ARM_SVE forwarding) Modified files: - `impl/fast_scan/pq4_fast_scan.h` — FastScanCodeScanner base + factory decl - `impl/fast_scan/pq4_fast_scan.cpp` — NONE specialization + dispatch wrapper - `xplat.bzl` / `CMakeLists.txt` — register SIMD files and header Note: RaBitQ handler is not wired through FastScanCodeScanner in this diff. That comes in later diffs when callers are switched. Differential Revision: D95950483
1 parent 86c3618 commit ccd3761

7 files changed

Lines changed: 581 additions & 0 deletions

File tree

faiss/CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,19 @@
99
# Architecture-specific: only include files for the current build architecture
1010
# =============================================================================
1111
set(FAISS_SIMD_AVX2_SRC
12+
impl/fast_scan/impl-avx2.cpp
1213
impl/pq_code_distance/pq_code_distance-avx2.cpp
1314
impl/scalar_quantizer/sq-avx2.cpp
1415
utils/simd_impl/distances_avx2.cpp
1516
)
1617
set(FAISS_SIMD_AVX512_SRC
18+
impl/fast_scan/impl-avx512.cpp
1719
impl/pq_code_distance/pq_code_distance-avx512.cpp
1820
impl/scalar_quantizer/sq-avx512.cpp
1921
utils/simd_impl/distances_avx512.cpp
2022
)
2123
set(FAISS_SIMD_NEON_SRC
24+
impl/fast_scan/impl-neon.cpp
2225
impl/scalar_quantizer/sq-neon.cpp
2326
utils/simd_impl/distances_aarch64.cpp
2427
)
@@ -262,6 +265,7 @@ set(FAISS_HEADERS
262265
impl/kmeans1d.h
263266
impl/lattice_Zn.h
264267
impl/platform_macros.h
268+
impl/fast_scan/dispatching.h
265269
impl/fast_scan/pq4_fast_scan.h
266270
impl/fast_scan/decompose_qbs.h
267271
impl/fast_scan/kernels_simd256.h

0 commit comments

Comments
 (0)