Commit 28b2b66
Inline PQ code distance kernels into scanner TUs (#5159)
Summary:
Pull Request resolved: #5159
After the SIMD dispatch refactoring, the PQ code distance implementations
(pq_code_distance_single_impl<SL>, pq_code_distance_four_impl<SL>) lived in
separate translation units from the scanner loops (IVFPQScannerT::scan_list_with_table,
PQDistanceComputer::distance_to_code). The compiler could not inline the SIMD
gather/accumulate code into the hot inner loops.
This diff converts the per-SIMD pq_code_distance .cpp files to .h headers and
includes them in the corresponding scanner TUs before the scanner _impl.h
includes. This puts the kernel definitions in the same TU, enabling the compiler
to inline the AVX2/AVX512 vgatherdps code directly into scan_list_with_table and
scan_list_polysemous_hc.
Changes:
- pq_code_distance-avx2.cpp → pq_code_distance-avx2.h (header, #pragma once)
- pq_code_distance-avx512.cpp → pq_code_distance-avx512.h (same)
- New pq_code_distance-generic.h with inline NONE/ARM_NEON specializations
- Scanner TUs (avx2.cpp, avx512.cpp, neon.cpp) include the PQ distance headers
- PQCodeDistance gains static constexpr simd_level member
- scan_list_polysemous uses PQCodeDist::simd_level for Hamming computer dispatch
(was hardcoded to SIMDLevel::NONE)
- Scanner TUs include per-ISA hamming_computer headers for SIMD Hamming dispatch
- Build files (xplat.bzl, CMakeLists.txt) updated
Verified via objdump that scan_list_with_table and scan_list_polysemous_hc contain
zero calls to pq_code_distance_*_impl — all AVX2/AVX512 gather code is fully inlined.
Reviewed By: algoriddle
Differential Revision: D102942787
fbshipit-source-id: d2979f9cd629652ac1de4886afd3c335fbb4fac81 parent 417c53e commit 28b2b66
19 files changed
Lines changed: 1954 additions & 1037 deletions
File tree
- faiss
- impl/pq_code_distance
- python
- utils/simd_impl
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| |||
133 | 134 | | |
134 | 135 | | |
135 | 136 | | |
| 137 | + | |
136 | 138 | | |
137 | 139 | | |
138 | 140 | | |
| |||
311 | 313 | | |
312 | 314 | | |
313 | 315 | | |
314 | | - | |
315 | 316 | | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
316 | 323 | | |
317 | 324 | | |
318 | 325 | | |
| |||
0 commit comments