Commit 0bd934e
committed
Add Sapphire Rapids optimizations for ScalarQuantizer (L2, IP)
Adds an AVX512_SPR specialization path for ScalarQuantizer that uses
Sapphire Rapids-specific instructions for byte-code distance computation
on QT_8bit_direct and QT_8bit_direct_signed.
Inner product (8-bit codes):
Replaces the AVX512 path that processes 16 bytes per iteration via
cvtepu8_epi32 + mullo_epi32 with a VNNI loop that processes 64 bytes
per iteration using _mm512_dpbusd_epi32. VNNI computes unsigned*signed
dot products, so the standard bias trick is used to bridge
unsigned*unsigned: subtract 128 from code2, run dpbusd, then add the
128 * sum(code1) correction. A scalar tail handles d % 64.
For QT_8bit_direct_signed (storage = value + 128), the same VNNI loop
runs and an additional closed-form correction is applied:
(a-128) * (b-128) = a*b - 128*(a+b) + 16384
sum(a) and sum(b) are accumulated cheaply via _mm512_sad_epu8 (one
PSADBW per 64-byte iteration).
L2 (8-bit codes):
Replaces the 16-bytes-per-iter cvtepu8_epi32 + sub + mullo_epi32 path
with a 16-bit pipeline: load 64 bytes, zero-extend to 16-bit lanes via
_mm512_cvtepu8_epi16, subtract in 16-bit, square-and-accumulate to
32-bit with _mm512_madd_epi16. Squared differences of two uint8_t
values fit in 16 bits (max 255^2 = 65025), so the widened
representation is safe. Falls through to a 32-byte step and a scalar
tail for arbitrary d. The same kernel is bit-exact for the signed
variant: (a - 128) - (b - 128) == a - b, so no correction is needed.
Signed-off-by: Mulugeta Mammo <[email protected]>1 parent 9d5491a commit 0bd934e
7 files changed
Lines changed: 446 additions & 8 deletions
File tree
- faiss
- impl
- scalar_quantizer
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
37 | 40 | | |
38 | 41 | | |
39 | 42 | | |
| |||
61 | 64 | | |
62 | 65 | | |
63 | 66 | | |
64 | | - | |
| 67 | + | |
65 | 68 | | |
66 | 69 | | |
67 | 70 | | |
| |||
461 | 464 | | |
462 | 465 | | |
463 | 466 | | |
464 | | - | |
| 467 | + | |
465 | 468 | | |
466 | 469 | | |
467 | 470 | | |
468 | 471 | | |
469 | 472 | | |
470 | | - | |
| 473 | + | |
471 | 474 | | |
472 | 475 | | |
473 | 476 | | |
| |||
525 | 528 | | |
526 | 529 | | |
527 | 530 | | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
528 | 536 | | |
529 | 537 | | |
530 | 538 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
| 157 | + | |
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
| |||
197 | 197 | | |
198 | 198 | | |
199 | 199 | | |
200 | | - | |
| 200 | + | |
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
| |||
216 | 216 | | |
217 | 217 | | |
218 | 218 | | |
219 | | - | |
| 219 | + | |
220 | 220 | | |
221 | 221 | | |
222 | 222 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
80 | 86 | | |
81 | 87 | | |
82 | 88 | | |
| |||
0 commit comments