44
55Single Instruction, Multiple Data (SIMD) is used heavily in Faiss to speed up
66many types of operations. This includes AVX2, AVX512 (various flavors) for
7- x86_64 CPUs and NEON, SVE for ARM CPUs. SIMD code that is run on a machine
8- that does not support it will crash with SIGILL (illegal instruction signal),
9- therefore it is important to select the right implementation for the current
10- machine.
7+ x86_64 CPUs, NEON and SVE for ARM CPUs, and RVV for RISC-V CPUs. SIMD code
8+ that is run on a machine that does not support it will crash with SIGILL
9+ (illegal instruction signal), therefore it is important to select the right
10+ implementation for the current machine.
1111
1212Faiss is transitioning from a ** monolithic SIMD** model to a ** dynamic
1313dispatch** model. New code should be written with dynamic dispatch in mind.
@@ -211,12 +211,13 @@ FlatCodesDistanceComputer* get_distance_computer() {
211211```
212212
213213** Dispatch masks:** ` with_simd_level ` assumes NONE + AVX2 + AVX512 +
214- ARM_NEON implementations exist. If your function has another subset of available
215- implementations, it can be passed with
214+ ARM_NEON + RISCV_RVV implementations exist. If your function has another subset
215+ of available implementations, it can be passed with
216216` with_selected_simd_levels<mask> ` with a bitmask of available levels. Missing
217217levels in the mask cause the dispatch to ** fall through** to the next lower
218218level in the same architecture family (x86: AVX512_SPR → AVX512 → AVX2 →
219- NONE; ARM: ARM_SVE → ARM_NEON → NONE — x86 and ARM chains are independent):
219+ NONE; ARM: ARM_SVE → ARM_NEON → NONE; RISC-V: RISCV_RVV → NONE —
220+ architecture chains are independent):
220221
221222``` cpp
222223// Only NONE, AVX2, and ARM_SVE implementations exist.
@@ -237,7 +238,7 @@ your own with `(1 << int(SIMDLevel::X)) | ...`):
237238| ------| --------| ---------|
238239| ` AVAILABLE_SIMD_LEVELS_NONE ` | NONE only | Scalar-only functions |
239240| ` AVAILABLE_SIMD_LEVELS_AVX2_NEON ` | NONE, AVX2, ARM_NEON | 256-bit ` simdlib ` ops (` with_simd_level_256bit ` ) |
240- | ` AVAILABLE_SIMD_LEVELS_A0 ` | NONE, AVX2, AVX512, ARM_NEON | Default (` with_simd_level ` ) |
241+ | ` AVAILABLE_SIMD_LEVELS_A0 ` | NONE, AVX2, AVX512, ARM_NEON, RISCV_RVV | Default (` with_simd_level ` ) |
241242| ` AVAILABLE_SIMD_LEVELS_A1 ` | A0 + ARM_SVE | Functions with dedicated SVE implementations |
242243| ` AVAILABLE_SIMD_LEVELS_ALL ` | All levels | Identity / diagnostic functions |
243244
@@ -265,6 +266,10 @@ set(FAISS_SIMD_SVE_SRC
265266 # ... existing entries ...
266267 path/to/functions_sve.cpp # <-- add (if SVE implementation exists)
267268)
269+ set(FAISS_SIMD_RVV_SRC
270+ # ... existing entries ...
271+ path/to/functions_rvv.cpp # <-- add (if RVV implementation exists)
272+ )
268273# Also add any new headers to FAISS_HEADERS
269274```
270275
@@ -289,6 +294,7 @@ SIMD_FILES = {
289294 " path/to/functions_avx2.cpp" : (X86_64, AVX2 ),
290295 " path/to/functions_avx512.cpp" : (X86_64, AVX512 ),
291296 " path/to/functions_neon.cpp" : (AARCH64 , ARM_NEON ),
297+ " path/to/functions_rvv.cpp" : (RISCV64 , RISCV_RVV ),
292298}
293299# Also add headers to header_files()
294300```
@@ -377,9 +383,9 @@ The `simdlib` wrappers (`simd8float32_tpl`,
377383`simd8uint32_tpl`) provide portable 256-bit and 512-bit operations
378384across AVX2, AVX512 and NEON (two 128 bit NEON registers are clumped
379385together in 256 bits)
380- There is **no simdlib for SVE** (`simdlib_sve.h` does not exist).
381- Use raw intrinsics when you need SVE
382- ( variable-length vectors via `svcntw() `).
386+ There is **no simdlib for SVE or RVV ** (`simdlib_sve.h` and `simdlib_rvv.h`
387+ do not exist). Use raw intrinsics when you need SVE (variable-length vectors
388+ via `svcntw()`) or RVV ( variable-length vectors via `__riscv_vsetvl* `).
383389An example of usage is with `-inl.h` files
384390
385391**The include order matters** —
@@ -423,11 +429,17 @@ void my_kernel(...) {
423429 factory/constructor boundary. The constructed object carries its
424430 ` SIMDLevel ` as a compile-time template parameter.
425431
426- 5 . ** Private dispatch machinery.** ` simd_dispatch.h ` is internal — do not
432+ 5 . ** Variable-width SIMD is not fixed-width simdlib.** SVE and RVV are
433+ variable-width architectures. Do not route them through fixed-width helpers
434+ such as ` with_simd_level_256bit ` , ` with_simd_level_512bit ` , or
435+ ` simd8float32_tpl ` unless an explicit selector maps them to a supported
436+ fixed-width fallback.
437+
438+ 6 . ** Private dispatch machinery.** ` simd_dispatch.h ` is internal — do not
427439 include in public headers. The public API is ` SIMDConfig ` and ` SIMDLevel `
428440 in ` utils/simd_levels.h ` .
429441
430- 6 . ** Build system parity.** Every change must be reflected in both
442+ 7 . ** Build system parity.** Every change must be reflected in both
431443 CMakeLists.txt and Buck's xplat.bzl.
432444
433445## Conversion approach
@@ -470,12 +482,17 @@ cd build_dd && ctest --output-on-failure
470482# Verify dispatch at different levels (DD mode only)
471483FAISS_SIMD_LEVEL=NONE ctest --output-on-failure
472484FAISS_SIMD_LEVEL=AVX2 ctest --output-on-failure
485+ FAISS_SIMD_LEVEL=RISCV_RVV ctest --output-on-failure
473486
474487# Also build/test static modes for comparison
475488cmake -B build_avx2 -DFAISS_OPT_LEVEL=avx2 -DBUILD_TESTING=ON .
476489cmake --build build_avx2 -j$( nproc) && cd build_avx2 && ctest --output-on-failure
477490```
478491
492+ For RVV, build on ` riscv64 ` or cross-build with RISC-V flags and run the
493+ resulting tests under hardware or QEMU with vector support enabled, for example
494+ ` QEMU_CPU=rv64,v=true ` .
495+
479496### Buck (internal)
480497
481498``` bash
@@ -503,3 +520,6 @@ buck2 test -c faiss.dynamic_dispatch=true fbcode//faiss/tests:test_your_module
503520- Building with CMake's default ` FAISS_OPT_LEVEL=generic ` and thinking DD is
504521 enabled — generic mode has no SIMD and no dispatch. Use
505522 ` FAISS_OPT_LEVEL=dd ` explicitly.
523+ - Treating SVE or RVV as fixed-width ` simdlib ` backends — they are
524+ variable-width ISAs and need raw-intrinsic implementations or explicit scalar
525+ fallbacks for fixed-width helper paths.
0 commit comments