You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- handles `f64`, `f32`, `f16`, and `bf16` real & complex vectors.
73
-
- handles `i8` integral, `i4` sub-byte, and `b8` binary vectors.
74
-
- handles sparse `u32` and `u16` sets, and weighted sparse vectors.
72
+
- handles `float64`, `float32`, `float16`, and `bfloat16` real & complex vectors.
73
+
- handles `int8` integral, `int4` sub-byte, and `b8` binary vectors.
74
+
- handles sparse `uint32` and `uint16` sets, and weighted sparse vectors.
75
75
- is a zero-dependency [header-only C 99](#using-simsimd-in-c) library.
76
76
- has [Python](#using-simsimd-in-python), [Rust](#using-simsimd-in-rust), [JS](#using-simsimd-in-javascript), and [Swift](#using-simsimd-in-swift) bindings.
77
77
- has Arm backends for NEON, Scalable Vector Extensions (SVE), and SVE2.
@@ -95,14 +95,14 @@ You can learn more about the technical implementation details in the following b
95
95
For reference, we use 1536-dimensional vectors, like the embeddings produced by the OpenAI Ada API.
96
96
Comparing the serial code throughput produced by GCC 12 to hand-optimized kernels in SimSIMD, we see the following single-core improvements for the two most common vector-vector similarity metrics - the Cosine similarity and the Euclidean distance:
97
97
98
-
| Type | Apple M2 Pro | Intel Sapphire Rapids | AWS Graviton 4 |
Similar speedups are often observed even when compared to BLAS and LAPACK libraries underlying most numerical computing libraries, including NumPy and SciPy in Python.
108
108
Broader benchmarking results:
@@ -115,8 +115,8 @@ Broader benchmarking results:
115
115
116
116
The package is intended to replace the usage of `numpy.inner`, `numpy.dot`, and `scipy.spatial.distance`.
117
117
Aside from drastic performance improvements, SimSIMD significantly improves accuracy in mixed precision setups.
118
-
NumPy and SciPy, processing `i8`, `u8` or `f16` vectors, will use the same types for accumulators, while SimSIMD can combine `i8` enumeration, `i16` multiplication, and `i32` accumulation to avoid overflows entirely.
119
-
The same applies to processing `f16` and `bf16` values with `f32` precision.
118
+
NumPy and SciPy, processing `int8`, `uint8` or `float16` vectors, will use the same types for accumulators, while SimSIMD can combine `int8` enumeration, `int16` multiplication, and `int32` accumulation to avoid overflows entirely.
119
+
The same applies to processing `float16` and `bfloat16` values with `float32` precision.
Unlike SciPy, SimSIMD allows explicitly stating the precision of the input vectors, which is especially useful for mixed-precision setups.
158
+
The `dtype` argument can be passed both by name and as a positional argument:
158
159
159
160
```py
160
-
dist = simsimd.cosine(vec1, vec2, "i8")
161
-
dist = simsimd.cosine(vec1, vec2, "f16")
162
-
dist = simsimd.cosine(vec1, vec2, "f32")
163
-
dist = simsimd.cosine(vec1, vec2, "f64")
164
-
dist = simsimd.hamming(vec1, vec2, "bits")
165
-
dist = simsimd.jaccard(vec1, vec2, "bits")
161
+
dist = simsimd.cosine(vec1, vec2, "int8")
162
+
dist = simsimd.cosine(vec1, vec2, "float16")
163
+
dist = simsimd.cosine(vec1, vec2, "float32")
164
+
dist = simsimd.cosine(vec1, vec2, "float64")
165
+
dist = simsimd.hamming(vec1, vec2, "bit8")
166
+
```
167
+
168
+
With other frameworks, like PyTorch, one can get a richer type-system than NumPy, but the lack of good CPython interoperability makes it hard to pass data without copies.
169
+
170
+
```py
171
+
import numpy as np
172
+
buf1 = np.empty(8, dtype=np.uint16)
173
+
buf2 = np.empty(8, dtype=np.uint16)
174
+
175
+
# View the same memory region with PyTorch and randomize it
distances_array: np.ndarray = np.array(distances, copy=True) # now managed by NumPy
236
255
```
237
256
257
+
### Elementwise Kernels
258
+
259
+
SimSIMD also provides mixed-precision elementwise kernels, where the input vectors and the output have the same numeric type, but the intermediate accumulators are of a higher precision.
metric="hamming", # Unlike SciPy, SimSIMD doesn't divide by the number of dimensions
252
-
out_dtype="u8", # so we can use `u8` instead of `f64` to save memory.
253
-
threads=0, # Use all CPU cores with OpenMP.
254
-
dtype="b8", # Override input argument type to `b8` eight-bit words.
312
+
metric="hamming", # Unlike SciPy, SimSIMD doesn't divide by the number of dimensions
313
+
out_dtype="uint8", # so we can use `uint8` instead of `float64` to save memory.
314
+
threads=0, # Use all CPU cores with OpenMP.
315
+
dtype="bin8", # Override input argument type to `bin8` eight-bit words.
255
316
)
256
317
```
257
318
258
-
By default, the output distances will be stored in double-precision `f64` floating-point numbers.
259
-
That behavior may not be space-efficient, especially if you are computing the hamming distance between short binary vectors, that will generally fit into 8x smaller `u8` or `u16` types.
319
+
By default, the output distances will be stored in double-precision `float64` floating-point numbers.
320
+
That behavior may not be space-efficient, especially if you are computing the hamming distance between short binary vectors, that will generally fit into 8x smaller `uint8` or `uint16` types.
260
321
To override this behavior, use the `dtype` argument.
261
322
262
323
### Helper Functions
@@ -575,7 +636,7 @@ Simplest of all, you can include the headers, and the compiler will automaticall
assert A.dtype == B.dtype and A.dtype == C.dtype, "Input types must match and affect the output style"
957
1011
return (Alpha * A * B + Beta * C).astype(A.dtype)
958
1012
```
@@ -1095,7 +1149,7 @@ All of the function names follow the same pattern: `simsimd_{function}_{type}_{b
1095
1149
- The type can be `f64`, `f32`, `f16`, `bf16`, `f64c`, `f32c`, `f16c`, `bf16c`, `i8`, or `b8`.
1096
1150
- The function can be `dot`, `vdot`, `cos`, `l2sq`, `hamming`, `jaccard`, `kl`, `js`, or `intersect`.
1097
1151
1098
-
To avoid hard-coding the backend, you can use the `simsimd_metric_punned_t` to pun the function pointer and the `simsimd_capabilities` function to get the available backends at runtime.
1152
+
To avoid hard-coding the backend, you can use the `simsimd_kernel_punned_t` to pun the function pointer and the `simsimd_capabilities` function to get the available backends at runtime.
1099
1153
To match all the function names, consider a RegEx:
0 commit comments