You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-9Lines changed: 10 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -91,15 +91,16 @@ You can learn more about the technical implementation details in the following b
91
91
## Benchmarks
92
92
93
93
For reference, we use 1536-dimensional vectors, like the embeddings produced by the OpenAI Ada API.
94
-
Comparing the serial code throughput produced by GCC 12 to hand-optimized kernels in SimSIMD, we see the following single-core improvements:
94
+
Comparing the serial code throughput produced by GCC 12 to hand-optimized kernels in SimSIMD, we see the following single-core improvements for the two most common vector-vector similarity metrics - the Cosine similarity and the Euclidean distance:
95
95
96
-
| Type | Apple M2 Pro | AMD Genoa | AWS Graviton 4 |
Similar speedups are often observed even when compared to BLAS and LAPACK libraries underlying most numerical computing libraries, including NumPy and SciPy in Python.
105
106
Broader benchmarking results:
@@ -112,7 +113,7 @@ Broader benchmarking results:
112
113
113
114
The package is intended to replace the usage of `numpy.inner`, `numpy.dot`, and `scipy.spatial.distance`.
114
115
Aside from drastic performance improvements, SimSIMD significantly improves accuracy in mixed precision setups.
115
-
NumPy and SciPy, processing `i8` or `f16` vectors, will use the same types for accumulators, while SimSIMD can combine `i8` enumeration, `i16` multiplication, and `i32` accumulation to avoid overflows entirely.
116
+
NumPy and SciPy, processing `i8`, `u8` or `f16` vectors, will use the same types for accumulators, while SimSIMD can combine `i8` enumeration, `i16` multiplication, and `i32` accumulation to avoid overflows entirely.
116
117
The same applies to processing `f16` and `bf16` values with `f32` precision.
0 commit comments