Releases
v7.5.0
v7.5: Parallelism & Portability
Compare
Sorry, something went wrong.
No results found
Built-in OpenMP bundling for JS & Python π
Intel Granite Rapids πͺ¨ F16 β F32 GEMMs π
Faster bit-vector population counts for Arm NEON π¦Ύ
SME compatibility with non-Apple Clang on Apple machines π
Hardening against MSan SVE false-positives, thanks to @alexey-milovidov π¦Ί
Hardening against GCC 13 Arm NEON code-gen bugs, thanks to @swasik π
_into & _parallel GEMM Rust APIs: reusing memory & ForkUnion pools π
De-vectorize serial kernels with compiler flags π
Compress source & binary distributions for Windows ποΈ
Pre-build & share FreeBSD, PowerPC, RISC-V, & LoongArch libs π€
Minor
Add: NEON popcount kernel for nk_reduce_moments_u1 (2181e0c )
Add: Tensor constructors, sealed trait family, div_ceil cleanup (2792279 )
Add: Span-based matrix _into APIs, parallel Hammings/Jaccards, full-crate docs (99289df )
Add: OpenMP for Python & JavaScript (499ecc9 )
Add: Granite Rapids AMX for F16 & F32 (28036ea )
Patch
Fix: Native ISA probe on Apple Clang + compile/runtime glyph (bc13e02 )
Make: Detect illegal instructions in macOS CI (289cdaf )
Fix: Drop -march= on macOS setup.py builds (28aac74 )
Fix: Exclude std::signal from WASM builds (14814c5 )
Improve: Drop GNU statement-expression macros in SVE reduce helpers (b8b4ca0 )
Make: Drop +nosimd from AArch64 baseline (23f5195 )
Make: Forbid auto-vectorization in portable baseline builds (43e8324 )
Make: Pin TU baseline to per-arch ABI floor across build systems (453ed5f )
Fix: Mitigate GCC 13 wrong BF16 splat in Arm NEON (#346 ) (fc3d8ec )
Improve: Log faulting capability detection (a401f8a )
Improve: Log faulting kernel on fatal signals in nk_test (22c7c79 )
Make: Normalize Python test dependencies across CI and docs (8a0f3d4 )
Make: Baseline-only ISA for shared-library test, harden Windows CI (1907685 )
Fix: Wrong compiler probes for SMEBF16 & SMEBI32 (8b19ddb )
Make: Log host CPU capabilities in macOS and Windows CI jobs (988eeb2 )
Fix: Pre-declare OpenMP loop counter, universal libomp for macOS (493a021 )
Fix: Use int for OpenMP loop counters, absolute libomp install name (ccc0118 )
Fix: GCC requires +sme prefix in target attribute for _arm_sc * stubs (291dc0a )
Fix: Signed OpenMP iterators, source-built libomp, JS KMP guard (dc1ae75 )
Fix: OpenMP wheel builds on macOS and Windows (f569121 )
Fix: Add target("sme") to _arm_sc * stubs for GCC compatibility (ad2add0 )
Fix: Unpoison SVE scalar reductions for MemorySanitizer (#342 ) (b42eda7 )
Improve: Move SME runtime stubs to types.h as weak inline definitions (64ca934 )
Improve: Manual SME streaming control, single enter/exit per API call (6432837 )
Fix: Update cdist edge-case test for re-added threads= kwarg (50681af )
Make: Allow force-enabling ISA targets via environment variables (0e58702 )
Improve: Abandon F32βF64 via Ozaki on Granite Rapids (94a5f19 )
Make: FreeBSD, PPC64le, LoongArch, RISC-V releases & compress Windows (a9a0d83 )
Make: Standardize CI compilers and add Windows test job (9a22ea4 )
Make: Shrink serial fallbacks with scoped size optimization (83154a8 )
Make: Compress Windows builds (e30ad3d )
Fix: Streaming-compatible stubs for LLVM SME builds (0be7b2f )
You canβt perform that action at this time.