All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- ferray-strings: StringArray is completely separate from Array — no trait interop (#514)
- ferray-linalg: no complex matrix support — LinalgFloat is sealed to f32/f64 (#404)
- ferray-ufunc: no vectorized math library for transcendentals (sleef/svml equivalent) (#393)
- ferray-ufunc: no ufunc object — functions are standalone, not composable abstractions (#376)
- ferray-numpy-interop: only supports 1D arrays for Arrow and Polars (#549)
- ferray-ufunc: no mixed-type operations (type promotion at ufunc level) (#380)
- ferray-autodiff: no integration with ferray-core — DualNumber can't be stored in Array (#534)
- ferray-ufunc: no reduce/accumulate/outer/at/reduceat as generic ufunc methods (#381)
- ferray-ufunc: every operation allocates a new Vec — no out= parameter (#378)
- ferray-linalg: no batched solve/inv/det — only cholesky has batched variant (#412)
- ferray-random: standard_normal uses Box-Muller instead of Ziggurat — slower (#442)
- ferray-random: only f64 output — no f32 random generation (#441)
- ferray-fft: complex FFTs require Complex input — no auto-promotion from real arrays (#427)
- ferray-fft: complex FFTs only support Complex — no f32 FFT (#426)
- ferray-fft: rfft doesn't use rustfft's real-to-complex FFT — promotes to complex first (#432)
- ferray-ma: MaskedArray is not interoperable with ferray Array functions (#505)
- ferray-ma: no masked array broadcasting in arithmetic ops (#504)
- ferray-ma: no fill_value concept — masked elements get T::zero() in operations (#501)
- ferray-ma: reductions are whole-array only — no axis parameter (#500)
- ferray-polynomial: no domain/window mapping — polynomials operate on raw x values (#474)
- Add oracle test suite that validates all crates against NumPy fixture outputs (#45)
- ferray-random: all distributions return 1D arrays only — no shape parameter for ND output (#440)
- ferray-stats: no multi-axis reductions — axis only accepts single usize (#458)
- ferray-stats: no keepdims= parameter on any reduction (#457)
- ferray-core: no concatenate/stack/split in core (#362)
- ferray-core: no safe casting functions (can_cast, astype equivalent) (#361)
- ferray-core: no reduction methods (sum, prod, min, max, mean) on Array (#368)
- ferray-core: ops.rs operators don't broadcast — diverges from NumPy behavior (#346)
- Add operator overloading (+, -, *, /, %, unary -) for Array types (#43)
- Yank all ferrum crates from crates.io for rename to ferray (#38)
- Add exp_fast() with Even/Odd Remez v2 algorithm (#37)
- ferray-core: no nditer equivalent — missing general-purpose strided iterator (#340)
- ferray-linalg: faer_bridge always copies — no zero-copy path for C-contiguous arrays (#405)
- ferray-core: no np.where equivalent (ternary selection) (#372)
- ferray-core: no scalar-array arithmetic (array + scalar) (#347)
- ferray-ma: no std::ops overloads — must use masked_add/masked_sub functions (#502)
- ferray-stats: no dtype= parameter on sum/prod — integer overflow risk (#460)
- ferray-autodiff: no jacobian() function for vector-valued functions (#543)
- Add Index/IndexMut trait impls for Array with fixed-size array indices (#339)
- Add Serialize/Deserialize impls for Array<T, D> behind serde feature (#338)
- ferray-core:
NdArray<T, D>with full ownership model (owned, view, mutable view, ArcArray, CowArray) - ferray-core: Broadcasting (NumPy rules), basic/advanced/extended indexing,
s![]macro - ferray-core: Array creation (
zeros,ones,arange,linspace,eye,meshgrid, etc.) - ferray-core: Shape manipulation (
reshape,transpose,concatenate,stack,split,pad,tile, etc.) - ferray-core:
Elementtrait for 17 dtypes (f16, f32, f64, Complex, i8-i128, u8-u128, bool) - ferray-core:
DTyperuntime enum,finfo/iinfo, type promotion rules - ferray-core:
FerrayErrorhierarchy with diagnostic context, zero panics - ferray-core-macros:
#[derive(FerrayRecord)]proc macro,promoted_type!macro - ferray-ufunc: 40+ SIMD-accelerated universal functions via
pulp(sin, cos, exp, log, sqrt, etc.) - ferray-ufunc: CORE-MATH correctly-rounded transcendentals (< 0.5 ULP from mathematical truth)
- ferray-ufunc: Binary ops (add, sub, mul, div, pow) with broadcasting
- ferray-stats: Reductions (sum, mean, var, std, min, max, argmin, argmax) with SIMD pairwise summation
- ferray-stats: Sorting (sort, argsort, partition), histograms, set operations (unique, intersect, union)
- ferray-io: NumPy
.npy/.npzfile I/O with memory mapping, structured dtype support - ferray: Re-export crate with prelude module
- ferray-linalg: Matrix products (dot, matmul, einsum, tensordot, kron, multi_dot)
- ferray-linalg: Decompositions (Cholesky, QR, SVD, LU, eig, eigh) via
faer - ferray-linalg: Solvers (solve, lstsq, inv, pinv, matrix_power, tensorsolve)
- ferray-linalg: Norms and measures (norm, cond, det, slogdet, matrix_rank, trace)
- ferray-linalg: Batched operations for 3D+ stacked arrays with Rayon parallelism
- ferray-fft: 1D/2D/ND FFT and IFFT via
rustfftwith plan caching - ferray-fft: Real FFTs (rfft, irfft, rfft2, rfftn), Hermitian FFTs
- ferray-fft: Frequency utilities (fftfreq, rfftfreq, fftshift, ifftshift)
- ferray-fft:
FftNorm(Backward, Forward, Ortho) matching NumPy - ferray-random: Generator API with PCG64, Philox, SFC64, MT19937 BitGenerators
- ferray-random: 30+ distributions (Normal, Uniform, Poisson, Binomial, etc.)
- ferray-random: Permutations (shuffle, permutation, choice)
- ferray-polynomial: 6 basis classes (Power, Chebyshev, Legendre, Laguerre, Hermite, HermiteE)
- ferray-polynomial: Poly trait with fitting, root-finding, companion matrix eigenvalues
- ferray-window: Window functions (hann, hamming, blackman, kaiser, etc.)
- ferray-strings:
StringArraywith vectorized operations (case, strip, search, split, regex) - ferray-ma:
MaskedArraywith mask propagation, masked reductions and sorting - ferray-stride-tricks:
sliding_window_view,as_strided, overlap checking - ferray-numpy-interop: PyO3 zero-copy NumPy conversion, Arrow and Polars interop
- f16 support: Half-precision floats as first-class element type across all crates
- no_std core:
ferray-coreandferray-ufunccompile with# - Const generic shapes:
Shape1<N>throughShape6with compile-time dimension checking - ferray-autodiff: Forward-mode automatic differentiation via
DualNumber<T>
- NumPy oracle fixture generation for cross-validation
- Property-based tests with
proptest(256 cases per property) - Fuzz targets for public function families
- SIMD vs scalar verification (all tests pass with
FERRAY_FORCE_SCALAR=1) - Kani formal verification harnesses for ferray-core
- Statistical equivalence benchmarks (47/47 accuracy tests pass)
- ferray-gpu design doc: Phase 6 GPU acceleration architecture
- CubeCL for cross-platform GPU kernels (CUDA/ROCm/Vulkan/Metal/WebGPU)
- cudarc for NVIDIA vendor libraries (cuBLAS, cuFFT, cuSOLVER)
GpuArray<T, D>with stream-ordered async execution- Pinned memory transfers, 6 fused kernels, auto-dispatch
- ferray-autodiff: god file — 1866 lines in single lib.rs (#535)
- ferray-fft: plan cache is f64-only; no f32 FFT support (#109)
- ferray-fft: rfft computes full FFT then discards half (#108)
- ferray-ufunc oracle.rs: replace closure-suppression with macro type pinning (#561)
- ferray-ufunc: binary ops require same shape — broadcast variants are separate functions (#379)
- Prepare all crates for crates.io publishing with GitHub URLs (#35)
- ferray-core: NdIter — restore and complete BinaryBroadcastIter Iterator impl and binary_map_to (#560)
- SIMD pairwise summation: 4 accumulators to saturate FPU throughput, base case 256 elements
- Fused SIMD variance:
simd_sum_sq_diff_f64()with FMA, no intermediate allocation (5x speedup) - FFT 1D fast path: Skip lane extraction for 1D arrays (fft/64: 17x faster than NumPy)
- FFT thread-local scratch:
par_iter().map_init()reuses buffers per Rayon thread - 4-wide SIMD sqrt unroll: Hides 12-cycle sqrt latency with instruction-level parallelism
- Uninit output buffers:
Vec::with_capacity+set_lenskips zeroing for ufunc outputs - SIMD square/reciprocal: Real pulp SIMD kernels for common operations
- matmul 3-tier dispatch: Naive ikj (<=64), faer::Seq (65-255), faer::Rayon (>=256)
- Rayon threshold: 1M elements minimum to avoid thread pool overhead on small arrays
- Batch benchmark mode: Single-process benchmarking eliminates subprocess cold-cache bias
- LTO + codegen-units=1: 10-20% speedup across the board in release builds
- ferray wins 23/55 benchmarks, NumPy wins 32/55
- Dominates: FFT all sizes (1.6-17x), var/std all sizes (2.1-20.8x)
- Beats: mean/sum small (1.1-8.7x), arctan 10K+ (1.1-1.5x), sqrt/1K (1.1x), tanh/1K (1.3x)
- Slower: transcendentals at scale 1.4-2.1x (CORE-MATH accuracy tradeoff), matmul medium 4x (faer vs BLAS)
- ferray-ufunc: SIMD dispatch via pulp doesn't actually use SIMD for transcendentals (#377)
- ferray-strings: operations return flat StringArray — don't preserve higher-D shape correctly (#520)
- ferray-core-macros: no compile-fail tests for macro error paths (#210)
- ferray-numpy-interop: IntoNumPy creates flat PyArray1 then reshapes — extra allocation (#550)
- ferray-numpy-interop: AsFerray always copies — doc claims zero-copy borrow but implementation copies (#548)
- ferray-numpy-interop: to_arrow always copies — claims zero-copy but never achieves it (#547)
- ferray-test-oracle: fixtures exist without corresponding oracle tests (#205)
- ferray-test-oracle: no self-tests for the oracle framework (#204)
- ferray-test-oracle: cross-sign ULP distance calculation incorrect (#199)
- ferray-numpy-interop: zero-copy claims are false — every conversion copies (#195)
- ferray-numpy-interop: NpElement not implemented for bool (#196)
- ferray-numpy-interop: no NaN roundtrip test for Polars (#198)
- ferray-test-oracle: parse_complex_data panics on NaN/Inf string values (#202)
- ferray-ufunc: rayon parallelism is defined but never used in ops (#382)
- ferray-linalg: matmul/dot use naive O(n^3) loops — not using faer/BLAS for matrix multiply (#413)
- ferray-io: no f16/bf16 support in npy I/O even with feature enabled (#118)
- ferray-fft: nd.rs lane extraction copies each lane individually — O(n) allocations for ND FFT (#433)
- Fix bugs found by oracle test suite (#46)
- Fix exp_fast SIMD dispatch so it auto-vectorizes without target-cpu=native (#42)
- Fix Windows compilation: add signgam compat header for core-math lgamma.c (#41)
- ferray-linalg: tensorsolve and tensorinv have zero tests (#105)
- ferray-linalg: matmul_batched (ND x ND) has zero tests (#106)
- ferray-linalg: lstsq 2D b code path untested (#104)
- ferray-core-macros: no tests for negative indices in s! macro (#212)
- ferray-fft: all FFT functions flatten input via iter().copied().collect() before computing (#428)
- ferray-polynomial: oracle/property tests only cover power basis (#132)
- ferray-fft: no property tests for rfft/irfft family (#114)
- ferray-random: README references nonexistent types (SFC64, MT19937, wrong API) (#147)
- ferray-polynomial: no high-degree root finding tests (degree >= 10) (#130)
- ferray-random: no reproducibility tests against known reference values (#145)
- ferray-random: no statistical goodness-of-fit tests (KS, chi-squared) (#134)
- ferray-io: no tests for loading big-endian .npy files (#115)
- ferray-fft: no tests for fft2/fftn s (shape/padding) parameter (#113)
- ferray-fft: no tests for rfft/irfft along non-last axes (#112)
- ferray-fft: no tests for normalization modes on N-D FFTs (#111)
- ferray-strings: ljust and rjust lack fillchar parameter (#165)
- ferray-stride-tricks: no tests for non-contiguous (Fortran-ordered) source arrays (#176)
- ferray-io: no tests for malformed .npy files (#117)
- ferray-strings: no Unicode tests for multi-byte characters (#158)
- ferray-ma: no tests for 2D or higher-dimensional masked arrays (#151)
- ferray-linalg: no tests for solve with singular matrices (#103)
- ferray-linalg: no tests for empty (0x0) or 1x1 matrices (#107)
- ferray-io: no tests for Fortran-order arrays (#116)
- ferray-random: standard_normal_parallel is sequential — parallelism is not implemented (#443)
- ferray-random: binomial uses normal approximation for large n*p instead of BTPE algorithm (#450)
- ferray-fft: irfft output is Complex instead of f64 — should return real array (#437)
- ferray-core: broadcast_to rejects negative strides — NumPy handles them (#359)
- ferray-autodiff: no numerical gradient comparison tests (finite-difference) (#188)
- ferray-window: apply_along_axis returns wrong shape for 1-D input (#180)
- ferray-autodiff: powf with both variable base and exponent untested (#190)
- ferray-linalg: dot/matmul/outer all flatten via iter().copied().collect() before computing (#425)
- ferray-autodiff: zero tests for Rem derivative (#192)
- ferray-linalg: solve detects singularity via NaN check instead of pivot analysis (#414)
- ferray-autodiff: higher-order derivatives (nested DualNumber) untested (#544)
- ferray-core: ArcArray only supports C-contiguous layout (#360)
- ferray-linalg: lstsq computes full SVD just for rank/singular values (#419)
- ferray-stats: mean requires Float — can't compute mean of integer arrays (#461)
- ferray-stats: sort/argsort axis defaults to None (flatten) — NumPy defaults to -1 (last axis) (#465)
- ferray-stride-tricks: sliding_window_view rejects negative strides — can't window reversed arrays (#527)
- ferray-io: savetxt fmt is a simple string replace — not NumPy's printf-style format (#495)
- ferray-io: savez/savez_compressed buffer each .npy in memory before writing to zip (#493)
- ferray-stats: nan_aware allocates per-lane filter Vec — could use in-place NaN compaction (#471)
- ferray-linalg: norm P(p) for matrices silently falls back to Frobenius (#420)
- ferray-window: vectorize only works on Ix1 — not generic over dimension (#533)
- ferray-io: load reads element-by-element — no bulk read for native endianness (#488)
- ferray-io: save writes element-by-element — no bulk write_all on contiguous data (#487)
- ferray-ufunc: add_reduce collects all data into flat Vec before reducing — O(2n) memory (#391)
- ferray-linalg: pinv ignores rcond parameter (#415)
- ferray-io: save fails on non-contiguous arrays — should auto-contiguify (#489)
- ferray-ufunc: no clip for integer types — requires Float bound (#402)
- ferray-ufunc: gcd/lcm use Float trait — should work on actual integers (#390)
- ferray-ufunc: multiply_outer copies data twice (into Ix2, then into IxDyn) (#392)
- ferray-ufunc: divmod computes floor_divide and remainder separately — double work (#399)
- ferray-ma: set_mask_flat uses iter_mut().nth(flat_idx) — O(n) per call (#511)
- ferray-core: flat_index uses .iter().nth() — O(n) for non-contiguous arrays (#350)
- ferray-autodiff: no tests for mathematical singularities (sqrt(0), ln(0), tan(pi/2), asin(1)) (#540)
- ferray-autodiff: gradient() allocates O(n^2) — should reuse buffer (#539)
- ferray-autodiff: no tests for ReLU pattern — max(x, 0) derivative is critical for deep learning (#538)
- ferray-autodiff: powf has division by zero and undefined ln at x=0 and x<0 (#537)
- ferray-autodiff: Rem has dead computation — T::zero() * q is always 0 (#536)
- ferray-core-macros: promoted_type! does not support f16/bf16 (#208)
- ferray-core-macros: s! macro rfind(';') splits incorrectly for complex step expressions (#207)
- ferray-random: default_rng() entropy source is weak (#143)
- ferray-io: TOCTOU race in memmap read_npy_header_with_offset (#120)
- ferray-ufunc: force_scalar() LazyLock caching makes SIMD/scalar tests ineffective (#86)
- ferray-ufunc: frexp uses iterative loops instead of bit extraction — O(1074) for subnormals (#83)
- ferray-core: display formatting uses Relaxed atomic ordering (#79)
- ferray-core: frombuffer creates arbitrary T from bytes without validity check (#78)
- ferray-core: to_bytes() is unsound for types with padding bytes (#77)
- ferray-core: 13+ unwrap/expect in non-test code violates zero-panic guarantee (#67)
- ferray-core: concatenate uses iter().nth(src_flat) — O(n^2) and can panic (#57)
- ferray-random: binomial powi truncates n to i32 for large n (#140)
- ferray-random: Poisson large-lambda squeeze test is dead code (#138)
- ferray-random: Laplace sampling produces Inf when u.abs() == 0.5 (#136)
- ferray-ma: masked_binary_op missing shape validation — silently truncates (#70)
- ferray-io: unbounded header allocation from untrusted header_len (DoS) (#69)
- ferray-io: no bounds check on shape product overflow from untrusted .npy files (DoS) (#68)
- ferray-stats: sort(axis=None) returns wrong shape (#91)
- ferray-stats: histogram density parameter is silently ignored (#63)
- ferray-ufunc: unwrap, gcd, lcm can infinite-loop on NaN or large inputs (#62)
- ferray-ufunc: logaddexp/logaddexp2 silently drop NaN inputs (#61)
- ferray-ufunc: spacing does not return true ULP (#60)
- ferray-ufunc: nextafter uses float approximation instead of bit manipulation (#59)
- ferray-ma: min/max silently ignore NaN in unmasked elements (#149)
- ferray-window: kaiser rejects negative beta, but NumPy allows it (#179)
- ferray-stride-tricks: sliding_window_view has no bounds check on resulting view (#172)
- ferray-stride-tricks: sliding_window_view casts isize strides to usize without negative check (#169)
- ferray-polynomial: degree() uses different tolerance than trim() (#124)
- ferray-strings: search/boolean ops flatten output to Ix1 for multidim input (#71)
- ferray-stats: unsafe pointer casts in SIMD type-dispatch (#92)
- ferray-linalg: pinv ignores rcond parameter (#98)
- ferray-linalg: cond unnecessarily requires square matrices (#97)
- ferray-stats: min/max inconsistent NaN propagation (#90)
- ferray-fft: unnecessary unsafe impl Send/Sync for FftPlan (#110)
- ferray-linalg: solve never returns SingularMatrix error despite documenting it (#99)
- ferray-ufunc: cumsum/cumprod with axis=None returns original shape, not flattened (#84)
- ferray-linalg: slogdet computes raw determinant first, defeating numerical purpose (#64)
- ferray-linalg: inv uses unreliable determinant-based singularity check (#65)
- ferray-polynomial: quadratic formula numerically unstable (catastrophic cancellation) (#66)
- ferray-core: view_cast does not check alignment in Vec::from_raw_parts (#58)
- ferray umbrella: indexing module (~18 functions) not re-exported (#75)
- ferray umbrella: config.rs .expect() can panic in library code (#74)
- ferray umbrella: bf16 feature not exposed (#73)
- ferray umbrella: f16 feature not forwarded to ferray-ufunc (#72)
- ferray-core: iinfo::().max overflows i128, wraps to -1 (#56)
- ferray umbrella: no_std feature violates Cargo additivity, --all-features fails (#52)
- ferray-io: Fortran-order silently ignored for complex32/complex64 dynamic loading (#51)
- ferray-linalg: array2_to_faer O((m*n)^2) allocation for non-contiguous arrays (#50)
- ferray-core: broadcast_to casts negative isize strides to usize — UB for transposed views (#49)
- Patched all design docs for adversarial review findings
- Fixed .gitignore: added target/, pycache/, *.pyc, .worktrees/, fuzz artifacts
- Purged 3,591 accidentally committed build artifacts from git history (163MB -> 1.3MB)