Releases: ml-explore/mlx
Releases · ml-explore/mlx
v0.0.10
Highlights:
- Faster matmul: up to 2.5x faster for certain sizes, benchmarks
- Fused matmul + addition (for faster linear layers)
Core
- Quantization supports sizes other than multiples of 32
- Faster GEMM (matmul)
- ADMM primitive (fused addition and matmul)
mx.isnan,mx.isinf,isposinf,isneginfmx.tile- VJPs for
scatter_minandscatter_max - Multi output split primitive
NN
- Losses: Gaussian negative log-likelihood
Misc
- Performance enhancements for graph evaluation with lots of outputs
- Default PRNG seed is based on current time instead of 0
- Primitive VJP takes output as input. Reduces redundant work without need for simplification
- PRNGs default seed based on system time rather than fixed to 0
- Format boolean printing in Python style when in Python
Bugfixes
- Scatter < 32 bit precision and integer overflow fix
- Overflow with
mx.eye - Report Metal out of memory issues instead of silent failure
- Change
mx.roundto follow NumPy which rounds to even
v0.0.9
Highlights:
- Initial (and experimental) GGUF support
- Support Python buffer protocol (easy interoperability with NumPy, Jax, Tensorflow, PyTorch, etc)
at[]syntax for scatter style operations:x.at[idx].add(y), (min,max,prod, etc)
Core
- Array creation from other mx.array’s (
mx.array([x, y])) - Complete support for Python buffer protocol
mx.inner,mx.outer- mx.logical_and, mx.logical_or, and operator overloads
- Array at syntax for scatter ops
- Better support for in-place operations (
+=,*=,-=, ...) - VJP for scatter and scatter add
- Constants (
mx.pi,mx.inf,mx.newaxis, …)
NN
- GLU activation
cosine_similarityloss- Cache for
RoPEandALiBi
Bugfixes / Misc
- Fix data type with
tri - Fix saving non-contiguous arrays
- Fix graph retention for inlace state, and remove
retain_graph - Multi-output primitives
- Better support for loading devices
v0.0.7
Core
- Support for loading and saving HuggingFace's safetensor format
- Transposed quantization matmul kernels
mlx.core.linalgsub-package withmx.linalg.norm(Frobenius, infininty, p-norms)tensordotandrepeat
NN
- Layers
Bilinear,Identity,InstanceNormDropout2D,Dropout3D- more customizable
Transformer(pre/post norm, dropout) - More activations:
SoftSign,Softmax,HardSwish,LogSoftmax - Configurable scale in
RoPEpositional encodings
- Losses:
hinge,huber,log_cosh
Misc
- Faster GPU reductions for certain cases
- Change to memory allocation to allow swapping
v0.0.6
Core
- quantize, dequantize, quantized_matmul
- moveaxis, swapaxes, flatten
- stack
- floor, ceil, clip
- tril, triu, tri
- linspace
Optimizers
- RMSProp, Adamax, Adadelta, Lion
NN
- Layers:
QuantizedLinear,ALiBipositional encodings - Losses: Label smoothing, Smooth L1 loss, Triplet loss
Misc
- Bug fixes