Release v0.30.3 · ml-explore/mlx

Highlights

Support nvfp4 and mxfp8 quantized ops on Metal
Support nvfp4 and mxfp8 quantized-quantized matrix-matrix multiplication on CUDA

What's Changed

Bump the patch version by @angeloskath in #2922
Faster copy for col contig to row contig by @awni in #2917
Fix cuda release by @awni in #2925
Metal logging by @CC-Yeh in #2904
fix cuda release part 2 by @awni in #2926
new[CI]: add linux sanitizer tests by @incertum in #2860
patch bump by @awni in #2927
Fix CUDA pypi release by @awni in #2929
Move allocate_workspace to cuda/utils.h by @zcbenz in #2923
Allow dry run for PyPI release workflow by @zcbenz in #2928
Set rpath with cmake for CUDA build by @zcbenz in #2932
Fix nightly build by @zcbenz in #2933
Set install rpath of python bindings with cmake by @zcbenz in #2934
Fix pid in local launch by @angeloskath in #2936
Make CUDA CI run faster by @zcbenz in #2939
refactor: use perf_counter for accurate benchmarking by @Satyam12singh in #2940
Fix for non row-contig scales by @awni in #2941
Fix stubgen by @zcbenz in #2942
ci: add macOS 26 target by @madrob in #2937
Fix float64 size in data_types.rst by @pdevine in #2948
Fixes in mlx.distributed_config by @angeloskath in #2947
Metal/CPU nvfp4 and mxfp8 by @awni in #2946
[CUDA] Implement gather_mm_rhs by @zcbenz in #2902
Fetch nanobind with cmake by @zcbenz in #2949
refactor: use time.perf_counter for consistent and accurate benchmarking by @Satyam12singh in #2943
BUG FIX - Addition of missing parameter in random::uniform by @hwiesmann in #2963
Fix doc issues in mlx.nn.init.he_normal and mlx.nn.hard_tanh by @Redempt1onzzZZ in #2968
fix numpy dtype bug by @awni in #2960
QQ linear by @nastya236 in #2931
fix array allocator with user buffer and deleter by @andresy in #2971
Swizzle scales by @nastya236 in #2979
Fix grid_dim_x calculations by @CC-Yeh in #2980
Add asarray to array_namespace by @Anri-Lombard in #2966
fix doc by @CC-Yeh in #2988
replace MLX_IBV_COORDINATOR with MLX_JACCL_COORDINATOR by @Evanev7 in #2986
Fix RandomBits::is_equivalent to include width by @MillaFleurs in #2978
Don't try to use NAX at run-time if kernels aren't there by @awni in #2982
Expose to/from fp8 in Python and don't auto-convert fp8 when loading from safetensors by @awni in #2985
Allow some non 2D inputs in qqmm by @awni in #2981

New Contributors

@pdevine made their first contribution in #2948
@hwiesmann made their first contribution in #2963
@Anri-Lombard made their first contribution in #2966
@Evanev7 made their first contribution in #2986
@MillaFleurs made their first contribution in #2978

Full Changelog: v0.30.1...v0.30.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.30.3

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

What's Changed

New Contributors

Contributors

Uh oh!