Releases: ml-explore/mlx-swift-lm
Releases Β· ml-explore/mlx-swift-lm
3.31.3
First 3.x release.
See for information on what changed and how to upgrade
What's Changed
- Decouple from tokenizer and downloader packages by @DePasqualeOrg in #118
- Batched LLM inference part 1 - consolidating RoPE calls by @ronaldmannak in #178
- Add speculative decoding by @petrukha-ivan in #173
- Fix doc comments and verify in CI by @DePasqualeOrg in #176
- Add more documentation for integrations to readme by @DePasqualeOrg in #201
- Fix tool calling for Llama 3 by @aleroot in #145
- Add IntegrationTesting Xcode project and additional integration test models by @atdrendel in #142
- Fix links in readme by @DePasqualeOrg in #204
- Fix Swift 6 Sendable error in Llama3ToolCallParser by @Lakr233 in #203
- Add gemma 4 model (text, vision, MoE) by @adrgrondin in #180
- Add Gemma 4 text model support (E2B and E4B) by @stefan-geens in #185
- small v3 api fixes by @davidkoski in #190
- v3 api embedder fixes by @davidkoski in #202
- Fix prompt-cache round-trip support for
ArraysCache,MambaCache, andCacheListby @ronaldmannak in #155 - Prepare Gemma4Text for batched RoPE offsets by @ronaldmannak in #212
- Fix Gemma 4 system message and modality order by @adrgrondin in #211
- Add qwen3_next to tool call format inference by @alankessler in #166
- add upgrade docs, how to use, how to develop. by @davidkoski in #206
New Contributors
- @Lakr233 made their first contribution in #203
- @stefan-geens made their first contribution in #185
Full Changelog: 2.31.3...3.31.3
2.31.3
In addition to the many changes and improvements in mlx-swift-lm, this also:
- uses mlx-swift 0.31.3
- switches Package.swift to use the 6.1 swift-tools-version -- this will help keep the code concurrency safe
Important
This will be the last tag for the 2.x releases. We will continue with some breaking API changes on main with 3.x
What's Changed
- Enforce structured concurrency for MLXEmbedders by @CodebyCR in #111
- Add wiredMemoryTicket to GenerateTokens by @ronaldmannak in #117
- Adding Support for Qwen3.5 & Qwen3.5 MoE (Text-only) by @johnmai-dev in #97
- Allow reading LFM2 models nested rope params by @adrgrondin in #122
- Fix KVCache serialization by @petrukha-ivan in #121
- Adding Support for Qwen3.5 & Qwen3.5 MoE (Vision) by @johnmai-dev in #120
- Add JSON5 support by @ronaldmannak in #125
- Pick up swift-transformers 1.1.9 by @davidkoski in #126
- ensure models have links back to where they were ported from by @davidkoski in #105
- add optional toolCall dispatch and tool output injection by @davidkoski in #114
- Pass additionalContext to Qwen3VL by @adrgrondin in #127
- Qwen3.5 performance optimization by @johnmai-dev in #129
- Fix XMLFunctionParser regex to match newlines by @pixelsoccupied in #131
- audit RoPE use across models by @davidkoski in #115
- fix Sendable issues, unused code, deprecation warnings by @davidkoski in #113
- Add qwen3_5_text model type support by @adhney in #135
- Fix LFM2.5 VL tools by @viktike in #139
- Fix tool calling for Mistral 3 by @atdrendel in #132
- Fixed tool calling for qwen3.5 by @tpae in #133
- Adding support for GLM-OCR model by @smdesai in #144
- Add topK, minP and penalty parameters to GenerateParameters by @adrgrondin in #141
- Add gemma 3 embedding model by @CodebyCR in #136
- fix: @ModuleInfo for pooler + attention mask dtype in Bert/NomicBert by @jowharshamshiri in #153
- Fix LFM2 tool calling with nested parentheses in arguments by @tpae in #152
- fix unreliable tests by @davidkoski in #128
- add missing context/toolcall parameters by @davidkoski in #140
- perf: eliminate CPUβGPU sync in penalty processors, optimize TopPSampler by @spokvulcan in #147
- Add copy() to KVCache protocol and all implementations by @alankessler in #158
- Add KV cache initializers and cache access to ChatSession by @alankessler in #151
- Add model-defined pooling fallback for embedding models by @sxy-trans-n in #156
- Handle multiple tool calls in ChatSession by @alankessler in #162
- switch to swift 6 -- prevent concurrency issues, fix concurrency issues by @davidkoski in #165
New Contributors
- @johnmai-dev made their first contribution in #97
- @pixelsoccupied made their first contribution in #131
- @adhney made their first contribution in #135
- @viktike made their first contribution in #139
- @atdrendel made their first contribution in #132
- @jowharshamshiri made their first contribution in #153
- @spokvulcan made their first contribution in #147
- @alankessler made their first contribution in #158
- @sxy-trans-n made their first contribution in #156
Full Changelog: 2.30.6...2.31.3
2.30.6
Switch to mlx-swift 0.30.6: https://github.com/ml-explore/mlx-swift/releases/tag/0.30.6
Important:
- ml-explore/mlx-swift-examples#462 -- incorrect detection of NAX hardware on iPhone 16 Pro
What's Changed
- Fix AfMoE and DeepSeek V3 by @DePasqualeOrg in #30
- Add GLM 4.7 Flash models by @ronaldmannak in #68
- Added minicpm support by @robertmsale in #71
- Support for NemotronH by @fmarmori in #75
- Added Qwen3-Next-80b by @robertmsale in #70
- Feature/vlm external video frames support [Draft] by @vade in #64
- Add configurable tool call parsing with support for multiple model formats by @tpae in #78
- Add skill.md by @ronaldmannak in #92
- LFM2.5 outputs [func(arg='value')] Pythonic syntax, not JSON by @tpae in #91
- Update code snippet in README. by @zaneenders in #96
- trim vocabulary padding in weight sanitization in Gemma3Text by @aleroot in #99
- Rename MLXEmbedders sub library to match target name by @CodebyCR in #102
- Added wired memory control by @robertmsale in #72
- Fix MLXEmbedders README.md example by @CodebyCR in #103
- Add documentation for specified embedding model by @CodebyCR in #104
- Add raw token streaming (TokenGeneration) and generateTokens convenience APIs by @ronaldmannak in #88
- Fix latest mistral 3.2 model loading by @aleroot in #108
- Add tools parameter to ChatSession for function calling support by @louis-jan in #107
- Add MiniMax and MiMo v2 Flash models by @ronaldmannak in #50
- Update glm4_moe_lite To Store KV Latent in Cache by @ronaldmannak in #73
- switch to current mlx-swift by @davidkoski in #100
New Contributors
- @robertmsale made their first contribution in #71
- @fmarmori made their first contribution in #75
- @vade made their first contribution in #64
- @tpae made their first contribution in #78
- @zaneenders made their first contribution in #96
- @louis-jan made their first contribution in #107
Full Changelog: 2.30.3...2.30.6
2.30.3
Picks up mlx-swift 0.30.3.
What's Changed
- Add GLM 4.7 model by @ronaldmannak in #48
- Fix #44 Add support for chat re-hydration by @aleroot in #45
- fix(gemma3n): support per-layer intermediate_size array by @swernerx in #46
- Optimize model loading performance by @DePasqualeOrg in #34
- SwissAI Apertus Model Implementation by @BlackSamorez in #37
- split lint github action by @davidkoski in #13
- adopt mlx-swift 0.30.2 by @davidkoski in #52
- fix gemma3 + attention mask by @davidkoski in #53
- GPT-oss performance optimizations by @ronaldmannak in #51
- Add LFM2 VL model by @adrgrondin in #58
- Add MLX embedders code documentation by @CodebyCR in #65
- Use EOS tokens from config files by @ronaldmannak in #69
- fix trailing comma by @davidkoski in #74
- restore Sendable to ModelAdaptor by @davidkoski in #56
- fix thread safety issues by @davidkoski in #55
New Contributors
- @ronaldmannak made their first contribution in #48
- @aleroot made their first contribution in #45
- @swernerx made their first contribution in #46
- @BlackSamorez made their first contribution in #37
- @CodebyCR made their first contribution in #65
Full Changelog: 2.29.3...2.30.3
2.29.3
Many bug fixes and new models. Checkpoint commit before picking up new mlx-swift.
What's Changed
- Fix broken link in README.md by @madrob in #3
- Fix prompt time metric by @petrukha-ivan in #11
- Align cache implementation with Python mlx-lm by @DePasqualeOrg in #10
- feat: Add Jamba model by @sairamanareddy in #8
- Add Olmo 3 by @DePasqualeOrg in #9
- Add Arcee-AI's AfMoE by @smdesai in #12
- Add Mistral 3, fix SuScaledRoPE by @DePasqualeOrg in #16
- Fix many compiler warnings by @DePasqualeOrg in #14
- Fix GPTOSS Fatal error in getSlidingWindowMask by @jeanmatthieu in #39
- Add Ministral 3 with vision (Pixtral) by @adrgrondin in #18
- Update swift-transformers version requirement by @DePasqualeOrg in #28
- Expose inner model for all models by @DePasqualeOrg in #32
- Fix Mistral3TextConfiguration parsing by @jeanmatthieu in #43
New Contributors (to mlx-swift-lm anyway)
- @madrob made their first contribution in #3
- @petrukha-ivan made their first contribution in #11
- @DePasqualeOrg made their first contribution in #10
- @sairamanareddy made their first contribution in #8
- @smdesai made their first contribution in #12
- @jeanmatthieu made their first contribution in #39
- @adrgrondin made their first contribution in #18
Full Changelog: 2.29.2...2.29.3
2.29.2
2.29.1
mlx-swift-lm for mlx-swift 0.29.1
Please read release notes for mlx-swift:
What's Changed
- Port of nanochat by @smdesai in ml-explore/mlx-swift-examples#415
- Add Qwen 3 VL Support (Dense) by @rudrankriyam in ml-explore/mlx-swift-examples#414
- prep for mlx-swift 0.29.1 by @davidkoski in ml-explore/mlx-swift-examples#411
Full Changelog: 2.25.9...2.29.1