Releases · ml-explore/mlx-swift-lm

15 Apr 14:33

davidkoski

3.31.3

1c05248

3.31.3 Latest

Latest

First 3.x release.

See for information on what changed and how to upgrade

What's Changed

Decouple from tokenizer and downloader packages by @DePasqualeOrg in #118
Batched LLM inference part 1 - consolidating RoPE calls by @ronaldmannak in #178
Add speculative decoding by @petrukha-ivan in #173
Fix doc comments and verify in CI by @DePasqualeOrg in #176
Add more documentation for integrations to readme by @DePasqualeOrg in #201
Fix tool calling for Llama 3 by @aleroot in #145
Add IntegrationTesting Xcode project and additional integration test models by @atdrendel in #142
Fix links in readme by @DePasqualeOrg in #204
Fix Swift 6 Sendable error in Llama3ToolCallParser by @Lakr233 in #203
Add gemma 4 model (text, vision, MoE) by @adrgrondin in #180
Add Gemma 4 text model support (E2B and E4B) by @stefan-geens in #185
small v3 api fixes by @davidkoski in #190
v3 api embedder fixes by @davidkoski in #202
Fix prompt-cache round-trip support for ArraysCache, MambaCache, and CacheList by @ronaldmannak in #155
Prepare Gemma4Text for batched RoPE offsets by @ronaldmannak in #212
Fix Gemma 4 system message and modality order by @adrgrondin in #211
Add qwen3_next to tool call format inference by @alankessler in #166
add upgrade docs, how to use, how to develop. by @davidkoski in #206

New Contributors

@Lakr233 made their first contribution in #203
@stefan-geens made their first contribution in #185

Full Changelog: 2.31.3...3.31.3

Contributors

atdrendel, ronaldmannak, and 8 other contributors

Assets 2

01 Apr 20:50

davidkoski

2.31.3

25b00d4

2.31.3

In addition to the many changes and improvements in mlx-swift-lm, this also:

uses mlx-swift 0.31.3
switches Package.swift to use the 6.1 swift-tools-version -- this will help keep the code concurrency safe

Important

This will be the last tag for the 2.x releases. We will continue with some breaking API changes on main with 3.x

What's Changed

Enforce structured concurrency for MLXEmbedders by @CodebyCR in #111
Add wiredMemoryTicket to GenerateTokens by @ronaldmannak in #117
Adding Support for Qwen3.5 & Qwen3.5 MoE (Text-only) by @johnmai-dev in #97
Allow reading LFM2 models nested rope params by @adrgrondin in #122
Fix KVCache serialization by @petrukha-ivan in #121
Adding Support for Qwen3.5 & Qwen3.5 MoE (Vision) by @johnmai-dev in #120
Add JSON5 support by @ronaldmannak in #125
Pick up swift-transformers 1.1.9 by @davidkoski in #126
ensure models have links back to where they were ported from by @davidkoski in #105
add optional toolCall dispatch and tool output injection by @davidkoski in #114
Pass additionalContext to Qwen3VL by @adrgrondin in #127
Qwen3.5 performance optimization by @johnmai-dev in #129
Fix XMLFunctionParser regex to match newlines by @pixelsoccupied in #131
audit RoPE use across models by @davidkoski in #115
fix Sendable issues, unused code, deprecation warnings by @davidkoski in #113
Add qwen3_5_text model type support by @adhney in #135
Fix LFM2.5 VL tools by @viktike in #139
Fix tool calling for Mistral 3 by @atdrendel in #132
Fixed tool calling for qwen3.5 by @tpae in #133
Adding support for GLM-OCR model by @smdesai in #144
Add topK, minP and penalty parameters to GenerateParameters by @adrgrondin in #141
Add gemma 3 embedding model by @CodebyCR in #136
fix: @ModuleInfo for pooler + attention mask dtype in Bert/NomicBert by @jowharshamshiri in #153
Fix LFM2 tool calling with nested parentheses in arguments by @tpae in #152
fix unreliable tests by @davidkoski in #128
add missing context/toolcall parameters by @davidkoski in #140
perf: eliminate CPU←GPU sync in penalty processors, optimize TopPSampler by @spokvulcan in #147
Add copy() to KVCache protocol and all implementations by @alankessler in #158
Add KV cache initializers and cache access to ChatSession by @alankessler in #151
Add model-defined pooling fallback for embedding models by @sxy-trans-n in #156
Handle multiple tool calls in ChatSession by @alankessler in #162
switch to swift 6 -- prevent concurrency issues, fix concurrency issues by @davidkoski in #165

New Contributors

@johnmai-dev made their first contribution in #97
@pixelsoccupied made their first contribution in #131
@adhney made their first contribution in #135
@viktike made their first contribution in #139
@atdrendel made their first contribution in #132
@jowharshamshiri made their first contribution in #153
@spokvulcan made their first contribution in #147
@alankessler made their first contribution in #158
@sxy-trans-n made their first contribution in #156

Full Changelog: 2.30.6...2.31.3

Contributors

tpae, atdrendel, and 14 other contributors

Assets 2

18 Feb 18:29

davidkoski

2.30.6

7e19e09

2.30.6

Switch to mlx-swift 0.30.6: https://github.com/ml-explore/mlx-swift/releases/tag/0.30.6

Important:

ml-explore/mlx-swift-examples#462 -- incorrect detection of NAX hardware on iPhone 16 Pro
- via ml-explore/mlx#3083

What's Changed

Fix AfMoE and DeepSeek V3 by @DePasqualeOrg in #30
Add GLM 4.7 Flash models by @ronaldmannak in #68
Added minicpm support by @robertmsale in #71
Support for NemotronH by @fmarmori in #75
Added Qwen3-Next-80b by @robertmsale in #70
Feature/vlm external video frames support [Draft] by @vade in #64
Add configurable tool call parsing with support for multiple model formats by @tpae in #78
Add skill.md by @ronaldmannak in #92
LFM2.5 outputs [func(arg='value')] Pythonic syntax, not JSON by @tpae in #91
Update code snippet in README. by @zaneenders in #96
trim vocabulary padding in weight sanitization in Gemma3Text by @aleroot in #99
Rename MLXEmbedders sub library to match target name by @CodebyCR in #102
Added wired memory control by @robertmsale in #72
Fix MLXEmbedders README.md example by @CodebyCR in #103
Add documentation for specified embedding model by @CodebyCR in #104
Add raw token streaming (TokenGeneration) and generateTokens convenience APIs by @ronaldmannak in #88
Fix latest mistral 3.2 model loading by @aleroot in #108
Add tools parameter to ChatSession for function calling support by @louis-jan in #107
Add MiniMax and MiMo v2 Flash models by @ronaldmannak in #50
Update glm4_moe_lite To Store KV Latent in Cache by @ronaldmannak in #73
switch to current mlx-swift by @davidkoski in #100