add a minimal LLM chat example + switch to mlx-swift 0.30.2 #454

davidkoski · 2025-12-17T22:00:55Z

Proposed changes

LLMEval has more of a showcase of features and runtime statistics
this provides the minimum required to load a model and interact with it
also cleans up the xcodeproj (see Fix Xcode project #451)
removes VLMEval (redundant and wasn't maintained)
fixes all warnings from moving to mlx-swift 0.30.2 (and equivalent mlx-swift-lm, tag not cut yet)

I updated the documentation to indicate which examples were more full featured and which ones were the minimal starting points. Both have uses.

@DePasqualeOrg FYI

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)
update build dependency on mlx-swift-lm when tag is ready

- LLMEval has more of a showcase of features and runtime statistics - this provides the minimum required to load a model and interact with it - also cleans up the xcodeproj (see #451) - removes VLMEval (redundant and wasn't maintained)

davidkoski · 2025-12-17T22:02:31Z

Applications/LLMBasic/ChatModel.swift

+            self.task = nil
+        }
+    }
+}


This and the next file (ContentView) are the full minimal chat app.

davidkoski · 2025-12-17T22:03:03Z

Applications/LLMEval/README.md

-### Troubleshooting
-
-If the program crashes with a very deep stack trace, you may need to build
-in Release configuration. This seems to depend on the size of the model.


This advice was obsolete

davidkoski · 2025-12-17T22:03:14Z

Applications/LLMEval/Views/ContentView.swift

@@ -1,6 +1,5 @@
 // Copyright © 2025 Apple Inc.

-import AsyncAlgorithms


davidkoski · 2025-12-17T22:03:22Z

Package.swift

                .product(name: "MLXNN", package: "mlx-swift"),
                .product(name: "MLXOptimizers", package: "mlx-swift"),
                .product(name: "MLXRandom", package: "mlx-swift"),
-                .product(name: "Transformers", package: "swift-transformers"),


- ml-explore/mlx-swift-examples#454 - fixes #27 - move ChatSession integration tests into new test target so we can more easily control when it runs - make a ChatSession _unit_ (more or less) test - fix Sendable / thread safety issues uncovered by LLMBasic

- ml-explore/mlx-swift-examples#454 - fixes #27 - move ChatSession integration tests into new test target so we can more easily control when it runs - make a ChatSession _unit_ (more or less) test - fix Sendable / thread safety issues uncovered by LLMBasic - collect TestTokenizer and friends in its own file. fix warnings in tests

- ml-explore/mlx-swift-examples#454 - fixes #27 - move ChatSession integration tests into new test target so we can more easily control when it runs - make a ChatSession _unit_ (more or less) test - fix Sendable / thread safety issues uncovered by LLMBasic - collect TestTokenizer and friends in its own file. fix warnings in tests - UserInputProcessors -> structs

- see #27 - a port of ml-explore/mlx-lm#463 (happened after the initial port to swift) - in support of ml-explore/mlx-swift-examples#454

- support for ml-explore/mlx-swift-examples#454 - ModelContainer appeared to provide thread safe access to the KVCache and model - but in fact was not -- async token generation could use the KVCache concurrently - if you were to break the async stream early the previously call could still be running

- see #27 - a port of ml-explore/mlx-lm#463 (happened after the initial port to swift) - in support of ml-explore/mlx-swift-examples#454

- support for ml-explore/mlx-swift-examples#454 - ModelContainer appeared to provide thread safe access to the KVCache and model - but in fact was not -- async token generation could use the KVCache concurrently - if you were to break the async stream early the previously call could still be running swift-format

davidkoski · 2026-01-09T20:58:50Z

Applications/LLMEval/ViewModels/LLMEvaluator.swift

-                                    self.tokensPerSecond = Double(self.totalTokens) / elapsed
-                                    self.totalTime = elapsed
-                                }
+            let lmInput = try await modelContainer.prepare(input: userInput)


This is a little easier with the updated API on ModelContainer

davidkoski · 2026-01-09T20:59:29Z

Applications/LLMEval/Views/MetricsView.swift

-                        Active Memory: \(FormatUtilities.formatMemory(memoryUsed))/\(FormatUtilities.formatMemory(GPU.memoryLimit))
-                        Cache Memory: \(FormatUtilities.formatMemory(cacheMemory))/\(FormatUtilities.formatMemory(GPU.cacheLimit))
+                        Active Memory: \(FormatUtilities.formatMemory(memoryUsed))/\(FormatUtilities.formatMemory(Memory.memoryLimit))
+                        Cache Memory: \(FormatUtilities.formatMemory(cacheMemory))/\(FormatUtilities.formatMemory(Memory.cacheLimit))


This was changed from GPU -> Memory to match the python side (we aren't always running on a GPU).

These are deprecation warnings, not build breaks.

davidkoski · 2026-01-09T20:59:53Z

Applications/LoRATrainingExample/ContentView.swift

    private func startInner() async throws {
        // setup
-        GPU.set(cacheLimit: 32 * 1024 * 1024)
+        Memory.cacheLimit = 32 * 1024 * 1024


This as a property is more swifty -- the new Memory API exposes it like that

davidkoski · 2026-01-09T21:01:21Z

Tools/LinearModelTraining/LinearModelTraining.swift


    func run() async throws {
-        Device.setDefault(device: Device(device))
+        try await Device.withDefaultDevice(Device(device)) {


This is now Task scoped rather than global -- this is a better fit for the swift model. The setDefault is deprecated.

davidkoski · 2026-01-09T21:02:37Z

Tools/llm-tool/LLMTool.swift

+            }
+            if let chunk = item.chunk {
+                print(chunk, terminator: "")
+            }


Another move to the updated API. Passing the UserInput (not Sendable) was an issue in the above code in swift 6.

I wonder if this is worth moving to ChatSession? That would make it even simpler.

davidkoski · 2026-01-10T01:00:25Z

Tools/llm-tool/Chat.swift

-        var cache: [KVCache]
-
-        var printStats = false
-    }


Replace all of this with ChatSession -- much simpler.

davidkoski added 2 commits December 17, 2025 13:59

add a minimal LLM chat example

97d9c91

- LLMEval has more of a showcase of features and runtime statistics - this provides the minimum required to load a model and interact with it - also cleans up the xcodeproj (see #451) - removes VLMEval (redundant and wasn't maintained)

swift-format

046a98e

davidkoski requested a review from awni December 17, 2025 22:00

davidkoski commented Dec 17, 2025

View reviewed changes

davidkoski mentioned this pull request Dec 18, 2025

support for LLMBasic (mlx-swift-examples) ml-explore/mlx-swift-lm#29

Closed

4 tasks

davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026

fix gemma3 + attention mask

6779f8e

- see #27 - a port of ml-explore/mlx-lm#463 (happened after the initial port to swift) - in support of ml-explore/mlx-swift-examples#454

davidkoski mentioned this pull request Jan 9, 2026

fix gemma3 + attention mask ml-explore/mlx-swift-lm#53

Open

4 tasks

davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026

fix gemma3 + attention mask

1a5bbcf

- see #27 - a port of ml-explore/mlx-lm#463 (happened after the initial port to swift) - in support of ml-explore/mlx-swift-examples#454

davidkoski mentioned this pull request Jan 9, 2026

fix thread safety issues ml-explore/mlx-swift-lm#55

Open

4 tasks

davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026

fix gemma3 + attention mask

97e842b

- see #27 - a port of ml-explore/mlx-lm#463 (happened after the initial port to swift) - in support of ml-explore/mlx-swift-examples#454

prepare for mlx 0.30.2 and matching mlx-swift-lm

c8bc6a4

davidkoski changed the title ~~add a minimal LLM chat example~~ add a minimal LLM chat example + switch to mlx-swift 0.30.2 Jan 9, 2026

davidkoski commented Jan 9, 2026

View reviewed changes

davidkoski added 3 commits January 9, 2026 14:22

remove tests that have migrated to mlx-swift-lm

78c9413

remove ExampleLLM -- this will be covered by other examples

c0823ca

rework llm-tool to use ChatSession -- a better example

416b686

davidkoski commented Jan 10, 2026

View reviewed changes

		@@ -1,6 +1,5 @@
		// Copyright © 2025 Apple Inc.

		import AsyncAlgorithms

add a minimal LLM chat example + switch to mlx-swift 0.30.2 #454

Are you sure you want to change the base?

add a minimal LLM chat example + switch to mlx-swift 0.30.2 #454

Uh oh!

Conversation

davidkoski commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davidkoski commented Dec 17, 2025 •

edited

Loading