Skip to content

Conversation

@davidkoski
Copy link
Collaborator

@davidkoski davidkoski commented Dec 17, 2025

Proposed changes

  • LLMEval has more of a showcase of features and runtime statistics
  • this provides the minimum required to load a model and interact with it
  • also cleans up the xcodeproj (see Fix Xcode project #451)
  • removes VLMEval (redundant and wasn't maintained)
  • fixes all warnings from moving to mlx-swift 0.30.2 (and equivalent mlx-swift-lm, tag not cut yet)
image

I updated the documentation to indicate which examples were more full featured and which ones were the minimal starting points. Both have uses.

@DePasqualeOrg FYI

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document

  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes

  • I have added tests that prove my fix is effective or that my feature works

  • I have updated the necessary documentation (if needed)

  • update build dependency on mlx-swift-lm when tag is ready

- LLMEval has more of a showcase of features and runtime statistics
- this provides the minimum required to load a model and interact with it
- also cleans up the xcodeproj (see #451)
- removes VLMEval (redundant and wasn't maintained)
@davidkoski davidkoski requested a review from awni December 17, 2025 22:00
self.task = nil
}
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the next file (ContentView) are the full minimal chat app.

### Troubleshooting

If the program crashes with a very deep stack trace, you may need to build
in Release configuration. This seems to depend on the size of the model.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This advice was obsolete

@@ -1,6 +1,5 @@
// Copyright © 2025 Apple Inc.

import AsyncAlgorithms
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used

.product(name: "MLXNN", package: "mlx-swift"),
.product(name: "MLXOptimizers", package: "mlx-swift"),
.product(name: "MLXRandom", package: "mlx-swift"),
.product(name: "Transformers", package: "swift-transformers"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used

davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Dec 18, 2025
- ml-explore/mlx-swift-examples#454

- fixes #27
- move ChatSession integration tests into new test target so we can more easily control when it runs
- make a ChatSession _unit_ (more or less) test
- fix Sendable / thread safety issues uncovered by LLMBasic
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 6, 2026
- ml-explore/mlx-swift-examples#454

- fixes #27
- move ChatSession integration tests into new test target so we can more easily control when it runs
- make a ChatSession _unit_ (more or less) test
- fix Sendable / thread safety issues uncovered by LLMBasic

- collect TestTokenizer and friends in its own file.  fix warnings in tests
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 6, 2026
- ml-explore/mlx-swift-examples#454

- fixes #27
- move ChatSession integration tests into new test target so we can more easily control when it runs
- make a ChatSession _unit_ (more or less) test
- fix Sendable / thread safety issues uncovered by LLMBasic

- collect TestTokenizer and friends in its own file.  fix warnings in tests
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 8, 2026
- ml-explore/mlx-swift-examples#454

- fixes #27
- move ChatSession integration tests into new test target so we can more easily control when it runs
- make a ChatSession _unit_ (more or less) test
- fix Sendable / thread safety issues uncovered by LLMBasic

- collect TestTokenizer and friends in its own file.  fix warnings in tests
- UserInputProcessors -> structs
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026
- see #27
- a port of ml-explore/mlx-lm#463 (happened after the initial port to swift)

- in support of ml-explore/mlx-swift-examples#454
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026
- see #27
- a port of ml-explore/mlx-lm#463 (happened after the initial port to swift)

- in support of ml-explore/mlx-swift-examples#454
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026
- support for ml-explore/mlx-swift-examples#454
- ModelContainer appeared to provide thread safe access to the KVCache and model
    - but in fact was not -- async token generation could use the KVCache concurrently
    - if you were to break the async stream early the previously call could still be running
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026
- see #27
- a port of ml-explore/mlx-lm#463 (happened after the initial port to swift)

- in support of ml-explore/mlx-swift-examples#454
davidkoski added a commit to ml-explore/mlx-swift-lm that referenced this pull request Jan 9, 2026
- support for ml-explore/mlx-swift-examples#454
- ModelContainer appeared to provide thread safe access to the KVCache and model
    - but in fact was not -- async token generation could use the KVCache concurrently
    - if you were to break the async stream early the previously call could still be running

swift-format
@davidkoski davidkoski changed the title add a minimal LLM chat example add a minimal LLM chat example + switch to mlx-swift 0.30.2 Jan 9, 2026
self.tokensPerSecond = Double(self.totalTokens) / elapsed
self.totalTime = elapsed
}
let lmInput = try await modelContainer.prepare(input: userInput)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little easier with the updated API on ModelContainer

Active Memory: \(FormatUtilities.formatMemory(memoryUsed))/\(FormatUtilities.formatMemory(GPU.memoryLimit))
Cache Memory: \(FormatUtilities.formatMemory(cacheMemory))/\(FormatUtilities.formatMemory(GPU.cacheLimit))
Active Memory: \(FormatUtilities.formatMemory(memoryUsed))/\(FormatUtilities.formatMemory(Memory.memoryLimit))
Cache Memory: \(FormatUtilities.formatMemory(cacheMemory))/\(FormatUtilities.formatMemory(Memory.cacheLimit))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed from GPU -> Memory to match the python side (we aren't always running on a GPU).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are deprecation warnings, not build breaks.

private func startInner() async throws {
// setup
GPU.set(cacheLimit: 32 * 1024 * 1024)
Memory.cacheLimit = 32 * 1024 * 1024
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This as a property is more swifty -- the new Memory API exposes it like that


func run() async throws {
Device.setDefault(device: Device(device))
try await Device.withDefaultDevice(Device(device)) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now Task scoped rather than global -- this is a better fit for the swift model. The setDefault is deprecated.

}
if let chunk = item.chunk {
print(chunk, terminator: "")
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another move to the updated API. Passing the UserInput (not Sendable) was an issue in the above code in swift 6.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is worth moving to ChatSession? That would make it even simpler.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

var cache: [KVCache]

var printStats = false
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace all of this with ChatSession -- much simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants