MLXLLM

Documentation

Porting and implementing models
MLXLLMCommon -- common API for LLM and VLM
MLXLLM -- large language model example implementations
MLXVLM -- vision language model example implementations

mlx-community/CodeLlama-13b-Instruct-hf-4bit-MLX
mlx-community/Llama-3.2-1B-Instruct-4bit
mlx-community/Llama-3.2-3B-Instruct-4bit
mlx-community/Meta-Llama-3-8B-Instruct-4bit
mlx-community/Meta-Llama-3.1-8B-Instruct-4bit
mlx-community/Mistral-7B-Instruct-v0.3-4bit
mlx-community/Mistral-Nemo-Instruct-2407-4bit
mlx-community/OpenELM-270M-Instruct
mlx-community/Phi-3.5-MoE-instruct-4bit
mlx-community/Phi-3.5-mini-instruct-4bit
mlx-community/Qwen1.5-0.5B-Chat-4bit
mlx-community/SmolLM-135M-Instruct-4bit
mlx-community/gemma-2-2b-it-4bit
mlx-community/gemma-2-9b-it-4bit
mlx-community/phi-2-hf-4bit-mlx
mlx-community/quantized-gemma-2b-it

Currently supported model types are:

Cohere
Gemma
Gemma2
InternLM2
Llama / Mistral
OpenELM
Phi
Phi3
PhiMoE
Qwen2
Starcoder2

See llm-tool

Adding a Model

If the model follows the typical LLM pattern:

config.json, tokenizer.json, and tokenizer_config.json
*.safetensors

You can follow the pattern of the models in the Models directory and create a .swift file for your new model:

Create a Configuration

Create a configuration struct to match the config.json (any parameters needed).

public struct YourModelConfiguration: Codable, Sendable {
    public let hiddenSize: Int
    
    // use this pattern for values that need defaults
    public let _layerNormEps: Float?
    public var layerNormEps: Float { _layerNormEps ?? 1e-6 }
    
    enum CodingKeys: String, CodingKey {
        case hiddenSize = "hidden_size"
        case _layerNormEps = "layer_norm_eps"
    }
}

Create the Model Class

Create the model class. The top-level public class should have a structure something like this:

public class YourModel: Module, LLMModel, KVCacheDimensionProvider, LoRAModel {

    public let kvHeads: [Int]

    @ModuleInfo var model: YourModelInner

    public func loraLinearLayers() -> LoRALinearLayers {
        // TODO: modify as needed
        model.layers.map { ($0.attention, ["q_proj", "v_proj"]) }
    }

    public init(_ args: YourModelConfiguration) {
        self.kvHeads = Array(repeating: args.kvHeads, count: args.hiddenLayers)
        self.model = YourModelInner(args)
    }

    public func callAsFunction(_ inputs: MLXArray, cache: [KVCache]?) -> MLXArray {
        // TODO: modify as needed
        let out = model(inputs, cache: cache)
        return model.embedTokens.asLinear(out)
    }
}

Register the Model

In LLMModelFactory.swift register the model type itself (this is independent of the model id):

public class ModelTypeRegistry: @unchecked Sendable {
...
    private var creators: [String: @Sendable (URL) throws -> any LanguageModel] = [
        "yourModel": create(YourModelConfiguration.self, YourModel.init),

Add a constant for the model in the ModelRegistry (not strictly required but useful for callers to refer to it in code):

public class ModelRegistry: @unchecked Sendable {
...
    static public let yourModel_4bit = ModelConfiguration(
        id: "mlx-community/YourModel-4bit",
        defaultPrompt: "What is the gravity on Mars and the moon?"
    )

and finally add it to the all list -- this will let users find the model configuration by id:

    private static func all() -> [ModelConfiguration] {
        [
            codeLlama13b4bit,
...
            yourModel_4bit,

Using a Model

See MLXLMCommon/README.md.

LoRA

Lora.swift contains an implementation of LoRA based on this example:

https://github.com/ml-explore/mlx-examples/tree/main/lora

See llm-tool/LoraCommands.swift for an example of a driver and llm-tool for examples of how to run it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MLXLLM

Documentation

Contents

Adding a Model

Create a Configuration

Create the Model Class

Register the Model

Using a Model

LoRA

Files

README.md

Latest commit

History

README.md

File metadata and controls

MLXLLM

Documentation

Contents

Adding a Model

Create a Configuration

Create the Model Class

Register the Model

Using a Model

LoRA