Installation

Requirements

macOS on Apple Silicon (M1/M2/M3/M4)
Python 3.10+

Install with uv (Recommended)

git clone https://github.com/waybarrios/vllm-mlx.git
cd vllm-mlx

uv pip install -e .

Install with pip

git clone https://github.com/waybarrios/vllm-mlx.git
cd vllm-mlx

pip install -e .

Optional: Vision Support

For video processing with transformers:

pip install -e ".[vision]"

Optional: Audio Support (STT/TTS)

pip install mlx-audio

Optional: Embeddings

pip install mlx-embeddings

What Gets Installed

mlx, mlx-lm, mlx-vlm - MLX framework and model libraries
transformers, tokenizers - HuggingFace libraries
opencv-python - Video processing
gradio - Chat UI
psutil - Resource monitoring
mlx-audio (optional) - Speech-to-Text and Text-to-Speech
mlx-embeddings (optional) - Text embeddings

Verify Installation

# Check CLI commands
vllm-mlx --help
vllm-mlx-bench --help
vllm-mlx-chat --help

# Test with a small model
vllm-mlx-bench --model mlx-community/Llama-3.2-1B-Instruct-4bit --prompts 1

Troubleshooting

MLX not found

Ensure you're on Apple Silicon:

uname -m  # Should output "arm64"

Model download fails

Check your internet connection and HuggingFace access. Some models require authentication:

huggingface-cli login

Out of memory

Use a smaller quantized model:

vllm-mlx serve mlx-community/Llama-3.2-1B-Instruct-4bit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

Requirements

Install with uv (Recommended)

Install with pip

Optional: Vision Support

Optional: Audio Support (STT/TTS)

Optional: Embeddings

What Gets Installed

Verify Installation

Troubleshooting

MLX not found

Model download fails

Out of memory

FilesExpand file tree

installation.md

Latest commit

History

installation.md

File metadata and controls

Installation

Requirements

Install with uv (Recommended)

Install with pip

Optional: Vision Support

Optional: Audio Support (STT/TTS)

Optional: Embeddings

What Gets Installed

Verify Installation

Troubleshooting

MLX not found

Model download fails

Out of memory