vLLM Metal Plugin

High-performance LLM inference on Apple Silicon using MLX and vLLM

vLLM Metal is a plugin that enables vLLM to run on Apple Silicon Macs using MLX as the primary compute backend. It unifies MLX and PyTorch under a single lowering path.

Documentation: https://docs.vllm.ai/projects/vllm-metal/en/latest/

Latest News 🔥

[2026/04] We released the new version v0.2.0! Unified paged varlen Metal kernel is now the default attention backend. 83x TTFT, 3.6x throughput compared to v0.1.0.

Requirements

macOS on Apple Silicon
Native arm64 Python 3.12. Rosetta/x86_64 Python is not supported.
Xcode Command Line Tools (xcode-select --install). vLLM core is compiled from source via clang++. The Metal kernels ship prebuilt, so no Metal compiler or toolchain is needed to run them.

Supported Models

vllm-metal supports a growing set of models on Apple Silicon. See the full matrix in docs/supported_models.md.

Installation

curl -fsSL https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh | bash

Using the install script above, the following will be installed under the ~/.venv-vllm-metal directory (the default).

vllm-metal plugin
vllm core
Related libraries

If you run source ~/.venv-vllm-metal/bin/activate, the vllm CLI becomes available and you can access the vLLM right away.

For how to use the vllm CLI, please refer to the official vLLM guide. https://docs.vllm.ai/en/latest/cli/

Optional: Rust frontend (experimental)

Pass --with-vllm-rs to also install vllm-rs, the experimental Rust frontend vendored in the bundled vLLM release. Requires the Rust toolchain (https://rustup.rs):

./install.sh --with-vllm-rs

See docs/rust_frontend.md for usage and architecture.

Name		Name	Last commit message	Last commit date
Latest commit History 410 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src		src
tests		tests
tools		tools
vllm_metal		vllm_metal
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
mkdocs.yaml		mkdocs.yaml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vLLM Metal Plugin

Requirements

Supported Models

Installation

Optional: Rust frontend (experimental)

About

Uh oh!

Releases 340

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

vLLM Metal Plugin

Requirements

Supported Models

Installation

Optional: Rust frontend (experimental)

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 340

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages