You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-friendly solution for running MLX-based vision and language models locally with an OpenAI-compatible interface.
在 macOS 上把本地模型包装成 OpenAI 风格 API(Chat + TTS),方便你的应用/SDK 直接接入。|Wrap local models into an OpenAI-style API (Chat + TTS) on macOS, so any OpenAI-compatible client can connect.
Run large Mixture-of-Experts LLMs that exceed system RAM on Apple Silicon by loading only router-selected experts from SSD with MLX. Includes OpenAI/Anthropic-compatible serving for local agentic coding.