Skip to content

Latest commit

 

History

History
17 lines (12 loc) · 546 Bytes

File metadata and controls

17 lines (12 loc) · 546 Bytes

MAX inference server

MAX is a high-performance inference server that provides an OpenAI-compatible endpoint for large language models (LLMs) locally or in the cloud.

To start your own endpoint with just a few commands, check out our quickstart guide.

License

Users must adhere to the terms of usage for MAX and Mojo. Modular Community License.