OpenClaw is an open coding agent that targets local models via the openai-completions API mode.
afm prints a paste-ready config for you:
afm mlx -m mlx-community/Qwen3-Coder-Next-4bit --openclaw-configThis emits a JSON block with baseUrl, api: "openai-completions", model metadata (vision / reasoning detection, context window, max tokens), and zero-cost pricing fields. Copy it into your OpenClaw provider config.
afm mlx -m mlx-community/Qwen3-Coder-Next-4bit \
--port 9999 --enable-prefix-caching(--openclaw-config defaults to port 9999 unless you pass -p.)
OpenClaw will issue requests against http://localhost:9999/v1. Tool calling, streaming, and reasoning extraction all work out of the box.
- The generated config detects
<think>reasoning support automatically. - Vision capability is reported as true for VLM-class model IDs.
- For multi-session use, add
--concurrent N(each OpenClaw conversation can hold its own slot).