Aider is a git-aware terminal pair-programmer.
afm mlx -m mlx-community/Qwen3-Coder-Next-4bit \
--port 9999 --enable-prefix-cachingexport OPENAI_API_BASE=http://localhost:9999/v1
export OPENAI_API_KEY=x
aider --model openai/mlx-community/Qwen3-Coder-Next-4bitAider commits/edits/reverts as usual. All inference is local.
- Aider's repo-map prompt grows with codebase size;
--enable-prefix-cachingkeeps subsequent turns fast. - Aider doesn't stream tool-call deltas — it waits for the final assistant message — so streaming overhead is minimal either way.
- For very long sessions, add
--kv-bits 8to halve the KV-cache memory footprint.