Releases: scouzi1966/maclocal-api
afm-next (20260303 · 410d7e5)
Nightly build from main branch.
- Commit: 410d7e5
- Date: 20260303
- Version: 0.9.5-next.410d7e5.20260303
This is an unstable development build. For the latest stable release, use
brew install scouzi1966/afm/afm.
Changes since last build (cd2941e)
- Auto-clone homebrew-afm tap repo if missing during nightly publish (
410d7e5) - Add ownership guard to build-afm-nightly-publish skill (
1b71d95) - Add build-afm-nightly-publish skill (
bc777d6) - Fix Jinja crash on nullable tool schemas (closes #32) (
f4c80cc) - Simplify build-afm Step 3 to report only binary path and version (
d96c786) - Add fork-first instruction to vibe coding callout (
7f58591) - Add vibe coding callout to README for non-Swift developers (
75af8fc) - Reorder build-afm prerequisites by dependency chain (
2d498f1) - Add prerequisite validation to build-afm skill (
1c27148) - Add skills, test reports, qwen3_5 registry alias, and bench tooling (
50ed40f) - Add script paths and commands to report index page (
038ba82) - Show invocation command in report header for reproducibility (
c7cdfd3) - Fix index.html links to use htmlpreview for HTML reports (
7d8187c) - Remove GitHub Actions Pages workflow (Actions disabled on repo) (
7b4fff1) - Add GitHub Actions workflow for Pages deployment (
f63791b) - Add GitHub Pages index for test reports (
2e3f7bf) - Add MLX patch comparison report (3-ref with upstream-only detection) (
7d38bf9) - Add YAML frontmatter to CLI help for AI agent discovery (
9f755d8) - Add structured help with YAML frontmatter for AI agent discovery (
a31ccd5) - Add repeatable MLX patch comparison report generator (
8bbec6f) - Add cached_tokens to usage response, assertion test suite, and test-macafm skill (
037c657) - Reorder test report: description above link (
f9cff88) - Move test report link below title, add description with judge methodology (
b8f7728) - Add afm-next nightly test report link for Qwen3.5-35B-A3B (
90dfc8d) - Test harness: template mode, smart scoring fixes, report improvements (
aefe39a) - Merge pull request #28 from scouzi1966/feature/optimise-metal (
4d58e3f) - Fix review: QKV mode check, QuantizedKVCache mode, dead perf code (
2ca6495) - Fix GLM-5 OOM + gemma-3 crash + MoE argPartition optimization (
5b8e88c) - Auto-detect VLM models + fix model discovery for HF cache dirs (
02a7008) - Perf: Metal kernel fusions + graph optimizations (107.6→130 tok/s, +21%) (
9e9fcf3) - Perf: beat Python mlx-lm throughput (95.7→107.6 tok/s, +12%) (
b11fa4a) - Add assumption for previous afm installation (
afb62d6) - Update model reference in README.md (
eb17f6c) - Update README with new features in nightly build (
3f4e859) - Update README with new features for nightly build (
d9c804b) - Update README with new API features and parameters (
1da2ad3)
Install / Upgrade via Homebrew
Fresh install (first time):
brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next
Upgrade (already installed):
brew upgrade afm-next
If you have stable afm installed, unlink it first:
brew unlink afm
brew install scouzi1966/afm/afm-next
Switch back to stable:
brew unlink afm-next
brew link afm
Force reinstall (same version, new build):
brew reinstall afm-next
afm-next (20260226 · cd2941e)
Nightly build from main branch.
- Commit: cd2941e
- Date: 20260226
- Version: 0.9.5-next.cd2941e.20260226
This is an unstable development build. For the latest stable release, use
brew install scouzi1966/afm/afm.
Changes since last build (0cfba17)
- Fix Metal kernel fallback and temp media file cleanup (
d84ab02) - Add --media flag for VLM single-prompt mode + base64 data URL support (
a834638) - Qwen3.5 perf: add VLM Metal kernel + default to LLM loading (42→95 tok/s) (
ec88798) - Add test scripts, reports, and benchmark strict=False fix (
cdceabf) - Fix afm args quoting with shlex.split, add benchmark script (
7905be4) - Fix afm: args quoting in test harness (read -ra → eval) (
0ddb545) - Clean up README: consolidate install sections, add stable/nightly table (
c78e5d5) - Exclude test report HTML/JSONL from GitHub language stats (
a4c9c7d) - Fix CLI --stop not passed to Server in MlxCommand (
f3607af) - Fix stop sequences in thinking models, add CLI --stop flag, fix JSON schema injection (
050e836) - Fix claude nested session issue and regenerate report with both AI analyses (
dd419d8) - Add Qwen3.5-35B-A3B-4bit test suite and report (129/132 passed) (
b93db37) - Move stable install instructions below latest release link (
747a67c) - Update afm-next heading wording (
385e738) - Change afm-next heading to 'Available NOW' (
697abf0) - Update README with Qwen3.5-35B-A3B support and afm-next install instructions (
50fcd13)
Install / Upgrade via Homebrew
Fresh install (first time):
brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next
Upgrade (already installed):
brew upgrade afm-next
If you have stable afm installed, unlink it first:
brew unlink afm
brew install scouzi1966/afm/afm-next
Switch back to stable:
brew unlink afm-next
brew link afm
Force reinstall (same version, new build):
brew reinstall afm-next
afm-next (20260225 · 0cfba17)
Nightly build from main branch.
- Commit: 0cfba17
- Date: 20260225
- Version: 0.9.5-next.0cfba17.20260225
This is an unstable development build. For the latest stable release, use
brew install scouzi1966/afm/afm.
RUN Qwen3.5-35B-A3B!
afm mlx -m mlx-community/Qwen3.5-35B-A3B-4bit -w
Changes since last build (199cded)
- Download *.jinja files and fix missing chat_template fallback (
0cfba17) - Gate verbose colored logs behind --very-verbose flag (
7514638)
Install / Upgrade via Homebrew
Fresh install (first time):
brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next
Upgrade (already installed):
brew upgrade afm-next
If you have stable afm installed, unlink it first:
brew unlink afm
brew install scouzi1966/afm/afm-next
Switch back to stable:
brew unlink afm-next
brew link afm
Force reinstall (same version, new build):
brew reinstall afm-next
afm-next (20260225 · 199cded)
Nightly build from main branch.
- Commit: 199cded
- Date: 20260225
- Version: 0.9.5-next.199cded.20260225
This is an unstable development build. For the latest stable release, use
brew install scouzi1966/afm/afm.
Changes since last build (04f8b52)
- Merge pull request #26 from alantmiller/fix/vision-async-dispatch (
199cded) - Use 'Changes since last build (SHA)' format in release notes (
7f4b7e7) - Show 'changes since' commit SHA in nightly release notes (
62a2a1f) - Add --since flag to publish-next.sh for changelog control (
5a44659) - Expand install/upgrade instructions in nightly release notes (
2913362) - Include commit SHA in brew version string (
969ca10) - Add commit field and fullVersion to BuildInfo.swift (
586bbea) - Preserve nightly release history with unique tags (
745e5cd) - Add git commit SHA to --version output (e.g. v0.9.5-abc1234) (
0e5a8b6) - Add changelog and install instructions to local publish script (
493c94f) - Fix MXFP4 quantization crash, token counting, gemma3n routing, and test harness (
d79a1d3) - Update README with nightly build installation instructions (
f1d2813) - fix: vision subcommand dispatches async run() correctly (
04566f0) - Fix bare JSON tool call detection and add ToolCallFormat.swift patch (
f6efa24) - Change nightly build to manual trigger only (
af238ab) - Add nightly build workflow for afm-next (
8892329) - Merge origin/main into feature/mlx-prompt-caching (
146030f) - Add tool call parser test results (26/26 pass) (
694abbb) - Fix review findings: zero-arg JSON, prefix caching default, fallback tag detection (
1ee1ede) - Add hermes, llama3_json, gemma, and mistral tool call parsers (
341f35d) - Merge pull request #24 from scouzi1966/feature/structured-outputs (
a7b3b38) - Address PR review: nullable types, null rejection, guided streaming deltas (
e25d3ef) - Add structured outputs, --guided-json CLI flag, and comprehensive test suite (
3b30fa0) - Add incremental streaming tool call arguments and fix parameter name mapping (
9b27cac) - Update README for v0.9.5 features (
519d35f) - Update README with new features and MLX support (
fe7bd74) - Add token-level streaming tool call detection and update CLAUDE.md (
ec46e88) - Add tool calling, stop sequences, response_format, and real token counts (
71e2c68) - Save test reports to test-reports/ with JSONL data and add Kimi brief prompt (
bee6a72) - Add logprobs support, --max-logprobs switch, and dynamic system_fingerprint (
b37fdff) - Bump version to v0.9.5 and add sampling params test report (
4abbdee) - Add top_k, min_p, presence_penalty, and seed sampling parameters (
ce69ba0) - Checkpoint: OpenClaw config, verbose logging, max_completion_tokens, and streaming improvements (
b69b9a5) - Revise README for clarity on v0.9.4 features (
c66764e) - Add Qwen3.5-MoE VLM support, reasoning extraction, --raw flag, and stream cancellation (
6cf821c) - Checkpoint: pre Qwen3.5-397B-A17B-4bit reclassify (
8f35675) - Update OpenCode usage instructions in README (
c72a023) - Revise installation methods in README (
73c1640) - Swap OpenCode setup steps: configure first, then start afm (
694f456) - Add detailed OpenCode /connect instructions to README (
b15801b) - Add OpenCode integration guide to README (
90af069) - Wire all MLX CLI params to server mode, enhance generation logging (
1bc007f) - Revise installation command formatting in README (
38c25d2) - Update README with model repo environment variable (
51ef4b4) - Update README.md (
5bc112f) - Revise README for v0.9.4 feature announcement (
3d6c296) - Update README with feature listing and API access (
d7b3bfa) - Revise README for MLX model support and commands (
c89f191) - Add MLX excitement and quick install to README hero section (
4f9b374) - Add MLX models screenshot to README (
0d1bd07) - Add files via upload (
f979ab4) - Update README with MLX local model support and new v0.9.4 features (
9d86386) - Add regression test report: 61/61 passed (
680c465) - Add MLX model test report: 27/28 passed, Kimi-K2.5 interrupted (
fa84cf7) - Fix MLX metallib resolution for relocated binaries (
5059f1b)
Install / Upgrade via Homebrew
Fresh install (first time):
brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next
Upgrade (already installed):
brew upgrade afm-next
If you have stable afm installed, unlink it first:
brew unlink afm
brew install scouzi1966/afm/afm-next
Switch back to stable:
brew unlink afm-next
brew link afm
Force reinstall (same version, new build):
brew reinstall afm-next
afm-next (20260224 · 7f4b7e7)
Nightly build from main branch.
- Commit: 7f4b7e7
- Date: 20260224
- Version: 0.9.5-next.7f4b7e7.20260224
This is an unstable development build. For the latest stable release, use
brew install scouzi1966/afm/afm.
Changes since last build (04f8b52)
- Use 'Changes since last build (SHA)' format in release notes (
7f4b7e7) - Show 'changes since' commit SHA in nightly release notes (
62a2a1f) - Add --since flag to publish-next.sh for changelog control (
5a44659) - Expand install/upgrade instructions in nightly release notes (
2913362) - Include commit SHA in brew version string (
969ca10) - Add commit field and fullVersion to BuildInfo.swift (
586bbea) - Preserve nightly release history with unique tags (
745e5cd) - Add git commit SHA to --version output (e.g. v0.9.5-abc1234) (
0e5a8b6) - Add changelog and install instructions to local publish script (
493c94f) - Fix MXFP4 quantization crash, token counting, gemma3n routing, and test harness (
d79a1d3) - Update README with nightly build installation instructions (
f1d2813) - Fix bare JSON tool call detection and add ToolCallFormat.swift patch (
f6efa24) - Change nightly build to manual trigger only (
af238ab) - Add nightly build workflow for afm-next (
8892329) - Merge origin/main into feature/mlx-prompt-caching (
146030f) - Add tool call parser test results (26/26 pass) (
694abbb) - Fix review findings: zero-arg JSON, prefix caching default, fallback tag detection (
1ee1ede) - Add hermes, llama3_json, gemma, and mistral tool call parsers (
341f35d) - Merge pull request #24 from scouzi1966/feature/structured-outputs (
a7b3b38) - Address PR review: nullable types, null rejection, guided streaming deltas (
e25d3ef) - Add structured outputs, --guided-json CLI flag, and comprehensive test suite (
3b30fa0) - Add incremental streaming tool call arguments and fix parameter name mapping (
9b27cac) - Update README for v0.9.5 features (
519d35f) - Update README with new features and MLX support (
fe7bd74) - Add token-level streaming tool call detection and update CLAUDE.md (
ec46e88) - Add tool calling, stop sequences, response_format, and real token counts (
71e2c68) - Save test reports to test-reports/ with JSONL data and add Kimi brief prompt (
bee6a72) - Add logprobs support, --max-logprobs switch, and dynamic system_fingerprint (
b37fdff) - Bump version to v0.9.5 and add sampling params test report (
4abbdee) - Add top_k, min_p, presence_penalty, and seed sampling parameters (
ce69ba0) - Checkpoint: OpenClaw config, verbose logging, max_completion_tokens, and streaming improvements (
b69b9a5) - Revise README for clarity on v0.9.4 features (
c66764e) - Add Qwen3.5-MoE VLM support, reasoning extraction, --raw flag, and stream cancellation (
6cf821c) - Checkpoint: pre Qwen3.5-397B-A17B-4bit reclassify (
8f35675) - Update OpenCode usage instructions in README (
c72a023) - Revise installation methods in README (
73c1640) - Swap OpenCode setup steps: configure first, then start afm (
694f456) - Add detailed OpenCode /connect instructions to README (
b15801b) - Add OpenCode integration guide to README (
90af069) - Wire all MLX CLI params to server mode, enhance generation logging (
1bc007f) - Revise installation command formatting in README (
38c25d2) - Update README with model repo environment variable (
51ef4b4) - Update README.md (
5bc112f) - Revise README for v0.9.4 feature announcement (
3d6c296) - Update README with feature listing and API access (
d7b3bfa) - Revise README for MLX model support and commands (
c89f191) - Add MLX excitement and quick install to README hero section (
4f9b374) - Add MLX models screenshot to README (
0d1bd07) - Add files via upload (
f979ab4) - Update README with MLX local model support and new v0.9.4 features (
9d86386) - Add regression test report: 61/61 passed (
680c465) - Add MLX model test report: 27/28 passed, Kimi-K2.5 interrupted (
fa84cf7) - Fix MLX metallib resolution for relocated binaries (
5059f1b)
Install / Upgrade via Homebrew
Fresh install (first time):
brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next
Upgrade (already installed):
brew upgrade afm-next
If you have stable afm installed, unlink it first:
brew unlink afm
brew install scouzi1966/afm/afm-next
Switch back to stable:
brew unlink afm-next
brew link afm
Force reinstall (same version, new build):
brew reinstall afm-next
v0.9.4 — MLX Local Models, All Swift, No Python. WebUI Supported.
What's New in v0.9.4
MLX Local Model Support
- Run any Hugging Face MLX model locally — all Swift, no Python dependency
- Full WebUI support (
afm mlx -m <model> -w) - Interactive model picker for downloaded models
- 28 models tested and verified (Qwen3, Gemma 3/3n, GLM-4/5, DeepSeek V3, LFM2, SmolLM3, Llama 3.2, MiniMax M2.5, Nemotron, Kimi-K2.5, and more)
- Single prompt and pipe mode (
afm mlx -m <model> -s "prompt")
API Gateway Mode
afm -w -gauto-discovers and proxies Ollama, LM Studio, Jan, and other local backends- Single URL, single WebUI, all your models
Also New
- Vision OCR subcommand (
afm vision) - Reasoning model support (Qwen, DeepSeek, gpt-oss)
- WebUI auto-selects the right model on startup
- Increase max_tokens to 5000
- Increase model load timeout to 360s for large models
Install
pip install macafm
# or
brew install scouzi1966/afm/afmv0.9.3
What's New in v0.9.3
API Gateway Mode (afm -w -g)
- Auto-discovers and proxies to Ollama, LM Studio, Jan, llama.cpp, and other local LLM backends
- Unified model selector — all backend models appear in a single dropdown
- Model info strip — shows backend name, capabilities (Vision, Tools), and context window size
- LM Studio loaded state detection — correctly identifies loaded vs unloaded models
Reasoning Model Support
- GPT-OSS / DeepSeek / Qwen reasoning — normalizes
reasoningfield toreasoning_contentfor WebUI compatibility <think>tag extraction — extracts reasoning from<think>...</think>blocks in streaming responses
WebUI Enhancements
- Apple Intelligence branding in single-model mode with SF Symbol icon
- Startup flash fix — page hidden until branding applied, no more llama.cpp → AFM transition flicker
- Single-model mode fix — model selector dropdown no longer pops up when only Foundation model is available
- llama.cpp webui subtitle — shows "llama.cpp webui" under AFM heading
Streaming & Stats
stream_options.include_usagesent to all backends (LM Studio, Ollama, Jan) for real token counts- Estimated token counting fallback for backends without usage data
- Native streaming for Apple Foundation Model
Other
- pip install —
pip install macafmnow available as alternative to Homebrew - Pre-warm — model pre-warmed on server startup for faster first response
- Multimodal OCR — vision support for Foundation Model
- Fix: WebUI bundled in pip package —
afm -wnow correctly opens browser when installed via pip
Installation
# Homebrew
brew tap scouzi1966/afm
brew install afm
brew update
brew upgrade afm (if previously installed with brew)
# pip
pip install macafm
pip install --upgrade macafm (if installed with pip earlier)Quick Start
afm # API server only
afm -w # API server + WebUI
afm -w -g # WebUI + gateway (auto-discovers all local backends)AFM v0.9.1 with WebUI!
What's New
Bug Fixes
- Improved error handling for context window exceeded errors - When the conversation exceeds Apple Foundation Models' 4096 token limit, the WebUI now displays a clear, formatted error message in the chat instead of silently stopping
- Errors are shown as visible assistant messages with emoji formatting
- Proper stream termination ensures the WebUI exits its waiting state
Changes
- Added
contextWindowExceedederror type with token count extraction - Error messages include specific token counts when available
- OpenAI-compatible
context_length_exceedederror type for API responses
Installation
Homebrew (recommended)
brew tap scouzi1966/tap
brew install afmManual
tar -xzf afm-v0.9.1-arm64.tar.gz
cd afm-v0.9.1-arm64
./install.shRequirements
- macOS 26+ with Apple Intelligence enabled
- Apple Silicon Mac (M1/M2/M3/M4 series)
v0.9.0 - llama.cpp WebUI Integration
What's New
llama.cpp WebUI Integration
- Add
-w/--webuiflag to enable llama.cpp webui and open browser automatically - WebUI provides a chat interface at
http://localhost:9999when enabled - Rebranded as "Apple Foundation Models" in the UI
API Improvements
- Add
/propsendpoint for webui compatibility - Make
modelfield optional in chat completion requests
Build System
- Add llama.cpp as git submodule (pinned to specific commit for reproducibility)
- Add Makefile targets:
submodules,webui,build-with-webui - WebUI included in distribution packages
Usage
# Start server with webui
afm -w
afm --webui
# Start on custom port
afm -w --port 8080Requirements
- macOS 26+ with Apple Intelligence enabled
- Apple Silicon Mac (M1/M2/M3/M4 series)
- Node.js (only for building webui from source)
Installation
Homebrew
brew tap scouzi1966/afm
brew install afmManual
Download afm-v0.9.0-arm64.tar.gz, extract, and run ./install.sh
v0.8.0 - Permissive Guardrails
🚀 AFM v0.8.0 - Permissive Guardrails by Dan Fabulich (https://github.com/dfabulich)
This release introduces permissive guardrails support, allowing you to process potentially unsafe content for legitimate use cases like content moderation and inspection.
✨ What's New
🛡️ Permissive Guardrails Support
Enable Apple's .permissiveContentTransformations API to allow processing of content that would normally be blocked by default safety guardrails:
# Enable permissive guardrails mode (the following will flag unsafe content without permissive guardrails option)
afm --permissive-guardrails -s "I want to be a porn star"
afm -P -s "I want to be a porn star"
# Use in server mode
afm -P --port 9999
# Combine with other parameters
afm -P -t 0.7 -r "random:top-p=0.9" -s "Analysis task"Use Cases:
- Content moderation systems
- Safety research and testing
- Content inspection and classification
- Educational/research applications
🐛 Bug Fixes
- Fixed server startup in background/daemon mode
- Fixed missing default parameter values for ArgumentParser
- Improved stdin detection for pipe mode
- Better error handling and validation
🧪 Comprehensive Test Suite
New test-all-features.sh script with 60+ automated tests:
- CLI parameter validation
- Single-prompt mode testing
- Temperature and randomness parameters
- Permissive guardrails behavior
- Server mode and API endpoints
- Realistic API test scenarios
📋 Usage Examples
Basic Permissive Guardrails
# Single prompt with permissive mode
afm -P -s "I want to be a porn star"
# Server with permissive guardrails
afm -P --port 9999API Usage
# Chat completion with permissive guardrails (server must be started with -P)
curl -X POST http://localhost:9999/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "foundation",
"messages": [{"role": "user", "content": "Analyze this content"}]
}'Combined Parameters
# All features together
afm -P -t 0.5 -r "random:top-p=0.9:seed=42" -a "model.fmadapter" --port 9999🔧 Technical Improvements
- Permissive Guardrails: Integrated throughout the entire stack (CLI → Server → Controller → Service)
- Enhanced Splash Screen: ANSI color codes for beautiful terminal output
- Better Validation: Parameter validation at CLI parsing level
- Improved Testing: Comprehensive automated test coverage
- Code Quality: Bug fixes for production reliability
📦 Installation
Homebrew (Recommended)
brew tap scouzi1966/afm
brew install afmManual Installation
# Download the binary
curl -L https://github.com/scouzi1966/maclocal-api/releases/download/v0.8.0/afm-v0.8.0-arm64.tar.gz -o afm.tar.gz
# Extract
tar -xzf afm.tar.gz
# Make executable and move to PATH
chmod +x afm
sudo mv afm /usr/local/bin/📌 Requirements
- macOS 26+ (macOS Sequoia or later)
- Apple Intelligence enabled
- Apple Silicon Mac (M1/M2/M3/M4 series)
🔄 Upgrade from v0.7.0
# Homebrew
brew update
brew upgrade afm
# Manual
# Download new version and replace binaryBreaking Changes: None - fully backward compatible with v0.7.0
📖 Documentation
🙏 Acknowledgments
Special thanks to @dfabulich for contributing the permissive guardrails feature via PR #11!
Full Changelog: v0.7.0...v0.8.0