Release afm 0.9.5 · scouzi1966/maclocal-api

afm 0.9.5

Apple Foundation Models + MLX local models — OpenAI-compatible API, WebUI, all Swift.

Changes since v0.9.4

Auto-clone homebrew-afm tap repo if missing during nightly publish (410d7e5)
Add ownership guard to build-afm-nightly-publish skill (1b71d95)
Add build-afm-nightly-publish skill (bc777d6)
Fix Jinja crash on nullable tool schemas (closes #32) (f4c80cc)
Simplify build-afm Step 3 to report only binary path and version (d96c786)
Add fork-first instruction to vibe coding callout (7f58591)
Add vibe coding callout to README for non-Swift developers (75af8fc)
Reorder build-afm prerequisites by dependency chain (2d498f1)
Add prerequisite validation to build-afm skill (1c27148)
Add skills, test reports, qwen3_5 registry alias, and bench tooling (50ed40f)
Add script paths and commands to report index page (038ba82)
Show invocation command in report header for reproducibility (c7cdfd3)
Fix index.html links to use htmlpreview for HTML reports (7d8187c)
Remove GitHub Actions Pages workflow (Actions disabled on repo) (7b4fff1)
Add GitHub Actions workflow for Pages deployment (f63791b)
Add GitHub Pages index for test reports (2e3f7bf)
Add MLX patch comparison report (3-ref with upstream-only detection) (7d38bf9)
Add YAML frontmatter to CLI help for AI agent discovery (9f755d8)
Add structured help with YAML frontmatter for AI agent discovery (a31ccd5)
Add repeatable MLX patch comparison report generator (8bbec6f)
Add cached_tokens to usage response, assertion test suite, and test-macafm skill (037c657)
Reorder test report: description above link (f9cff88)
Move test report link below title, add description with judge methodology (b8f7728)
Add afm-next nightly test report link for Qwen3.5-35B-A3B (90dfc8d)
Test harness: template mode, smart scoring fixes, report improvements (aefe39a)
Merge pull request #28 from scouzi1966/feature/optimise-metal (4d58e3f)
Fix review: QKV mode check, QuantizedKVCache mode, dead perf code (2ca6495)
Fix GLM-5 OOM + gemma-3 crash + MoE argPartition optimization (5b8e88c)
Auto-detect VLM models + fix model discovery for HF cache dirs (02a7008)
Perf: Metal kernel fusions + graph optimizations (107.6→130 tok/s, +21%) (9e9fcf3)
Perf: beat Python mlx-lm throughput (95.7→107.6 tok/s, +12%) (b11fa4a)
Add assumption for previous afm installation (afb62d6)
Update model reference in README.md (eb17f6c)
Update README with new features in nightly build (3f4e859)
Update README with new features for nightly build (d9c804b)
Update README with new API features and parameters (1da2ad3)
Fix Metal kernel fallback and temp media file cleanup (d84ab02)
Add --media flag for VLM single-prompt mode + base64 data URL support (a834638)
Qwen3.5 perf: add VLM Metal kernel + default to LLM loading (42→95 tok/s) (ec88798)
Add test scripts, reports, and benchmark strict=False fix (cdceabf)
Fix afm args quoting with shlex.split, add benchmark script (7905be4)
Fix afm: args quoting in test harness (read -ra → eval) (0ddb545)
Clean up README: consolidate install sections, add stable/nightly table (c78e5d5)
Exclude test report HTML/JSONL from GitHub language stats (a4c9c7d)
Fix CLI --stop not passed to Server in MlxCommand (f3607af)
Fix stop sequences in thinking models, add CLI --stop flag, fix JSON schema injection (050e836)
Fix claude nested session issue and regenerate report with both AI analyses (dd419d8)
Add Qwen3.5-35B-A3B-4bit test suite and report (129/132 passed) (b93db37)
Move stable install instructions below latest release link (747a67c)
Update afm-next heading wording (385e738)
Change afm-next heading to 'Available NOW' (697abf0)
Update README with Qwen3.5-35B-A3B support and afm-next install instructions (50fcd13)
Download *.jinja files and fix missing chat_template fallback (0cfba17)
Gate verbose colored logs behind --very-verbose flag (7514638)
Merge pull request #26 from alantmiller/fix/vision-async-dispatch (199cded)
Use 'Changes since last build (SHA)' format in release notes (7f4b7e7)
Show 'changes since' commit SHA in nightly release notes (62a2a1f)
Add --since flag to publish-next.sh for changelog control (5a44659)
Expand install/upgrade instructions in nightly release notes (2913362)
Include commit SHA in brew version string (969ca10)
Add commit field and fullVersion to BuildInfo.swift (586bbea)
Preserve nightly release history with unique tags (745e5cd)
Add git commit SHA to --version output (e.g. v0.9.5-abc1234) (0e5a8b6)
Add changelog and install instructions to local publish script (493c94f)
Fix MXFP4 quantization crash, token counting, gemma3n routing, and test harness (d79a1d3)
Update README with nightly build installation instructions (f1d2813)
fix: vision subcommand dispatches async run() correctly (04566f0)
Fix bare JSON tool call detection and add ToolCallFormat.swift patch (f6efa24)
Change nightly build to manual trigger only (af238ab)
Add nightly build workflow for afm-next (8892329)
Merge origin/main into feature/mlx-prompt-caching (146030f)
Add tool call parser test results (26/26 pass) (694abbb)
Fix review findings: zero-arg JSON, prefix caching default, fallback tag detection (1ee1ede)
Add hermes, llama3_json, gemma, and mistral tool call parsers (341f35d)
Merge pull request #24 from scouzi1966/feature/structured-outputs (a7b3b38)
Address PR review: nullable types, null rejection, guided streaming deltas (e25d3ef)
Add structured outputs, --guided-json CLI flag, and comprehensive test suite (3b30fa0)
Add incremental streaming tool call arguments and fix parameter name mapping (9b27cac)
Update README for v0.9.5 features (519d35f)
Update README with new features and MLX support (fe7bd74)
Add token-level streaming tool call detection and update CLAUDE.md (ec46e88)
Add tool calling, stop sequences, response_format, and real token counts (71e2c68)
Save test reports to test-reports/ with JSONL data and add Kimi brief prompt (bee6a72)
Add logprobs support, --max-logprobs switch, and dynamic system_fingerprint (b37fdff)
Bump version to v0.9.5 and add sampling params test report (4abbdee)
Add top_k, min_p, presence_penalty, and seed sampling parameters (ce69ba0)
Checkpoint: OpenClaw config, verbose logging, max_completion_tokens, and streaming improvements (b69b9a5)
Revise README for clarity on v0.9.4 features (c66764e)
Add Qwen3.5-MoE VLM support, reasoning extraction, --raw flag, and stream cancellation (6cf821c)
Checkpoint: pre Qwen3.5-397B-A17B-4bit reclassify (8f35675)
Update OpenCode usage instructions in README (c72a023)
Revise installation methods in README (73c1640)
Swap OpenCode setup steps: configure first, then start afm (694f456)
Add detailed OpenCode /connect instructions to README (b15801b)
Add OpenCode integration guide to README (90af069)
Wire all MLX CLI params to server mode, enhance generation logging (1bc007f)
Revise installation command formatting in README (38c25d2)
Update README with model repo environment variable (51ef4b4)
Update README.md (5bc112f)
Revise README for v0.9.4 feature announcement (3d6c296)
Update README with feature listing and API access (d7b3bfa)
Revise README for MLX model support and commands (c89f191)
Add MLX excitement and quick install to README hero section (4f9b374)
Add MLX models screenshot to README (0d1bd07)
Add files via upload (f979ab4)
Update README with MLX local model support and new v0.9.4 features (9d86386)
Add regression test report: 61/61 passed (680c465)
Add MLX model test report: 27/28 passed, Kimi-K2.5 interrupted (fa84cf7)
Fix MLX metallib resolution for relocated binaries (5059f1b)

Install / Upgrade via Homebrew

Fresh install:

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm

Upgrade:

brew upgrade afm

Install via PyPI

pip install macafm==0.9.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

afm 0.9.5

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

afm 0.9.5

Changes since v0.9.4

Install / Upgrade via Homebrew

Install via PyPI

Uh oh!