chore(main): release 0.9.7 by github-actions[bot] · Pull Request #440 · AlexsJones/llmfit

github-actions · 2026-04-13T09:33:42Z

🤖 I have created a release beep boop

0.9.7 (2026-04-13)

Features

add --force-runtime flag to override automatic runtime selection (c31099f)
add --memory flag to override GPU VRAM autodetection (9a35a76)
add --memory flag to override GPU VRAM autodetection (b6e323d)
add "Sort by Release Date" to TUI (f371cd8)
add "Sort by Release Date" to TUI (d55670a)
add 15 popular models from HuggingFace (eb886d6)
Add 15 popular models from HuggingFace (33→48 models) (e0c0b52)
add 18 new models to database (Feb 2026) (8612c0e)
add 18 new models to database (Feb 2026) (1c040f2)
add 18 new models to database (Feb 2026) — conflict-resolved (0970845)
add AWQ/GPTQ support with vLLM inference runtime (e398490)
add AWQ/GPTQ support with vLLM inference runtime (d7f2f11)
add Catppuccin color themes to TUI (f05d299)
add Catppuccin color themes to TUI (104b281)
add CLI/TUI model diff compare workflow (3cab068)
add context-length cap for memory estimation (--max-context) (0d9a0ef)
Add Docker containerization support (42ff7d3)
Add Docker containerization support (fbb483c)
add Docker Model Runner as a runtime provider (2a84138)
add fuzzy search with space-separated terms (89bbe3d)
add fuzzy search with space-separated terms (3d967f4), closes #65
add GGUF download source enrichment for models (967253b)
add Google Gemma 4 models and fix Gemma 3 capabilities (2cb21c3)
add homebrew tap support and update release workflow (67b6fcf)
add InferenceRuntime enum and MLX quantization support (186b95e)
add license filter for models (5bb4b1e)
add license filter for models (#186) (33bd708)
add Liquid AI LFM2/LFM2.5 models (83d038a)
add max-context cap for memory estimation (afba24a)
add MiniMax-M2.7 to curated model database (a5cda9b)
add MiniMax-M2.7 to curated model database (7c16b9d)
add MlxProvider for MLX model detection and downloads (7f0ee24)
add model comparison workflow (llmfit diff) and TUI compare mode (76a8a9c)
add Ollama mappings for new models (5c97e75)
add Qwen 3.5 series (397B/122B/35B/27B) to model database (0d550f3)
add Qwen3-Coder-Next (80B MoE) and Qwen 3.5 Ollama mappings (5f8a640)
add Qwen3.5 small model series (0.8B, 2B, 4B, 9B) (21cc8ac)
add Qwen3.5 small model series (0.8B, 2B, 4B, 9B) (226c5c2)
add runtime model database updates from HuggingFace (d465fb9)
add runtime model database updates from HuggingFace (1e20ec3)
add tensor parallelism awareness for multi-GPU model fitting (e8a6001)
add tensor parallelism awareness for multi-GPU model fitting (5d99ec2)
add theme switcher with 6 color schemes for TUI (317c44a)
add theme switcher with 6 color schemes for TUI (89bd63f)
add tok/s column sorting in TUI (41b7358)
add tok/s sort option to TUI sort cycle and table header highlighting (82bb526)
add use-case popup filter and use-case search matching to TUI (4b1aeab)
add use-case popup filter to TUI (4b0f452)
added arc support (0d5e991)
added in vim like bindings (a3df62f)
added LM studio (70e60cc)
added logo (86dd737)
added moe (4723782)
added Rest API (2adb593)
adding ollama as supported provider (e198c17)
adding release please (c68914a)
adding serve capabilities (503229e)
api: add capabilities, license, supports_tp to model response and new endpoints (c02e189)
append (WSL) to RAM label in tui when running under WSL (aa30f57)
availability filter [a] in TUI (All / GGUF Avail / Installed) (9356150)
bandwidth-based tok/s estimation for known GPUs (0c552b4)
bandwidth-based tok/s estimation for known GPUs (1c34600)
cargo fmt (cde77e3)
caught some unavailable models on ollama (1d68bba)
caught some unavailable models on ollama (6aaba9f)
cli: add --sort for fit output (8114446)
cli: add --sort option for fit output (3a06877)
crate version yank skip rebuild (b1b0628)
dashboard: ship embedded web UI and auto-start from CLI (96bd53c)
dashboard: ship embedded web UI and auto-start it from CLI (37dd0db)
detect installed Ollama models and support pulling from TUI (dc05d5a)
expand mlx-community model mappings (7a9e917)
expand mlx-community model mappings and improve fallback heuristic (5e8a676), closes #260
fix for qwen3_5moe (455709f)
fixed regression in LM studio model list (cbeccfa)
fixed regression in LM studio model list (734f556)
fixed up skill (38c8350)
fixed up skill (368fa47)
fixing vram on apple bug (bd59767)
fixing vram on apple bug (db666b6)
fixing vram on apple bug (c26d13b)
fixing vram on apple bug (085cd25)
GGUF download source enrichment (unsloth & bartowski) (6e3c828)
hardware simulation mode and CLI hardware overrides (002eeff), closes #322
hardware: detect NVIDIA Tegra/Grace Blackwell unified memory via ATS (rebased #93) (b454697)
hardware: detect NVIDIA Tegra/Grace Blackwell unified memory via ATS addressing mode (dd1caa6)
improvements based on #12 (7997525)
increased model count (5be634f)
increment version (d88bd3f)
installed status for lm studio (a1cad30)
mlx disablement on non-apple hw (c26bb49)
MLX-native inference support for Apple Silicon (993599e)
model detail modal + Ollama download in desktop app (c92053f)
model detail modal + Ollama download in desktop app (dacae77)
overall to the scoring system (e689f95)
overall to the scoring system (d411a8d)
overall to the scoring system (5ca2467)
persist filter state across sessions (#430) (c3962e4)
persist theme selection to ~/.config/llmfit/theme (765c87b)
plan: KV cache fidelity + KV quant flag + TurboQuant gating (1f1b6fc)
plumbing 2 (209f9d4)
plumbing 2 (ef18920)
pull functionality (e193d56)
rebased (ac6639d)
release plumbing (648a8c1)
release plumbing (8215ba8)
removed exact model count as it increases so often (1a19c42)
reworked available models for download (6d56b03)
show approximate disk space usage (#134) (#304) (32c835b)
split-pane TUI detail view for GGUF downloads (de5c53b)
support for windows vulkan (5a76a8f)
supporting 94 models (c68d7aa)
surface runtime info in TUI, CLI, and JSON output (4bdb4c4)
This PR is to add the ability for capabilities to be described (d09b8fe)
This PR is to add the ability for capabilities to be described (05768af)
TUI GGUF downloads section, default enrichment, caching (e7dc0b2)
tui: add runtime/backend filter and help popup (v0.9.2) (ad28138)
update key bindings and add Catppuccin themes to README files (c41567b)
updated build actions (59a8af4)
updated demo (6c32985)
updated demo (42fbeeb)
updated images (20205bd)
updated models (dfb9afb)
updated sorting and new models (744465d)
updated tui to support multiple providers better and also multiple GPU suppor (df338e8)
updated urls (ce14772)
updated version (4d1542b)
updating models (d4f0a0e)
version bump llama.cpp added (1643214)
version bump llama.cpp added (319fb90)
web: add 10 color themes from TUI theme.rs (ab14a06)
web: add side-by-side model comparison view (ccb56e9)
web: replace freeform limit input with preset options (96981ec)
web: split App.jsx into Header, SystemPanel, FilterBar, ModelTable, DetailPanel (950e620)
working on v0.8.1 (696107d)
workspace restructure + Tauri desktop app (83ded27)
workspace restructure + Tauri desktop app (3fb10d0)

Bug Fixes

accept OLLAMA_HOST values without URL scheme (#166) (c647b13)
add --local flag to install script and improve fallback logic (1e5cda1)
add --local flag to install script and improve fallback logic (23b500e), closes #56
add AMD RX 9060 series to VRAM estimation database (1c02da8)
add AMD RX 9060 series to VRAM estimation database (9305810), closes #55
add download confirmation to prevent accidental pulls (9e4e881)
add download confirmation to prevent accidental pulls (#95) (5b56d74)
address AlexsJones review comments (587252b)
address Copilot review comments (74384c9)
AMD Ryzen AI unified memory detection and --memory override (174fc10)
AMD unified memory detection and --memory override (c763717), closes #89 #91
bump Dockerfile Rust version to 1.88 for dependency compatibility (c25c02e)
bump Dockerfile Rust version to 1.88 for dependency compatibility (6304ba0)
cap default estimation context at 8192 tokens (#311) (4699ecb)
ci: switch release-please to simple type for workspace version inheritance (48be41a)
cli: default auto dashboard host to 0.0.0.0 (4553fbf)
cli: make dashboard auto-start controllable and self-cleaning (0c51ad6)
correct crates.io metadata and prepare for publishing (f8b08a0)
correct crates.io metadata and prepare for publishing (2ce5858), closes #58
correct PCI ID reading on Linux and improve AMD GPU identification (dcfde57)
correctly estimate VRAM for APU integrated GPUs (2df3298)
correctly estimate VRAM for APU integrated GPUs (Radeon Graphics) (5797f10), closes #25
Default theme uses terminal colors for light/dark compat (72a6d8d), closes #67
desktop: prevent XSS via inline onclick handler in modal (#323) (d20dbee)
detect installed Ollama tag variants with suffixes (#165) (7092fa6)
detect NVIDIA GB10/GB20 as unified memory GPUs (a77cf36)
detect NVIDIA GB10/GB20 as unified memory GPUs (#83, #17) (c0a584a)
discover GGUF files in HuggingFace subdirectories (#291) (4e8a596)
docker action version (7a38a25)
docker action version pin (da96503)
download: fetch all shards of multi-part GGUF models (b7494d9)
filter AWQ/GPTQ models by GPU compute capability (e720a0e), closes #257
find Tauri bundle in correct target directory (19c0ac3)
fmt: align providers formatting and enrich TUI compare metrics (a65e4c5)
hide MLX models on non-Apple-Silicon hardware (#113) (1e3cf8a)
hide MLX models on non-Apple-Silicon hardware (#113) (1052819)
hide MLX-only models on non-Apple Silicon systems (ab54fa9)
ignore GGUF tests that require network access (#306) (8fd8f4d)
iGPU inflating GPU count and force-runtime being ignored (#271) (f584d7e)
improve Android CPU and Vulkan GPU detection (0ebb92a)
improve download error messages for incompatible model formats (0e2f9cc), closes #123 #198
improve MoE tok/s estimate and clarify baseline speed labels (81f88c4)
incorrect installed model count in TUI status bar (307c74f)
invoke hf instead of huggingface-cli (752680f)
invoke hf instead of huggingface-cli (78f8268)
pr review fixes (25952da)
prefer discrete GPU over integrated on Windows (#303) (3312a40)
prefer exact matches in info selection (68272d4)
prefer exact matches in info selection (883ea1f)
query local providers in recommend CLI command (2477d1d)
regression in json only mode (36e8503)
remove unused function to eliminate build-time warning (4406582)
remove unused function to eliminate build-time warning (289c8b8)
Runtimes are installed but no downloadable artifact exist (9424204)
support owner-scoped MLX pulls and robust tag normalization (7a9e476)
support owner-scoped MLX repos and normalize MLX pull tags (f0e516c)
support sudo in piped install script (9fa1960)
support sudo in piped install script (cf61557), closes #158
surface MoE offloaded RAM in JSON output (96b73ff)
surface MoE offloaded RAM in JSON output (929eaad), closes #230
tui: avoid borrow conflict when marking compare model (ae60aa5)
tui: write dashboard pid file under ~/.llmfit, not /tmp (#324) (00967f1)
typo in CHANGELOG.md (suppor -> support) (1e26ec2)
typo in CHANGELOG.md (suppor -> support) (aa6cd00)
update (2b6767d)
update OpenClaw skill to match actual CLI output (df585a7)
update OpenClaw skill to match actual CLI output (d53c192)
use accurate model counts instead of len()/2 for installed display (7a273cc), closes #189
use aggregated VRAM for multi-GPU fit scoring (dc3db39)
use aggregated VRAM for multi-GPU fit scoring (f55882d), closes #68
use MoE active parameters for tok/s estimation and clarify baseline speed labels (47e70a1)
use per-card VRAM instead of summed for multi-GPU systems (2f1f9f4)
use per-card VRAM instead of summed for multi-GPU systems (f80b4b0)
use quantization in path selection for both dense and MoE models (#49) (8703d43)
web: align dashboard hero with project tagline (d4e77c1)

Performance Improvements

api: optimize embedded asset serving and cache policy (4d8c0cc)

This PR was generated with Release Please. See documentation.

github-actions · 2026-04-13T11:45:30Z

🤖 Created releases:

v0.9.7

🌻

chore(main): release 0.9.7

da6a5f1

github-actions bot force-pushed the release-please--branches--main branch from 9d94be9 to da6a5f1 Compare April 13, 2026 09:33

github-actions bot added the autorelease: pending label Apr 13, 2026

AlexsJones merged commit c9830fc into main Apr 13, 2026

github-actions bot added autorelease: tagged and removed autorelease: pending labels Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(main): release 0.9.7#440

chore(main): release 0.9.7#440
AlexsJones merged 1 commit intomainfrom
release-please--branches--main

github-actions bot commented Apr 13, 2026

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions bot commented Apr 13, 2026

🤖 I have created a release beep boop

0.9.7 (2026-04-13)

Features

Bug Fixes

Performance Improvements

Uh oh!

github-actions bot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant