You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+19-6Lines changed: 19 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,20 @@ GPU safety layer for [MLX](https://github.com/ml-explore/mlx) on Apple Silicon.
6
6
7
7
Prevents kernel panics and OOM crashes caused by Metal driver bugs when running MLX inference — especially multi-model pipelines, long-running servers, and agent frameworks with heavy tool calling.
8
8
9
-
**Current version: v0.10.0** — see [CHANGELOG.md](CHANGELOG.md) for release history and per-feature rationale.
9
+
**Current version: v0.11.0** — see [CHANGELOG.md](CHANGELOG.md) for release history and per-feature rationale.
10
+
11
+
### What's in v0.11
12
+
13
+
Built on the 2026-04-27 community sweep (mlx-lm#1185 / #1206 / mlx-vlm#1064 / omlx#578 / #862 / #902):
14
+
15
+
-**`error_classifier`** — central regex table for **6 distinct error severities**: `kernel_panic` / `process_abort` / `command_buffer_oom` / `gpu_hang` / `gpu_page_fault` / `descriptor_leak`. `SubprocessCrashError` now exposes `.error_class` + `.recovery_hint` for caller routing.
16
+
-**L10b — process-abort scanner** — `scan_recent_aborts(24h)` sibling to `scan_recent_panics(72h)`. Aborts are non-rebooting failures; counted separately so they don't trip the kernel-panic lockout. `CooldownVerdict.abort_count_24h` exposed for dashboards.
17
+
-**L13b — Apple GPU family detection** — `apple_gpu_family()` reads `mx.device_info()` and maps `applegpu_g13`/`g14`/`g15`/`g16`/`g17` → `M1`/`M2`/`M3`/`M4`/`M5`. Surfaces `resource_limit` (mlx-lm#1185 descriptor cap, 499000 on M1 Ultra).
18
+
-**L14 — descriptor-leak heuristic** — `ResourceTracker(cold_restart_after=4000)` tracks inferences-since-cold-restart so callers can pre-emptively `shutdown()` + spawn new subprocess before hitting the descriptor limit. `mx.clear_cache()` doesn't release descriptor handles; only subprocess respawn does.
19
+
-**`breadcrumb_with_meta(tag, payload, **meta)`** — structured breadcrumb format `[ts] TAG: payload | k=v k=v` for richer postmortem forensics. L11 orphan parser updated to lazy regex (backward-compat with legacy `breadcrumb()`).
20
+
-**`KNOWN_PANIC_MODELS` schema upgrade** — adds `tier` (panic / abort / degradation), `error_classes[]` (multiple modes per model + per-GPU-family confirmation), `verified_safe_alternative`. New helpers: `check_known_panic_model_for_gpu(model, gpu_family="M5")` / `models_by_tier()` / `models_affecting_gpu_family()`.
21
+
-**4 new registry entries** covering Qwen3.5/Qwen3.6/Qwen3-VL family across M4 / M5 hardware.
22
+
-**Hotfix**: PEP 639 license-classifier conflict in `pyproject.toml` that blocked every `pip install` since v0.9.0.
10
23
11
24
### What's in v0.10
12
25
@@ -127,13 +140,13 @@ This affects any workflow that loads and unloads multiple MLX models in sequence
127
140
Installs from a tagged release — gives you the `metal-guard` and `mlx-safe-python` console scripts plus the `metal_guard` Python module:
0 commit comments