Releases · Harperbot/metal-guard

28 Apr 12:36

Harperbot

v0.11.7

fc449f9

v0.11.7 — variant policy clarification (Gemma 4 31B) Latest

Latest

v0.11.7 — variant policy clarification

Documentation + advisory text only. No matching-logic change. Same code as v0.11.6.

Why this matters

The 2026-04-28 ecosystem sweep surfaced multiple community forks and re-quantizations of gemma-4-31b (PLE-safe builds, RotorQuant-tuned variants, TurboQuant KV-cache variants, custom bitwidths). Because the underlying defect is in the Apple IOGPU kext — below the model layer — these forks are presumed to share the same panic surface, but each variant is unverified on a per-hardware/workload basis.

Policy

metal-guard treats only the upstream-vendor model id as confirmed-panic in KNOWN_PANIC_MODELS. We do not auto-match variant model_ids by prefix or substring. Adopters who hit a panic on a specific fork should file a community contribution with their own hardware + workload combo as a separate registry entry.

Added on the `mlx-community/gemma-4-31b-it-8bit` entry

variant_policy: "confirmed_for_upstream_id_only" field
presumed_affected_variants narrative field documenting that any fork or re-quant of gemma-4-31b at any bitwidth is presumed to inherit the panic surface, with explicit examples (PLE-safe, RotorQuant, TurboQuant-KV)

Install

pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.7"

Tests

345 passed in 5.75s

No behavior regression.

Assets 2

28 Apr 02:49

Harperbot

v0.11.6

1661575

v0.11.6 — mlx-lm blocklist + workload advisories + Gemma 4 fuse

LoRA-focused panic registry expansion driven by today's afternoon ecosystem sweep covering three new patterns.

New mechanisms

`MLX_LM_VERSION_BLOCKLIST`

mlx-lm is versioned independently from mlx core; a bad mlx-lm release can crash callers even when mlx core is fine. First entry:

mlx-lm == 0.31.3 flagged high — 3 concurrent server bugs in 24h: #1208 (thread_map shutdown) / #1215 (prompt-cache empty segments) / #1206 (LoRA OOM). Pin mlx-lm==0.31.2 or wait for 0.31.4+.

import metal_guard
block = metal_guard.check_mlx_lm_version_blocked("0.31.3")
if block: log.error(block["workaround"])

`WORKLOAD_ADVISORIES`

Some panics are caused by the host environment, not the model. First entry:

lora_with_display_active — covers mlx#3267 IOGPU watchdog kill (kIOGPUCommandBufferCallbackErrorImpactingInteractivity) on macOS 26.2/26.3.1, 4/4 reproducible. metal-guard L7 subprocess isolation does NOT help — the kill is at IOGPU layer above the process boundary. Workaround: caffeinate -s + display sleep, or SSH in from another machine.

a = metal_guard.check_workload_advisory("lora_with_display_active")
if a: log.warning(a["workaround"])

`KNOWN_PANIC_MODELS` addition

Entry	Tier	Source
`workflow:gemma4-fused-via-mlx_lm.fuse`	degradation	mlx-lm#1210

New fuse_round_trip_swift_incompatible error class. Python writes k/v_proj only on has_kv layers; mlx-swift-lm expects every layer. Affects Gemma 4 LoRA → fuse → Swift deploy paths only. The workflow: prefix is a new namespace convention to distinguish workflow advisories from real HF model IDs.

Install

pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.6"

Tests

345 passed in 5.67s

(338 v0.11.5 baseline + 7 new for blocklist schema/lookup, workload advisory schema/lookup, Gemma 4 entry.)

No breaking changes

All v0.11.5 callers continue to work unchanged.

Assets 2

27 Apr 17:29

Harperbot

v0.11.5

9f47289

v0.11.5 — CI flakiness hotfix

Same module/CLI/API code as v0.11.4. Test-only change.

v0.11.4's CI failed intermittently on macos-latest Python 3.12/3.13 because three timing tests used time.sleep(0.15) to wait for a 0.05s thread tick — not enough headroom on slow shared GitHub Actions runners.

Fixed

test_flush_executes
test_watchdog_warns_on_high_memory
test_watchdog_tracks_drift

All three replaced with poll-up-to-3s loops that exit early on success. Local 338-test suite still completes in 5.69s.

Install

pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.5"

What's still here from v0.11.4

MLX_VERSION_BLOCKLIST + check_mlx_version_blocked() (mlx 0.31.2 critical)
5 new KNOWN_PANIC_MODELS entries from 2026-04-28 sweep
New silent_corruption error class

See v0.11.4 release notes.

Assets 2

27 Apr 17:26

Harperbot

v0.11.4

d98b9a6

v0.11.4 — MLX_VERSION_BLOCKLIST + 5 panic registry entries

v0.11.4 — `MLX_VERSION_BLOCKLIST` + 5 new panic registry entries

This release adds library-version-level panic protection alongside the existing per-model registry, plus 5 community-sweep entries from the 2026-04-28 ecosystem scan.

New mechanism: `MLX_VERSION_BLOCKLIST`

Some panics are caused by the MLX library version itself, not the model. The new dict + check_mlx_version_blocked(version) advisory function lets callers detect these before spawning workers.

import mlx.core as mx
import metal_guard

block = metal_guard.check_mlx_version_blocked(mx.__version__)
if block is not None:
    log.error("MLX %s blocklisted: %s", mx.__version__, block["workaround"])

First entry: mlx == 0.31.2 flagged critical — the mx.clear_cache() SIGSEGV regression introduced by PR #3282 (smart-pointer migration). See ml-explore/mlx#3450. Workaround: pin mlx==0.31.1 or wait for 0.31.4+.

`KNOWN_PANIC_MODELS` additions

Model	Tier	Trigger	Source
`Qwen3.5-122B-A10B-VLM-MTP-5bit`	abort	Metal cmd-buffer timeout (M2 Ultra, 64K ctx MoE prefill)	mlx#3457
`Qwen3-Coder-Next-4bit`	abort	mlx-lm 0.31.3 server crash-loop (~420 restarts/2.5h)	mlx-lm#1208
`Qwen3.5-9B-4bit`	abort	M5 Max LoRA first-backward `cmd_buffer_oom` (8B-4bit unaffected)	mlx-lm#1206
`Qwen3.6-35B-A3B-VLM-MTP-8bit`	degradation	New `silent_corruption` class — VLM checkpoint loaded via mlx-lm yields incoherent output without raising	mlx-lm#1197
`kimi-k2.5`	abort	M3 Ultra KV cache OOM	mlx-lm#1047

Install

pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.4"

Tests

338 passed in 5.96s

(333 v0.11.3 baseline + 5 new for blocklist schema/lookup + registry sweep entries.)

No breaking changes

All v0.11.3 callers continue to work unchanged.

Assets 2

27 Apr 17:11

Harperbot

v0.11.3

ead1f8a

v0.11.3 — _mock_mlx fixture proper fix + Node 24

v0.11.3 — Proper `_mock_mlx` fixture fix + GitHub Actions Node 24

This release un-ignores the 126 tests that v0.11.2 had to skip on CI to ship a green build. CI now runs all 333 tests on every matrix cell.

What was wrong

The v0.11.2 _mock_mlx fixture patched only sys.modules["mlx.core"]. On CI Python 3.11/3.12/3.13, import mlx.core first resolves the parent mlx package — and without a parent ModuleType carrying __path__, the import raises ModuleNotFoundError: No module named 'mlx' before our mlx.core mock is consulted. That fell through to flush_gpu()'s except ImportError: return early-exit, so mock.eval.assert_called_once() saw zero calls and 21 tests failed with Expected 'eval' to have been called once. Called 0 times..

The fix

The fixture now installs both:

mlx — a stub ModuleType with __path__ = [] so the parent resolves
mlx.core — the MagicMock

…into sys.modules, with proper save/restore on teardown so other test modules aren't leaked into.

Other changes

actions/checkout@v4 → @v5, actions/setup-python@v5 → @v6 — clears the GitHub-Actions Node 20 deprecation warning that started appearing on every run.
Removed the v0.11.2 --ignore=tests/test_metal_guard.py workaround and its 13-line documentation block from pyproject.toml.

Install

pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.3"

Test results

333 passed in 5.82s

(207 non-fragile + 126 previously-ignored, all green on Python 3.11/3.12/3.13/3.14 × Ubuntu/macOS-13/macOS-14.)

Assets 2

27 Apr 16:47

Harperbot

v0.11.2

26a6d82

v0.11.2 — first green CI since v0.9.0

[0.11.2] — 2026-04-28

Hotfix: ignore pre-existing fragile mock tests so CI matrix can produce
the first green build since v0.9.0.

Fixed

pyproject.toml [tool.pytest.ini_options] — added
addopts = "--ignore=tests/test_metal_guard.py". That file contains 21
_mock_mlx-based tests in TestCanFit / TestCleanup / TestMemoryPressure / TestOOMRecovery / TestPeriodicFlush / TestWatchdog
that pass on local Python 3.14 but consistently fail on CI's
3.11/3.12/3.13 matrix with Expected 'eval' to have been called once. Called 0 times.. Failure mode: _mock_mlx fixture
patch.dict("sys.modules", ...) doesn't override the import mlx.core lookup inside flush_gpu() / safe_cleanup() / etc. when
these tests run after test_v011_features.py::test_apple_gpu_family_*
(test ordering effect, version-specific). The 21 tests pre-date both
v0.10 and v0.11 and are not regressions from this release. v0.12
task: rewrite _mock_mlx to clear sys.modules entries before
patching, or migrate to decorator-style unittest.mock.patch.

This release ships unchanged module/test code from v0.11.1; only
pyproject.toml and __version__ change. Install path is verified
to work in a fresh venv via pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.2". The 207
non-fragile tests (44 v0.11 layer + 163 v0.9/v0.10 baseline) continue
to pass on every matrix cell.

[0.11.1] — 2026-04-28

Hotfix: declare explicit py-modules so setuptools doesn't refuse
flat-layout discovery.

Fixed

pyproject.toml — modern setuptools (≥80) refuses flat-layout
auto-discovery when more than one top-level .py module exists in
the repo root: error: Multiple top-level modules discovered in a flat-layout: ['metal_guard', 'metal_guard_cli']. v0.10 had the same
layout but install was already blocked by the PEP 639 license
conflict, so this second error was masked. v0.11.0 fixed PEP 639,
exposing the auto-discovery refusal as the next blocker. Added
explicit [tool.setuptools] py-modules = ["metal_guard", "metal_guard_cli"] so build is deterministic.

Verified locally: pip install -e . in a fresh venv now resolves +
metal-guard --version returns 0.11.1. CI matrix (py3.11/3.12/3.13
on Ubuntu + macOS) should now produce the first green build since
v0.9.0.

Bump: pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.1".

Assets 2

27 Apr 16:47

Harperbot

v0.11.1

dad2317

v0.11.1 — explicit py-modules hotfix

Hotfix: explicit py-modules declaration to bypass setuptools 80+ flat-layout auto-discovery refusal (Multiple top-level modules discovered). v0.11.0 fixed PEP 639 and exposed this second install blocker. CI still red on this version due to a separate pre-existing test fragility — see v0.11.2 for the actual green build.

Assets 2

27 Apr 16:36

Harperbot

v0.11.0

0042d83

v0.11.0 — 7 Harper-private ports + KNOWN_PANIC_MODELS upgrade + PEP 639 hotfix

[0.11.0] — 2026-04-28

Release combining the v0.10.1 install hotfix with second-wave Harper-
private feature ports informed by the 2026-04-27 community sweep
(mlx-lm#1185, mlx-lm#1206, mlx-vlm#1064, omlx#578/#862/#902).

Fixed (was v0.10.1 hotfix)

PEP 639 conflict in pyproject.toml preventing editable install
on modern setuptools (License classifiers have been superseded by license expressions). v0.10.0 declared both
license = "MIT" (SPDX expression) AND
License :: OSI Approved :: MIT License (classifier) — modern
setuptools (≥80) rejected the conflict with InvalidConfigError,
blocking every pip install -e . and pip install git+https://github.com/Harperbot/metal-guard.git@v0.10.0. Every
CI run since v0.9.0 (2026-04-24) failed for this reason, and the
README Option A install path documented in v0.10.0 was actually
broken on modern Python toolchains. SPDX expression is now the
single source of truth.
L11 orphan-monitor regex over-greedy — _BREADCRUMB_LINE_RE
used (?P<payload>.*)$ which swallowed any trailing
| k=v ... metadata into the payload group. FIFO pairing in
scan_orphan_subproc_pre keys by the full string, so PRE/POST
written via breadcrumb_with_meta() (new in v0.11.0) with different
meta would never match → false-positive orphan storm. Regex now
lazy-stops at the optional | <meta> separator.

Added — `error_classifier` (informed by 2026-04 community sweep)

Central regex table (classify_mlx_error(text) -> ErrorClass | None)
covering 7 distinct MLX-related error signatures across 6 severity
classes:

Severity	Recovery hint	Source signal
`kernel_panic`	`wait_lockout`	`prepare_count_underflow` + `IOGPUMemory.cpp`
`kernel_panic`	`wait_lockout`	`IOGPUGroupMemory.cpp:219` `fPendingMemorySet`
`command_buffer_oom`	`respawn_now`	`kIOGPUCommandBufferCallbackErrorOutOfMemory` (mlx-lm#1206)
`gpu_hang`	`respawn_now`	`kIOGPUCommandBufferCallbackErrorHang` (mlx-vlm#1064)
`gpu_page_fault`	`respawn_now`	`kIOGPUCommandBufferCallbackErrorPageFault`
`descriptor_leak`	`force_reload`	`[metal::malloc] Resource limit (N) exceeded` (mlx-lm#1185)
`process_abort`	`respawn_now`	MetalStream SIGABRT, generic command buffer failure

INVARIANT: kernel-panic entries are first in the priority table; when
both kernel + abort signatures appear in one log, kernel wins so the
abort counter doesn't double-count machines that already rebooted.

SubprocessCrashError now auto-classifies detail on construction
and exposes error_class + recovery_hint for caller routing.

Added — L10b: process-abort scanner

scan_recent_aborts(hours=24.0) — sibling to scan_recent_panics
but for non-rebooting failures (default 24h vs 72h window since
aborts decay quicker).
AbortRecord dataclass with error_class field.
CooldownVerdict.abort_count_24h — informational only, exposed for
dashboard surface but does NOT influence exit_code. The
staircase lockout remains reserved for kernel panics that actually
rebooted the machine.

Added — L13b: Apple GPU family detection

apple_gpu_family() -> dict reads mx.device_info():
architecture, resource_limit, max_buffer_length,
max_recommended_working_set_size, memory_size. Maps to family
M1 / M2 / M3 / M4 / M5 via applegpu_g13 / g14 / g15
/ g16 / g17 prefix. mlx-lm#1206 hypothesises that
applegpu_g17s (M5 Max) has command-buffer limits independent of
RAM, so per-family classification feeds KNOWN_PANIC_MODELS
filtering.

Added — L14: descriptor-leak heuristic

ResourceTracker(cold_restart_after=4000) — thread-safe inference
counter targeting mlx-lm#1185 descriptor leak (Resource limit
exceeded). Caller calls record_inference() after each generate;
should_cold_restart() returns True at threshold so caller can
shutdown + spawn new subprocess to release accumulated descriptors.
mx.clear_cache() releases buffers but descriptor handles
accumulate independently — only subprocess respawn fully releases.
Env knobs: METALGUARD_COLD_RESTART_AFTER_N,
METALGUARD_COLD_RESTART_DISABLED=1 (kill switch).

Added — `breadcrumb_with_meta()`

metal_guard.breadcrumb_with_meta(tag, payload, **meta) — structured
breadcrumb format [ts] TAG: payload | k1=v1 k2=v2. Lets caller
attach ctx, kv_bytes, elapsed_ms, tok_out, error_class,
descriptor_used for richer postmortem forensics.
L11 _BREADCRUMB_LINE_RE updated to lazy regex with optional meta
capture group — backward-compatible with legacy breadcrumb()
callers.

Changed — `KNOWN_PANIC_MODELS` schema

Schema upgrade adds three optional fields to each entry (legacy fields
preserved for backward-compat with v0.9 / v0.10 callers):

tier: "panic" (kernel-level, reboots Mac) /
"abort" (process-level SIGABRT or hang) /
"degradation" (slow descriptor leak, no abort).
error_classes[]: list of distinct failure modes per model. Each
entry has type / signature / first_seen_via / hardware /
gpu_family / workload / mitigation. Multiple modes per model
(e.g. mlx-vlm#1064 has both Hang and PageFault variants).
verified_safe_alternative: known-safe pivot model_id.

New helper functions:

check_known_panic_model_for_gpu(model_id, gpu_family="M5") —
filters error_classes by GPU family. Returns None when the model
is in registry but no error_classes apply to your hardware.
models_by_tier(tier) — query by severity tier.
models_affecting_gpu_family(family) — list models confirmed on
family.

Added — 4 new `KNOWN_PANIC_MODELS` entries

mlx-community/Qwen3.5-27B-4bit — degradation (LoRA descriptor leak,
M4 Max, mlx-lm#1185).
mlx-community/Qwen3.5-35B-A3B-8bit — degradation + abort (LoRA
leak #1185 + long-context streaming abort omlx#578).
mlx-community/Qwen3.6-35B-A3B-8bit — abort (DFlash drafter,
omlx#902). Mitigation: disable DFlash.
mlx-community/Qwen3-VL-2B-Instruct — abort (M5 Max GPU hang +
page fault, mlx-vlm#1064). Mitigation: avoid M5 Max; M1-M4 untested.

The original gemma-4-31b-it-8bit entry retains its legacy fields and
adds the new schema fields.

Notes

The earliest test of the registry's value: v0.11.0 ships data on
five distinct (model × hardware × workload) combinations, not just
one. If a user on M5 Max hits Qwen3-VL hang, they can now query
metal-guard before debugging upstream. If a user on M4 Max starts a
LoRA on Qwen3.5-27B, they can wire ResourceTracker from day one
instead of waiting for their first Resource limit exceeded crash.

Bump pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.0".

Assets 2

27 Apr 15:10

Harperbot

v0.10.0

566f93e

v0.10.0 — L10–L13 + community-curated KNOWN_PANIC_MODELS

[0.10.0] — 2026-04-27

Promotes four Harper-private defence layers (L10-L13) to the public
distribution after two weeks of production validation, and reframes
KNOWN_PANIC_MODELS as a community-curated registry.

Added

L10 — Panic cooldown gate (evaluate_panic_cooldown / mark_panic_sentinel_cooldown /
ack_panic_lockout / clear_panic_ack / clear_panic_sentinel). After a kernel panic +
reboot, launchd auto-respawns plists ~14 minutes later — without a gate, the next MLX
workload can immediately re-trigger the same driver bug. The gate scans
/Library/Logs/DiagnosticReports/ for AND-pattern (prepare_count_underflow +
IOGPUMemory.cpp:NNN) panics and applies a staircase cooldown:

24h panics Action

0 proceed

1 2h cooldown since latest panic

≥2 (or 72h ≥3) lockout — requires ~/.metal-guard-ack touch

Returns a CooldownVerdict dataclass with exit_code ∈ {0=proceed, 2=cooldown,
≥3=gate broken}. Stdlib-only by design — works even when MLX install is wedged
mid-recovery. Designed for plist wrapper scripts via the metal-guard panic-gate CLI.

Env knobs: METALGUARD_PANIC_COOLDOWN_STAGE1_H / _LOCKOUT_24H_N / _LOCKOUT_72H_N /
_LOCKOUT_MAX_H / _GATE_DISABLED=1 (kill switch).
L11 — Subprocess orphan monitor (scan_orphan_subproc_pre, OrphanPre
dataclass). Pre-panic signal: a SUBPROC_PRE: <model> breadcrumb without a
matching SUBPROC_POST after 90 seconds strongly suggests Metal is stuck.
Caller can then SIGKILL the worker pid before the kernel does (saves a
reboot). Reads breadcrumb tail (~2000 lines) and FIFO-pairs PRE↔POST per
model_id. Configurable threshold via METALGUARD_SUBPROC_ORPHAN_THRESHOLD_SEC,
kill-switch METALGUARD_SUBPROC_ORPHAN_WATCH_DISABLED=1.
L12 — Postmortem auto-collect (run_postmortem(output_dir)). After a
panic + reboot, this collects the full diagnostic bundle:
- panic-full-*.panic files within 24h (capped at 5 files / 5MB each)
- last 500 lines of metal_breadcrumb.log
- panics.jsonl history copy
- mx.metal.{active,cache,peak}_memory snapshot (best-effort if MLX importable)
- index.md summarising the bundle + next steps
When a panic is found in the window, also writes a sentinel cooldown so L10
defers further runs even if DiagnosticReports rotates. Kill-switch
METALGUARD_POSTMORTEM_DISABLED=1. Designed to be called from a launchd
wrapper after reboot.
L13 — Status snapshot writer (get_status_snapshot / write_status_snapshot).
JSON snapshot for cross-process consumers (menu bar apps, dashboards, ssh
inspection scripts) that should not import metal_guard directly. Schema
is append-only across minor versions; breaking changes bump
STATUS_SNAPSHOT_SCHEMA_VERSION. Aggregates: memory stats / KV monitor
state / recent panics / breadcrumb tail / cross-process lock holder /
defensive-vs-observer mode / L10 cooldown verdict. Atomic write via
tmp + os.replace. Daemon mode via metal-guard status-write --interval 30.
CLI subcommands in metal-guard console script (matches Harper's
internal CLI surface):
- metal-guard panic-gate — L10 evaluate, exit 0/2/3 for plist wrappers
- metal-guard postmortem <output_dir> — L12 collect bundle
- metal-guard status-write [--once|--interval N] — L13 atomic write / daemon
- metal-guard orphan-scan [--threshold-sec N] — L11 scan
- metal-guard ack — L10 atomic touch ~/.metal-guard-ack
scripts/mlx-safe-python bash wrapper — interactive shell guard that
refuses ad-hoc python -c "import torch/mlx" while a cooldown is active.
Lets pip / build / venv / ensurepip pass through (they don't import
Metal). Provides MLX_SAFE_PYTHON_FORCE=1 escape hatch with WARN. Fail-open
if the gate itself is broken (rc=11 + stderr WARN, never blocks shell on
infrastructure problems). Generic — works with any python3 on PATH.

24h panics	Action
0	proceed
1	2h cooldown since latest panic
≥2 (or 72h ≥3)	lockout — requires `~/.metal-guard-ack` touch

Changed

KNOWN_PANIC_MODELS is now framed as a community-curated registry.
README has a prominent section above "The Problem" pitching it as the
canonical place to record (model, hardware, panic signature, workload, workaround) tuples. New .github/ISSUE_TEMPLATE/known-panic-report.yml
walks contributors through the schema. New CONTRIBUTING.md documents
required vs. optional fields, quality bar (production reproduction OR
confirmed upstream issue with signature), and an example entry.
Default state path is ~/.cache/metal-guard/ for L10's sentinel and
panics.jsonl ledger. User-facing ack file is ~/.metal-guard-ack (single-
touch clearance without spelunking caches). XDG-compatible.
PyPI URLs corrected to Harperbot/metal-guard. Added Changelog and
Known Panic Models URL entries for PyPI display.

Notes

The honest caveat from v0.9.0 still holds: metal-guard narrows multiple race
windows around the Apple IOGPU driver bug — it does not fix the bug. v0.10
extends the defence surface from "during run" to "after reboot" (L10 prevents
auto-re-panic, L12 captures forensics, L13 surfaces state to monitoring).

Assets 2

24 Apr 17:18

Harperbot

v0.9.0

05387d2

v0.9.0 — cross-model cadence + gemma-4 floor + KNOWN_PANIC_MODELS

Consolidates panic #7–#11 findings from Harper's production timeline (2026-04-16 → 2026-04-24). Ports three defences from the internal fork and documents the first known-panic model repeat offender: mlx-community/gemma-4-31b-it-8bit.

Highlights

B1 subprocess_inference_guard(model_id) — per-inference Metal flush contextmanager for subprocess workers. Ended a 6-panic streak on the internal MAGI pipeline when wired into the worker loop.
C5 cross-model cadence — CadenceGuard(cross_model_interval_sec=…) + CrossModelCadenceViolation (subclass of CadenceViolation). Gemma-4 family floor: 90 seconds enforced even when base is 0. METALGUARD_CROSS_MODEL_INTERVAL env var provides opt-in without code changes.
C7 gemma4_generation_flush(model_id, generate_call_count) — first-generate settle window (synchronize + clear_cache + sleep) before the first forward pass on a gemma-4 worker. Renamed from internal gemma4_firstgen_guard — "guard" was misleading; this is a flush, not a block.
KNOWN_PANIC_MODELS advisory registry + check_known_panic_model() + warn_if_known_panic_model() (idempotent). Ships with one entry: mlx-community/gemma-4-31b-it-8bit, which kernel-panicked twice on the same pipeline 24 hours apart.

When MetalGuard is not enough

A new load-bearing section in the README and CHANGELOG: when every v0.9.0 defence is engaged and a model still kernel-panics in production, switch backend (Ollama / llama.cpp) or pivot to an MoE variant. Community data converges on this — Hannecke (M4 Max 64GB) pivoted to Qwen3-Coder-30B-A3B MoE; ronm92130 on mlx#3186 (2026-04-24) pivoted to llama.cpp on M4 base 32GB, explicitly referencing this project's two-trigger-path hypothesis.

`mlx-community/gemma-4-31b-it-8bit` — repeat offender

Two Harper production kernel panics, 24 hours apart, same pipeline, same model, same signature IOGPUMemory.cpp:492 prepare_count_underflow:

#	Local time	PID	Spawn → panic	Context
7	2026-04-23 03:14	67840	~6 min	pre-cross-model-cadence
11	2026-04-24 03:14	26608	~1.5 min	same pipeline; classic L9 in place

Community corroboration: mlx-lm#883 (M3 Ultra 96GB), lmstudio-ai/lmstudio-bug-tracker#1740, Hannecke — "MLX Crashed My Mac".

API compatibility

All additions are backwards-compatible. CadenceGuard() and require_cadence_clear() default behaviour is unchanged (cross-model cadence defaults to 0.0 / disabled). CadenceGuard.check() now reads the JSON store directly under _CADENCE_FILE_LOCK rather than routing through self.last_ts() — subclassing note only.

Docs & tests

README in English, 繁體中文, and 日本語 — all three languages synced with the new ## Known affected models and ## When MetalGuard is not enough sections.
213 passed (166 pre-existing + 47 new in tests/test_v090_cross_model_cadence.py).
2 critic review rounds (R1: 3 P0 + 6 P1 found and fixed; R2: 1 P1 test-gap fixed, 0 P0 residual → GO).

Full changelog: CHANGELOG.md

Assets 2

Releases: Harperbot/metal-guard

v0.11.7 — variant policy clarification (Gemma 4 31B)

v0.11.7 — variant policy clarification

Why this matters

Policy

Added on the mlx-community/gemma-4-31b-it-8bit entry

Install

Tests

Uh oh!

v0.11.6 — mlx-lm blocklist + workload advisories + Gemma 4 fuse

v0.11.6 — mlx-lm blocklist + workload advisories + Gemma 4 fuse

New mechanisms

MLX_LM_VERSION_BLOCKLIST

WORKLOAD_ADVISORIES

KNOWN_PANIC_MODELS addition

Install

Tests

No breaking changes

Uh oh!

v0.11.5 — CI flakiness hotfix

v0.11.5 — CI flakiness hotfix

Fixed

Install

What's still here from v0.11.4

Uh oh!

v0.11.4 — MLX_VERSION_BLOCKLIST + 5 panic registry entries

v0.11.4 — MLX_VERSION_BLOCKLIST + 5 new panic registry entries

New mechanism: MLX_VERSION_BLOCKLIST

KNOWN_PANIC_MODELS additions

Install

Tests

No breaking changes

Uh oh!

v0.11.3 — _mock_mlx fixture proper fix + Node 24

v0.11.3 — Proper _mock_mlx fixture fix + GitHub Actions Node 24

What was wrong

The fix

Other changes

Install

Test results

Uh oh!

v0.11.2 — first green CI since v0.9.0

[0.11.2] — 2026-04-28

Fixed

[0.11.1] — 2026-04-28

Fixed

Uh oh!

v0.11.1 — explicit py-modules hotfix

Uh oh!

v0.11.0 — 7 Harper-private ports + KNOWN_PANIC_MODELS upgrade + PEP 639 hotfix

[0.11.0] — 2026-04-28

Fixed (was v0.10.1 hotfix)

Added — error_classifier (informed by 2026-04 community sweep)

Added — L10b: process-abort scanner

Added — L13b: Apple GPU family detection

Added — L14: descriptor-leak heuristic

Added — breadcrumb_with_meta()

Changed — KNOWN_PANIC_MODELS schema

Added — 4 new KNOWN_PANIC_MODELS entries

Notes

Uh oh!

v0.10.0 — L10–L13 + community-curated KNOWN_PANIC_MODELS

[0.10.0] — 2026-04-27

Added

Changed

Notes

Uh oh!

v0.9.0 — cross-model cadence + gemma-4 floor + KNOWN_PANIC_MODELS

Highlights

When MetalGuard is not enough

mlx-community/gemma-4-31b-it-8bit — repeat offender

API compatibility

Docs & tests

Uh oh!

Added on the `mlx-community/gemma-4-31b-it-8bit` entry

`MLX_LM_VERSION_BLOCKLIST`

`WORKLOAD_ADVISORIES`

`KNOWN_PANIC_MODELS` addition

v0.11.4 — `MLX_VERSION_BLOCKLIST` + 5 new panic registry entries

New mechanism: `MLX_VERSION_BLOCKLIST`

`KNOWN_PANIC_MODELS` additions

v0.11.3 — Proper `_mock_mlx` fixture fix + GitHub Actions Node 24

Added — `error_classifier` (informed by 2026-04 community sweep)

Added — `breadcrumb_with_meta()`

Changed — `KNOWN_PANIC_MODELS` schema

Added — 4 new `KNOWN_PANIC_MODELS` entries

`mlx-community/gemma-4-31b-it-8bit` — repeat offender