Releases: Harperbot/metal-guard
v0.11.7 — variant policy clarification (Gemma 4 31B)
v0.11.7 — variant policy clarification
Documentation + advisory text only. No matching-logic change. Same code as v0.11.6.
Why this matters
The 2026-04-28 ecosystem sweep surfaced multiple community forks and re-quantizations of gemma-4-31b (PLE-safe builds, RotorQuant-tuned variants, TurboQuant KV-cache variants, custom bitwidths). Because the underlying defect is in the Apple IOGPU kext — below the model layer — these forks are presumed to share the same panic surface, but each variant is unverified on a per-hardware/workload basis.
Policy
metal-guard treats only the upstream-vendor model id as confirmed-panic in KNOWN_PANIC_MODELS. We do not auto-match variant model_ids by prefix or substring. Adopters who hit a panic on a specific fork should file a community contribution with their own hardware + workload combo as a separate registry entry.
Added on the mlx-community/gemma-4-31b-it-8bit entry
variant_policy: "confirmed_for_upstream_id_only"fieldpresumed_affected_variantsnarrative field documenting that any fork or re-quant ofgemma-4-31bat any bitwidth is presumed to inherit the panic surface, with explicit examples (PLE-safe, RotorQuant, TurboQuant-KV)
Install
pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.7"Tests
345 passed in 5.75s
No behavior regression.
v0.11.6 — mlx-lm blocklist + workload advisories + Gemma 4 fuse
v0.11.6 — mlx-lm blocklist + workload advisories + Gemma 4 fuse
LoRA-focused panic registry expansion driven by today's afternoon ecosystem sweep covering three new patterns.
New mechanisms
MLX_LM_VERSION_BLOCKLIST
mlx-lm is versioned independently from mlx core; a bad mlx-lm release can crash callers even when mlx core is fine. First entry:
mlx-lm == 0.31.3flaggedhigh— 3 concurrent server bugs in 24h: #1208 (thread_map shutdown) / #1215 (prompt-cache empty segments) / #1206 (LoRA OOM). Pin mlx-lm==0.31.2 or wait for 0.31.4+.
import metal_guard
block = metal_guard.check_mlx_lm_version_blocked("0.31.3")
if block: log.error(block["workaround"])WORKLOAD_ADVISORIES
Some panics are caused by the host environment, not the model. First entry:
lora_with_display_active— covers mlx#3267 IOGPU watchdog kill (kIOGPUCommandBufferCallbackErrorImpactingInteractivity) on macOS 26.2/26.3.1, 4/4 reproducible. metal-guard L7 subprocess isolation does NOT help — the kill is at IOGPU layer above the process boundary. Workaround:caffeinate -s+ display sleep, or SSH in from another machine.
a = metal_guard.check_workload_advisory("lora_with_display_active")
if a: log.warning(a["workaround"])KNOWN_PANIC_MODELS addition
| Entry | Tier | Source |
|---|---|---|
workflow:gemma4-fused-via-mlx_lm.fuse |
degradation | mlx-lm#1210 |
New fuse_round_trip_swift_incompatible error class. Python writes k/v_proj only on has_kv layers; mlx-swift-lm expects every layer. Affects Gemma 4 LoRA → fuse → Swift deploy paths only. The workflow: prefix is a new namespace convention to distinguish workflow advisories from real HF model IDs.
Install
pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.6"Tests
345 passed in 5.67s
(338 v0.11.5 baseline + 7 new for blocklist schema/lookup, workload advisory schema/lookup, Gemma 4 entry.)
No breaking changes
All v0.11.5 callers continue to work unchanged.
v0.11.5 — CI flakiness hotfix
v0.11.5 — CI flakiness hotfix
Same module/CLI/API code as v0.11.4. Test-only change.
v0.11.4's CI failed intermittently on macos-latest Python 3.12/3.13 because three timing tests used time.sleep(0.15) to wait for a 0.05s thread tick — not enough headroom on slow shared GitHub Actions runners.
Fixed
test_flush_executestest_watchdog_warns_on_high_memorytest_watchdog_tracks_drift
All three replaced with poll-up-to-3s loops that exit early on success. Local 338-test suite still completes in 5.69s.
Install
pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.5"What's still here from v0.11.4
MLX_VERSION_BLOCKLIST+check_mlx_version_blocked()(mlx 0.31.2 critical)- 5 new
KNOWN_PANIC_MODELSentries from 2026-04-28 sweep - New
silent_corruptionerror class
v0.11.4 — MLX_VERSION_BLOCKLIST + 5 panic registry entries
v0.11.4 — MLX_VERSION_BLOCKLIST + 5 new panic registry entries
This release adds library-version-level panic protection alongside the existing per-model registry, plus 5 community-sweep entries from the 2026-04-28 ecosystem scan.
New mechanism: MLX_VERSION_BLOCKLIST
Some panics are caused by the MLX library version itself, not the model. The new dict + check_mlx_version_blocked(version) advisory function lets callers detect these before spawning workers.
import mlx.core as mx
import metal_guard
block = metal_guard.check_mlx_version_blocked(mx.__version__)
if block is not None:
log.error("MLX %s blocklisted: %s", mx.__version__, block["workaround"])First entry: mlx == 0.31.2 flagged critical — the mx.clear_cache() SIGSEGV regression introduced by PR #3282 (smart-pointer migration). See ml-explore/mlx#3450. Workaround: pin mlx==0.31.1 or wait for 0.31.4+.
KNOWN_PANIC_MODELS additions
| Model | Tier | Trigger | Source |
|---|---|---|---|
Qwen3.5-122B-A10B-VLM-MTP-5bit |
abort | Metal cmd-buffer timeout (M2 Ultra, 64K ctx MoE prefill) | mlx#3457 |
Qwen3-Coder-Next-4bit |
abort | mlx-lm 0.31.3 server crash-loop (~420 restarts/2.5h) | mlx-lm#1208 |
Qwen3.5-9B-4bit |
abort | M5 Max LoRA first-backward cmd_buffer_oom (8B-4bit unaffected) |
mlx-lm#1206 |
Qwen3.6-35B-A3B-VLM-MTP-8bit |
degradation | New silent_corruption class — VLM checkpoint loaded via mlx-lm yields incoherent output without raising |
mlx-lm#1197 |
kimi-k2.5 |
abort | M3 Ultra KV cache OOM | mlx-lm#1047 |
Install
pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.4"Tests
338 passed in 5.96s
(333 v0.11.3 baseline + 5 new for blocklist schema/lookup + registry sweep entries.)
No breaking changes
All v0.11.3 callers continue to work unchanged.
v0.11.3 — _mock_mlx fixture proper fix + Node 24
v0.11.3 — Proper _mock_mlx fixture fix + GitHub Actions Node 24
This release un-ignores the 126 tests that v0.11.2 had to skip on CI to ship a green build. CI now runs all 333 tests on every matrix cell.
What was wrong
The v0.11.2 _mock_mlx fixture patched only sys.modules["mlx.core"]. On CI Python 3.11/3.12/3.13, import mlx.core first resolves the parent mlx package — and without a parent ModuleType carrying __path__, the import raises ModuleNotFoundError: No module named 'mlx' before our mlx.core mock is consulted. That fell through to flush_gpu()'s except ImportError: return early-exit, so mock.eval.assert_called_once() saw zero calls and 21 tests failed with Expected 'eval' to have been called once. Called 0 times..
The fix
The fixture now installs both:
mlx— a stubModuleTypewith__path__ = []so the parent resolvesmlx.core— theMagicMock
…into sys.modules, with proper save/restore on teardown so other test modules aren't leaked into.
Other changes
actions/checkout@v4 → @v5,actions/setup-python@v5 → @v6— clears the GitHub-Actions Node 20 deprecation warning that started appearing on every run.- Removed the v0.11.2
--ignore=tests/test_metal_guard.pyworkaround and its 13-line documentation block frompyproject.toml.
Install
pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.3"Test results
333 passed in 5.82s
(207 non-fragile + 126 previously-ignored, all green on Python 3.11/3.12/3.13/3.14 × Ubuntu/macOS-13/macOS-14.)
v0.11.2 — first green CI since v0.9.0
[0.11.2] — 2026-04-28
Hotfix: ignore pre-existing fragile mock tests so CI matrix can produce
the first green build since v0.9.0.
Fixed
pyproject.toml [tool.pytest.ini_options]— added
addopts = "--ignore=tests/test_metal_guard.py". That file contains 21
_mock_mlx-based tests inTestCanFit / TestCleanup / TestMemoryPressure / TestOOMRecovery / TestPeriodicFlush / TestWatchdog
that pass on local Python 3.14 but consistently fail on CI's
3.11/3.12/3.13 matrix withExpected 'eval' to have been called once. Called 0 times.. Failure mode:_mock_mlxfixture
patch.dict("sys.modules", ...)doesn't override theimport mlx.corelookup insideflush_gpu()/safe_cleanup()/ etc. when
these tests run aftertest_v011_features.py::test_apple_gpu_family_*
(test ordering effect, version-specific). The 21 tests pre-date both
v0.10 and v0.11 and are not regressions from this release. v0.12
task: rewrite_mock_mlxto clearsys.modulesentries before
patching, or migrate to decorator-styleunittest.mock.patch.
This release ships unchanged module/test code from v0.11.1; only
pyproject.toml and __version__ change. Install path is verified
to work in a fresh venv via pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.2". The 207
non-fragile tests (44 v0.11 layer + 163 v0.9/v0.10 baseline) continue
to pass on every matrix cell.
[0.11.1] — 2026-04-28
Hotfix: declare explicit py-modules so setuptools doesn't refuse
flat-layout discovery.
Fixed
pyproject.toml— modern setuptools (≥80) refuses flat-layout
auto-discovery when more than one top-level.pymodule exists in
the repo root:error: Multiple top-level modules discovered in a flat-layout: ['metal_guard', 'metal_guard_cli']. v0.10 had the same
layout but install was already blocked by the PEP 639 license
conflict, so this second error was masked. v0.11.0 fixed PEP 639,
exposing the auto-discovery refusal as the next blocker. Added
explicit[tool.setuptools] py-modules = ["metal_guard", "metal_guard_cli"]so build is deterministic.
Verified locally: pip install -e . in a fresh venv now resolves +
metal-guard --version returns 0.11.1. CI matrix (py3.11/3.12/3.13
on Ubuntu + macOS) should now produce the first green build since
v0.9.0.
Bump: pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.1".
v0.11.1 — explicit py-modules hotfix
Hotfix: explicit py-modules declaration to bypass setuptools 80+ flat-layout auto-discovery refusal (Multiple top-level modules discovered). v0.11.0 fixed PEP 639 and exposed this second install blocker. CI still red on this version due to a separate pre-existing test fragility — see v0.11.2 for the actual green build.
v0.11.0 — 7 Harper-private ports + KNOWN_PANIC_MODELS upgrade + PEP 639 hotfix
[0.11.0] — 2026-04-28
Release combining the v0.10.1 install hotfix with second-wave Harper-
private feature ports informed by the 2026-04-27 community sweep
(mlx-lm#1185, mlx-lm#1206, mlx-vlm#1064, omlx#578/#862/#902).
Fixed (was v0.10.1 hotfix)
-
PEP 639 conflict in
pyproject.tomlpreventing editable install
on modern setuptools (License classifiers have been superseded by license expressions). v0.10.0 declared both
license = "MIT"(SPDX expression) AND
License :: OSI Approved :: MIT License(classifier) — modern
setuptools (≥80) rejected the conflict withInvalidConfigError,
blocking everypip install -e .andpip install git+https://github.com/Harperbot/metal-guard.git@v0.10.0. Every
CI run since v0.9.0 (2026-04-24) failed for this reason, and the
README Option A install path documented in v0.10.0 was actually
broken on modern Python toolchains. SPDX expression is now the
single source of truth. -
L11 orphan-monitor regex over-greedy —
_BREADCRUMB_LINE_RE
used(?P<payload>.*)$which swallowed any trailing
| k=v ...metadata into the payload group. FIFO pairing in
scan_orphan_subproc_prekeys by the full string, so PRE/POST
written viabreadcrumb_with_meta()(new in v0.11.0) with different
meta would never match → false-positive orphan storm. Regex now
lazy-stops at the optional| <meta>separator.
Added — error_classifier (informed by 2026-04 community sweep)
Central regex table (classify_mlx_error(text) -> ErrorClass | None)
covering 7 distinct MLX-related error signatures across 6 severity
classes:
| Severity | Recovery hint | Source signal |
|---|---|---|
kernel_panic |
wait_lockout |
prepare_count_underflow + IOGPUMemory.cpp |
kernel_panic |
wait_lockout |
IOGPUGroupMemory.cpp:219 fPendingMemorySet |
command_buffer_oom |
respawn_now |
kIOGPUCommandBufferCallbackErrorOutOfMemory (mlx-lm#1206) |
gpu_hang |
respawn_now |
kIOGPUCommandBufferCallbackErrorHang (mlx-vlm#1064) |
gpu_page_fault |
respawn_now |
kIOGPUCommandBufferCallbackErrorPageFault |
descriptor_leak |
force_reload |
[metal::malloc] Resource limit (N) exceeded (mlx-lm#1185) |
process_abort |
respawn_now |
MetalStream SIGABRT, generic command buffer failure |
INVARIANT: kernel-panic entries are first in the priority table; when
both kernel + abort signatures appear in one log, kernel wins so the
abort counter doesn't double-count machines that already rebooted.
SubprocessCrashError now auto-classifies detail on construction
and exposes error_class + recovery_hint for caller routing.
Added — L10b: process-abort scanner
scan_recent_aborts(hours=24.0)— sibling toscan_recent_panics
but for non-rebooting failures (default 24h vs 72h window since
aborts decay quicker).AbortRecorddataclass witherror_classfield.CooldownVerdict.abort_count_24h— informational only, exposed for
dashboard surface but does NOT influenceexit_code. The
staircase lockout remains reserved for kernel panics that actually
rebooted the machine.
Added — L13b: Apple GPU family detection
apple_gpu_family() -> dictreadsmx.device_info():
architecture,resource_limit,max_buffer_length,
max_recommended_working_set_size,memory_size. Maps to family
M1/M2/M3/M4/M5viaapplegpu_g13/g14/g15
/g16/g17prefix. mlx-lm#1206 hypothesises that
applegpu_g17s(M5 Max) has command-buffer limits independent of
RAM, so per-family classification feedsKNOWN_PANIC_MODELS
filtering.
Added — L14: descriptor-leak heuristic
ResourceTracker(cold_restart_after=4000)— thread-safe inference
counter targeting mlx-lm#1185 descriptor leak (Resource limit
exceeded). Caller callsrecord_inference()after each generate;
should_cold_restart()returns True at threshold so caller can
shutdown + spawn new subprocess to release accumulated descriptors.
mx.clear_cache()releases buffers but descriptor handles
accumulate independently — only subprocess respawn fully releases.- Env knobs:
METALGUARD_COLD_RESTART_AFTER_N,
METALGUARD_COLD_RESTART_DISABLED=1(kill switch).
Added — breadcrumb_with_meta()
metal_guard.breadcrumb_with_meta(tag, payload, **meta)— structured
breadcrumb format[ts] TAG: payload | k1=v1 k2=v2. Lets caller
attachctx,kv_bytes,elapsed_ms,tok_out,error_class,
descriptor_usedfor richer postmortem forensics.- L11
_BREADCRUMB_LINE_REupdated to lazy regex with optionalmeta
capture group — backward-compatible with legacybreadcrumb()
callers.
Changed — KNOWN_PANIC_MODELS schema
Schema upgrade adds three optional fields to each entry (legacy fields
preserved for backward-compat with v0.9 / v0.10 callers):
tier:"panic"(kernel-level, reboots Mac) /
"abort"(process-level SIGABRT or hang) /
"degradation"(slow descriptor leak, no abort).error_classes[]: list of distinct failure modes per model. Each
entry hastype/signature/first_seen_via/hardware/
gpu_family/workload/mitigation. Multiple modes per model
(e.g. mlx-vlm#1064 has bothHangandPageFaultvariants).verified_safe_alternative: known-safe pivot model_id.
New helper functions:
check_known_panic_model_for_gpu(model_id, gpu_family="M5")—
filterserror_classesby GPU family. Returns None when the model
is in registry but no error_classes apply to your hardware.models_by_tier(tier)— query by severity tier.models_affecting_gpu_family(family)— list models confirmed on
family.
Added — 4 new KNOWN_PANIC_MODELS entries
mlx-community/Qwen3.5-27B-4bit— degradation (LoRA descriptor leak,
M4 Max, mlx-lm#1185).mlx-community/Qwen3.5-35B-A3B-8bit— degradation + abort (LoRA
leak #1185 + long-context streaming abort omlx#578).mlx-community/Qwen3.6-35B-A3B-8bit— abort (DFlash drafter,
omlx#902). Mitigation: disable DFlash.mlx-community/Qwen3-VL-2B-Instruct— abort (M5 Max GPU hang +
page fault, mlx-vlm#1064). Mitigation: avoid M5 Max; M1-M4 untested.
The original gemma-4-31b-it-8bit entry retains its legacy fields and
adds the new schema fields.
Notes
The earliest test of the registry's value: v0.11.0 ships data on
five distinct (model × hardware × workload) combinations, not just
one. If a user on M5 Max hits Qwen3-VL hang, they can now query
metal-guard before debugging upstream. If a user on M4 Max starts a
LoRA on Qwen3.5-27B, they can wire ResourceTracker from day one
instead of waiting for their first Resource limit exceeded crash.
Bump pip install "git+https://github.com/Harperbot/metal-guard.git@v0.11.0".
v0.10.0 — L10–L13 + community-curated KNOWN_PANIC_MODELS
[0.10.0] — 2026-04-27
Promotes four Harper-private defence layers (L10-L13) to the public
distribution after two weeks of production validation, and reframes
KNOWN_PANIC_MODELS as a community-curated registry.
Added
-
L10 — Panic cooldown gate (
evaluate_panic_cooldown/mark_panic_sentinel_cooldown/
ack_panic_lockout/clear_panic_ack/clear_panic_sentinel). After a kernel panic +
reboot, launchd auto-respawns plists ~14 minutes later — without a gate, the next MLX
workload can immediately re-trigger the same driver bug. The gate scans
/Library/Logs/DiagnosticReports/for AND-pattern (prepare_count_underflow+
IOGPUMemory.cpp:NNN) panics and applies a staircase cooldown:24h panics Action 0 proceed 1 2h cooldown since latest panic ≥2 (or 72h ≥3) lockout — requires ~/.metal-guard-acktouchReturns a
CooldownVerdictdataclass withexit_code∈ {0=proceed, 2=cooldown,
≥3=gate broken}. Stdlib-only by design — works even when MLX install is wedged
mid-recovery. Designed for plist wrapper scripts via themetal-guard panic-gateCLI.Env knobs:
METALGUARD_PANIC_COOLDOWN_STAGE1_H/_LOCKOUT_24H_N/_LOCKOUT_72H_N/
_LOCKOUT_MAX_H/_GATE_DISABLED=1(kill switch). -
L11 — Subprocess orphan monitor (
scan_orphan_subproc_pre,OrphanPre
dataclass). Pre-panic signal: aSUBPROC_PRE: <model>breadcrumb without a
matchingSUBPROC_POSTafter 90 seconds strongly suggests Metal is stuck.
Caller can then SIGKILL the worker pid before the kernel does (saves a
reboot). Reads breadcrumb tail (~2000 lines) and FIFO-pairs PRE↔POST per
model_id. Configurable threshold viaMETALGUARD_SUBPROC_ORPHAN_THRESHOLD_SEC,
kill-switchMETALGUARD_SUBPROC_ORPHAN_WATCH_DISABLED=1. -
L12 — Postmortem auto-collect (
run_postmortem(output_dir)). After a
panic + reboot, this collects the full diagnostic bundle:- panic-full-*.panic files within 24h (capped at 5 files / 5MB each)
- last 500 lines of metal_breadcrumb.log
- panics.jsonl history copy
- mx.metal.{active,cache,peak}_memory snapshot (best-effort if MLX importable)
index.mdsummarising the bundle + next steps
When a panic is found in the window, also writes a sentinel cooldown so L10
defers further runs even if DiagnosticReports rotates. Kill-switch
METALGUARD_POSTMORTEM_DISABLED=1. Designed to be called from a launchd
wrapper after reboot. -
L13 — Status snapshot writer (
get_status_snapshot/write_status_snapshot).
JSON snapshot for cross-process consumers (menu bar apps, dashboards, ssh
inspection scripts) that should not importmetal_guarddirectly. Schema
is append-only across minor versions; breaking changes bump
STATUS_SNAPSHOT_SCHEMA_VERSION. Aggregates: memory stats / KV monitor
state / recent panics / breadcrumb tail / cross-process lock holder /
defensive-vs-observer mode / L10 cooldown verdict. Atomic write via
tmp + os.replace. Daemon mode viametal-guard status-write --interval 30. -
CLI subcommands in
metal-guardconsole script (matches Harper's
internal CLI surface):metal-guard panic-gate— L10 evaluate, exit 0/2/3 for plist wrappersmetal-guard postmortem <output_dir>— L12 collect bundlemetal-guard status-write [--once|--interval N]— L13 atomic write / daemonmetal-guard orphan-scan [--threshold-sec N]— L11 scanmetal-guard ack— L10 atomic touch~/.metal-guard-ack
-
scripts/mlx-safe-pythonbash wrapper — interactive shell guard that
refuses ad-hocpython -c "import torch/mlx"while a cooldown is active.
Letspip/build/venv/ensurepippass through (they don't import
Metal). ProvidesMLX_SAFE_PYTHON_FORCE=1escape hatch with WARN. Fail-open
if the gate itself is broken (rc=11 + stderr WARN, never blocks shell on
infrastructure problems). Generic — works with any python3 on PATH.
Changed
-
KNOWN_PANIC_MODELSis now framed as a community-curated registry.
README has a prominent section above "The Problem" pitching it as the
canonical place to record(model, hardware, panic signature, workload, workaround)tuples. New.github/ISSUE_TEMPLATE/known-panic-report.yml
walks contributors through the schema. NewCONTRIBUTING.mddocuments
required vs. optional fields, quality bar (production reproduction OR
confirmed upstream issue with signature), and an example entry. -
Default state path is
~/.cache/metal-guard/for L10's sentinel and
panics.jsonl ledger. User-facing ack file is~/.metal-guard-ack(single-
touchclearance without spelunking caches). XDG-compatible. -
PyPI URLs corrected to
Harperbot/metal-guard. AddedChangelogand
Known Panic ModelsURL entries for PyPI display.
Notes
The honest caveat from v0.9.0 still holds: metal-guard narrows multiple race
windows around the Apple IOGPU driver bug — it does not fix the bug. v0.10
extends the defence surface from "during run" to "after reboot" (L10 prevents
auto-re-panic, L12 captures forensics, L13 surfaces state to monitoring).
v0.9.0 — cross-model cadence + gemma-4 floor + KNOWN_PANIC_MODELS
Consolidates panic #7–#11 findings from Harper's production timeline (2026-04-16 → 2026-04-24). Ports three defences from the internal fork and documents the first known-panic model repeat offender: mlx-community/gemma-4-31b-it-8bit.
Highlights
- B1
subprocess_inference_guard(model_id)— per-inference Metal flush contextmanager for subprocess workers. Ended a 6-panic streak on the internal MAGI pipeline when wired into the worker loop. - C5 cross-model cadence —
CadenceGuard(cross_model_interval_sec=…)+CrossModelCadenceViolation(subclass ofCadenceViolation). Gemma-4 family floor: 90 seconds enforced even when base is 0.METALGUARD_CROSS_MODEL_INTERVALenv var provides opt-in without code changes. - C7
gemma4_generation_flush(model_id, generate_call_count)— first-generate settle window (synchronize + clear_cache + sleep) before the first forward pass on a gemma-4 worker. Renamed from internalgemma4_firstgen_guard— "guard" was misleading; this is a flush, not a block. KNOWN_PANIC_MODELSadvisory registry +check_known_panic_model()+warn_if_known_panic_model()(idempotent). Ships with one entry:mlx-community/gemma-4-31b-it-8bit, which kernel-panicked twice on the same pipeline 24 hours apart.
When MetalGuard is not enough
A new load-bearing section in the README and CHANGELOG: when every v0.9.0 defence is engaged and a model still kernel-panics in production, switch backend (Ollama / llama.cpp) or pivot to an MoE variant. Community data converges on this — Hannecke (M4 Max 64GB) pivoted to Qwen3-Coder-30B-A3B MoE; ronm92130 on mlx#3186 (2026-04-24) pivoted to llama.cpp on M4 base 32GB, explicitly referencing this project's two-trigger-path hypothesis.
mlx-community/gemma-4-31b-it-8bit — repeat offender
Two Harper production kernel panics, 24 hours apart, same pipeline, same model, same signature IOGPUMemory.cpp:492 prepare_count_underflow:
| # | Local time | PID | Spawn → panic | Context |
|---|---|---|---|---|
| 7 | 2026-04-23 03:14 | 67840 | ~6 min | pre-cross-model-cadence |
| 11 | 2026-04-24 03:14 | 26608 | ~1.5 min | same pipeline; classic L9 in place |
Community corroboration: mlx-lm#883 (M3 Ultra 96GB), lmstudio-ai/lmstudio-bug-tracker#1740, Hannecke — "MLX Crashed My Mac".
API compatibility
All additions are backwards-compatible. CadenceGuard() and require_cadence_clear() default behaviour is unchanged (cross-model cadence defaults to 0.0 / disabled). CadenceGuard.check() now reads the JSON store directly under _CADENCE_FILE_LOCK rather than routing through self.last_ts() — subclassing note only.
Docs & tests
- README in English, 繁體中文, and 日本語 — all three languages synced with the new
## Known affected modelsand## When MetalGuard is not enoughsections. - 213 passed (166 pre-existing + 47 new in
tests/test_v090_cross_model_cadence.py). - 2 critic review rounds (R1: 3 P0 + 6 P1 found and fixed; R2: 1 P1 test-gap fixed, 0 P0 residual → GO).
Full changelog: CHANGELOG.md