-
scripts/install.sh --virtioand--virtio-force: gated VirtIO-transport install path for orchestrators that hardcode VirtIO at the libvirt-channel level (kubevirt today; any orchestrator with the same constraint qualifies). Full operator-facing contract indocs/NO_ISA_OVERRIDE.md. ISA serial remains the only supported transport — this is an opt-in escape hatch for the small population (macOS 11+, SIP disabled, orchestrator forces VirtIO) where ISA is not an option on the host. Auto-detect behavior unchanged; no transport-mode flips; no philosophy change.--virtio(documented, safety-gated):- Prerequisite checks: macOS >= 11, SIP disabled (
csrutil status),AppleQEMUGuestAgentLaunchDaemon present at/System/Library/LaunchDaemons/, VirtIO guest-agent device present at/dev/cu.org.qemu.guest_agent.0. Every check refuses with a specific actionable message; no install actions run if any check fails. - Interactive warning block +
yes/noprompt read from/dev/tty(NOT stdin) — soyes | install.sh --virtiocannot bypass the gate. - On confirmation:
launchctl unload -wApple's LaunchDaemon, then verify the unload actually landed via two probes: (a)launchctl listno longer shows the daemon label, (b)lsofon the VirtIO device shows no holder. If either probe fails, abort and attempt to reload Apple's daemon to restore the prior state. - Standard agent install + write
/etc/qemu/qemu-ga.confwithpath = /dev/cu.org.qemu.guest_agent.0+ drop marker at/var/db/mac-guest-agent/.virtio-mode(content:mode=full). - Functional verification: agent process PID is non-
-inlaunchctl list, and/var/log/mac-guest-agent.logshowsOpened device: /dev/cu.org.qemu.guest_agent.0within 5 seconds. If either fails, roll back (remove our agent, remove config, remove marker, reload Apple's daemon).
--virtio-force(undocumented; visible only in--help):- Bypasses every prerequisite check. No SIP probe. No Apple-agent unload. No
/dev/ttyprompt. - Installs the agent + writes the override config + drops marker with
mode=force. - For experts who have already configured the host manually (Apple's daemon unloaded by hand, SIP off by hand, non-standard device path, etc.) and want a one-line install without re-running the same checks the gated path imposes.
- Not in
README.md, not indocs/NO_ISA_OVERRIDE.md. Hostile-named on purpose.
scripts/install.sh --uninstall(new): detects the marker, removes the override config, and:mode=full(gated install): reloads Apple'sAppleQEMUGuestAgentLaunchDaemon to restore prior state.mode=force(force install): does not touch Apple's daemon (we didn't unload it; the operator did).- SIP is NOT re-enabled by
--uninstallin either mode — that's an operator action via Recovery +csrutil enable.
Argument-parsing rejections (hard errors before any side effect):
--virtio+--virtio-forcecannot combine;--uninstallcannot combine with--virtio/--virtio-force/--dry-run.--dry-runsupport:--dry-run --virtioand--dry-run --virtio-forceprint the would-do plan for the override paths (including the Apple-unload + verify steps for--virtio, and the no-checks notice for--virtio-force), no side effects, no root required. Same UX as the existing standard-install dry-run. - Prerequisite checks: macOS >= 11, SIP disabled (
-
docs/NO_ISA_OVERRIDE.md: full operator contract for--virtio. macOS 11+ scope stated up front; SIP-off rationale explained structurally (Apple's LaunchDaemon plist lives in/System/Library/, every Apple-supported override path is SIP-protected, no engineering workaround on our side); risk block; rollback instructions; explicit statement that this configuration is NOT covered by release-to-release stability promises; explicit non-decision on shipping a DriverKit System Extension to avoid the SIP-off requirement (months of engineering + paid Developer Program + notarization + user approval + ongoing IOKit-match arbitration against Apple per release, for a population this page already describes as small — explicitly not the right ROI for the project). -
tests/test_install_flags.sh: 34-assertion test suite covering argument parsing, mutually-exclusive flag rejection,--helpmentions of new flags, and dry-run plan output for--virtio/--virtio-force/ default. Wired intomake testvia the newtest-install-flagstarget. Does not exercise livecsrutil/lsof/launchctlprobes (those need PATH-stubbed system commands and are not worth the test-infrastructure cost for an unsupported feature) — manual verification on the El Cap and a Big Sur+ VM covers the live-probe paths.
src/channel.c log_virtio_diagnostic_if_present(): the "no ISA device found, but a VirtIO device is present" diagnostic message now closes with a one-sentence pointer atdocs/NO_ISA_OVERRIDE.mdfor operators whose orchestrator hardcodes VirtIO and cannot expose ISA. Same diagnostic surface, no change in detection logic. Surfaces the escape hatch to anyone hitting the error without re-adding any auto-detect path.
- ISA serial remains the only supported transport. Auto-detect (
src/channel.c known_devices[]) is unchanged —--virtiodoes not modify it. Default installs are not affected by anything in this release. - No
transport = virtioconfig key, no first-class VirtIO transport, no auto-detect of VirtIO devices, no DriverKit System Extension, no kext. Every option-space entry that would erode the v2.5.0 ISA-only decision was considered and rejected — see commit messages and the discussion behind issue #7 (mav2287/mac-guest-agent) for the rationale.
-
guest-execprocess-table slot leak (DoS after 64 short execs even with correct status polling).src/cmd-exec.c::process_tablecaps atMAX_PROCESSES = 64. Slots were reclaimed only by a 30-minute wall-time cleanup, so a caller pollingguest-exec-statusuntilexited:true(the correct usage pattern) would still see the slot held for half an hour. After 64 short execs the 65th returnedGenericError: Too many running processesuntil the cleanup window passed — affecting backup tools, monitoring loops, and the project's ownscripts/verify.shwhich usesguest-execfor the in-VM--self-test-jsonand--safe-test-jsoncalls. Reproducer (now a regression test intests/run_tests.sh): launched/bin/echo+ poll-until-exited 64 times → 65th failed.Fix:
handle_exec_status()now callsrelease_process(proc)after the terminalexited:trueresponse is fully built (the response JSON owns its own malloc'd b64 strings, so freeingproc->out_buf/proc->err_bufis safe). The 30-minute cleanup inalloc_process()stays as the safety net for callers who launched and never polled. New regression test runs 100 short execs withpoll-until-exitedand asserts every single one succeeds; sabotage-verified the test catches the bug at iteration 64 without the fix.Behavioral change: a
guest-exec-statuscall against a PID that already received its terminal status now returnsInvalidParameter. The QGA spec does not guarantee idempotent terminal polling and we never documented it; the common pattern is "poll until exited, then move on."
docs/TESTING_HARNESS.mdunblocked for contributors. The Profile B verify.sh download URL pointed at the deleteduniversal-upgrade-v2.4.4branch (404'd as soon as PR #6 merged). Updated tomain. Staleagent_version = 2.5.0expectations replaced with "the version you installed." Same applies to evidence-expectation wording further down.docs/COMPATIBILITY.mdintroduces Tier 1† (Production-ready, current-artifact retest pending) for 10.4 Tiger and 10.5 Leopard. Runtime evidence on those rows is still v2.4.3 (vit9696's PR #5); the i386 slice's build recipe is unchanged from v2.4.3 → v2.5.x andscripts/verify-legacy-slices.shconfirms structural equivalence on every CI build, so the rows remain Tier 1 in spirit but the dagger flags the pending current-release runtime drop. Promotes back to plain Tier 1 once a v2.5.x evidence drop lands.- Cleanup sweep: tracked source / scripts / workflows no longer reference the deleted
audit.mdanduniversal_upgrade.mdfiles.docs/design/AGENT_BEHAVIOUR_SPEC.mdanddocs/research/UPSTREAM_NOTES.mdnow marked as "historical reference" rather than "in progress" — the work they describe shipped in v2.4.3.
The published release asset is renamed from mac-guest-agent-darwin-universal (v2.5.0) to simply mac-guest-agent (v2.5.1+). Same tri-fat universal binary, shorter name. The shorter name matches what /usr/local/bin/mac-guest-agent will contain post-install — the manual install flow becomes:
curl -fLO https://github.com/mav2287/mac-guest-agent/releases/latest/download/mac-guest-agent
sudo mv mac-guest-agent /usr/local/bin/
sudo /usr/local/bin/mac-guest-agent --installinstead of requiring a per-step rename. Requested by @vit9696 (#4).
Anyone with v2.5.0 URLs (one-day window between v2.5.0 and v2.5.1) must update:
| Was (v2.5.0) | Now (v2.5.1+) |
|---|---|
…/releases/latest/download/mac-guest-agent-darwin-universal |
…/releases/latest/download/mac-guest-agent |
scripts/install.sh --local accepts both names during the transition — the v2.5.0 *-darwin-universal filename is kept in the search list as a recovery fallback so users who downloaded yesterday's release don't need to rename. service.c --update text and all install snippets updated to the new name.
-
producer | short-circuit-consumerSIGPIPE race class eliminated across allset -o pipefailshell scripts. Under pipefail, a pipeline whose right-hand side exits early (grep -q,head -1,awk '...{print; exit}',sed 'Nq', etc.) SIGPIPEs the producer's next write, which dies with status 141 and propagates non-zero through the pipeline — making the surroundingifor$()see a "failure" that didn't logically happen. Hit once in the wild as the "PVE: VMID redacted in human output" flake on CI run 26532052157 (commitd0bde24, macos-14, 2026-05-27); other instances of the same pattern existed in the codebase and would have been timing-bombs.Sites fixed:
tests/test_verify_transports.shassert_contains/assert_not_containsrewritten to use bashcase "$haystack" in *"$needle"*)pattern matching — pure bash, no subprocess.scripts/verify-legacy-slices.shslice_min_macosx/slice_build_version_minosawk extracts: removed; exitfrom the awk action and addedflag=0to clear after first match. awk reads to EOF; otool finishes writing cleanly.scripts/verify-legacy-slices.shgate 3h (host_statistics64 weak-import check): rewritten in pure bash with awhile IFS= read … <<<"$nm_full"loop pluscasematching.tests/run_tests.shall 11awk 'NR==1{...; print; exit}'invocations:; exitremoved (awk reads to EOF).tests/run_tests.shtwo| sed 's/^QMP> //' | head -1chains: collapsed to| awk 'NR==1{sub(/^QMP> /,""); print}'(single awk that reads to EOF).
Each affected file also gained a banner comment near its
set -o pipefailline documenting the convention (noproducer | short-circuit-consumerunder pipefail) with a reference to this CI incident, so future contributors don't re-introduce the pattern. Verified locally with 5x consecutive runs ofmake test,./scripts/verify-legacy-slices.sh,./tests/run_tests.sh, and./tests/test_legacy_slice_gate.sh— every run clean.
--dry-runflag for--install/--uninstall/--update. Plumbed throughsrc/service.cso the three handlers gate every side-effect (filesystem writes, file copies,unlink,rename,launchctlcalls) on the flag and print "DRY RUN: would ..." lines instead. Root check is also skipped in dry-run because no privileged operations execute. Non-destructive validation (binary path existence, executable bit) still runs — so a--update /no/such/file --dry-runinvocation fails fast with the right error, exactly as the real--updatewould. Pairs withscripts/install.sh --dry-run(added in v2.5.0): the script side covers download / path resolution / cp + chmod planning, the binary side covers the LaunchDaemon plist write, log rotation config, and launchctl load/start. Together they give end-to-end smoke-testability of the install flow without root or a clean VM. Help text + manpage +docs/CLI.mdupdated.
-
cfg.methodconfig field and-m/--methodCLI flag. The field was already vestigial in v2.5.0 — VirtIO transport was removed, leavingautoandisa-serialas functionally identical synonyms with no behavior to gate (channel selection insrc/channel.c known_devices[]is ISA-only regardless). v2.5.1 removes the field fromstruct config, removesDEFAULT_METHOD, drops the-m/--methodflag (getopt returns "unknown option"), drops themethod =line from--dump-confoutput, and updates help text +configs/qemu-ga.confaccordingly. Use-p PATH/path = /dev/cu.serial1(which already exists) for explicit device-path override.Migration: existing
/etc/qemu/qemu-ga.conffiles that still containmethod = auto,method = isa-serial, ormethod = virtio-seriallines will continue to parse — the parser accepts the key and emits a one-time notice on stderr ("themethodconfig key was removed in v2.5.1 and is ignored …") pointing the user at removing the line. No exit, no error. The v2.5.0 hard-rejection ofmethod = virtio-serialsoftens to the same deprecation notice because the field no longer has any behavior to misconfigure.Why now: the ISA-only transport decision in v2.5.0 collapsed the field's value space from three distinguishable options (
auto/isa-serial/virtio-serial) to one (any value → ignored), making the surface honest about there being no choice. Keeping the field cost ~15 lines of code spread acrosssrc/main.cplus a doc paragraph explaining why it existed but did nothing.
The release ships a single binary: mac-guest-agent-darwin-universal (i386 + x86_64 + arm64 in one tri-fat Mach-O; dyld picks the right slice at load time). The previous per-architecture assets are gone:
mac-guest-agent-darwin-amd64→ removedmac-guest-agent-darwin-arm64→ removedmac-guest-agent-darwin-i386→ removed (was Makefile-only, never officially published)
Anything pinning the old URL — install scripts, Ansible/Salt/Chef recipes, CI jobs, IaC, package manifests — must update to:
https://github.com/mav2287/mac-guest-agent/releases/latest/download/mac-guest-agent-darwin-universal
One download URL now covers macOS 10.4 Tiger through 26 Tahoe. The version bump to 2.5.0 (rather than a 2.4.4 patch) reflects that this is a backward-incompatible release-shape change, not a drop-in patch.
The agent now supports ISA serial only; the VirtIO transport fallback that v2.4.x carried in src/channel.c known_devices[] has been removed. The new contract:
known_devices[]is ISA-only (/dev/cu.serial1,/dev/cu.serial2,/dev/cu.serialand their/dev/tty.*counterparts).- A VirtIO-only VM presents no usable channel — the agent logs a clear error message (
"Found VirtIO serial device (...) but VirtIO transport was removed in v2.5.0 — this agent now requires ISA serial. Reconfigure your hypervisor...") and exits. method = virtio-serialis rejected at config-parse time with the same explanation;method = auto(default) andmethod = isa-serialcontinue to work.- CLI:
-m virtio-serialis rejected the same way.
Why: VirtIO was always a footgun on Apple Virtualization.framework hosts (UTM Virtualize mode, vz_run, anything VZVirtualMachine-backed) where Apple's own 18-command AppleQEMUGuestAgent claims the channel and silently intercepts traffic; v2.4.x kept it as a fallback for the narrow case of "plain QEMU configs without ISA UART," which produced the surprising behavior that the same install behaved differently depending on host class. Restricting to ISA closes that ambiguity and makes a disk image moving between QEMU and VZ-backed hosts keep working without reinstall.
Migration from v2.4.x:
- PVE:
qm set <vmid> --agent enabled=1,type=isa(already the documented setup). - libvirt: add an
isa-serialdevice to the domain XML; remove anyvirtio-serialagent channel. - UTM: in VM settings, Devices → Serial → set Interface to QemuGuestAgent (ISA-backed). Remove any VirtIO Serial Interface.
- Raw QEMU: add
-device isa-serialto the command line; remove-device virtio-serial-pciif it was the agent channel. - UTM Virtualize backend on Apple Silicon: no ISA option exists. Switch the VM to UTM's Emulate (QEMU) backend, or accept Apple's built-in 18-command agent (no freeze) on the VirtIO channel.
After reconfiguring the hypervisor, fully stop and restart the VM (QEMU device changes need a full restart, not a guest reboot).
- Fixed (compatibility):
mac-guest-agent-darwin-amd64v2.4.3 crashed at startup on Mac OS X 10.6 Snow Leopard and 10.7 Lion withdyld: unknown required load command 0x80000028(SIGTRAP). The amd64 binary advertisedLC_VERSION_MIN_MACOSX 10.6but its entry-point load command wasLC_MAIN(introduced 10.8). The v2.4.3 release pipeline was running on a GitHub Actions runner image carrying Xcode 15.5, which silently clamped the Makefile'sMACOSX_DEPLOYMENT_TARGET=10.6env var and emittedLC_MAINregardless. Reported by @vit9696 in #4. Fixed by building both legacy slices (i386 + x86_64) against the phrackerMacOSX10.13.sdkwith explicit-mmacosx-version-minflags (10.4 for i386, 10.6 for x86_64) AND-Wl,-ld_classicto invoke Apple's older linker (Xcode 15-16's newld-primehardcodesLC_MAINfor x86_64 regardless of the min flag;ld-classichonors the min flag for entry-point selection). The combination emitsLC_UNIXTHREADwhich 10.6/10.7 dyld understands. Also addedscripts/verify-legacy-slices.shinvoked by both build and release CI workflows, which fails the build on any disallowed load command, off-spec deployment target, unexpected dylib dependency, weak-import attribute regression onhost_statistics64, or undefined-symbol drift outside the checked-in per-slice baselines. The gate makes the invariant explicit in CI; current implementation depends onmacos-14runner +ld-classic(deprecated) and will need revisiting if Apple removesld-classicor GitHub retires the runner image (canary build onmacos-latestwatches for both).
- Removed: VirtIO entries from
src/channel.c known_devices[]. The auto-detect list went from 14 entries (6 ISA + 8 VirtIO across the various/dev/cu.virtio*,/dev/cu.org.qemu.guest_agent.0,/dev/cu.qemu-guest-agentaliases UTM/QEMU/libvirt expose) down to 6 (ISA only). Added a separatelog_virtio_diagnostic_if_present()that scans for the removed VirtIO paths only when ISA detect fails, so an upgrading user with a leftover VirtIO setup gets an explanatory error pointing at the migration steps rather than a generic "no serial device found." - Removed:
method = virtio-serial(config file) and-m virtio-serial(CLI) are now rejected at parse time with a message pointing at the v2.5.0 BREAKING entry.auto(default) andisa-serialcontinue to work. - Changed (release): v2.5.0 publishes a single binary:
mac-guest-agent-darwin-universal, a tri-fat Mach-O containingi386 + x86_64 + arm64slices. dyld picks the appropriate slice at load time: Tiger and Leopard pick i386 (those OSes lack x86_64 user-space support, or in 10.5's case prefer i386); Snow Leopard picks x86_64 when booted with a 64-bit kernel (Xserve / Mac Pro default) or i386 when booted with the 32-bit kernel default on most consumer hardware; Lion through Catalina pick x86_64 (with theLC_UNIXTHREADfix above); Big Sur and Apple Silicon pick arm64. The thin per-arch binaries (-i386,-amd64,-arm64) are no longer published. One download URL covers all supported macOS versions and architectures. If the universal doesn't start on a specific host, open an issue at https://github.com/mav2287/mac-guest-agent/issues/new — we work each report as a bug. - Changed: Install URL changed from
mac-guest-agent-darwin-amd64tomac-guest-agent-darwin-universal. Scripts pinning the old URL must update. - Added:
scripts/verify-legacy-slices.sh— CI-callable script that audits per-slice invariants (LC commands, deployment targets, dylib deps, undefined symbols) of the produced universal. Replaces the previous inlineclock_gettimecheck; now runs against all three slices and covers more failure modes. Hard-fails the build on any disallowedLC_REQ_DYLDcommand, unknown numeric load command, missing per-slice symbol baseline, or symbol drift outsidetests/legacy_slice_symbols_<arch>.txt. - Added: New
surrogate-32bitCI job builds the portable subset (protocol.c+cJSON.c) undergcc -m32onubuntu-latestvia a standalonetests/surrogate_32bit_main.cdriver and runs portable unit tests under 32-bit code.selftest.cis excluded because it drags macOS-specific dependencies (compat_*,run_command_capture);log.cis excluded because it loadsos_logviadlfcn(a macOS-runtime feature with no Linux glibc equivalent);util.cis excluded because it#includescompat.hand uses POSIX surface that needs_POSIX_C_SOURCE=200809Lon Linux glibc. Catches int-width / struct-layout / endianness regressions in JSON marshaling without depending on access to old Intel Mac hardware. - Added:
#include <stdint.h>tosrc/util.c(one line, no behavior change) —SIZE_MAXwas previously visible only via transitive Apple SDK includes; explicit include eliminates that fragility. - Changed:
Makefilebuild-x86_64now uses explicit-mmacosx-version-min=10.6 -isysroot $(LEGACY_SDK)instead of relying onMACOSX_DEPLOYMENT_TARGET=10.6env var (which is toolchain-version-dependent and was the underlying mechanism of the v2.4.3 bug).build-i386similarly gets explicit-mmacosx-version-min=10.4.LEGACY_SDKdefaults to/tmp/MacOSX10.13.sdk(phracker tarball, SHA2561d2984ac…23a5apinned in CI);I386_SDKaliases it for backward compatibility. - Changed:
Makefilebuild-universalnow produces a tri-fat binary (i386 + x86_64 + arm64; previously x86_64 + arm64 only). - Changed:
Makefiledist/pkg/sign/dsym/helptargets all updated for universal-only distribution.distclears$(DIST_DIR)before populating so stale per-arch artifacts from previous builds can't leak into the checksums. - Changed:
src/service.c--updateflag's instruction text now referencesmac-guest-agent-darwin-universal. - Changed:
scripts/install.shfetches the universal binary;detect_arch()removed (no per-arch asset to pick) but architecture validation preserved asvalidate_arch()so unsupported hosts (e.g., PowerPC) fail early with a clear message. - Changed:
scripts/install.sh --localnow finds the published release asset by its real filename. The pre-v2.5.0 search list only checkedbuild/mac-guest-agent,./mac-guest-agent,/tmp/mac-guest-agent-x86_64,/tmp/mac-guest-agent— none of which match whatservice.c --updateor the install docs tell users to download (mac-guest-agent-darwin-universal). The script now searches./mac-guest-agent-darwin-universaland/tmp/mac-guest-agent-darwin-universalfirst, thenbuild/mac-guest-agent-universal, then the legacy generic names for recovery flows. Added explicit--local /path/to/binaryform so the installer never has to guess:sudo ./install.sh --local /Users/me/Downloads/mac-guest-agent-darwin-universal. The error message on "no binary found" now lists every searched path and points at the explicit-path form.--helpupdated. - Updated: Workflow comments in
.github/workflows/build.ymland.github/workflows/release.ymlrewritten to name-Wl,-ld_classicas the load-bearing dependency (the ld-classic linker is what honors-mmacosx-version-minand emitsLC_UNIXTHREADon the legacy slices). Previous wording framed themacos-14runner pin as the primary mechanism; the pin is just toolchain stability — withoutld_classicthe slices break on any Xcode 15+ regardless of runner image. Documented fallback chain for when Apple removesld_classic: verify canary status, try Homebrew cctoolsld, or build legacy slices in a container with a frozen older Xcode CLT. - Changed:
scripts/build-pkg.shdefault arch is nowuniversal; per-arch invocation kept for internal testing. - Changed:
scripts/verify-installer.shrecommendation collapsed from per-arch (if/elif/elif on macOS version) to single universal-binary line. - Added:
--self-test-jsonsystem_infoblock now includes aselected_archfield reporting which slice of the universal binary dyld actually picked. Useful for verify.sh evidence drops and post-incident forensics.
- Updated:
README.md,docs/PVE.md,docs/UTM.md,docs/COMPATIBILITY.md,docs/RELEASE_TEMPLATE.mdinstall snippets all reference the universal binary as the single download. README has a new "If the agent doesn't start" section that asks users to open a GitHub issue with diagnostic outputs (loader-safesw_vers/file/lipo -infofirst;--self-test-json/--version/ log tail only if the binary actually starts). Modern-machine TLS caveat preserved — Tiger / Leopard / older Snow Leopard guests usually need to download on a modern machine and transfer the file.
- Unified host-side verifier (
scripts/verify.sh). Replaces the PVE-onlyscripts/pve-verify.shwith a single auto-detecting verifier that covers Proxmox VE, libvirt, UTM, and any raw-QEMU host with a QGA Unix socket. Each transport reaches host-driven QGA commands andguest-execpolling through the same five-primitive plugin interface, so the check pipeline (Configuration → VM State → Agent Communication → Memory → Host Environment → Multi-cycle Freeze/Thaw → In-VM Diagnostics) is identical regardless of hypervisor. PVE auto-detected viaqm+/etc/pve/qemu-server/<id>.conf; libvirt viavirsh dominfo; UTM viautmctl status(with QGA-serial socket discovered from the.utmbundle plist); raw QEMU via--qga-socket PATH. Per-transport preflights: PVE (root, cluster locality, backup-lock); libvirt (libvirtd reachability); UTM (refuses root, requires QGA serial configured in UTM GUI); qga-socket (path is a real Unix socket). Auto-thaw safety trap on EXIT/INT/TERM.freeze_dispatchstatic-contract check against the agent binary catches drift betweenfs_dispatch_class()anddocs/design/FREEZE_SEMANTICS.md. Multi-cycle freeze (default 3) catches state-leak bugs the prior single-cycle check missed. Mount-dispatch cross-check compares the frozen count to the captured mount table. JSON appendix schema bumped to 2.0 withhost_environment(sw_vers/ hardware / kexts / ioreg-serial-nodes / parsed mount table / launchd / log-file stat),freeze_cycles_log(per-cycle structured records),mount_dispatch_crosscheck. PII (IPv4 / MAC / supplied identifier) redacted by default. 57-assertion shell-shim test suite (make test-verify-transports) covers all four transports without requiring a real hypervisor. Docs swept:COMPATIBILITY.mdStep 2,UTM.md,LIBVIRT.md,PVE.md,evidence/README.md(with full schema 2.0 field table).pve-verify.shdeleted — no shim. Tracked as Phase 4 indocs/PLAN.md.
- Added:
docs/mac-guest-agent.8is now generated fromdocs/mac-guest-agent.8.inat build time, with@VERSION@substituted from the Makefile and@DATE@from the build's month/year. New Makefile rule (docs/mac-guest-agent.8: docs/mac-guest-agent.8.in Makefile) regenerates whenever either input changes. Thebuildtarget depends on it so a plainmake buildalso keeps the manpage fresh. New CI step in.github/workflows/build.ymlruns the regeneration andgit diff --quiets the result — fails CI with a clear "manpage is stale" error if aVERSIONbump landed without regenerating. Prevents the audit-finding-4-style drift (manpage at 2.2.0 while Makefile at 2.4.2) from ever recurring. The generated.8is still tracked in git soraw.githubusercontent.comfetches and OS package builds that don't runmakefirst still get a usable manpage. Build-side only; no runtime change; no Tiger concern (the build runs on a developer's modern macOS, not on Tiger). - Added: Test-mode
MGA_HOOK_DIR_OVERRIDEenv var +tests/run_tests.shintegration test that locks the freeze-hook abort contract from audit finding 5.src/cmd-fs.c HOOK_DIRwas a hardcoded#defineto/etc/qemu/fsfreeze-hook.d; replaced by ahook_dir()getter that honorsMGA_HOOK_DIR_OVERRIDEONLY whentest_modeis enabled (set exclusively by--testflag at startup, never attacker-controlled). The script-ownership validation inrun_hooks()(script must be uid 0, parent dir must be uid 0) is similarly bypassed in test mode — test fixtures live in/tmpowned by the test runner. World-writable + executable checks stay enforced in both modes (correctness, not security). Two new test cases: (1) a freeze hook that exits non-zero must produce aGenericErrorwith descriptionFreeze hook script failed; (2) a freeze hook that exits 0 must NOT abort the freeze (proves the abort is gated on the non-zero exit specifically, not on the mere presence of a hook). Locks finding 5 against future drift. Tiger-compat:getenv()is POSIX-ancient, no new APIs. i386/10.4 cross-build clean. - Fixed:
.github/workflows/build.ymlASAN-integration step previously had./tests/run_tests.sh ./build/mac-guest-agent-asan || true, so the job passed even if every integration test failed under ASAN — silently masked any sanitizer-detected bug. Removed the|| trueso ASAN integration failures fail CI. Verified locally that the ASAN binary (-fsanitize=address,undefined) passes the full 75-test integration suite on macOS 26 with the audit-finding fixes in place. Addresses audit.md finding 7. - Added:
.github/workflows/build.yml—make test-verify-transportsnow runs in the test-matrix job (macOS 14 / 15 / latest) so the shell-shim integration tests catch regressions in CI. - Added:
tests/test_verify_transports.sh— shell-shim integration test suite forscripts/verify.sh. Mocksqm+pvesh(PVE), runs a real Perl-driven QGA Unix-socket listener (qga-socket / UTM socket I/O), and exercises CLI surface, transport-plugin wiring, JSON appendix schema 2.0, all six optional flags (--no-redact,--no-appendix,--no-in-vm,--no-env-capture,--no-freeze,--freeze-cycles N), redaction (with both raw-present-when-disabled and raw-absent-when-enabled assertions), mount-dispatch cross-check arithmetic (expected vs actual frozen count), and multi-cycle freeze recording (verifiesfreeze_cycles_loglength matches--freeze-cycles). 57 assertions, ~0.5s runtime, no real hypervisor required. Wired intomake testvia the newtest-verify-transportsMakefile target. - Added:
scripts/verify.shmulti-cycle freeze test + mount-dispatch cross-check +--no-freezeopt-out. The Freeze/Thaw section now runs--freeze-cycles N(default 3) consecutive freeze/thaw cycles instead of one — catches state-leak bugs between cycles, which the single-cycle check missed by construction. Each cycle records its own structured entry (cycle number, frozen count, thawed count, fsfreeze-status outcome, behavioural-check outcome, post-thaw outcome, the matchingFilesystem frozen:log line) in a newfreeze_cycles_logarray in the appendix. The pre-cycle freeze auto-thaw safety trap re-arms and disarms per cycle so a kill between cycles still thaws cleanly. After the last cycle, amount_dispatch_crosscheckruns: from the captured mount table (Host Environment section), it counts mounts whosefstypeis NOT in{smbfs, afpfs, nfs, webdav, ftp, devfs, autofs, fdesc, volfs, synthfs, lifs}and compares to the last cycle's frozen count. PASS if the count is 1..2× the expected (loose because APFS containers can produce more snapshot rows than mount rows and ZFS datasets pad too); FAIL if 0 or grossly over; INFO if the expected count isn't derivable (env-capture off, malformed mount table). New--no-freezeflag skips the section entirely for contributors who don't want to freeze a production-ish VM — gives ~80% of the evidence. New--freeze-cycles Nflag is validated (must be a positive integer) at parse time. - Added:
scripts/verify.shHost Environment capture section + JSON appendix schema bumped to 2.0. New section runs before Freeze/Thaw (so the captured mount table reflects pre-freeze state and isn't blocked by the freeze command allowlist) and probes the guest viatransport_guest_exec_jsonfor:sw_vers -productName -productVersion -buildVersion;sysctl -n hw.model hw.ncpu hw.memsize machdep.cpu.brand_string(machdep.cpu.brand_string is populated on both Intel and Apple-silicon hosts);kextstatfiltered toApple16X50Serial/AppleVirtIO/IOSerialFamilyfamilies;ioreg -l -w 0filtered to serial/virtio nodes (capped at 8 KB);mountparsed into[{device, mount_point, fstype, options}];launchctl list com.macos.guest-agent;stat -f "size=%z mtime=%Sm name=%N"on the agent log file. All captured pieces assembled into a singlehost_environmentobject embedded in the appendix. Schema bumped to 2.0 (additive change — every 1.0 field is preserved; downstream consumers ignore the new fields are still compatible). New--no-env-captureflag opts out for cases where guest-exec is slow or only host-driven checks are wanted. The script'sgx_capturehelper is a thin wrapper aroundtransport_guest_exec_jsonthat extracts theout-datatext, used here and by the existing in-VM diagnostics section. - Added:
scripts/verify.shUTM transport via plist-based socket discovery, plus a genericqga-sockettransport for raw QEMU / custom installs. UTM shipsutmctlbut no arbitrary-QGA subcommand, so the transport talks to the QGA Unix socket directly (same socketutmctl execuses). Discovery reads~/Library/Containers/com.utmapp.UTM/Data/Documents/<name>.utm/config.plistviaplutil -convert json -o -, finds theSerialentry withInterface == "QemuGuestAgent", and uses itsPath. If discovery fails (no QGA serial configured), errors with the exact UTM GUI steps to add one — the .utm bundle is never mutated.--qga-socket PATHoverrides discovery entirely. Socket I/O uses PerlIO::Socket::UNIX+JSON::PP(core macOS modules) to avoid BSDnc/socatversion quirks; one helper (_qga_socket_cmd+_qga_socket_guest_exec_json) is shared by both transports. Preflight refuses to run as root because the UTM socket is owned by the desktop user. Theqga-sockettransport requires--qga-socket PATH, validates the path is an actual Unix socket, and probes viaguest-pingfor VM-state detection (no hypervisor metadata to inspect). Both transports return PVE-shape-compatible envelopes fromguest_exec_json(base64-decodedout-data/err-data) so the in-VM diagnostics section is transport-agnostic. Auto-detection extended:utmctl status <id>exits 0 → UTM;--qga-socket PATHset → qga-socket. - Added:
scripts/verify.shlibvirt transport. Driven byvirsh qemu-agent-command; reachesguest-exec+guest-exec-statusvia the same channel and base64-decodesout-data/err-datainto the same envelope shape PVE'sqm guest exec --output-format jsonproduces, so the check pipeline above the transport layer is identical between PVE and libvirt. QGA responses unwrapped from libvirt's{return: ...}envelope so downstreamjson_querycalls work with the same$d->{field}shape PVE provides — error envelopes pass through unchanged so the content-based behavioural-freeze check still sees$d->{error}->{desc}. Auto-detection:virshon PATH andvirsh dominfo <id>exits 0. Preflight: virsh+perl+base64 present, libvirtd socket reachable (root or libvirt-group membership; honoursLIBVIRT_DEFAULT_URI), domain exists. Config check looks for the documentedorg.qemu.guest_agent.0virtio-serial channel invirsh dumpxml; without it the in-guest agent has nothing to talk to and the verifier flags it before any real test runs. Discard/SSD-emulation hints best-effort-grepped from the disk XML. - Added:
scripts/verify.sh— unified, multi-transport host-side verifier replacingscripts/pve-verify.sh. Auto-detects PVE (libvirt and UTM land in subsequent commits) from the host environment, or accepts--transport pve|libvirt|utm|qga-socketexplicitly. Transport plugin architecture: every transport implements five primitives (transport_describe,transport_vm_state,transport_config_summary,transport_qga_cmd,transport_guest_exec_json); the check pipeline is transport-agnostic above that layer. PVE transport ships in this commit and is a direct port of the priorpve-verify.shflow (config / VM-state / agent comms / memory / freeze-thaw with content-based behavioural check / in-VM--self-test-json+--safe-test-json/ freeze-log fetch / JSON appendix), plus three new safety preflights — root check, PVE cluster locality check (refuses to run when the VM lives on a different node), and PVE backup-lock check (refuses to run whenvzdumpis in progress). New auto-thaw safety trap fires onEXIT/INT/TERMand issuesfsfreeze-thawif the script is killed between freeze and thaw — the agent has its own 10-minute auto-thaw safety net, this just makes recovery immediate. PII redaction (IPv4 / MAC / supplied identifier) reimplemented in Perl rather thansed -Ebecause BSD sed on macOS doesn't support\bword boundaries; one redaction implementation works on Linux and macOS hosts (relevant for the upcoming UTM transport). - Removed:
scripts/pve-verify.shdeleted (no compatibility shim). The single previous user is on the same branch as this commit; superseded entirely byscripts/verify.sh --transport pve. - Added:
scripts/pve-verify.shPhase 3 one-shot rewrite. A single host-side invocation (./pve-verify.sh <vmid>) now produces the full Tier-2 → Tier-1 evidence: the existing host-side checks (config, VM state, agent ping/get-osinfo/network/info, agent-sourced memory, freeze/thaw round-trip), plus in-VMmac-guest-agent --self-test-jsonand--safe-test-jsondriven viaqm guest exec --output-format json(no need to SSH into the guest or run anything manually inside the VM), plus a tail of the agent log for the per-eventFilesystem frozen: ...INFO line that summarises the per-treatment breakdown (Phase 2 Q3). Output is a human-readable text report followed by a structured JSON appendix that contributors paste straight intodocs/evidence/<version>/pve-verify.json— appendix embeds the in-VM JSON outputs as parsed objects (in_vm_selftest,in_vm_safetest) plus host-side check records (host_checks) and the freeze-event log line (freeze_log_tail). PII (IPv4 addresses, MAC addresses, supplied VM ID) redacted by default; new flags--no-redact,--no-appendix,--no-in-vm,--agent-path,--log-path,--exec-timeout,--help. Static-contract check on the in-VMfreeze_dispatchblock: verifies the agent advertisesper_fstypename.apfs = "tmutil_snapshot+f_fullfsync"andcpustats_discriminator = "linux"(Phase 2 Q3/Q4) so contract drift between the binary anddocs/design/FREEZE_SEMANTICS.mdbecomes a visible verifier FAIL. Implements all ofdocs/PLAN.mdPhase 3. - Fixed:
scripts/pve-verify.shfrozen-state behavioural check inspectedqm agent get-osinfoexit code, which is structurally unreliable. Perdocs/research/UPSTREAM_NOTES.mdTarget 4, PVE'sregister_commanddispatcher (used byqm agent <cmd>) wraps QGA errors as{result:{error:{...}}}and the CLI exits 0 regardless of whether the agent answered or refused. The check now inspects response content: presence of"pretty-name"→ FAIL (agent answered while frozen); presence of"Command not allowed while filesystem is frozen"or"error"→ PASS (genuinely gated); anything else → INFO with the truncated raw response. Same content-inspection rule applied to the post-thaw "agent responds normally again" check (looks for"pretty-name"). Robust regardless of whether future PVE versions change the wrapper. - Updated:
docs/COMPATIBILITY.md"Step 2: Runtime Validation" anddocs/evidence/README.mdper-version layout updated for the one-shot flow. Step 2 is now install-in-VM (one-time) +pve-verify.shon the host (everything else); the previous "run two commands in the VM" step is gone. Evidence layout preferspve-verify.txt+pve-verify.json(split at theJSON Appendixheader in the script output); the legacy three-file layout (selftest.json,safetest.json,pve-verify.txt) is still accepted, no rewrite of existing per-version directories. - Added:
--self-test-jsonnow emits afreeze_dispatchJSON sibling ofsystem_info. Surfaces the per-f_fstypenamedispatch policy table (apfs →tmutil_snapshot+f_fullfsync, hfs →f_fullfsync, FAT/exFAT/UDF/NTFS →f_fullfsync_with_enotsup_tolerated, ZFS →zfs_snapshot_if_cli_else_f_fullfsync, network →skip_network, special →skip_special), the default log path and the per-event INFO-line prefix thatscripts/pve-verify.shgreps for, azfs_cli_availableboolean (resolved via the samefind_zfs_cli()cache as the freeze path), the three documented divergences from upstream QGA (idempotent_re_freeze, nopersistent_frozen_state_marker, nologging_disabled_during_freeze), and thecpustats_discriminator("linux") so a verifier can statically check that the wire shape ofguest-get-cpustatsmatches what the agent advertises. Lets contributors and PVE-side tooling introspect the agent's freeze policy without having to run a real freeze. Implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ3. Backed by two newtests/run_tests.shcontract checks: (1)freeze_dispatchblock shape + dispatch-table values, (2)guest-get-cpustatsdiscriminator round-trip against the advertised value. - Fixed:
scripts/pve-verify.shmemory check reportedPASS memory reporting: 0GB / 0GB. It read PVE's host-side QMP/balloon counters (blank for macOS guests — macOS ships no virtio-balloon stats driver) by scraping thepveshtext table, and printedPASSwithout validating the parsed values. Rewritten: memory now comes from the guest agent itself (get-memory-block-info+get-memory-blocks), with real used/total derived from its data; agent JSON is parsed with PerlJSON::PPinstead ofgrep-on-text; the result model is fail-closed, so no check printsPASSon data it could not parse; added aqm/perlpreflight and a VM-running check. - Fixed:
scripts/pve-verify.sh— thetype=isaconfig check requiredenabled=1to appear beforetype=isaon the config line, producing a falseFAILwhen Proxmox wrote the agent options in the other order; the two options are now matched independently. The freeze/thaw checks passed on any digit in the output, including a zero-filesystem freeze; they now require a parsed count of at least 1. - Added:
scripts/pve-verify.shfreeze check now verifies the frozen state behaviourally — while frozen, the agent must reject a non-freeze command (get-osinfo), and must resume normal operation after thaw — rather than trustingfsfreeze-status, which only echoes the agent's internal frozen flag. macOS has noFIFREEZE, so the rejection behaviour is the observable proof that the freeze took effect.
- Added:
docs/evidence/10.11.6/— first real-world v2.4.3 evidence drop. Apple Xserve3,1 (real bare metal, not emulated), Mac OS X 10.11.6, HFS+ on/dev/disk0s2, PVE host-sideverify.shrun reporting 38 passed / 0 failed. Capturedhost_environment(sw_vers / Xserve3,1 hardware / Intel Xeon W5590 / 8 GiB / IOSerialFamily v11 + Apple16X50Serial v3.2 + Apple16X50ACPI v3.2 / mount table / launchd state / log-file stat), three freeze cycles with per-cycle log line (sync + F_FULLFSYNCon the HFS+ root, 4 special FS categorically skipped — matchesdocs/design/FREEZE_SEMANTICS.md), mount-dispatch cross-check passed,freeze_dispatchJSON contract validated against the binary,--self-test-json20/0/0,--safe-test-json21/21. COMPATIBILITY.md row for 10.11 El Capitan refreshed with the v2.4.3 evidence reference. - Fixed:
scripts/verify.sh— two real-world fixes from the El Cap evidence run. (1)qm guest exec --output-format jsonis not supported on some PVE versions in the wild (returns400 unable to parse option); the default output is already JSON, so the flag was dropped. The silent failure mode previously reported "guest-exec failed (binary missing or guest-exec disabled?)" even though guest-exec worked — misleading. (2)--safe-test-json's real wire shape is top-level{passes, failures, status, agent_version, test}with no nestedsummaryand nototal; the parser was looking forsummary.passed / summary.failed. Synthesisedtotal = passes + failuresand added thestatusfield to the human message. The shell-shim test fixture was updated in lockstep to emit the real shape (it had been emitting my fabricated shape, so the integration suite was passing for the wrong reason). - Fixed: Docs and code disagreed about freeze-hook failure handling —
SECURITY.mdandconfigs/hooks/README.mdboth stated "a hook script failure does not abort the freeze", butsrc/cmd-fs.c run_hooks()has aborted the freeze on freeze-hook non-zero exit since at least v2.4.0 (if (strcmp(action, "freeze") == 0) failed = 1). The code is the right default — a hook's purpose is typically to flush in-flight writes for backup consistency (FLUSH TABLES WITH READ LOCK,CHECKPOINT,BGSAVE), so a failed flush means the snapshot is inconsistent for that workload and surfacing it as a freeze failure is the safer choice (matches the fail-secure / strict-default pattern established in Phase 2 / findings 1–4 here). Aligned both docs to the code. Also documented the asymmetric thaw-hook behaviour (thaw hooks log on non-zero but always proceed — refusing to thaw would leave the VM filesystem indefinitely frozen) and the validation-failure path (wrong owner / world-writable / not-executable scripts are skipped before they run; that's a configuration error, distinct from a runtime non-zero exit). Addresses audit.md finding 5. The audit's "add an integration test with a failing hook" suggestion is deferred — requiresHOOK_DIRto be runtime-overridable (currently a hardcoded#defineto/etc/qemu/fsfreeze-hook.d) AND test-mode bypass of the root-ownership validation; both are real changes that haven't been requested for the next release. - Added:
docs/design/FREEZE_SEMANTICS.md— single source of truth defining whatguest-fsfreeze-freezeandguest-fsfreeze-freeze-listactually do perf_fstypenameon macOS, what each treatment guarantees, the five documented divergences from upstream QEMU Guest Agent (idempotent re-freeze, no persistent frozen-state marker, logging-during-freeze,guest-sync-idextension, foreign-FSF_FULLFSYNCfailure treatment), the freeze-time command allowlist contract, and the must-surface vs by-design failure-mode classification. The dispatch table is the same one expressed infs_dispatch_class()and surfaced in--self-test-json'sfreeze_dispatchblock — one source of truth (the code), one verbatim copy in the doc, one verbatim copy in the JSON envelope. Linked fromREADME.mdDocumentation index and fromdocs/BACKUP.md"What 'freeze' means per filesystem". Implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ7c. - Fixed:
docs/PVE.md"Accurate Memory Reporting Without Balloon Driver" section misleadingly implied that installing the agent makes PVE's web UI memory gauge accurate on macOS guests. It doesn't —pvestatdand the PVE web UI source per-VM memory from the virtio-balloon device's stats vq (when populated) or from the cgroup RSS of the QEMU process scope (always), and they never call the guest agent for memory. macOS ships no virtio-balloon driver on any version, so the balloon path is empty and the gauge falls back to cgroup RSS — and installing this agent doesn't change that. Section retitled "Memory reporting on macOS guests" and rewritten to honestly describe: (a) what the gauge actually reads (cgroup RSS, structural balloon-driver limitation), (b) what this agent does provide (guest-side memory view viaguest-get-memory-blocks/guest-get-memory-block-info, consumable directly viaqm agent <vmid> get-memory-blocksor rendered byscripts/pve-verify.sh), (c) why reclamation is impossible regardless (no balloon driver to inflate). Cross-referenced fromdocs/research/UPSTREAM_NOTES.mdTargets 5 and 7. Implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ7a. - Fixed:
README.md(two callouts),docs/COMPATIBILITY.md(new "ISA Serial Transport — Why" section + architectural-transitions row), andsrc/channel.c(known_devices[]ISA-block comment) — the "ISA because Apple claims VirtIO" rationale was right for Apple Virtualization.framework hosts (UTM/vz_run, where Apple'sAppleQEMUGuestAgentis IOKit-launched on the VirtIO console channel via theAppleVirtIOAgentDevicematch set byapplevirtio.console) but oversimplified for plain QEMU/KVM hosts (Proxmox/libvirt/raw QEMU, typically OpenCore-booted, whereapplevirtio.consoledoesn't load and Apple's agent never launches — leaving the VirtIO console channel actually free). All four sites updated to articulate both host classes and explain why we still default to ISA universally: one transport across both host classes (no IOKit introspection at startup, identical launchd plist and channel-detection list everywhere), and no conflict if a disk image is moved between QEMU and VZ. Backed bydocs/research/UPSTREAM_NOTES.mdTarget 6 (local Mach-O symbol survey of/usr/libexec/AppleQEMUGuestAgenton macOS 26.5). Implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ7b. - Updated:
docs/BACKUP.md— "How Freeze Works" and "Freeze Methods by macOS Version" sections reflect the v2.4.3 per-FS dispatch (replacing the prior "10.4–10.12: sync+F_FULLFSYNC / 10.13+: sync+F_FULLFSYNC+APFS snapshot" two-row table that hid the foreign-FS, ZFS, network-mount, and special-FS treatments). The stale "Note onguest-fsfreeze-freeze-list: This command accepts a mountpoint list parameter but currently freezes all filesystems regardless — the mountpoint filter is not yet implemented" is replaced with a description of the new subset-freeze handler (including the deliberate skip of the container-level APFS snapshot for subset requests). The freeze-time command allowlist is now spelled out (9 commands; the upstream 6 plus three documented divergences) instead of the vague "ping, sync, info, freeze/thaw allowed". All elaboration links back todocs/design/FREEZE_SEMANTICS.mdas the canonical reference. - Updated:
docs/COMPATIBILITY.md— promoted 10.4 Tiger to Tier 1 after @vit9696's v2.4.2 confirmation (issue #2): agent serves PVE end-to-end on 10.4.11 (ping, get-osinfo, network, memory, reboot/shutdown). Matches the convention 15.7 Sequoia already set (Tier 1 with freeze untested). - Updated:
docs/COMPATIBILITY.md"Step 2: Runtime Validation" sequence now points atscripts/pve-verify.sh(one-shot host-side validation with agent-sourced memory + behavioural freeze check) and the modern--self-test-json+--safe-test-jsonin-VM diagnostics, replacing the oldertests/safe_test.shreference. Added a note on how external contributors submit results (issue comment or PR underdocs/evidence/<version>/). - Fixed:
docs/CLI.mdDevice Auto-Detection section listed the probe order as VirtIO → UTM → ISA. The code insrc/channel.chas been ISA-first since v2.1.0 — deliberately, because Apple's built-in VirtIO guest agent on Big Sur+ claims the VirtIO channel and ISA is the only one it leaves alone. Reordered the doc to match the code and the v2.1.0 rationale. - Added:
docs/evidence/directory with a README defining the per-version layout (selftest.json,safetest.json,pve-verify.txt, optionalNOTES.md) and the submission flow referenced from the reply to issue #2 — so contributors land on a real path with format guidance instead of an empty directory. - Added:
docs/PLAN.md— phased roadmap (research → configuration matrix and intent design → one-shot validator) covering the deeper freeze/gating/foreign-FS gaps surfaced by @vit9696's Tier-2 submission on 10.4.11. Scaffoldeddocs/research/UPSTREAM_NOTES.mdto capture Phase 1 evidence (QGA spec, Linux reference impl, PVE wrapper behaviour, etc.) before any code change.
-
Improved:
ssh_safe_write_file()now uses an atomic temp-file-plus-rename pattern instead of open-truncate-then-write. The prior implementation refused to follow a symlink at the target (good — closed the audit finding 6 privesc) but left two operational weaknesses: (a) if the agent crashed mid-write, the user'sauthorized_keyswas permanently corrupted and SSH access was lost until manually restored; (b) any reader (sshd checking the file for the next authentication, a backup utility, anything) racing the write would see a partial file. The new pattern writes to<dir>/.<basename>.tmp.<pid>viaO_WRONLY|O_CREAT|O_EXCL|O_NOFOLLOW,fchown/fchmodby fd, write loop, thenrename(2)over the target —rename()is POSIX-atomic for same-directory same-filesystem operations, so readers never see a partial file, and a mid-write crash leaves the target's prior content intact (only a stray dot-file in.ssh/that the operator can clean up). As a side benefit, a pre-positioned symlink at the target is now atomically replaced with a regular file rather than the write being refused — keeps the user's SSH access working instead of leaving the broken symlink in place. Same Tiger-compat constraints as the rest of audit finding 6's hardening: every primitive (O_EXCL,rename,unlinketc.) is POSIX.1-2001, i386/10.4 cross-build clean.tests/test_proactive.cupdated to assert the new atomic-replacement semantics (symlink replaced rather than refused, victim file still untouched, no leftover temp file after success). Error messages incmd-ssh.cupdated for the new failure mode ("temp create or rename failed; .ssh directory may have wrong permissions"). -
Fixed (security): SSH key management followed symlinks as root — the audit's finding 6 demonstrated a real privilege-escalation surface.
guest-ssh-add-authorized-keysandguest-ssh-remove-authorized-keysboth wrote<home>/.ssh/authorized_keysvia plainopen(O_WRONLY|O_CREAT|O_TRUNC)and thenchown()d the path by name;guest-ssh-get-authorized-keysread it viafopen(). None of those calls usedO_NOFOLLOW. A user with write access to their own home directory could replace~/.ssh/authorized_keyswith a symlink to any root-owned file (/etc/shadow,/var/db/dslocal/nodes/Default/users/root.plist, etc.) and trick the root-running agent into truncating, chowning to that user, or exposing the content of the linked-to file via the QGA response. Hardened all three handlers insrc/cmd-ssh.cvia three new file-local helpers (exposed for unit testing):ssh_safe_read_fileopens withO_RDONLY|O_NOFOLLOWandfstat-rejects non-regular files before reading;ssh_safe_write_fileopens withO_WRONLY|O_CREAT|O_TRUNC|O_NOFOLLOWand callsfchown/fchmodBY FD (not by path), so any racing symlink swap between open and metadata change still only mutates our held fd's inode;ssh_safe_ssh_dircreates~/.sshviamkdir(0700)(which itself doesn't follow symlinks on the final component) and verifies the result vialstat+S_ISDIRbeforelchowning (instead of the priorchownwhich would have followed). Tiger-compatible — every primitive (O_NOFOLLOW,fchown,fchmod,lchown,lstat) is POSIX.1-2001 and shipped on macOS since 10.0; i386/10.4 cross-build clean. Doesn't useopenat/renameat(POSIX.1-2008, macOS 10.5+) — atomic rename is a future improvement when we drop 10.4. Newtests/test_proactive.cregression: 10 assertions covering (a) write to symlinked target →-1with victim file untouched, (b) write to non-existent target → succeeds with0600regular file containing the input, (c) write to existing regular file → truncate-and-rewrite succeeds, (d) read of symlinked target →NULL(no content exposure), (e) read of regular file → returns the data. Test wiring:cmd-ssh.cadded to thetest-proactiveMakefile link line;command_registerstub already in place from earlier. Addresses audit.md finding 6. -
Fixed: Version metadata was inconsistent across files —
Makefilesaid2.4.2,scripts/build-pkg.shhardcoded2.4.0,docs/mac-guest-agent.8said2.2.0, anddocs/BACKUP.mdreferred to behaviour "since v2.4.3" (the unreleased target). The release workflow extracted a tag-derived$VERSIONinto the env but didn't pass it through tomake, so a tagged release could publish binaries whose embedded version came from the Makefile rather than the git tag. Addresses audit.md finding 4. Fixes: Makefile bumped from 2.4.2 → 2.4.3 (matches the docs and the work shipped in this Unreleased section);scripts/build-pkg.shnow readsVERSIONfrom the Makefile viaawk '/^VERSION[[:space:]]*:=/{print $2}'with aVERSIONenv override (used by the release workflow to stamp the tag version);docs/mac-guest-agent.8bumped to 2.4.3 with a.\"comment noting the sync requirement and pointing at the future improvement (stamping it from$(VERSION)at build time);.github/workflows/release.ymlnow invokesmake VERSION="$VERSION" build-all,make VERSION="$VERSION" build-i386, andmake VERSION="$VERSION" buildso tagged-release binaries always carry the git-tag's version. Single source of truth is the Makefile; the .pkg script and the release workflow both honourVERSIONenv overrides for explicit ad-hoc bumps. Binary--versionconfirmed reportsmac-guest-agent 2.4.3after the bump. -
Fixed:
base64_decode()accepted any input whose length was a multiple of 4, including bytes outside the base64 alphabet — characters not in[A-Za-z0-9+/=]mapped to0in the lookup table (which means literalA), so unvalidated input like"!!!!"silently decoded to three zero bytes. Affectedguest-file-write(would write zero bytes for any non-base64 input the caller sent, silently corrupting the file) andguest-set-user-passwordwithcrypted=false(would either silently use zero bytes as the password OR — separately — fall through to use the raw literal string as the password if decoding "succeeded" in the prior loose sense). Tightenedsrc/util.c base64_decode(): every non-padding character is now validated against the[A-Za-z0-9+/]alphabet, and=is allowed only in the last 1 or 2 positions of the input (RFC 4648 §3.2). Invalid alphabet, embedded whitespace, high-bit bytes, URL-safe substitutions (-/_), three-or-more=, and=anywhere but the tail all now returnNULL(which both existing callsites already treat as "decode failed → return GenericError").src/cmd-user.c handle_set_user_passwordadditionally now returns anInvalidParametererror whencrypted=falseand base64 decoding fails — the prior silent-fallthrough-to-raw-literal would have set the user's password to whatever literal bytes the caller happened to send. 28 new unit-test cases intests/test_proactive.ccover round-trip of known inputs, every category of invalid alphabet, every category of bad padding, every category of bad length, and the NULL safety guard. Addresses audit.md finding 3. -
Fixed:
guest-get-diskstatsemitted iostat-style fields at the top level (name,kb-per-transfer,transfers-per-second,mb-per-second) — the QGAGuestDiskStatsInfoschema wants{name, major, minor, stats: {15 Linux-block-stats fields}}. Strict QGA consumers (virsh / PVE plugins) reject the prior shape. Rewritten insrc/cmd-disk.c handle_get_diskstats()to source real cumulative per-disk counters from IOKit'sIOBlockStorageDriverStatisticsproperty dict instead of parsingiostatrate snapshots — 6 of the 15 spec fields map cleanly (read-sectors←Bytes (Read)/ 512,read-ios←Operations (Read),write-sectors←Bytes (Write)/ 512,write-ios←Operations (Write),read-ticks←Total Time (Read)ns → ms,write-ticks←Total Time (Write)ns → ms). The remaining 9 Linux-block-layer-specific fields (read-merges,write-merges,discard-sectors,discard-ios,discard-merges,discard-ticks,in-flight,io-ticks,time-in-queue) emit0— same honest-zero precedent as cpustatsnice: 0and routemetric: 0/irtt: 0.major/minoralso0(macOS has no stable Linux-style block-device major/minor numbers). BSD device name discovered by recursively walking the IOBlockStorageDriver's children for theBSD Nameproperty on the child IOMedia node (viaIORegistryEntrySearchCFProperty(... kIORegistryIterateRecursively)). New helpercfdict_u64()reads a uint64 out of a CFDictionary by C-string key. New includes:<stdint.h>,<CoreFoundation/CoreFoundation.h>,<IOKit/IOKitLib.h>,<IOKit/IOBSD.h>,<IOKit/storage/IOBlockStorageDriver.h>,<IOKit/storage/IOMedia.h>—IOKitandCoreFoundationframeworks were already linked. Newtests/run_tests.shshape contract validates the full 4-top + 15-stats field set per entry.docs/COMMAND_STATUS.mdrow promoted from "caveated/partial — Returns raw iostat output" to "stable/partial — IOKit IOBlockStorageDriver Statistics". Addresses audit.md finding 2c. Breaking for any caller that hard-coded the prior iostat-style field names. -
Fixed:
guest-network-get-routeroute objects didn't match the QGAGuestNetworkRouteschema — emitteddestination/nexthop/source/interface/version/prefix, missing the spec'siface/gateway/mask/metric/irtt/desprefixlen. Strict virsh / qm-agent / PVE-plugin consumers reject responses with the prior field names. Rewritten insrc/cmd-network.c handle_network_get_route()to emit the spec shape:iface,destination(stripped of /CIDR; "default" normalised to0.0.0.0/::),gateway,nexthop(alias forgateway, which the QGA schema also defines),mask(computed from the prefix length — IPv4 dotted-quad like255.255.255.0or IPv6 colon-hex likeffff:ffff:ffff:ffff:0000:...),metric(constant0— macOSnetstat -rndoesn't expose a metric column),irtt(constant0— Linux-only concept),version,desprefixlen. The0defaults formetricandirttfollow the same precedent asguest-get-cpustats'snice: 0on macOS (Q4 / audit finding 2a pattern: spec-conformant with honest zeros for fields the host can't supply). Two new helpers:ipv4_prefix_to_mask()andipv6_prefix_to_mask().<stdint.h>added foruint32_t.tests/run_tests.shshape contract updated to assert the full set of spec fields (iface,destination,gateway,nexthop,mask,metric,irtt,version,desprefixlen). Addresses audit.md finding 2b. Breaking for any caller that hard-coded the prior field names. -
Fixed:
guest-get-loadreturnedload1/load5/load15; the QGAGuestLoadStatsschema requiresload1m/load5m/load15m(themsuffix marks "minutes"). Strict QGA parsers reject the prior field names. Renamed insrc/cmd-system.cto match the spec.tests/run_tests.shshape contract +tests/safe_test.shfield probe + print statement updated. Addresses audit.md finding 2a. Breaking for any caller that hard-coded the prior field names — none of the in-tree consumers did (the safe-test paths above are the only references), and the rename is a textual change only (same three doubles, same semantics). -
Fixed:
guest-execwas synchronous and could deadlock on stderr-heavy children — addressed audit.md finding 1. The oldhandle_exec()drained stdout to EOF, then stderr to EOF, thenwaitpid()-blocked the entire agent main loop until the child exited. Two real failures: (1) a child writing more than the ~64 KB pipe buffer to stderr while stdout stayed small would deadlock — child blocked on stderr write, parent blocked on stdout read, both waiting forever; (2) every QGA command from the host (ping, freeze, network checks, status polls from other callers) stalled for the child's entire lifetime. Neither matches the QGA spec, which hasguest-execreturn{pid: N}immediately andguest-exec-statuspoll for completion. Rewritten to the spec contract:handle_exec()forks, sets the parent's pipe read ends non-blocking viafcntl(F_SETFL, O_NONBLOCK), stores the fds + per-stream accumulating buffers + truncation flags in the process table, returns{pid: N}immediately. A newdrain_one_fd()helper does nonblockingread()chunks untilEAGAIN/EOF/error;cmd_exec_drain_all()(called from the agent main poll loop on every wake-up tick) keeps in-flight children's pipes from backing up while the caller is between status polls;handle_exec_status()opportunistically drains the named pid, reaps viawaitpid(WNOHANG), returns the current state. Output is base64-encoded only at status-return time. SameMAX_CAPTURE_SIZE = 16 MBper stream, sameout-truncated/err-truncatedflags as upstream Linux qemu-ga and Windows qemu-ga (matches their async model — Linux uses GLib I/O callbacks, Windows uses one reader thread per pipe, ours uses event-driven nonblocking drain). Zero compatibility risk: only POSIX-classic syscalls (fork/pipe/fcntl F_SETFL O_NONBLOCK/waitpid WNOHANG), all available on Mac OS X 10.0+ — i386/10.4 cross-build clean. Two newtests/run_tests.shregression tests: (a)sleep 2withcapture-output=truereturns fromguest-execwithin 250 ms (was the agent blocking 2 seconds); (b)dd if=/dev/zero bs=4096 count=64 1>&2(256 KB to stderr — 4× the pipe buffer) completes end-to-end without deadlock and the capturederr-datamatches. The existing exec tests were also rewritten to pollguest-exec-statusinstead of assuming sync semantics from the immediate response. -
Fixed: CI static-analyzer false positive
cmd-fs.c:253:14: The 1st argument to 'open' is NULL but should not be NULL [unix.StdCLibraryFunctions]introduced when Phase 2 addedtry_fullfsync().mnt->f_mntonnameis a fixedchar[MAXPATHLEN]array insidestruct statfsand can never be NULL — but the analyzer can't prove that without an explicit non-NULL constraint onmntitself. Added__attribute__((nonnull))totry_fullfsync's declaration: documents the precondition (always satisfied — callers pass&mntbuf[i]fromsync_all_volumes) and silences the warning. Verified locally withclang --analyzeon everysrc/*.c(no warnings). -
Added:
zfs snapshotsupport for OpenZFS-on-macOS mounts during freeze. Preferszfs snapshot <pool>/<dataset>@mac-guest-agent-<timestamp>overF_FULLFSYNCfor ZFS-typed mounts — ZFS snapshots are atomic and are the real consistency primitive for ZFS, whereasF_FULLFSYNCisn't documented to be implemented on it. ThezfsCLI is detected lazily at common installation paths (/usr/local/sbin/zfs,/usr/local/bin/zfs,/opt/local/bin/zfs,/opt/homebrew/bin/zfs); if absent, the dispatch falls through toF_FULLFSYNCas defence in depth. Snapshot names tracked sodo_thawcanzfs destroythem via the matching cleanup path (mirrors how APFS snapshots are tracked + cleaned). Also fixed:fs_dispatch_classpreviously ran the/dev/defensive-backing check BEFORE the type check, which meant ZFS mounts (whosef_mntfromnameispool/dataset, not/dev/...) were wrongly classified asSKIP_SPECIAL. The dispatch now matches known writable types (apfs/zfs/hfs) first; the/dev/check applies only to unknown types. New unit-test cases lock the corrected dispatch order. Implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ1 (ZFS). -
Fixed:
guest-get-cpustatsreturned a flat aggregate object{user, system, idle, nice}summed across vCPUs. The QGA spec defines the response as['GuestCpuStats']— an array of per-CPU discriminated-union records with a requiredtypefield, acpuindex, and the per-CPUuser/nice/system/idletick counters. Our shape was structurally invalid against the schema at the array level. Rewritten usinghost_processor_info(PROCESSOR_CPU_LOAD_INFO)to produce one entry per vCPU. Each entry taggedtype:"linux"(the only currently-defined value in upstreamGuestCpuStatsType; emitting an unknown enum value or omitting the field would be rejected by strict QAPI parsers, seedocs/design/AGENT_BEHAVIOUR_SPEC.mdQ4 for the full reasoning). User/system/idle/nice tick semantics translate cleanly between macOS'sprocessor_cpu_load_info_tand Linux'sGuestLinuxCpuStats. Newtests/run_tests.shshape contract verifies array structure, per-entry fields, andtype:"linux"discriminator.src/selftest.c's safe-test expectation for the command updated fromexpect_array=0toexpect_array=1. -
Fixed:
guest-fsfreeze-freeze-listsilently ignored its optionalmountpointsargument because it shared the global-freeze handler. A caller asking to freeze only["/Volumes/data"]got a global freeze instead of the requested subset. The command now has a distinct handler that parsesargs.mountpoints, validates each entry is a string, and restricts the per-FS dispatch loop to mounts whosef_mntonnamematches one of the listed paths. With no argument or an empty array it delegates to global freeze (the spec's default). Subset freezes deliberately skip the container-level APFStmutil localsnapshot(snapshotting a whole container for a per-mount request would capture state the caller didn't ask us to capture); per-mountF_FULLFSYNCis the consistency mechanism for subset freezes. Implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ2. Four integration tests intests/run_tests.shcover no-args delegation, empty-array delegation, two-mountpoint filter plumbed through (test-mode return matchesn_mountpoints), and non-string-entry spec-shaped error. -
Fixed: Filesystem freeze treated
F_FULLFSYNCreturningENOTSUP/EOPNOTSUPPon foreign filesystems (FAT32, exFAT, older MS-DOS drivers, third-party FUSE) as a failure — loggingWARNand not counting the volume. Per Apple'sfcntl(2)documentationF_FULLFSYNCis only implemented on HFS, MS-DOS (FAT), UDF, and APFS, so the failure is by-design on filesystems that don't implement it; the QEMU Linux QGA reference handles the analogousEOPNOTSUPPfromFIFREEZEby skipping silently.sync_all_volumesnow dispatches perf_fstypename: APFS gets thetmutilsnapshot +F_FULLFSYNC, HFS+ getsF_FULLFSYNC, foreign FS triesF_FULLFSYNCand countsENOTSUPasflushed_only(the globalsync()at the top already flushed dirty buffers), network mounts (smbfs/afpfs/nfs/webdav) and special FS (devfs/autofs/fdesc/synthfs/volfs/lifs) are skipped categorically. The handler emits a single INFO log line summarising the per-treatment breakdown (snapshotted / zfs_snapshotted / fullfsynced / flushed_only / skipped). The wire response remains a spec-conformant int (sum of "did-something" counters). Reported by @vit9696 in #2; implementsdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ1 + Q3 (log). 21-case unit test forfs_dispatch_classintests/test_proactive.c. -
Fixed:
guest-get-memory-blocksfabricated a memory-usage figure whenvm_statcould not be read.handle_get_memory_blocksinitialisedusedtototal / 2and only overwrote it on success, so aget_vm_stat()failure silently returned a block list implying exactly 50% RAM used — indistinguishable from a real reading. It now returns a QGA error (Failed to read memory statistics) instead of a fabricated value.guest-get-memory-block-info(block size) is unaffected; it does not depend onvm_stat.
- Refactored:
fsfreeze_command_allowedsplit into a pure-function variantfsfreeze_is_allowlisted(checks only the allowlist; ignoresfreeze_status) plus the existing public function that wraps it with the state check. Lets tests exercise the allowlist contract in isolation without needing to manipulate the frozen state. The expanded comment onfsfreeze_command_allowedstates the principled-restrictive rule fromdocs/design/AGENT_BEHAVIOUR_SPEC.mdQ5 (allow during freeze iff handler is read-only, doesn't exec, doesn't change agent state) and notes the deliberate divergences from upstream (guest-sync-idextension + idempotent re-freeze). No behaviour change: the same 9 commands are allowed during freeze as before. 30 new unit tests intests/test_proactive.clock the contract — 9 allowed + 20 representative blocked (writes, exec, suspends, read-only commands deliberately NOT on the list) + NULL guard.
- Fixed: Agent never connected on Mac OS X 10.4 Tiger — the serial transport used
poll(), which returnsPOLLNVAL(0x20) for the serial device on Tiger. macOSpoll()is implemented on top of kqueue, and Tiger's serial BSD client does not support the kqueue readiness path, sopoll()reported a valid, open/dev/cu.serial1as invalid. The agent treated that as a fatal device error and reconnect-looped every 5 s without ever reading a command — the host sawQEMU guest agent is not running. The serial read and write paths inchannel.cnow useselect(), which uses the legacyselrecordpath the driver implements and works on every macOS version. Reported by @vit9696 in #2.
- Added:
tests/test_proactive.c— channel read over a real PTY, covering theselect()-based read path (framed-message read and idle timeout). The transport read path previously had no behavioral test coverage.
- Fixed:
--safe-testcrash on Mac OS X 10.4 Tiger (dyld lazy-bind failure on_host_statistics64). Weak-import the symbol so the existingvm_stattext fallback inget_vm_stat()actually runs on 10.4 instead of the process aborting before the runtime check fires. Reported by @vit9696 in #2.
- Fixed:
docs/COMPATIBILITY.md—host_statistics64was incorrectly listed as present on Tiger. It was introduced in 10.6 Snow Leopard. Symbol list and Tiger row corrected; Tiger now noted as relying on thevm_stattext fallback for memory stats. - Fixed:
scripts/verify-installer.sh—host_statistics64moved from required to optional in the symbol audit, matching what the binary now actually needs. - Added: Tiger / Leopard PATH note in README —
/usr/local/binis not in the default PATH on 10.4–10.5; users should invoke via absolute path orexport PATH=/usr/local/bin:$PATH.
--safe-test/--safe-test-json— built-in read-only command validation. 21 tests, no external script or python needed. Runsudo mac-guest-agent --safe-testto verify all read-only commands work correctly.scripts/pve-verify.sh— host-side verification script. Run from PVE host against a VM ID to check config, ping, OS info, network, command count, memory reporting, and freeze round-trip.
- Fixed: Stop deleting ALL Time Machine snapshots on freeze — now only deletes the snapshot we created
- Fixed: Shutdown returns error when fork fails (was silently returning success)
- Fixed: SSH key removal returns error when write fails (was silently returning success)
- Fixed: Save/restore hibernatemode around suspend (was permanently altered)
- Fixed: NULL dereference before null check in channel_create_test
- Fixed: Unchecked realloc in SSH key operations (crash on OOM)
- Fixed: Memory leak in freeze hook cleanup (empty loop body)
- Fixed: Output capture capped at 16MB (matches Linux qemu-ga)
- Fixed: tmutil snapshot deletion uses run_command_v (no shell injection)
- Fixed: selftest tool_available uses access() instead of system()
- Fixed: Signal handler uses volatile sig_atomic_t flag
- Fixed: Password zeroed in all code paths with compiler-safe secure_zero
- Fixed: setenv() instead of putenv() after fork in guest-exec
- Fixed: base64_encode overflow guard for 32-bit
- Fixed: json_escape handles control characters
- Fixed: Unsupported commands (set-vcpus, set-memory-blocks) registered as disabled
- Fixed: LOG_FATAL no longer calls exit() — caller handles cleanup
- Fixed: guest-get-diskstats returns structured per-disk stats (was raw text)
- Fixed: commands_init guard prevents double-registration
- Fixed: macos-26 replaced with macos-latest (valid runner)
- Version now single-sourced from Makefile (agent.h uses -DVERSION)
- Fixed: Malformed JSON input now returns a proper error response per QMP spec instead of being silently discarded. Found by pgcudahy (PR #1).
- Fixed: Device detection error message now says "No serial device found" with setup instructions instead of the misleading "No virtio device found."
type=isais required on ALL macOS versions. macOS Big Sur+ ships Apple's own built-in VirtIO guest agent (~18 commands) which claims the default VirtIO serial channel. Usingagent: enabled=1(default) connects to Apple's agent, not ours — losing freeze, memory reporting, and 27 other commands. ISA serial is the only channel Apple's agent doesn't claim.- Full comparison of Apple's agent (18 commands) vs ours (45 commands) added to docs/PLATFORMS.md.
- ISA serial now checked first in device detection order (was last)
- Run_tests.sh: malformed JSON and missing execute tests un-skipped (65 tests, up from 63)
- PVE.md: "existing VM" troubleshooting for users adding agent to klabsdev-style setups
- LIBVIRT.md: VirtIO channel examples replaced with ISA serial (required)
- COMPATIBILITY.md: Sequoia 15.7.5 promoted to Tier 1 (first external user confirmation)
- COMPATIBILITY.md: PPC status and path to support documented
guest-network-get-route— IPv4 and IPv6 routing table vianetstat -rn. Achieves 100% Linux qemu-ga command parity (45 commands; onlyguest-get-devicesunimplemented, which is Windows-only).
--self-testand--self-test-json— environment diagnostics with backup readiness check. Reports freeze method, kext version, APFS/VirtIO capabilities, hook validation, and overall backup readiness verdict.- Backup readiness section in self-test: freeze method (APFS snapshot / sync / sync-only), root capability, hook count, overall verdict.
- i386 binary — cross-compiled via MacOSX10.13.sdk for Tiger (10.4) and Leopard (10.5) support.
- Baud rate set to 115200 — explicit max baud rate on serial port. QEMU ignores baud rate on virtual serial, but macOS kext may use it for internal pacing.
- UTM — auto-detects
/dev/cu.virtio(Apple Virtualization.framework) - libvirt/virt-manager — domain XML for ISA serial and VirtIO channels, virsh command examples, quiesced snapshots
- VirtIO prioritized over ISA serial on Big Sur+ (native driver preferred when available)
- Device detection order: VirtIO (QEMU/PVE/libvirt) → UTM → ISA serial (fallback)
- Restructured README — quick-start focused, detailed content moved to docs/
- docs/PVE.md — complete Proxmox VE operational guide with troubleshooting
- docs/LIBVIRT.md — full libvirt/virt-manager deployment guide with domain XML examples
- docs/UTM.md — UTM guide with utmctl comparison, CI/CD workflows, headless automation
- docs/BACKUP.md — freeze mechanics, hook scripts, TRIM guide
- docs/CLI.md — all flags, config file, device auto-detection
- docs/PLATFORMS.md — platform index with transport priority
- configs/hooks/ — ready-to-use freeze hooks for MySQL, PostgreSQL, Redis, launchd services
- configs/pve/ — anchor VM configurations for Tiger, High Sierra, Big Sur, Sequoia
- 18 macOS versions researched (10.4 Tiger through 26.3 Tahoe)
- Apple16X50Serial.kext verified present on every version with identical PCI class match
- Kext version timeline: v1.6 (Tiger base) → v1.7 (Tiger Intel 10.4.5) → v1.9 (Tiger 10.4.11 combo / Leopard) → v3.0 (Snow Leopard / Lion) → v3.1 (Mountain Lion) → v3.2 (Mavericks through Tahoe)
- Installer-verified: 10.4 through 11.6 (12 versions, deep verification: kext + symbols + frameworks + PCI class)
- Runtime-tested: 10.11.6 El Capitan (PVE), 26.3 Tahoe (native)
- Multi-version test matrix: macos-14, macos-15, macos-26
- i386 build via legacy MacOSX10.13.sdk download in CI
- Self-test validation (text + JSON) in CI pipeline
- ASAN smoke tests expanded to 15 commands
- 48 unit + 31 proactive + 210k fuzz + 63 integration tests
- LaunchDaemon plist:
--daemonchanged to--daemonize(primary flag name) - Command count corrected to 45 across all docs
- Test count corrected to 63 across all docs
- Evidence terminology standardized: runtime-tested, PVE-integrated, installer-verified, best-effort
- All version claims made consistent (10.4+ not 10.7+)
- Real filesystem freeze — replaces fake no-op with actual freeze:
- APFS (10.13+): atomic COW snapshot via
tmutil localsnapshot - All versions:
sync()+F_FULLFSYNCflushes data to physical media - Continuous
sync()every 100ms during freeze window - Auto-thaw safety timeout (10 minutes)
- Command filtering: only freeze-safe commands allowed during freeze
- APFS (10.13+): atomic COW snapshot via
- Freeze hook scripts —
/etc/qemu/fsfreeze-hook.d/(same model as Linux qemu-ga)- Scripts called with "freeze"/"thaw" argument
- 30-second per-script timeout
- Strict ownership validation (root-owned, not world-writable)
- Fixed password memory exposure (zero on all exit paths)
- Fixed command injection in diskutil calls (use execv, not shell)
- Fixed command injection in service update (use execv, not shell)
- Fixed unchecked
pipe()in guest-exec (could use uninitialized fds) - Fixed unchecked
fork()in shutdown handler - Fixed
WIFSIGNALEDcalled on extracted exit code instead of raw wait status - Check all
mkdir(),chown(),tcsetattr()return values - Replace all
strtok()with thread-safestrtok_r()
- 48 unit tests + 31 proactive tests + 210,000 fuzz rounds + 62 integration tests
- Code coverage: 55.74% line, 80.27% function (remaining is untestable-in-CI code)
- Proactive tests: channel API, SSH key operations, hook validation, injection prevention
- Backup consistency section in README (freeze behavior, hook scripts, limitations)
- Thin disk provisioning guide (ssd=1, trimforce, TRIM, zero-fill reclaim)
- SECURITY.md updated with freeze hook security model
- ISA serial transport — uses Apple's built-in
Apple16X50Serial.kextinstead of VirtIO serial. No custom kernel extensions, no SIP issues, no code signing required. Works on macOS versions with the built-in Apple16X50Serial.kext driver. - PVE setup:
qm set <vmid> --agent enabled=1,type=isa
- Password changes via
dsclnow pipe password through stdin instead of command line arguments (no longer visible inps aux) - Passwords are zeroed in memory after use
- SECURITY.md documenting trust model and hardening options
- Serial port raw mode (no ICANON, no OPOST, no ECHO) for reliable bidirectional communication
- Buffer-check-before-poll: immediately process queued commands when PVE sends sync + command in one write
- Silently discard malformed messages to prevent stale data corruption in the serial buffer
- Removed O_NONBLOCK from serial port open (caused writes to not flush on macOS)
block-rpcsandallow-rpcsfully implemented (were previously parsed but not enforced)- Log rotation via newsyslog (5 files, 1MB max each)
- ISA serial device auto-detection (
/dev/cu.serial1) - Big Sur+ also works with default
type=virtiovia Apple's native VirtIO driver
- VirtIO serial kernel extension (unnecessary with ISA serial)
- Native C implementation of the QEMU Guest Agent protocol
- 44 registered QGA commands (34 stable, 5 caveated, 1 no-op, 2 error, 2 aliases)
- Zero external dependencies (cJSON embedded)
- CLI flags compatible with Linux
qemu-ga - Configuration file compatible with
/etc/qemu/qemu-ga.conf - LaunchDaemon service with
--install/--uninstall - Binaries: i386 (10.4+), x86_64 (10.6+), arm64 (11.0+), universal
- Tested on macOS Tahoe 26.3 and Mac OS X El Capitan 10.11.6