Skip to content

fix(agent): harden macos es/ne scaffolding and supervision#180

Open
bb-connor wants to merge 11 commits intomainfrom
fix/macos-es-ne-hardening
Open

fix(agent): harden macos es/ne scaffolding and supervision#180
bb-connor wants to merge 11 commits intomainfrom
fix/macos-es-ne-hardening

Conversation

@bb-connor
Copy link
Collaborator

@bb-connor bb-connor commented Mar 8, 2026

Summary

  • harden macOS ES/NE scaffolding so status, receipts, and release packaging fail closed instead of reporting false healthy or supervised success
  • add macOS host status collection plus native ES/NE status helpers, fixtures, and tests under the agent app system-extension scaffolding
  • replace repo-driven swarm shell execution with validated bootstrap presets and safer lane/worktree/branch handling

Testing

  • cargo test -p clawdstrike sandbox:: -- --nocapture
  • cargo test -p hush-cli --test supervisor_tests -- --nocapture
  • cargo test -p hush-cli hush_run::tests -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos:: -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml agent_health_route_reports_pending_host_state -- --nocapture
  • cargo clippy -p clawdstrike --tests -- -D warnings
  • cargo clippy -p hush-cli --tests -- -D warnings
  • cargo clippy --manifest-path apps/agent/src-tauri/Cargo.toml --tests -- -D warnings
  • swift test --package-path apps/agent/src-tauri/macos/system-extension/endpoint-security
  • swift test --package-path apps/agent/src-tauri/macos/system-extension/network-extension
  • swift run --package-path apps/agent/src-tauri/macos/system-extension/endpoint-security endpoint-security-status-tool live
  • swift run --package-path apps/agent/src-tauri/macos/system-extension/network-extension network-extension-status-tool live
  • CLAWDSTRIKE_VALIDATE_MACOS_PACKAGING=1 CLAWDSTRIKE_REQUIRE_CONCRETE_MACOS_PACKAGING=1 cargo check --manifest-path apps/agent/src-tauri/Cargo.toml
  • bash -n scripts/notarize-agent-macos.sh scripts/codex-swarm/common.sh scripts/codex-swarm/setup-worktrees.sh
  • git diff --check -- .codex/swarm/lanes.tsv .codex/swarm/waves.tsv .github/workflows/ci.yml .github/workflows/release.yml apps/agent/src-tauri/build.rs apps/agent/src-tauri/src/api_server.rs apps/agent/src-tauri/src/main.rs apps/agent/src-tauri/src/macos/collector.rs apps/agent/src-tauri/src/macos/host.rs apps/agent/src-tauri/src/macos/mod.rs apps/agent/src-tauri/src/macos/status.rs apps/agent/src-tauri/tauri.conf.json crates/libs/clawdstrike/src/sandbox/attestation.rs crates/libs/clawdstrike/src/sandbox/capability_builder.rs crates/services/hush-cli/src/hush_run.rs crates/services/hush-cli/src/supervised_exec.rs crates/services/hush-cli/tests/supervisor_tests.rs docs/plans/multi-agent/codex-swarm-playbook.md docs/plans/threat-intel/overview.md scripts/codex-swarm/common.sh scripts/codex-swarm/setup-worktrees.sh scripts/notarize-agent-macos.sh

Note

High Risk
Touches security-adjacent enforcement/attestation reporting and release notarization/packaging gates; mistakes could misreport protection state or break macOS release builds.

Overview
Adds macOS EndpointSecurity/NetworkExtension scaffolding that fails closed: new Swift packages provide endpoint-security-status-tool/network-extension-status-tool JSON status + fixtures/tests, and the agent now polls these helpers into a shared MacosHostService, exposes it via /api/v1/agent/health, and includes it in local heartbeat payloads.

Hardens macOS packaging/release gates: tauri.conf.json now bundles macos/system-extension/**/*, bumps macOS minimum to 13.0, sets app entitlements, and build.rs/CI/release preflight validate required packaging assets (and optionally block placeholders/scaffold_only); the release workflow switches DMG build to a notarization script and uploads notarization evidence.

Expands sandbox attestation and supervised execution truthfulness: attestation now carries provider state/availability, degraded reasons, counters (deadline misses/dropped events), network backend hints, and a recomputed EnforcementLevel::Degraded; hush-cli switches to embedding the typed attestation in receipts, marks supervised runs as degraded when contracts are unavailable, and conditions supervisor stats logging.

Also updates Codex swarm lane/wave configuration for the macOS ES/NE implementation wave.

Written by Cursor Bugbot for commit f77db92. This will update automatically on new commits. Configure here.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c52c469b75

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bb-connor
Copy link
Collaborator Author

Addressed the review feedback in 1698f87. Changes:

  • keep non-macOS /api/v1/agent/health on the previous ok fallback instead of forcing macOS host state
  • fail closed when ES/NE install-state samples disagree, so partial deployment no longer collapses to installed
  • stop placeholder macOS provider states from forcing every supervised attestation to degraded, while still requiring explicit provider-state attachment when we actually know the ES/NE state
  • make attestation mechanism emission deterministic without relying on adjacent dedup()
  • fix the all-features clippy/import regression and the platform-specific attestation test that broke coverage/offline CI

Local validation rerun:

  • cargo test -p clawdstrike sandbox::attestation::tests -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos::collector::tests -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml agent_health_route_reports_pending_host_state -- --nocapture
  • cargo clippy --all-targets --all-features -- -D warnings

@bb-connor
Copy link
Collaborator Author

@codex

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e2406e139a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bb-connor
Copy link
Collaborator Author

Addressed the remaining PR feedback in 20d1e1349.

Changes:

  • make combined macOS install-state aggregation fail closed for installed + unknown as well as partial-install disagreement
  • make combined approval aggregation require consistent proof instead of promoting a single approved sample
  • fix the hush_run degraded-supervision test to assert the platform-specific reason so Linux/offline CI no longer fails spuriously

Local validation:

  • cargo test -p hush-cli --bin hush finalize_sandbox_contract_status_marks_degraded_supervised_runs -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos::collector::tests -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml agent_health_route_reports_pending_host_state -- --nocapture
  • CARGO_NET_OFFLINE=true scripts/cargo-offline.sh test -p hush-cli --bin hush
  • cargo clippy -p hush-cli --tests -- -D warnings
  • cargo clippy --manifest-path apps/agent/src-tauri/Cargo.toml --tests -- -D warnings

The two remaining attestation bot threads are false positives: the comment bodies explicitly walk themselves back, and the current effective_enforcement_level behavior is already covered by the existing attestation tests.

@bb-connor
Copy link
Collaborator Author

Addressed the last open review comment in b24cc48b6.

Change:

  • only advertise the resource-package swift run fallback when the Swift toolchain is actually present, matching the existing source-package fallback and preventing repeated runtime poll failures on end-user Macs without Swift installed

Local validation:

  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos::collector::tests -- --nocapture
  • cargo clippy --manifest-path apps/agent/src-tauri/Cargo.toml --tests -- -D warnings

@bb-connor
Copy link
Collaborator Author

@codex

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b24cc48b6b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bb-connor
Copy link
Collaborator Author

Addressed the new aggregation/health comments in 29c7f5805.

Changes:

  • stop poisoning combined install/approval state with the default unknown snapshot by only aggregating those fields when both ES and NE helpers reported
  • preserve fail-closed behavior for missing helper output by keeping aggregate install/approval at unknown unless both helpers agree
  • classify install_state: not_installed as degraded host health so missing protection does not look transient

Local validation:

  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos::collector::tests -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos_host_health_status -- --nocapture
  • cargo clippy --manifest-path apps/agent/src-tauri/Cargo.toml --tests -- -D warnings

@bb-connor
Copy link
Collaborator Author

@codex

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 29c7f58058

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bb-connor
Copy link
Collaborator Author

Pushed 5a5d90d28 to close the remaining sandbox review items.

Changes in this pass:

  • attestation now fails closed when supervision was requested but is inactive, even if a caller constructed the runtime state without degraded reasons
  • top-level provider_states duplication is removed; runtime.provider_states is now the single serialized source of truth
  • dropped_event_count is written back into the sandbox attestation before receipt finalization so overflowed runs degrade truthfully
  • the supervised receipt integration test now asserts the effective outer contract the runtime actually produced instead of assuming Linux is always fully active

Local validation:

  • CARGO_NET_OFFLINE=true scripts/cargo-offline.sh test -p hush-cli --test supervisor_tests
  • cargo test -p hush-cli --bin hush hush_run::tests -- --nocapture
  • cargo test -p clawdstrike sandbox::attestation::tests -- --nocapture
  • cargo clippy -p hush-cli --tests -- -D warnings
  • cargo clippy -p clawdstrike --tests -- -D warnings

@bb-connor
Copy link
Collaborator Author

Addressed the latest review and CI issues in b21b417f1.

Changes in this pass:

  • added #[serde(default)] coverage for the new SupervisorStats counters plus explicit legacy compatibility tests in attestation.rs
  • tightened the macOS-only egress warning test so changed-file coverage stays platform-correct
  • fixed the hush integration test binary fallback so coverage and vendored runs use the real sibling target/.../hush binary, never the deps/hush-<hash> Rust test harness

Local validation:

  • cargo test -p clawdstrike sandbox::attestation::tests -- --nocapture
  • cargo llvm-cov -p hush-cli --test supervisor_tests --no-report --all-features
  • CARGO_NET_OFFLINE=true scripts/cargo-offline.sh test -p hush-cli --test supervisor_tests
  • changed-file coverage gate: 80.17% via tools/scripts/check-changed-rust-coverage.py
  • cargo clippy -p hush-cli --tests -- -D warnings
  • git diff --check

@bb-connor
Copy link
Collaborator Author

@codex

@bb-connor
Copy link
Collaborator Author

Addressed the latest PR comments in f4bb7a84b.

This pass does two things:

  • macos_host_health_status() now treats provider inactive as pending, which keeps startup/activation states distinct from genuinely degraded states
  • added an attestation regression test that locks in static_mode(true, None) staying at EnforcementLevel::Kernel with the default macOS seatbelt provider state

Local validation:

  • cargo test -p clawdstrike sandbox::attestation::tests -- --nocapture
  • cargo test --manifest-path apps/agent/src-tauri/Cargo.toml macos_host_health_status -- --nocapture
  • cargo clippy -p clawdstrike --tests -- -D warnings
  • cargo clippy --manifest-path apps/agent/src-tauri/Cargo.toml --tests -- -D warnings
  • git diff --check

The remaining two Bugbot threads were false positives/withdrawn; I resolved them after adding the attestation regression coverage and rechecking the current behavior.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b21b417f1e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bb-connor
Copy link
Collaborator Author

Addressed the new review batch in f77db925d:

  • stopped policy event queue backpressure from degrading sandbox attestation and receipts
  • validated and enforced safe swarm namespaces before using them in orchestration paths
  • deduplicated NE degraded reasons on the policy_not_synced path
  • aligned build-script placeholder detection with the release workflow
  • removed the unused ProviderState constructors

Local validation:

  • cargo test -p clawdstrike sandbox::attestation::tests -- --nocapture
  • cargo test -p hush-cli --test supervisor_tests -- --nocapture
  • cargo test -p hush-cli --bin hush hush_run::tests -- --nocapture
  • swift test --package-path apps/agent/src-tauri/macos/system-extension/network-extension
  • cargo check --manifest-path apps/agent/src-tauri/Cargo.toml
  • cargo clippy -p clawdstrike --tests -- -D warnings
  • cargo clippy -p hush-cli --tests -- -D warnings
  • bash -n scripts/codex-swarm/common.sh
  • namespace smoke check for valid and traversal values
  • git diff --check

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

degraded_reasons: vec![
"macos_authorization_contract_unavailable".to_string(),
"supervised_launch_refused_without_live_authorization_provider".to_string(),
],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supervised preflight uses macOS-specific reason on all platforms

Medium Severity

supervised_preflight_refused unconditionally pushes "macos_authorization_contract_unavailable" as a degraded reason on all platforms, unlike supervised_mode which uses cfg!(target_os = "macos") to select the platform-appropriate reason string. On Linux, this produces a misleading macOS-specific degraded reason instead of something like "supervised_interception_inactive".

Fix in Cursor Fix in Web

@bb-connor
Copy link
Collaborator Author

@codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant