feat(v1.99 PR-18): update apply orchestration (G3-U5..U10) — CONTRACT-ENFORCED#475
Merged
feat(v1.99 PR-18): update apply orchestration (G3-U5..U10) — CONTRACT-ENFORCED#475
Conversation
This commit establishes the PR-18 contract BEFORE any implementation code
lands. Per user directive, PR-18 is judged by call-path purity, not by
"it works in happy path."
Pinned sentence (repeated in PR body + contract file + package doc):
"PR-18 is orchestration-only: update apply may invoke the existing
rebuild/lifecycle authority, but may not implement any independent
apply, mutation, recovery, validation, or authority-taking behavior."
Contents:
- Call graph (runUpdateApply → firewall_rebuild → validator → recovery)
- Canonical entry points PR-18 orchestrates (never reimplements)
- 8 forbidden patterns (automatic NO-GO)
- Explicit stop condition: new apply/mutation/recovery/authority
logic → STOP and split to new PR
No code, no behaviour change. This is the contract layer; step 1
(tests/proof skeleton) lands in the next commit, step 2 (minimal
orchestration wiring) after that.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
…purity)
Step 1 of PR-18 per user's implementation ordering. These tests land
BEFORE runUpdateApply itself so the invariants are enforced from the
first implementation commit onward.
Contract surface:
- applyWhitelist: the complete set of commands runUpdateApply may invoke.
Preflight probes (read-only) + canonical rebuild entry
(`nftban firewall rebuild`) + validator gate (`nftban-validate --json`)
+ read-only post-state inspection (nft list, systemctl is-active).
Adding to this list requires a corresponding apply_contract.md update.
- applyForbiddenSubstrings: categories that must never appear in
recorded commands — direct kernel mutation (nft add/flush/delete),
service lifecycle (systemctl stop/disable/mask), package removal,
external firewall touches (ufw, iptables).
- applyForbiddenWritePaths: file-system destinations apply may never
write — /etc/nftban/**, /usr/lib/nftban/**, /usr/sbin/nftban*.
Plus explicit G3-U5 rule: .conf.local byte-preservation.
Harness functions:
- auditRecordedCommands(cmds) — flags whitelist violations + forbidden
substrings. Returns list of violation strings (empty = clean).
- auditWrittenFiles(paths) — flags write-path violations + .conf.local
touches.
Self-tests:
- happy-path command trace produces zero violations
- direct kernel mutation (nft add table) detected
- unknown commands (curl) rejected
- .conf.local write detected (G3-U5)
- /etc/nftban write detected
- /usr/sbin/nftban write detected
When step 2 lands runUpdateApply, its MockExecutor trace will be piped
through auditRecordedCommands + auditWrittenFiles to prove INV-U-001/002/003
mechanically, not just by reading the code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
runUpdateApply is a thin sequencer over the canonical rebuild +
validator entrypoints. No custom apply, no retry, no rollback, no
recovery, no authority decisions. Validator is the truth gate —
rebuild success + validator failure = apply failure.
Files:
- cmd/nftban-installer/update_apply.go (NEW) — runUpdateApply + read-only
postStateInspection + truncate helper. Exit-code contract explicit:
each phase's failure propagates without merging so monitoring can
distinguish rebuild-fail from validator-fail by exit + log phase.
- cmd/nftban-installer/update_apply_test.go (NEW) — T1 happy path,
T2 preflight-fail blocks rebuild, T3 rebuild-fail blocks validator,
T4 G3-U8 truth gate (validator fail overrides rebuild success),
T5 call-path purity on ALL branches, T6 G3-U5 .conf.local untouched.
- cmd/nftban-installer/main.go — narrow dispatch: --mode=upgrade
without --rpm/--deb routes to runUpdateApply. Package postinst
paths (packaging/deb/postinst, RPM spec) always pass --rpm/--deb,
so they continue through runInstall as today.
- internal/installer/update/apply_contract.go (NEW) — exported
ApplyWhitelist + ApplyForbidden* + AuditRecordedCommands +
AuditWrittenFiles. The main-package tests pipe their MockExecutor
traces through these so the exact same rules apply to both
surfaces.
- internal/installer/update/apply_contract_test.go — rewritten as
self-tests for the exported harness.
Canonical entry points invoked (whitelist-enforced):
- update.Preflight (read-only, PR-16/PR-17)
- "nftban firewall rebuild" (mutation path — delegates to v1.96
firewall_rebuild in cli/lib/nftban/cli/cmd_firewall.sh:1070)
- "nftban-validate --json" (truth gate)
- exec.NftTableExists (read-only kernel inspection)
- exec.ServiceActive (read-only service inspection)
Forbidden by construction (AuditRecordedCommands / AuditWrittenFiles):
- direct kernel mutation (nft add/flush/delete)
- service lifecycle (systemctl stop/disable/mask)
- package removal (apt-get/dnf remove)
- external firewall touches (ufw, iptables)
- writes to /etc/nftban/**, /usr/lib/nftban/**, /usr/sbin/nftban*
- any write to *.conf.local (G3-U5)
INV-U-001/002/003 enforced mechanically — not just by reading code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 3 (recovery/validator proofs):
- cmd/nftban-installer/update_apply_test.go:
- T7 TestUpdateApply_RebuildFail_NoRetryNoRecovery — rebuild is
invoked EXACTLY once; no recovery-flavored command appears in
the trace after rebuild fails (firewall_rebuild owns retry per
v1.96; apply must NOT add another retry layer)
- T8 TestUpdateApply_DoesNotReinterpretValidatorOutput — when
validator exits 2 but returns JSON body with status="protected",
apply must honour the EXIT CODE and fail. Locks the truth-gate
discipline against the common regression class where a later
change "helpfully" inspects the JSON body to override the exit
Step 4 (CI gates G3-U5..U10):
- .github/workflows/ci-update-canonization.yml:
- New step "structural call-path audit of update_apply.go" — greps
the runUpdateApply source for 13 forbidden patterns (direct nft
mutation, service lifecycle, package removal, external firewall
touch, /etc/nftban/** writes, /usr/lib/nftban/** writes,
.conf.local touches). Fails fast BEFORE runtime tests, so a
reviewer sees the violation in the first CI failure.
- New step "unit tests for runUpdateApply call-path purity" — runs
go test ./internal/installer/update/... ./cmd/nftban-installer/...
which exercises every runUpdateApply trace through
AuditRecordedCommands + AuditWrittenFiles under happy,
rebuild-fail, validator-fail, no-retry, no-reinterpretation,
and conf.local byte-preservation branches.
Together these close G3-U5 (.conf.local byte-preservation), G3-U6
(base-config safety via forbidden-write-path audit), G3-U7 (rebuild
recovery integration — delegation, not duplication), G3-U8
(post-update validator gate — truth discipline), G3-U9 (service-state
convergence via postStateInspection read-only checks), G3-U10
(authority safety via forbidden systemctl/ufw/iptables patterns).
Every sub-gate is mechanical: if apply ever grows a direct mutation
surface, CI catches it in the structural grep before the unit tests
even run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-review response to PR #475. All changes are local to PR-18; no new enum, no taxonomy redesign, no scope drift. Blocker #1 — state/exit contradiction on validator failure: Validator-fail branch previously hard-coded sf.Transition(StateDegraded) regardless of validator's actual exit code. A validator exit 2 (DOWN) was persisted as DEGRADED, so sf.State.ExitCode() returned 1 while the process exit was 2 — silent truth split. Fix: new local helper stateForValidatorExit(rc) in update_apply.go: rc == 1 → StateDegraded (post-state valid-enough for degraded) rc >= 2 → StateFailedRebuild (stronger failure not collapsed) Depends ONLY on validator's process exit code. No JSON parsing. No new InstallState enum value. State.ExitCode() now matches the returned process exit by construction. Tests: T9 validator rc=1 → StateDegraded, state↔exit both = 1 T10 validator rc=2 → StateFailedRebuild, state↔exit both = 2 T11 stateForValidatorExit exhaustive mapping (rc 1/2/3/127) Blocker #2 — contract audit missed preflight-fail branch: T5 TestUpdateApply_CallPathPurity_AllBranches only covered happy/rebuild-fail/validator-fail. A non-whitelisted command or forbidden write slipping into the preflight-fail path was unaudited. Fix: add {"preflight-fail", delete ip:nftban mock} to T5's branches. Now every runUpdateApply exit path pipes through AuditRecordedCommands + AuditWrittenFiles. Minor #1 — post-state inspection success-path-only documented: update_apply.go header now explicitly states that step 4 (postStateInspection) runs on success path only and that earlier failure branches return before it. Closes the contract/docs drift the reviewer flagged. Minor #2 — truncate byte/rune mismatch: Previous implementation used len(s) + s[:n] on a string, risking cutting multi-byte UTF-8 characters mid-codepoint. Now converts to []rune before slicing. Minor #3 — structural guardrail broadened: ApplyForbiddenSubstrings + CI grep previously banned only systemctl stop/disable/mask. Now bans the full service-lifecycle surface: stop/start/restart/reload/enable/disable/mask/unmask. Future-proofs against "helpful" service manipulation creep even if the specific verb changes. Test-infra: tests now use t.TempDir() instead of /var/lib/nftban/state for sf.StateDir so Transition's WriteAtomic doesn't fail under unprivileged CI runners. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
0. Executive Summary
This PR introduces the update apply phase as a strictly orchestration-only layer that delegates all state mutation, validation, recovery, and authority enforcement to the existing rebuild/lifecycle authority.
No new apply logic is introduced.
This PR enforces that:
1. Core Contract (Hard Gate)
Invariant:
Explicitly:
Apply MUST NOT implement its own:
Apply MUST:
2. Enforced Invariants
Violation of any invariant = automatic NO-GO
3. Scope (G3-U5..U10)
G3-U5 —
.conf.localByte Preservation.conf.localG3-U6 — Base Config Safety
G3-U7 — Rebuild Recovery Integration (INV-U-002)
G3-U8 — Post-Update Validator Gate
nftban-validateMUST run; invalid state MUST block success; recovery MUST be triggered on failureG3-U9 — Service-State Convergence
G3-U10 — Authority Safety (INV-U-003)
4. Non-Goals (Strictly Out of Scope)
If any of the above becomes necessary → STOP and split into new design PR
5. Call Graph (Critical Proof)
There are NO parallel or side paths.
Concrete binding (today):
runUpdateApply→exec.Run("nftban", "firewall", "rebuild")→firewall_rebuild(cli/lib/nftban/cli/cmd_firewall.sh:1070)→ exec.Run("nftban-validate", "--json")→ validator gate (cmd/nftban-validate)firewall_rebuild's own_rebuild_*family (nftban_rebuild_recovery.sh) — PR-18 adds NO new recovery code6. Execution Flow
update.Preflightnftban firewall rebuildnftban-validate --json7. Merge Evidence Requirements (Hard Checklist)
PR cannot merge without ALL of the following:
A. Structural Proof
B.
.conf.localPreservation (G3-U5)C. Base Config Safety (G3-U6)
D. Recovery Integrity (G3-U7)
E. Validator Gate (G3-U8)
F. Service Convergence (G3-U9)
G. Authority Safety (G3-U10)
8. CI Gates (Blocking)
.conf.localpreservation testAll must be green and enforced (blocking)
9. Failure Model
10. Risk Assessment
11. Definition of Done
PR-18 is DONE only if:
12. Final Statement
This PR establishes that:
Any deviation from this model invalidates the PR.
13. Stop Condition
If at any point implementation requires:
→ STOP PR-18 immediately and escalate to design/audit.
Implementation plan (commit-by-commit)
internal/installer/update/apply_contract.md+ call-path audit test (this PR's seed commit already landed)runUpdateApplythat invokesnftban firewall rebuild+nftban-validateand nothing elseStatus: Draft — standard v1.99 pattern, un-drafts after CI green. Each commit keeps the PR diff auditable.
🤖 Generated with Claude Code