Skip to content

Commit 9f148bb

Browse files
itcmsgrclaude
andauthored
feat(v1.100 PR-26): code-D post-restore evidence record (§39.3 / §48.6 lock) (#517)
* feat(v1.100 PR-26-code-D): post-restore evidence record (§39.3 / §48.6 lock) PR-26-code-D — restore verification / evidence hardening, slice D. Adds the structured post-restore evidence-record writer per §39.3 + §48.6 operator lock. Recording-only — does NOT re-run PR-24 decisions, rebuild TargetAuthority, or add validator/module-health probes (operator design call). Authority: - PR #512 / contract.md Part IV §§37-50 - PR #513 / §51 lock record - PR #514 / code-A merge 4e98ff5 - PR #515 / code-B merge 45fc63e - PR #516 / code-C merge 6d8386d - §39 Q1 BLOCKING evidence rows - §39.3 evidence-record file requirement - §46 CI gate requirements - §48.6 (operator-locked at this commit's open): - path: /var/lib/nftban/state/restore-evidence/ - filename: restore-evidence-<UTC-RFC3339-basic>-<short-random>.json - schema: 1.0.0 - writer helper: writeRestoreEvidenceRecord(ctx, exec, record) - path constant: restoreEvidenceDir - §51.5-A2 (read-only typed introspection outside mutation cap) Files added (2): cmd/nftban-installer/restore_evidence.go - Constants: restoreEvidenceSchemaVersion = "1.0.0" restoreEvidenceDir = "/var/lib/nftban/state/restore-evidence" restoreEvidenceFilenamePrefix = "restore-evidence-" restoreEvidenceMode = 0o640 restoreEvidenceDirMode = 0o750 - Schema types: RestoreEvidenceRecord (schema_version, timestamp_utc, mode, phase, target, result, verification, history_gate, warnings) + the 4 nested structs. - Sentinels: ErrEvidenceWriteFailed, ErrEvidenceNilExecutor, ErrEvidenceNilRecord. - writeRestoreEvidenceRecord — the SINGLE helper. MkdirAll, marshal, WriteFileAtomic. Filename: prefix + UTC RFC3339-basic stamp + "-" + 8-hex random suffix + ".json". - buildRestoreEvidenceRecord — recording-only assembler. Sources: target.Kind/FirewallType/Panel, execRes.Terminal/Stage/VerifyResult, exec.NftTableExists for emergency + nftban tables, detect.SSHPortWithSource. No re-derivation; no Probe / Decide / DetectPanel calls. - evidenceShortRandom — crypto/rand-backed 8-hex suffix to avoid same-second filename collisions. cmd/nftban-installer/restore_evidence_test.go - 10 tests: 1. WriteRestoreEvidence_HappyPath — filename pattern + single write 2. WriteRestoreEvidence_RoundTripsJSON — schema_version + mode + phase + history_gate flags 3. WriteRestoreEvidence_NilExecutor — defensive guard 4. WriteRestoreEvidence_NilRecord — defensive guard 5. WriteRestoreEvidence_OnlyHelperWritesUnderEvidenceDir_FileScan — single-WriteFileAtomic invariant 6. WriteRestoreEvidence_NoForbiddenSurfaces_FileScan — recording-only invariant pin 7. BuildRestoreEvidenceRecord_RecordedPriorHappy — full happy path with ss-listener SSH port resolution 8. BuildRestoreEvidenceRecord_NftbanTablesPresent_Recorded — post-mutation kernel observation 9. BuildRestoreEvidenceRecord_AuthorityClassDivergenceWarning — ObservedAuthority diverging from AuthorityExternal surfaces in warnings 10. RestoreEvidenceConstants_LockPin — §48.6 path/version/prefix pinned exactly Files modified (4): internal/installer/detect/ssh.go - Added detect.SSHPortWithSource (read-only). Same 4-source priority chain as detect.SSHPort but also returns the source name (ss / sshd_config / state / config) — required by the §48.6 schema's ssh_port_source enum. Per §51.5-A2 outside the mutation cap. cmd/nftban-installer/restore_decide.go - runRestoreExecutionFromProceed gains a Step D (between Execute and Transition): 1. buildRestoreEvidenceRecord(target, execRes) 2. writeRestoreEvidenceRecord(ctx, exec, rec, log) - §48.6 downgrade rule: if evidence-write fails AFTER a successful StateRestoreExecuted, downgrade to StateRestoreDegraded (state.machine.go:152 already supports this terminal). The state model supports the downgrade; no contract amendment needed. - Operator-facing log line on Degraded now includes the evidence- write failure reason. - No state-machine / exit-code / history-gate change. main.go:132 mode-gate untouched. cmd/nftban-installer/restore_decide_test.go - TestRunRestoreExecutionFromProceed_FakeDeps_HappyPath_PersistsExecuted + 4 other dispatcher tests updated: pass executor.NewMockExecutor() instead of nil so the new evidence-write step succeeds and the terminal stays at StateRestoreExecuted (fake happy path). The 3 tests that pass nil exec via _ = runRestoreExecutionFromProceed do not assert on sf.State so they still pass under the downgrade. .github/workflows/ci-restore-canonization.yml - New gate G4-RESTORE-EVIDENCE-RECORD (§46). Structural — pins the named-constant + single-helper invariant: * restore_evidence.go declares restoreEvidenceDir, restoreEvidenceSchemaVersion, restoreEvidenceFilenamePrefix verbatim + locked values * restore_evidence.go declares writeRestoreEvidenceRecord + buildRestoreEvidenceRecord + RestoreEvidenceRecord struct * exactly ONE WriteFileAtomic call in restore_evidence.go (the single-helper invariant — locked by §48.6) * forbidden-symbol scan: restore.Decide / restore.PlanFromDecision / uninstall.Probe / detect.DetectPanel / writeHistory / update-history.json / mutation primitives / direct OS bypass (recording-only invariant) * dispatcher (restore_decide.go) calls BOTH writeRestoreEvidenceRecord AND buildRestoreEvidenceRecord (proves evidence is consumed, not just imported) - §46.1 line-skipping discipline applied (production-code-only, comment-stripped). Recording-only invariant (operator design call) honored: - No restore.Decide / restore.PlanFromDecision calls - No uninstall.Probe call - No detect.DetectPanel call (only detect.SSHPortWithSource — read-only typed introspection) - No validator full-sweep / module-health probe - No update-history.json write (§19.2 layer 4 / main.go:132 retained) - No new mutation primitive Constraints honored (per operator scope): IN: - evidence record type + schema ✓ (§48.6 lock) - evidence writer helper ✓ (single helper writeRestoreEvidenceRecord) - production write after restore execution path ✓ (dispatcher Step D) - structural CI gate G4-RESTORE-EVIDENCE-RECORD ✓ - tests proving all writes stay under restoreEvidenceDir ✓ - tests proving update-history is untouched ✓ (HistoryGate flags + no writeHistory references in evidence module) OUT: - destructive soak (PR-26-code-E) - A.4 cron changes (already shipped in code-C) - executor new mutation methods (Stat is read-only, shipped in code-C) - iptables introspection (Option B lock) - main.go history gate changes (untouched) - state/exit-code changes — only the existing StateRestoreDegraded is consumed, no new state added - repo hygiene / UX / GOTH / metrics / module cleanup Verified on lab2 (Ubuntu 24.04, go1.22.2): - go build ./... clean - go test ./... PASS (full repo, 64 packages) - go test -race -count=1 cmd + restore + state + switchop + detect PASS - go vet ./... clean - go mod tidy no-op - 10 new TestWriteRestoreEvidence_* / TestBuildRestoreEvidenceRecord_* / TestRestoreEvidenceConstants_LockPin tests all PASS - existing 5 dispatcher fake-deps tests updated + still PASS - All 3 G4 gates (NO-OUT-OF-TARGET / CRON-MANIFEST-INTEGRITY / EVIDENCE-RECORD) local replay: FAIL=0 Awaiting auditor pass before push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(v1.100 PR-26-code-D): add 5 dispatcher-level evidence-failure semantics tests (auditor checkpoint) Auditor focused-audit on 849b372 flagged that PR-26-code-D's Step D introduces a real operator-visible terminal transition: StateRestoreExecuted + evidence write failure → StateRestoreDegraded The 10 unit tests already covered the writer + builder + recording invariants but did NOT pin the dispatcher-level downgrade semantics. This commit adds 5 dispatcher-level tests to close that gap. Tests added: cmd/nftban-installer/restore_decide_test.go 1. PR26D_ExecutedPlusEvidenceFail_DowngradesToDegraded fake deps return StateRestoreExecuted; writeFailExec wrapper forces evidence WriteFileAtomic to fail. Asserts: - sf.State == StateRestoreDegraded (downgrade fires) - exit code == StateRestoreDegraded.ExitCode() - sf.State != StateRestoreExecuted (no false claim) Note: sf.FailureReason stays empty by design (Transition only populates FailureReason on .IsFailed() states; Degraded is success-with-warnings). The downgrade reason surfaces via log.Result, which is the authoritative operator channel for Degraded outcomes. 2. PR26D_FailedExecutionPlusEvidenceFail_TerminalPreserved fake.mutateErr forces FailedExecution; writeFailExec forces evidence-write failure. Asserts: - sf.State == StateRestoreFailedExecution (terminal preserved) - exit == StateRestoreFailedExecution.ExitCode() Evidence failure is warning-only on non-Executed terminals. 3. PR26D_FailedVerificationPlusEvidenceFail_TerminalPreserved fake.activeRet=false forces inline-verify SafeToRemove=false → FailedVerification; writeFailExec forces evidence-write fail. Asserts terminal + exit code unchanged from FailedVerification. 4. PR26D_ExecutedPlusEvidenceOk_PreservesExecuted Plain MockExecutor (writes succeed). Asserts: - sf.State == StateRestoreExecuted (no downgrade on clean write) - exit == StateRestoreExecuted.ExitCode() - exactly one file written under restoreEvidenceDir - no writes outside restoreEvidenceDir 5. PR26D_NoUpdateHistoryWrite_FileScan File-scan against restore_decide.go. Strips line-leading // per §46.1; asserts no production-code reference to writeHistory( or update-history.json. Pins the §19.2 layer-4 invariant stays untouched after PR-26-code-D adds Step D. writeFailExec wrapper (test-only): Wraps *executor.MockExecutor and overrides only WriteFileAtomic to fail. Avoids changing the production MockExecutor; uses the same composition pattern as flakyCSFActiveExec (introduced in PR-25 4B-3-csf for analogous test purposes). Verified on lab2 (Ubuntu 24.04, go1.22.2): - go build ./... clean - go test ./cmd/nftban-installer/... PASS - 5 new TestRunRestoreExecutionFromProceed_PR26D_* / TestDispatcher_PR26D_* tests all PASS - go test -race -count=1 cmd + restore + state PASS - existing PR-25 + PR-26-code-A/B/C tests still PASS No production code change. No CI workflow change. No contract amendment needed. Restore semantics from §48.6 lock + §19.2 layer-4 invariant are both now structurally pinned by tests. Awaiting auditor sign-off + push signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 6d8386d commit 9f148bb

6 files changed

Lines changed: 1101 additions & 11 deletions

File tree

.github/workflows/ci-restore-canonization.yml

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -549,6 +549,130 @@ jobs:
549549
550550
echo "G4-RESTORE-CRON-MANIFEST-INTEGRITY PASS — writer + reader structurally consume the shared sha256 + manifest API"
551551
552+
# ------------------------------------------------------------------
553+
# G4-RESTORE-EVIDENCE-RECORD (PR-26-code-D / §46) — structural
554+
# pin on the post-restore evidence-record writer.
555+
#
556+
# The §39.3 + §48.6 lock requires that ALL evidence-record file
557+
# writes route through a SINGLE helper using a NAMED CONSTANT
558+
# for the destination directory. This gate structurally
559+
# enforces:
560+
#
561+
# - cmd/nftban-installer/restore_evidence.go declares the
562+
# restoreEvidenceDir constant verbatim and sets its value
563+
# to /var/lib/nftban/state/restore-evidence
564+
# - the file declares the writeRestoreEvidenceRecord helper
565+
# and the schema_version constant set to "1.0.0"
566+
# - the file contains exactly ONE WriteFileAtomic call (the
567+
# call inside the single helper)
568+
# - the file does NOT reference update-history.json,
569+
# uninstall.Probe, restore.Decide, restore.PlanFromDecision,
570+
# detect.DetectPanel, or any mutation primitive
571+
# (recording-only invariant per §39.3)
572+
# - the dispatcher (cmd/nftban-installer/restore_decide.go)
573+
# calls writeRestoreEvidenceRecord — i.e. evidence is
574+
# actually consumed, not just imported
575+
#
576+
# §46.1 discipline: production-code-only, comment-stripped.
577+
# ------------------------------------------------------------------
578+
- name: G4-RESTORE-EVIDENCE-RECORD — single-helper structural pin
579+
shell: bash
580+
run: |
581+
set -Eeuo pipefail
582+
583+
ev=cmd/nftban-installer/restore_evidence.go
584+
dispatcher=cmd/nftban-installer/restore_decide.go
585+
586+
for f in "$ev" "$dispatcher"; do
587+
if [[ ! -f "$f" ]]; then
588+
echo "::error::G4-RESTORE-EVIDENCE-RECORD: $f not found"
589+
exit 1
590+
fi
591+
done
592+
593+
ev_src=$(grep -vE '^[[:space:]]*//' "$ev" || true)
594+
disp_src=$(grep -vE '^[[:space:]]*//' "$dispatcher" || true)
595+
596+
fail=0
597+
598+
# ---- Required symbols in the evidence module ---------------
599+
ev_required=(
600+
'restoreEvidenceDir[[:space:]]+=[[:space:]]+"/var/lib/nftban/state/restore-evidence"'
601+
'restoreEvidenceSchemaVersion[[:space:]]+=[[:space:]]+"1\.0\.0"'
602+
'restoreEvidenceFilenamePrefix[[:space:]]+=[[:space:]]+"restore-evidence-"'
603+
'func writeRestoreEvidenceRecord\('
604+
'func buildRestoreEvidenceRecord\('
605+
'type RestoreEvidenceRecord struct'
606+
)
607+
for pat in "${ev_required[@]}"; do
608+
if ! echo "$ev_src" | grep -qE "$pat"; then
609+
echo "::error::G4-RESTORE-EVIDENCE-RECORD: $ev missing required symbol matching '$pat'"
610+
fail=1
611+
fi
612+
done
613+
614+
# ---- Single-helper invariant -------------------------------
615+
# Exactly one WriteFileAtomic call in the evidence module —
616+
# the one inside writeRestoreEvidenceRecord. Anything else
617+
# is a violation of the §46 single-helper lock.
618+
wfa_count=$(echo "$ev_src" | grep -cE '\bWriteFileAtomic\(' || true)
619+
if [[ "$wfa_count" -ne 1 ]]; then
620+
echo "::error::G4-RESTORE-EVIDENCE-RECORD: $ev contains $wfa_count WriteFileAtomic calls; want exactly 1 (single-helper invariant)"
621+
fail=1
622+
fi
623+
624+
# ---- Forbidden recording-only-violation symbols -----------
625+
ev_forbidden=(
626+
'restore\.Decide\('
627+
'restore\.PlanFromDecision\('
628+
'uninstall\.Probe\('
629+
'detect\.DetectPanel\('
630+
'writeHistory\('
631+
'update-history\.json'
632+
'\bexec\.ServiceStart\('
633+
'\bexec\.ServiceStop\('
634+
'\bexec\.ServiceEnable\('
635+
'\bexec\.ServiceDisable\('
636+
'\bexec\.ServiceMask\('
637+
'\bexec\.ServiceUnmask\('
638+
'\bexec\.NftDeleteTable\('
639+
'\bexec\.NftAddElement\('
640+
'\bexec\.DaemonReload\('
641+
'\bexec\.Rename\('
642+
'"os/exec"'
643+
'\bexec\.Command\('
644+
'\bos\.Rename\('
645+
'\bos\.Remove\('
646+
'\bos\.WriteFile\('
647+
'\bos\.Create\('
648+
'\bsyscall\.'
649+
)
650+
for pat in "${ev_forbidden[@]}"; do
651+
if echo "$ev_src" | grep -qE "$pat"; then
652+
echo "::error::G4-RESTORE-EVIDENCE-RECORD: $ev contains forbidden pattern '$pat' (recording-only invariant)"
653+
fail=1
654+
fi
655+
done
656+
657+
# ---- Dispatcher consumption pin ----------------------------
658+
# The dispatcher MUST call writeRestoreEvidenceRecord — proves
659+
# evidence is consumed, not just imported.
660+
if ! echo "$disp_src" | grep -qE '\bwriteRestoreEvidenceRecord\('; then
661+
echo "::error::G4-RESTORE-EVIDENCE-RECORD: $dispatcher does not call writeRestoreEvidenceRecord"
662+
fail=1
663+
fi
664+
if ! echo "$disp_src" | grep -qE '\bbuildRestoreEvidenceRecord\('; then
665+
echo "::error::G4-RESTORE-EVIDENCE-RECORD: $dispatcher does not call buildRestoreEvidenceRecord"
666+
fail=1
667+
fi
668+
669+
if [[ "$fail" -ne 0 ]]; then
670+
echo "::error::§39.3 / §46 / §48.6 violation — restore evidence-record writer not structurally enforced."
671+
exit 1
672+
fi
673+
674+
echo "G4-RESTORE-EVIDENCE-RECORD PASS — writer + dispatcher structurally consume the single evidence helper"
675+
552676
restore-canonization-summary:
553677
name: Restore Canonization summary
554678
runs-on: ubuntu-24.04

cmd/nftban-installer/restore_decide.go

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -300,19 +300,44 @@ func runRestoreExecutionFromProceed(
300300
if execRes.Err != nil {
301301
reason = execRes.Err.Error()
302302
}
303-
_ = sf.Transition(execRes.Terminal, state.PhaseDetect, reason)
304303

305-
// Step D — Operator-facing output reflects the executed terminal.
306-
switch execRes.Terminal {
304+
// Step D — PR-26-code-D evidence record. Recording-only; no
305+
// re-derivation of TargetAuthority, no PR-24 re-decision, no
306+
// validator/module-health probe, no update-history write
307+
// (§19.2 layer 4 / main.go:132 mode-gate retained).
308+
//
309+
// The §48.6 lock requires that, if evidence-write fails AFTER a
310+
// successful StateRestoreExecuted, we MUST NOT claim "executed"
311+
// without recording the evidence. The state model already
312+
// supports StateRestoreDegraded (state/machine.go:152) so we
313+
// downgrade the terminal in that case.
314+
finalTerminal := execRes.Terminal
315+
finalReason := reason
316+
rec := buildRestoreEvidenceRecord(exec, log, target, execRes)
317+
evidenceErr := writeRestoreEvidenceRecord(ctx, exec, rec, log)
318+
if evidenceErr != nil {
319+
log.Warn("restore evidence: write failed: %v", evidenceErr)
320+
if execRes.Terminal == state.StateRestoreExecuted {
321+
finalTerminal = state.StateRestoreDegraded
322+
finalReason = "restore executed but evidence-write failed: " + evidenceErr.Error()
323+
log.Warn("restore evidence: downgrading StateRestoreExecuted -> StateRestoreDegraded — successful restore cannot claim executed without recorded evidence")
324+
}
325+
}
326+
327+
_ = sf.Transition(finalTerminal, state.PhaseDetect, finalReason)
328+
329+
// Step E — Operator-facing output reflects the (possibly
330+
// downgraded) terminal.
331+
switch finalTerminal {
307332
case state.StateRestoreExecuted:
308333
log.Result("[NFTBan] restore execution: COMPLETED — authorized restore is in effect")
309334
case state.StateRestoreDegraded:
310-
log.Result("[NFTBan] restore execution: COMPLETED with warnings — review inline-verify result")
335+
log.Result("[NFTBan] restore execution: COMPLETED with warnings — %s", finalReason)
311336
case state.StateRestoreFailedExecution:
312-
log.Result("[NFTBan] restore execution: FAILED at %s — %s", execRes.Stage, reason)
337+
log.Result("[NFTBan] restore execution: FAILED at %s — %s", execRes.Stage, finalReason)
313338
case state.StateRestoreFailedVerification:
314339
log.Result("[NFTBan] restore execution: FAILED VERIFICATION at %s — safety net retained — %s",
315-
execRes.Stage, reason)
340+
execRes.Stage, finalReason)
316341
}
317342

318343
return sf.State.ExitCode()

0 commit comments

Comments
 (0)