Skip to content

Commit 6b2d859

Browse files
itcmsgrclaude
andcommitted
feat(v1.100 PR-P2-3): kernel + service snapshot CI gate (G3-KS-SNAPSHOT)
Pre-PR-23 assurance blocker #3 of 5 remaining. Adds system-level before/after truth checks to every dry-run CI path so a future regression that mutates nftables tables or firewall service states without touching tracked files is caught at gate time. ## Scope (locked per authorization 2026-04-20) - Before/after `nft list tables` sorted diff — hard-assert equal - Before/after `systemctl is-active` for the 6 lifecycle-relevant units (nftband + 5 external firewalls) — hard-assert equal - No `|| true` on the diff checks - Covered paths: install dry-run refusal, update dry-run, uninstall dry-run (explicit + implicit) ## Implementation - NEW: scripts/ci-snapshot-kernel-service.sh — reusable helper that emits a stable, sorted snapshot. Degrades gracefully (both sides return the same placeholder) when nft or systemctl aren't available (e.g. almalinux-9 container without systemd). Contract is: * purely read-only probes * never invokes nft/systemctl with mutation verbs * never writes to the filesystem * exit 0 always — caller decides whether differences fail - EXTENDED: all 3 canonization workflows * ci-install-canonization.yml / G3-IN-REFUSE-DRY-RUN * ci-update-canonization.yml / G3-U3 * ci-uninstall-canonization.yml / G3-UN-PLAN-RENDERS Each takes a snapshot before the dry-run invocation and hard- asserts byte-identical equality after. ## Monitored units (must match extfw.Detect's signal set) nftband.service ufw.service firewalld.service csf.service lfd.service iptables.service Kept in lockstep with internal/installer/extfw/detect.go so CI and production code agree on "what counts as a firewall service." ## Non-goals (scope-lock) - NO code-path redesign - NO strace/exec tracing yet (deferred to PR-P2-4) - NO mutation behavior changes - NO new firewall-unit additions to the signal set ## Also: tracking update Marks blocker #2 (external-firewall detection unification, PR #486 / 49d98fc) as LANDED in the contract blocker table. Remaining: 4 Phase 2 PRs before PR-23. Refs: internal/installer/uninstall/contract.md §"Pre-PR-23 blockers" Authorization: locked Phase 2 sequencing (2026-04-20) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 49d98fc commit 6b2d859

5 files changed

Lines changed: 128 additions & 3 deletions

File tree

.github/workflows/ci-install-canonization.yml

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,11 @@ jobs:
106106
echo '{"schema_version":"1.0","entries":[]}' | sudo tee /var/lib/nftban/update-history.json >/dev/null
107107
before_hist=$(sudo sha256sum /var/lib/nftban/update-history.json | awk '{print $1}')
108108
109+
# PR-P2-3 (G3-KS-SNAPSHOT): capture kernel nft tables + firewall-
110+
# adjacent service states BEFORE the refuse-dry-run invocation
111+
# so we can hard-assert they are byte-identical afterward.
112+
before_ks=$(bash scripts/ci-snapshot-kernel-service.sh)
113+
109114
set +e
110115
out=$(sudo ./bin/nftban-installer --mode=install --dry-run \
111116
--state-dir=/var/lib/nftban/state 2>&1)
@@ -129,7 +134,17 @@ jobs:
129134
echo "::error::G3-IN-REFUSE-DRY-RUN FAIL: history file changed on a refused run"
130135
exit 1
131136
fi
132-
echo "G3-IN-REFUSE-DRY-RUN PASS — install dry-run refused cleanly, no pollution"
137+
138+
# PR-P2-3 (G3-KS-SNAPSHOT): kernel + service state must be
139+
# byte-identical after a refused run. Hard assertion — no
140+
# `|| true` on the diff check.
141+
after_ks=$(bash scripts/ci-snapshot-kernel-service.sh)
142+
if [[ "$before_ks" != "$after_ks" ]]; then
143+
echo "::error::G3-KS-SNAPSHOT FAIL: kernel or service state changed during install --dry-run refusal"
144+
diff <(echo "$before_ks") <(echo "$after_ks") || true
145+
exit 1
146+
fi
147+
echo "G3-IN-REFUSE-DRY-RUN PASS — install dry-run refused cleanly, no pollution, kernel+service unchanged"
133148
134149
# ------------------------------------------------------------------
135150
# G3-IN-FLAG-COMBOS — reject operator-error flag combinations.

.github/workflows/ci-uninstall-canonization.yml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,13 @@ jobs:
207207
# Stop Condition violation.
208208
before_varlib=$(find /var/lib/nftban -type f 2>/dev/null | sort | xargs -r sha256sum 2>/dev/null | sort || true)
209209
before_etcnftban=$(find /etc/nftban -type f 2>/dev/null | sort | xargs -r sha256sum 2>/dev/null | sort || true)
210+
# PR-P2-3 (G3-KS-SNAPSHOT): capture kernel nft tables +
211+
# firewall-adjacent service states BEFORE the dry-run so we
212+
# can hard-assert they remain byte-identical afterward.
213+
# Filesystem snapshot alone (above) misses a regression that
214+
# mutates kernel tables or service states without touching
215+
# tracked files — exactly the class of drift this gate closes.
216+
before_ks=$(bash scripts/ci-snapshot-kernel-service.sh)
210217
set +e
211218
sudo ./bin/nftban-installer --mode=uninstall --dry-run \
212219
--state-dir=/var/lib/nftban/state 2>&1 | tee /tmp/uninstall-dryrun.out
@@ -219,6 +226,7 @@ jobs:
219226
fi
220227
after_varlib=$(find /var/lib/nftban -type f 2>/dev/null | sort | xargs -r sha256sum 2>/dev/null | sort || true)
221228
after_etcnftban=$(find /etc/nftban -type f 2>/dev/null | sort | xargs -r sha256sum 2>/dev/null | sort || true)
229+
after_ks=$(bash scripts/ci-snapshot-kernel-service.sh)
222230
if [[ "$before_varlib" != "$after_varlib" ]]; then
223231
echo "::error::PR-22 Stop Condition violated — uninstall dry-run modified /var/lib/nftban/"
224232
diff <(echo "$before_varlib") <(echo "$after_varlib") || true
@@ -229,7 +237,13 @@ jobs:
229237
diff <(echo "$before_etcnftban") <(echo "$after_etcnftban") || true
230238
exit 1
231239
fi
232-
echo "G3-UN-NO-WRITES PASS — /var/lib/nftban and /etc/nftban unchanged after dry-run"
240+
# PR-P2-3 hard assertion — no `|| true` on the diff check.
241+
if [[ "$before_ks" != "$after_ks" ]]; then
242+
echo "::error::G3-KS-SNAPSHOT FAIL: kernel or service state changed during uninstall --dry-run"
243+
diff <(echo "$before_ks") <(echo "$after_ks") || true
244+
exit 1
245+
fi
246+
echo "G3-UN-NO-WRITES PASS — /var/lib/nftban, /etc/nftban, kernel, and service state all unchanged after dry-run"
233247
# Mandatory contract-language elements — each one corresponds
234248
# to a row PR-22 promised to render.
235249
for needle in \

.github/workflows/ci-update-canonization.yml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,10 @@ jobs:
166166
before=$(find /var/lib/nftban /etc/nftban -type f -printf '%p %s\n' 2>/dev/null | sort)
167167
before_hist=$(sudo sha256sum /var/lib/nftban/update-history.json | awk '{print $1}')
168168
before_state=$(sudo test -f /var/lib/nftban/state/install_state && sudo sha256sum /var/lib/nftban/state/install_state | awk '{print $1}' || echo "missing")
169+
# PR-P2-3 (G3-KS-SNAPSHOT): capture kernel nft tables + firewall-
170+
# adjacent service states for a hard before/after assertion
171+
# around the update dry-run.
172+
before_ks=$(bash scripts/ci-snapshot-kernel-service.sh)
169173
170174
set +e
171175
sudo ./bin/nftban-installer --mode=upgrade --dry-run \
@@ -212,6 +216,18 @@ jobs:
212216
exit 1
213217
fi
214218
219+
# PR-P2-3 (G3-KS-SNAPSHOT): kernel + service state must be
220+
# byte-identical. The pre-PR-P2-3 gate snapshot covered only
221+
# filesystem truth; a future regression that mutates nftables
222+
# tables or service states without touching tracked files
223+
# would have slipped past.
224+
after_ks=$(bash scripts/ci-snapshot-kernel-service.sh)
225+
if [[ "$before_ks" != "$after_ks" ]]; then
226+
echo "::error::G3-KS-SNAPSHOT FAIL: kernel or service state changed during update --dry-run"
227+
diff <(echo "$before_ks") <(echo "$after_ks") || true
228+
exit 1
229+
fi
230+
215231
# Exit code sanity: 0 (committed) if preflight passes, 1 (degraded)
216232
# if preflight fails. Never 2/3/4 for a well-formed run on a
217233
# host that doesn't have a real daemon.

internal/installer/uninstall/contract.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,12 +281,12 @@ discipline.
281281
| # | PR | Merge commit | Purpose |
282282
|---|---|---|---|
283283
| 1 | Prior-authority record hardening | PR #484 / `3b834033` | Added `recorded_at`, `installer_version`, explicit `active_at_install=false` handling to `prior.go`; 5-state classification |
284+
| 2 | External-firewall detection unification | PR #486 / `49d98fc1` | `internal/installer/extfw` canonical detector; Option A CSF config-file signal shared across install/update/uninstall; multi-active → `Ambiguous` (no silent collapse); cross-caller consistency test locked as regression guard |
284285

285286
### Behavioral / semantic blockers (code contract changes)
286287

287288
| # | PR | Scope | Blocking because |
288289
|---|---|---|---|
289-
| 2 | External-firewall detection unification | One shared `DetectExternalAuthority` function + one precedence order (ufw → firewalld → iptables → csf) used by install-side `authority/classify.go`, uninstall-side `uninstall/authority.go`, and any future consumer | Detection drift between modules will cause install/uninstall/restore to disagree about what external authority exists |
290290
| 6 | Payload integrity minimum checks | Minimum-size + required-header/token check for `/etc/nftban/nftban.conf` and `/etc/nftban/nftables.conf`; wire into existing `payload.VerifyInventory` | Presence-only validation lets a truncated-or-empty critical config pass |
291291

292292
### Assurance / gate blockers (CI and scope-lock enforcement)
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
#!/usr/bin/env bash
2+
# =============================================================================
3+
# NFTBan v1.100 PR-P2-3 — CI kernel/service snapshot helper
4+
# =============================================================================
5+
# SPDX-License-Identifier: MPL-2.0
6+
# meta:name="ci-snapshot-kernel-service"
7+
# meta:type="script"
8+
# meta:version="1.100.0"
9+
# meta:owner="Antonios Voulvoulis <contact@nftban.com>"
10+
# meta:created_date="2026-04-20"
11+
# meta:description="Emit stable before/after snapshot of nft tables + firewall-adjacent service states"
12+
# meta:inventory.files="scripts/ci-snapshot-kernel-service.sh"
13+
# meta:inventory.binaries=""
14+
# meta:inventory.env_vars=""
15+
# meta:inventory.config_files=""
16+
# meta:inventory.systemd_units="nftband.service, ufw.service, firewalld.service, csf.service, lfd.service, iptables.service"
17+
# meta:inventory.network=""
18+
# meta:inventory.privileges="root"
19+
# =============================================================================
20+
#
21+
# Prints a deterministic, line-oriented snapshot of:
22+
#
23+
# 1. Kernel nftables tables (`nft list tables`, sorted)
24+
# 2. Firewall-adjacent systemd unit states (nftband + every external
25+
# firewall unit the lifecycle may interact with)
26+
#
27+
# Used by CI gates to assert that dry-run paths leave kernel and
28+
# service state unchanged. The caller captures the output twice
29+
# (before + after the dry-run) and fails CI if the two snapshots
30+
# differ.
31+
#
32+
# Degrades gracefully on container environments that lack nft or
33+
# systemctl — both sides of the comparison emit the same placeholder,
34+
# so diff remains empty for environments that cannot probe.
35+
#
36+
# Contract (PR-P2-3, frozen 2026-04-20):
37+
# - Output is stable (sorted) and purely from read-only probes.
38+
# - Never invokes nft / systemctl with mutation verbs.
39+
# - Never writes to the filesystem.
40+
# - Exit code 0 always; the CALLER decides whether differences fail.
41+
#
42+
# =============================================================================
43+
set -Eeuo pipefail
44+
45+
# PR-P2-3 monitored-units: every unit that is either owned by nftban or
46+
# represents an external firewall the lifecycle code touches. Kept in
47+
# lockstep with internal/installer/extfw/detect.go so the CI gate and
48+
# the production detector agree on "what counts as a firewall service."
49+
UNITS=(
50+
nftband.service
51+
ufw.service
52+
firewalld.service
53+
csf.service
54+
lfd.service
55+
iptables.service
56+
)
57+
58+
echo "## kernel-nft-tables"
59+
if command -v nft >/dev/null 2>&1; then
60+
# Redirect stderr so a missing kernel module doesn't pollute the
61+
# snapshot with different messages across before/after invocations.
62+
sudo nft list tables 2>/dev/null | sort || echo "nft:exec_failed"
63+
else
64+
echo "nft:not_installed"
65+
fi
66+
67+
echo "## service-states"
68+
if command -v systemctl >/dev/null 2>&1 && systemctl --version >/dev/null 2>&1; then
69+
for u in "${UNITS[@]}"; do
70+
# Always emit "unit=state" for every monitored unit — even
71+
# inactive/missing — so both sides of the before/after diff
72+
# produce the same lines unless state actually changes.
73+
# `is-active` exits non-zero for inactive; we capture the
74+
# string and swallow the exit code intentionally.
75+
state=$(systemctl is-active "$u" 2>&1 || true)
76+
echo "$u=$state"
77+
done
78+
else
79+
echo "systemctl:not_available"
80+
fi

0 commit comments

Comments
 (0)