|
| 1 | +# authgate-kernel — Implementation TODO |
| 2 | + |
| 3 | +> Identity: "A formally constrained capability-security kernel for agent tool execution." |
| 4 | +> Analogy: seccomp/SELinux for agentic AI. Narrow. Deployable. Operationally real. |
| 5 | +> |
| 6 | +> MASTER_PLAN.md success criteria (required to call this "foundational infrastructure"): |
| 7 | +> [ ] 1. TLC has run — at least one finite instance model-checked to exhaustion |
| 8 | +> [x] 2. engine.rs has one mechanically verified property — Kani, 19 harnesses |
| 9 | +> [ ] 3. SEMANTICS.md has no known gaps |
| 10 | +> [ ] 4. Sub-microsecond Rust benchmark + Python latency baseline documented |
| 11 | +> [ ] 5. One real integration — non-trivial AI system, documented |
| 12 | +> |
| 13 | +> Current honest label: "well-architected research prototype with production-grade aspirations." |
| 14 | +> Target label: "foundational infrastructure for capability-security in agentic AI." |
| 15 | +
|
| 16 | +--- |
| 17 | + |
| 18 | +## Status key |
| 19 | +- [x] done and committed |
| 20 | +- [~] partially done / exists but incomplete |
| 21 | +- [ ] not started |
| 22 | + |
| 23 | +--- |
| 24 | + |
| 25 | +## Phase 0 — Architecture (COMPLETE) |
| 26 | + |
| 27 | +- [x] engine.rs — pure Rust verification core, zero I/O |
| 28 | +- [x] JSON wire format + C ABI (authgate_kernel_verify, authgate_kernel_pubkey) |
| 29 | +- [x] ed25519 attestation — signed verification results |
| 30 | +- [x] Python fallback — identical API, zero config |
| 31 | +- [x] Attenuation enforced in Rust dag.rs and Python layer |
| 32 | +- [x] Framework adapters — OpenAI, Anthropic, LangChain (src/authgate/adapters/) |
| 33 | +- [x] TLA+ spec — invariants stated (formal/authgate_v3.tla) |
| 34 | +- [x] SEMANTICS.md — honest scope documentation |
| 35 | + |
| 36 | +--- |
| 37 | + |
| 38 | +## Phase 1 — Mechanical Verification |
| 39 | + |
| 40 | +### 1.1 TLC model check ← BLOCKED (needs Java + tla2tools.jar) |
| 41 | +- [ ] Install Java and download tla2tools.jar |
| 42 | +- [ ] Run: `java -jar tla2tools.jar -tool MC_AuthGateV3` |
| 43 | +- [ ] Verify all 9 invariants + PermitSoundness (or document counterexamples) |
| 44 | +- [ ] Add CI job on spec-core branch: GitHub Actions with Java runner |
| 45 | +- Branch: spec-core |
| 46 | + |
| 47 | +### 1.2 Delegation lattice ✅ DONE |
| 48 | +- [x] Transitivity: attenuation enforced in dag.rs (rights ⊆ parent.rights) |
| 49 | +- [x] Resource propagation: child cap cannot redirect root resource (dag.rs) |
| 50 | +- [x] Chain depth bound: MAX_CHAIN_DEPTH = 16 |
| 51 | +- [x] AT-5.1: SHA-256(pubkey) = subject_id binding at every delegation node |
| 52 | +- [x] Tests: hardening_tests.rs (31 tests including 6 proptest properties) |
| 53 | + |
| 54 | +### 1.3 Resource scope formal rule + tests |
| 55 | +- [ ] Write formal prefix-matching rule in SEMANTICS.md: |
| 56 | + "resource R1 contains R2 iff R2.scope starts with R1.scope" |
| 57 | +- [ ] Add scope containment tests in tests/test_scope.py (20+ cases) |
| 58 | + - "/data/alice/" contains "/data/alice/file.csv" → True |
| 59 | + - "/data/" does NOT contain "/etc/passwd" → False |
| 60 | + - Exact match: "/data/alice/" contains "/data/alice/" → True |
| 61 | + - Trailing slash normalization |
| 62 | + - Path traversal rejection: "/data/../etc" → False |
| 63 | +- Branch: spec-core (spec) + adversarial-lab (tests) |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +## Phase 2 — Information Flow |
| 68 | + |
| 69 | +### 2.1 IFC labels [~] exists but incomplete |
| 70 | +- [~] src/authgate/extensions/ifc.py exists |
| 71 | +- [ ] Read current IFC state and document gaps |
| 72 | +- [ ] Wire IFC labels into Resource (optional field, backward-compatible) |
| 73 | +- [ ] NonInterferenceChecker.check_plan() — full Bell-LaPadula enforcement |
| 74 | +- [ ] 20+ tests covering flow violations (SECRET→PUBLIC denied) |
| 75 | +- Branch: integration |
| 76 | + |
| 77 | +--- |
| 78 | + |
| 79 | +## Phase 3 — Rust Formal Verification ✅ DONE |
| 80 | + |
| 81 | +- [x] Kani: 19 harnesses, all proved |
| 82 | + - All 10 sovereignty flags: always blocked |
| 83 | + - Ownerless machine: always blocked |
| 84 | + - Machine governs human: always blocked |
| 85 | + - Public resource read: always permitted |
| 86 | + - Write without claim: always denied |
| 87 | + - Attenuation two-node: all bitmask combinations |
| 88 | + - Epoch gate: total relation proved |
| 89 | + - Forged revocation: never flips Permit→Deny |
| 90 | +- [x] Lean 4: 7 theorems (forbidden flags, attenuation, epoch gate, |
| 91 | + subject mismatch, determinism, stale epoch, rights sufficiency) |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## Phase 4 — Plan Semantics ✅ DONE |
| 96 | + |
| 97 | +- [x] formal/plan_semantics.md — tractable vs intractable boundary documented |
| 98 | +- [x] verify_plan() in verifier.py — structural authority check per action |
| 99 | +- [x] Sovereignty propagation: flag in action[i] → cancel actions[i+1:] |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## Phase 5 — Production Hardening ← CURRENT FOCUS |
| 104 | + |
| 105 | +### 5.1 Thread safety audit + tests |
| 106 | +- [~] OwnershipRegistry uses threading.RLock — exists but untested concurrently |
| 107 | +- [ ] tests/test_thread_safety.py — concurrent reader/writer tests: |
| 108 | + - 50 readers, 5 writers simultaneously — no deadlock, no data race |
| 109 | + - freeze() + read from snapshot while original mutates — no corruption |
| 110 | + - verify() concurrent calls return consistent results |
| 111 | + - AuditLog concurrent append — chain integrity preserved under concurrent writes |
| 112 | +- Branch: integration |
| 113 | + |
| 114 | +### 5.2 Frozen registry ✅ DONE |
| 115 | +- [x] OwnershipRegistry.freeze() → immutable snapshot (registry.py) |
| 116 | +- [x] Frozen registry raises RuntimeError on any mutation attempt |
| 117 | +- [x] Snapshot is independent copy — original mutations don't affect snapshot |
| 118 | +- [ ] Wire freeze() into FreedomVerifier: verify() uses frozen snapshot by default |
| 119 | + Currently: verifier holds registry directly (live, can mutate under verify) |
| 120 | + Fix: in FreedomVerifier.__init__, call registry.freeze() and store snapshot |
| 121 | + |
| 122 | +### 5.3 Audit log [~] exists but needs hardening |
| 123 | +- [x] AuditLog with hash-chain integrity (audit.py) |
| 124 | +- [x] Wired into FreedomVerifier.verify() |
| 125 | +- [ ] AuditLog.load_from_file(path) — reconstruct from persisted .jsonl |
| 126 | +- [ ] AuditLog.verify_chain() test under concurrent writes (50 concurrent verifies) |
| 127 | +- [ ] AuditLog.replay(entry_idx) — reconstruct decision context for forensics |
| 128 | +- [ ] Test: tamper one entry → verify_chain() returns False |
| 129 | +- [ ] Test: delete one entry → verify_chain() returns False |
| 130 | +- Branch: integration |
| 131 | + |
| 132 | +### 5.4 Benchmarks ← MISSING — blocks "foundational infrastructure" label |
| 133 | +- [ ] benchmarks/python_verify_bench.py — FreedomVerifier.verify() microbenchmark |
| 134 | + - Single verify() call: target <100µs (Python layer overhead is acceptable) |
| 135 | + - 1000-claim registry lookup: document actual p50/p95 |
| 136 | + - verify_plan() on 10-action plan: document throughput |
| 137 | +- [ ] Document Rust benchmark results (cargo bench --bench verify_bench) |
| 138 | + - Target <1µs per verify() call in Rust |
| 139 | +- [ ] Add benchmark numbers to README in Benchmarks section |
| 140 | +- Branch: integration |
| 141 | + |
| 142 | +--- |
| 143 | + |
| 144 | +## Phase B — Wire Boundary Hardening (adversarial-lab work) |
| 145 | + |
| 146 | +- [x] wire_attacks.py (27 tests) — JSON deserialization boundary |
| 147 | +- [x] differential_tests.py (20 tests) — Python model semantics |
| 148 | +- [ ] B3: Rust strict wire parser — mirror wire_attacks.py rejection classes in serde |
| 149 | + File: freedom-kernel/src/wire.rs |
| 150 | + Add: strict field validation, reject unknown fields, reject float-in-u64 |
| 151 | +- [ ] B4: --rust flag for differential_tests.py — wire format alignment needed first |
| 152 | +- [ ] B5: WA-1 duplicate key — test Rust serde_json behavior explicitly, |
| 153 | + document last-wins or add explicit rejection |
| 154 | +- Branch: adversarial-lab (tests) + tcb-core (Rust impl) |
| 155 | + |
| 156 | +--- |
| 157 | + |
| 158 | +## Phase C — Production Reality |
| 159 | + |
| 160 | +### C1: Wire audit log into FreedomVerifier default (see 5.3 above) |
| 161 | +### C2: Typed error hierarchy |
| 162 | +- [ ] src/authgate/errors.py: |
| 163 | + AuthgateError → CapabilityError, RightsError, IntegrityError, WireError |
| 164 | + Each with structured fields (actor_id, resource_hash, failed_check, value) |
| 165 | +- [ ] Replace bare string violations in verifier.py with typed exceptions |
| 166 | + (keep backwards-compat: VerificationResult.violations still contains strings) |
| 167 | +- Branch: integration |
| 168 | + |
| 169 | +### C3: Key rotation protocol |
| 170 | +- [ ] docs/KEY_ROTATION.md — operational procedure: |
| 171 | + root key rotation via epoch advancement, grace period, emergency rotation |
| 172 | +- [ ] src/authgate/key_rotation.py — KeyRotationManager |
| 173 | + rotate(old_sk, new_sk, new_epoch) → signed rotation certificate |
| 174 | + verify_rotation(cert, old_vk) → bool |
| 175 | +- Branch: integration |
| 176 | + |
| 177 | +### C4: CLI tool |
| 178 | +- [ ] authgate-cli verify --registry reg.json --action action.json |
| 179 | +- [ ] authgate-cli audit verify <log.jsonl> |
| 180 | +- [ ] authgate-cli key rotate --old-key old.pem --new-key new.pem --epoch N |
| 181 | +- Branch: integration |
| 182 | + |
| 183 | +--- |
| 184 | + |
| 185 | +## Phase D — Success Criteria Completion |
| 186 | + |
| 187 | +### D1: SEMANTICS.md gap closure (criteria #3) |
| 188 | +- [ ] Read SEMANTICS.md, list all informal claims |
| 189 | +- [ ] For each claim: add formal statement, or label explicitly as "not yet verified" |
| 190 | +- [ ] No claim left ambiguous |
| 191 | +- Branch: spec-core |
| 192 | + |
| 193 | +### D2: One real integration (criteria #5) |
| 194 | +- [ ] End-to-end test: LangChain tool → FreedomVerifier.verify() → SandboxedExecutor |
| 195 | +- [ ] Audit log entry per action |
| 196 | +- [ ] Document in examples/langchain_integration/ |
| 197 | +- Branch: integration |
| 198 | + |
| 199 | +--- |
| 200 | + |
| 201 | +## Build order (this session) |
| 202 | + |
| 203 | +1. [x] Phase A sandbox — DONE |
| 204 | +2. [x] Phase B wire_attacks.py, differential_tests.py — DONE |
| 205 | +3. [ ] **5.4 Benchmarks** — implement now (benchmarks/python_verify_bench.py) |
| 206 | +4. [ ] **5.1 Thread safety tests** — implement now (tests/test_thread_safety.py) |
| 207 | +5. [ ] **5.3 Audit log: load_from_file + tamper tests** |
| 208 | +6. [ ] **5.2 Wire freeze() into verifier** |
| 209 | +7. [ ] **1.3 Resource scope tests** |
| 210 | +8. [ ] **C3 Key rotation** |
| 211 | +9. [ ] **D2 Real integration test** |
| 212 | + |
| 213 | +--- |
| 214 | + |
| 215 | +## Branch assignment |
| 216 | + |
| 217 | +| Work | Branch | |
| 218 | +|------|---------| |
| 219 | +| 5.1, 5.2, 5.3, 5.4, C2, C3, D2 | `integration` | |
| 220 | +| B3, B4, B5, 1.3 tests | `adversarial-lab` | |
| 221 | +| 1.1 TLC, 1.3 spec, D1 | `spec-core` | |
| 222 | +| B3 Rust | `tcb-core` | |
| 223 | +| README, TODO | `adversarial-lab` then propagate | |
0 commit comments