bglusman · bglusman · May 12, 2026 · May 11, 2026 · May 12, 2026
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -10,14 +10,16 @@ Self-hosted security gateway between AI agents and the rest of the world. Multi-
 
 1. **Correctness** — does the code do what its name/comments/tests claim?
 2. **Security** — secret leakage in logs, missing auth/identity check, substitution-boundary regression, exfil paths
-3. **Resource & error handling** — see Rust path-scoped instructions
-4. **Performance** — only when measurable in a hot path; otherwise skip
-5. **Style** — skip entirely (pre-commit handles it)
+3. **Architecture drift** — new source of truth, boundary bypass, god-object growth, stringly core logic, background state without ownership
+4. **Resource & error handling** — see Rust path-scoped instructions
+5. **Performance** — only when measurable in a hot path; otherwise skip
+6. **Style** — skip entirely (pre-commit handles it)
 
 ## Tools that already gate — don't re-flag what they catch
 
 - `cargo fmt --all -- --check` — formatting (pre-commit blocks commit)
 - `cargo clippy --all-targets -- -D warnings` — lints (pre-commit + CI)
+- `ruby scripts/check-architecture-ratchets.rb` — large-module budgets and watched pattern counts. Do not repeat the raw line-budget failure as a review comment unless the diff shows the architectural reason it grew or an obvious split point.
 - `gitleaks protect --staged` — secret scanning (pre-commit) + the same scan in CI. `.gitleaks.toml` allowlists by **path** (`tests/**/fixtures/`, `docs/rfcs/*.md`, lockfiles, `crates/paste-server/src/lib.rs`, `*.example.*`) and by **regex** (loopback IPs, RFC 5737 doc-ranges, a few specific inherited-from-main values). If a finding is inside an allowlisted path or matches an allowlisted regex, do NOT re-flag it — it's intentional. Findings outside the allowlist are real.
 
 ## Project context — NOT bugs despite looking like them
@@ -28,6 +30,19 @@ Self-hosted security gateway between AI agents and the rest of the world. Multi-
 - References to `zeroclaw_*` (no trailing `ed`) are the upstream third-party tool we wrap, NOT pre-rename leftover of this project.
 - Mixed Rust edition (2021 + 2024) is known and tracked. Do NOT suggest the bump unless the PR is explicitly about edition migration.
 
+## Architecture drift worth flagging
+
+Only flag these when the diff gives concrete evidence and a local fix or narrower design question is available:
+
+1. **Growing an existing oversized module** — especially `commands.rs`, channel modules, installer executor, proxy handlers, config, or doctor — when the added behavior could live behind a typed helper/module boundary.
+2. **New duplicate source of truth** — adapter kinds, model identifiers, channel capabilities, secret policy, gateway routing, install paths, or lifecycle state copied into a second registry/table without a synchronization plan.
+3. **Stringly core decisions** — security, routing, model selection, adapter lifecycle, approval, or persistence logic operating directly on raw `String`, `Vec<String>`, positional args, or `HashMap<String, String>` after the external boundary has already been crossed.
+4. **Background work without ownership** — spawned tasks that mutate shared state, swallow errors, or outlive the request/channel/session without a cancellation and reporting path.
+5. **Gateway/proxy bypasses** — new provider, fetch, exec, browser, or agent network paths that avoid Calciforge's configured model gateway/security proxy without being explicitly documented as opt-out or unenforceable.
+6. **Config/docs/test drift** — new channel, adapter, model gateway, or security config fields without matching docs and compile/smoke coverage.
+
+Do not ask for a broad rewrite. Prefer comments like: "This adds another command sub-flow to `commands.rs`; can this live in `commands/<domain>.rs` with a typed request enum so the ratchet budget does not keep rising?"
+
 ## Self-discipline
 
 - Do NOT repeat a comment already made on a parent or sibling PR in the same stack. If the same observation was raised on PR #N and merged/addressed, do not re-raise on PR #N+1. Past noisy patterns: the dead-doc-reference comment was posted four times across PRs #20/#23/#25; the env-mutex/`serial_test` comment was posted eight+ times across #19/#22/#23.

diff --git a/.github/instructions/rust.instructions.md b/.github/instructions/rust.instructions.md
@@ -25,6 +25,12 @@ These extend `.github/copilot-instructions.md`. Same review philosophy: **if unc
 - `tokio::select!` branches must be cancellation-safe. If a branch holds partial state across `.await` (e.g., a half-written buffer), losing the race silently corrupts state. Worth flagging if non-obvious.
 - `tokio::process::Command` without `.kill_on_drop(true)` leaks the child if the parent task is dropped mid-await. Flag for long-running children; skip for one-shot commands that are awaited to completion.
 - Spawned `JoinHandle`s that are dropped silently swallow panics. Worth flagging for long-running tasks; not for fire-and-forget helpers.
+- New `tokio::spawn` or thread-spawned work that mutates shared state should have an obvious owner, cancellation/error reporting path, and ordering story. Flag detached background state changes that make request/session/channel lifecycle implicit.
+
+## Boundary hygiene
+
+- Raw config/protocol/CLI values should be converted into typed structs or enums before security, routing, lifecycle, persistence, or model-selection decisions. Flag new core logic that keeps branching on arbitrary strings or positional `Vec<String>` indexes when a local typed request/decision type would make invalid states unrepresentable.
+- New registries for adapter kinds, channel capabilities, model identifiers, or secret policy should reuse the existing source of truth. Flag duplicated tables unless the diff includes a synchronization comment/test.
 
 ## Lints / attributes
 

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -33,6 +33,9 @@ jobs:
       - name: Check installer config helpers
         run: python3 scripts/test-upsert-calciforge-agent.py
 
+      - name: Check architecture ratchets
+        run: ruby scripts/check-architecture-ratchets.rb
+
   # ─────────────────────────────────────────────────────────────────────────────
   # Check formatting and linting
   # ─────────────────────────────────────────────────────────────────────────────

diff --git a/AGENTS.md b/AGENTS.md
@@ -50,6 +50,9 @@ User-facing tour: `README.md` → [calciforge.org](https://calciforge.org/).
 10. **Product-contract check before design changes.** Before changing installer behavior, gateway routing, secret handling, agent adapters, channel UX, or model selection, stop and ask: does this API or behavior actually fulfill Calciforge's design intent as a self-hosted security gateway, or merely make the local code pass? Preserve central promises such as one operator-owned secret store, agents never receiving plaintext secrets by default, model traffic flowing through the configured gateway unless explicitly opted out, and channel commands behaving consistently across supported transports.
 11. **Cross-node assumptions must be explicit.** Do not assume a helper binary, config file, fnox vault, MCP server, or environment variable exists on an agent host just because it exists on the Calciforge host. Multi-node features need an explicit propagation model, a runtime smoke test from the agent host, and docs that name whether state is central or local.
 12. **Avoid accidental architecture drift.** If a quick fix creates a second source of truth, bypasses the gateway/proxy, weakens a security boundary, or contradicts a documented roadmap/ADR, treat that as a design bug. Either implement the coherent version or leave a clearly documented follow-up with the user-visible limitation.
+13. **Large files are debt with budgets, not precedent.** `scripts/check-architecture-ratchets.rb` pins current oversized Rust modules to explicit line budgets and fails CI if they grow. New Rust modules should stay under the default budget unless the PR explains the boundary being created and adds a budget consciously.
+14. **Stringly data stays at the boundary.** It is acceptable for config, JSON, CLI args, and protocol payloads to enter as `String`, `Vec<String>`, or `HashMap<String, String>`, but core logic should convert them into typed structs/enums before making security, routing, lifecycle, or persistence decisions.
+15. **Detached work needs an owner.** New `tokio::spawn` or thread-spawned work must have an explicit lifecycle owner, cancellation/error path, and state handoff. Do not update shared mutable state from background tasks unless the owning module documents the ordering and failure behavior.
 
 ## Build / test
 

diff --git a/crates/calciforge/src/commands.rs b/crates/calciforge/src/commands.rs
@@ -34,6 +34,10 @@ use crate::messages::{ChoiceControl, ChoiceOption, Match, OutboundMessage};
 use crate::model_names::configured_first_class_model_ids;
 use crate::providers::alloy::AlloyManager;
 
+mod parser;
+
+use parser::{command_suggestion, command_token, first_arg, second_arg};
+
 const PENDING_CHOICE_TTL: Duration = Duration::from_secs(10 * 60);
 
 /// Default state directory: `~/.config/calciforge/state/`.
@@ -94,18 +98,6 @@ fn session_runtime_readiness_error(agent_cfg: &crate::config::AgentConfig) -> Op
     }
 }
 
-fn first_arg(text: &str) -> Option<&str> {
-    text.split_whitespace().nth(1)
-}
-
-fn second_arg(text: &str) -> Option<&str> {
-    text.split_whitespace().nth(2)
-}
-
-fn command_token(text: &str) -> &str {
-    text.split_whitespace().next().unwrap_or("")
-}
-
 fn gateway_model_selector_ids(config: &CalciforgeConfig) -> HashSet<String> {
     configured_first_class_model_ids(config)
         .into_iter()
@@ -160,65 +152,6 @@ impl fmt::Display for AgentChoiceError {
     }
 }
 
-fn command_suggestion(cmd: &str) -> Option<&'static str> {
-    const MAX_FUZZY_COMMAND_CHARS: usize = 64;
-    const COMMANDS: &[&str] = &[
-        "!help",
-        "!status",
-        "!agents",
-        "!agent",
-        "!sessions",
-        "!session",
-        "!new",
-        "!btw",
-        "!gateway",
-        "!metrics",
-        "!ping",
-        "!switch",
-        "!default",
-        "!model",
-        "!secure",
-        "!secret",
-        "!approve",
-        "!deny",
-    ];
-
-    let lower = cmd.to_lowercase();
-    let without_bang = lower.trim_start_matches('!');
-    if without_bang.chars().count() > MAX_FUZZY_COMMAND_CHARS {
-        return None;
-    }
-
-    COMMANDS
-        .iter()
-        .copied()
-        .find(|candidate| candidate.trim_start_matches('!') == without_bang)
-        .or_else(|| {
-            COMMANDS.iter().copied().find(|candidate| {
-                levenshtein_distance(without_bang, candidate.trim_start_matches('!')) <= 2
-            })
-        })
-}
-
-fn levenshtein_distance(a: &str, b: &str) -> usize {
-    let b_len = b.chars().count();
-    let mut costs: Vec<usize> = (0..=b_len).collect();
-
-    for (i, ca) in a.chars().enumerate() {
-        let mut previous = costs[0];
-        costs[0] = i + 1;
-        for (j, cb) in b.chars().enumerate() {
-            let insertion = costs[j + 1] + 1;
-            let deletion = costs[j] + 1;
-            let substitution = previous + usize::from(ca != cb);
-            previous = costs[j + 1];
-            costs[j + 1] = insertion.min(deletion).min(substitution);
-        }
-    }
-
-    costs[b_len]
-}
-
 /// Load persisted active-agent selections from a given state directory.
 /// Returns an empty map if the file doesn't exist or can't be parsed.
 fn load_active_agents_from(state_dir: &Path) -> HashMap<String, String> {

diff --git a/crates/calciforge/src/commands/parser.rs b/crates/calciforge/src/commands/parser.rs
@@ -0,0 +1,105 @@
+const MAX_FUZZY_COMMAND_CHARS: usize = 64;
+
+const COMMANDS: &[&str] = &[
+    "!help",
+    "!status",
+    "!agents",
+    "!agent",
+    "!sessions",
+    "!session",
+    "!new",
+    "!btw",
+    "!gateway",
+    "!metrics",
+    "!ping",
+    "!switch",
+    "!default",
+    "!model",
+    "!secure",
+    "!secret",
+    "!approve",
+    "!deny",
+];
+
+pub(super) fn first_arg(text: &str) -> Option<&str> {
+    text.split_whitespace().nth(1)
+}
+
+pub(super) fn second_arg(text: &str) -> Option<&str> {
+    text.split_whitespace().nth(2)
+}
+
+pub(super) fn command_token(text: &str) -> &str {
+    text.split_whitespace().next().unwrap_or("")
+}
+
+pub(super) fn command_suggestion(cmd: &str) -> Option<&'static str> {
+    let raw_without_bang = cmd.trim_start_matches('!');
+    if raw_without_bang.chars().count() > MAX_FUZZY_COMMAND_CHARS {
+        return None;
+    }
+
+    let lower = raw_without_bang.to_lowercase();
+
+    COMMANDS
+        .iter()
+        .copied()
+        .find(|candidate| candidate.trim_start_matches('!') == lower)
+        .or_else(|| {
+            COMMANDS.iter().copied().find(|candidate| {
+                levenshtein_distance(&lower, candidate.trim_start_matches('!')) <= 2
+            })
+        })
+}
+
+fn levenshtein_distance(a: &str, b: &str) -> usize {
+    let b_len = b.chars().count();
+    let mut costs: Vec<usize> = (0..=b_len).collect();
+
+    for (i, ca) in a.chars().enumerate() {
+        let mut previous = costs[0];
+        costs[0] = i + 1;
+        for (j, cb) in b.chars().enumerate() {
+            let insertion = costs[j + 1] + 1;
+            let deletion = costs[j] + 1;
+            let substitution = previous + usize::from(ca != cb);
+            previous = costs[j + 1];
+            costs[j + 1] = insertion.min(deletion).min(substitution);
+        }
+    }
+
+    costs[b_len]
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn command_token_ignores_surrounding_whitespace() {
+        assert_eq!(command_token("  !agent list  "), "!agent");
+        assert_eq!(command_token(""), "");
+        assert_eq!(command_token("   "), "");
+    }
+
+    #[test]
+    fn positional_args_are_whitespace_based() {
+        assert_eq!(first_arg("!agent details librarian"), Some("details"));
+        assert_eq!(second_arg("!agent details librarian"), Some("librarian"));
+        assert_eq!(first_arg("!agent"), None);
+        assert_eq!(second_arg("!agent details"), None);
+    }
+
+    #[test]
+    fn command_suggestions_accept_missing_bang_and_small_typos() {
+        assert_eq!(command_suggestion("agents"), Some("!agents"));
+        assert_eq!(command_suggestion("!stats"), Some("!status"));
+        assert_eq!(command_suggestion("!defualt"), Some("!default"));
+    }
+
+    #[test]
+    fn command_suggestions_ignore_unbounded_inputs() {
+        let long = format!("!{}", "x".repeat(MAX_FUZZY_COMMAND_CHARS + 1));
+        assert_eq!(command_suggestion(&long), None);
+    }
+}
diff --git a/scripts/check-architecture-ratchets.rb b/scripts/check-architecture-ratchets.rb
@@ -0,0 +1,106 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+
+require "find"
+require "pathname"
+
+ROOT = Pathname.new(__dir__).join("..").expand_path
+MAX_NEW_RUST_LINES = 700
+
+# Existing large files are technical debt, not precedent. Their budgets are
+# pinned to the line counts from the first architecture-ratchet pass; growing
+# one of these files should be an explicit decision, preferably paired with a
+# split or a tighter follow-up budget.
+RUST_LINE_BUDGETS = {
+  "crates/adversary-detector/src/proxy.rs" => 730,
+  "crates/adversary-detector/src/scanner.rs" => 1387,
+  "crates/calciforge/src/adapters/codex_cli.rs" => 752,
+  "crates/calciforge/src/adapters/mod.rs" => 1374,
+  "crates/calciforge/src/adapters/openclaw_channel.rs" => 1768,
+  "crates/calciforge/src/channels/matrix.rs" => 1815,
+  "crates/calciforge/src/channels/mock.rs" => 804,
+  "crates/calciforge/src/channels/signal.rs" => 1066,
+  "crates/calciforge/src/channels/sms.rs" => 958,
+  "crates/calciforge/src/channels/telegram.rs" => 2088,
+  "crates/calciforge/src/channels/whatsapp.rs" => 1291,
+  "crates/calciforge/src/commands.rs" => 4890,
+  "crates/calciforge/src/config.rs" => 2365,
+  "crates/calciforge/src/config/validator.rs" => 2419,
+  "crates/calciforge/src/doctor.rs" => 3580,
+  "crates/calciforge/src/install/cli.rs" => 1071,
+  "crates/calciforge/src/install/executor.rs" => 3681,
+  "crates/calciforge/src/install/linux_hardening.rs" => 751,
+  "crates/calciforge/src/install/model.rs" => 819,
+  "crates/calciforge/src/install/ssh.rs" => 1124,
+  "crates/calciforge/src/install/wizard.rs" => 701,
+  "crates/calciforge/src/providers/alloy.rs" => 1126,
+  "crates/calciforge/src/proxy/gateway.rs" => 1001,
+  "crates/calciforge/src/proxy/handlers.rs" => 2386,
+  "crates/host-agent/src/main.rs" => 1288,
+  "crates/paste-server/src/lib.rs" => 2623,
+  "crates/security-proxy/src/mitm.rs" => 1573,
+  "crates/security-proxy/src/proxy.rs" => 2296,
+  "crates/security-proxy/src/substitution.rs" => 912,
+  "crates/secrets-client/src/fnox_client.rs" => 787
+}.freeze
+
+WATCH_PATTERNS = {
+  "Arc<Mutex" => /Arc<Mutex/,
+  "Arc<RwLock" => /Arc<RwLock/,
+  "HashMap<String, String>" => /HashMap\s*<\s*String\s*,\s*String\s*>/,
+  "Vec<String>" => /Vec\s*<\s*String\s*>/,
+  "tokio::spawn" => /tokio::spawn\s*\(/,
+  "positional indexing" => /\[[0-9]+\]/,
+  "unsafe block" => /unsafe\s*\{/
+}.freeze
+
+def rust_files
+  files = []
+  Find.find(ROOT.join("crates").to_s) do |path|
+    next unless path.end_with?(".rs")
+    next if path.include?("/target/")
+
+    files << Pathname.new(path)
+  end
+  files.sort
+end
+
+def relative(path)
+  path.relative_path_from(ROOT).to_s
+end
+
+failed = false
+pattern_counts = Hash.new(0)
+
+rust_files.each do |file|
+  rel = relative(file)
+  text = file.read
+  line_count = text.lines.count
+  budget = RUST_LINE_BUDGETS.fetch(rel, MAX_NEW_RUST_LINES)
+
+  if line_count > budget
+    warn "#{rel}: #{line_count} lines exceeds architecture budget #{budget}"
+    failed = true
+  end
+
+  WATCH_PATTERNS.each do |name, regex|
+    pattern_counts[name] += text.scan(regex).count
+  end
+end
+
+missing_budget_files = RUST_LINE_BUDGETS.keys.reject { |path| ROOT.join(path).file? }
+unless missing_budget_files.empty?
+  warn "architecture budget references missing files:"
+  missing_budget_files.each { |path| warn "  #{path}" }
+  failed = true
+end
+
+puts "Architecture ratchets:"
+puts "  max lines for new Rust modules: #{MAX_NEW_RUST_LINES}"
+puts "  pinned large-module budgets: #{RUST_LINE_BUDGETS.length}"
+puts "  watched pattern counts:"
+pattern_counts.sort.each do |name, count|
+  puts "    #{name}: #{count}"
+end
+
+abort("architecture ratchets failed") if failed