Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 18 additions & 3 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,16 @@ Self-hosted security gateway between AI agents and the rest of the world. Multi-

1. **Correctness** — does the code do what its name/comments/tests claim?
2. **Security** — secret leakage in logs, missing auth/identity check, substitution-boundary regression, exfil paths
3. **Resource & error handling** — see Rust path-scoped instructions
4. **Performance** — only when measurable in a hot path; otherwise skip
5. **Style** — skip entirely (pre-commit handles it)
3. **Architecture drift** — new source of truth, boundary bypass, god-object growth, stringly core logic, background state without ownership
4. **Resource & error handling** — see Rust path-scoped instructions
5. **Performance** — only when measurable in a hot path; otherwise skip
6. **Style** — skip entirely (pre-commit handles it)

## Tools that already gate — don't re-flag what they catch

- `cargo fmt --all -- --check` — formatting (pre-commit blocks commit)
- `cargo clippy --all-targets -- -D warnings` — lints (pre-commit + CI)
- `ruby scripts/check-architecture-ratchets.rb` — large-module budgets and watched pattern counts. Do not repeat the raw line-budget failure as a review comment unless the diff shows the architectural reason it grew or an obvious split point.
- `gitleaks protect --staged` — secret scanning (pre-commit) + the same scan in CI. `.gitleaks.toml` allowlists by **path** (`tests/**/fixtures/`, `docs/rfcs/*.md`, lockfiles, `crates/paste-server/src/lib.rs`, `*.example.*`) and by **regex** (loopback IPs, RFC 5737 doc-ranges, a few specific inherited-from-main values). If a finding is inside an allowlisted path or matches an allowlisted regex, do NOT re-flag it — it's intentional. Findings outside the allowlist are real.

## Project context — NOT bugs despite looking like them
Expand All @@ -28,6 +30,19 @@ Self-hosted security gateway between AI agents and the rest of the world. Multi-
- References to `zeroclaw_*` (no trailing `ed`) are the upstream third-party tool we wrap, NOT pre-rename leftover of this project.
- Mixed Rust edition (2021 + 2024) is known and tracked. Do NOT suggest the bump unless the PR is explicitly about edition migration.

## Architecture drift worth flagging

Only flag these when the diff gives concrete evidence and a local fix or narrower design question is available:

1. **Growing an existing oversized module** — especially `commands.rs`, channel modules, installer executor, proxy handlers, config, or doctor — when the added behavior could live behind a typed helper/module boundary.
2. **New duplicate source of truth** — adapter kinds, model identifiers, channel capabilities, secret policy, gateway routing, install paths, or lifecycle state copied into a second registry/table without a synchronization plan.
3. **Stringly core decisions** — security, routing, model selection, adapter lifecycle, approval, or persistence logic operating directly on raw `String`, `Vec<String>`, positional args, or `HashMap<String, String>` after the external boundary has already been crossed.
4. **Background work without ownership** — spawned tasks that mutate shared state, swallow errors, or outlive the request/channel/session without a cancellation and reporting path.
5. **Gateway/proxy bypasses** — new provider, fetch, exec, browser, or agent network paths that avoid Calciforge's configured model gateway/security proxy without being explicitly documented as opt-out or unenforceable.
6. **Config/docs/test drift** — new channel, adapter, model gateway, or security config fields without matching docs and compile/smoke coverage.

Do not ask for a broad rewrite. Prefer comments like: "This adds another command sub-flow to `commands.rs`; can this live in `commands/<domain>.rs` with a typed request enum so the ratchet budget does not keep rising?"

## Self-discipline

- Do NOT repeat a comment already made on a parent or sibling PR in the same stack. If the same observation was raised on PR #N and merged/addressed, do not re-raise on PR #N+1. Past noisy patterns: the dead-doc-reference comment was posted four times across PRs #20/#23/#25; the env-mutex/`serial_test` comment was posted eight+ times across #19/#22/#23.
Expand Down
6 changes: 6 additions & 0 deletions .github/instructions/rust.instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ These extend `.github/copilot-instructions.md`. Same review philosophy: **if unc
- `tokio::select!` branches must be cancellation-safe. If a branch holds partial state across `.await` (e.g., a half-written buffer), losing the race silently corrupts state. Worth flagging if non-obvious.
- `tokio::process::Command` without `.kill_on_drop(true)` leaks the child if the parent task is dropped mid-await. Flag for long-running children; skip for one-shot commands that are awaited to completion.
- Spawned `JoinHandle`s that are dropped silently swallow panics. Worth flagging for long-running tasks; not for fire-and-forget helpers.
- New `tokio::spawn` or thread-spawned work that mutates shared state should have an obvious owner, cancellation/error reporting path, and ordering story. Flag detached background state changes that make request/session/channel lifecycle implicit.

## Boundary hygiene

- Raw config/protocol/CLI values should be converted into typed structs or enums before security, routing, lifecycle, persistence, or model-selection decisions. Flag new core logic that keeps branching on arbitrary strings or positional `Vec<String>` indexes when a local typed request/decision type would make invalid states unrepresentable.
- New registries for adapter kinds, channel capabilities, model identifiers, or secret policy should reuse the existing source of truth. Flag duplicated tables unless the diff includes a synchronization comment/test.

## Lints / attributes

Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ jobs:
- name: Check installer config helpers
run: python3 scripts/test-upsert-calciforge-agent.py

- name: Check architecture ratchets
run: ruby scripts/check-architecture-ratchets.rb

# ─────────────────────────────────────────────────────────────────────────────
# Check formatting and linting
# ─────────────────────────────────────────────────────────────────────────────
Expand Down
3 changes: 3 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ User-facing tour: `README.md` → [calciforge.org](https://calciforge.org/).
10. **Product-contract check before design changes.** Before changing installer behavior, gateway routing, secret handling, agent adapters, channel UX, or model selection, stop and ask: does this API or behavior actually fulfill Calciforge's design intent as a self-hosted security gateway, or merely make the local code pass? Preserve central promises such as one operator-owned secret store, agents never receiving plaintext secrets by default, model traffic flowing through the configured gateway unless explicitly opted out, and channel commands behaving consistently across supported transports.
11. **Cross-node assumptions must be explicit.** Do not assume a helper binary, config file, fnox vault, MCP server, or environment variable exists on an agent host just because it exists on the Calciforge host. Multi-node features need an explicit propagation model, a runtime smoke test from the agent host, and docs that name whether state is central or local.
12. **Avoid accidental architecture drift.** If a quick fix creates a second source of truth, bypasses the gateway/proxy, weakens a security boundary, or contradicts a documented roadmap/ADR, treat that as a design bug. Either implement the coherent version or leave a clearly documented follow-up with the user-visible limitation.
13. **Large files are debt with budgets, not precedent.** `scripts/check-architecture-ratchets.rb` pins current oversized Rust modules to explicit line budgets and fails CI if they grow. New Rust modules should stay under the default budget unless the PR explains the boundary being created and adds a budget consciously.
14. **Stringly data stays at the boundary.** It is acceptable for config, JSON, CLI args, and protocol payloads to enter as `String`, `Vec<String>`, or `HashMap<String, String>`, but core logic should convert them into typed structs/enums before making security, routing, lifecycle, or persistence decisions.
15. **Detached work needs an owner.** New `tokio::spawn` or thread-spawned work must have an explicit lifecycle owner, cancellation/error path, and state handoff. Do not update shared mutable state from background tasks unless the owning module documents the ordering and failure behavior.

## Build / test

Expand Down
75 changes: 4 additions & 71 deletions crates/calciforge/src/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@ use crate::messages::{ChoiceControl, ChoiceOption, Match, OutboundMessage};
use crate::model_names::configured_first_class_model_ids;
use crate::providers::alloy::AlloyManager;

mod parser;

use parser::{command_suggestion, command_token, first_arg, second_arg};

const PENDING_CHOICE_TTL: Duration = Duration::from_secs(10 * 60);

/// Default state directory: `~/.config/calciforge/state/`.
Expand Down Expand Up @@ -94,18 +98,6 @@ fn session_runtime_readiness_error(agent_cfg: &crate::config::AgentConfig) -> Op
}
}

fn first_arg(text: &str) -> Option<&str> {
text.split_whitespace().nth(1)
}

fn second_arg(text: &str) -> Option<&str> {
text.split_whitespace().nth(2)
}

fn command_token(text: &str) -> &str {
text.split_whitespace().next().unwrap_or("")
}

fn gateway_model_selector_ids(config: &CalciforgeConfig) -> HashSet<String> {
configured_first_class_model_ids(config)
.into_iter()
Expand Down Expand Up @@ -160,65 +152,6 @@ impl fmt::Display for AgentChoiceError {
}
}

fn command_suggestion(cmd: &str) -> Option<&'static str> {
const MAX_FUZZY_COMMAND_CHARS: usize = 64;
const COMMANDS: &[&str] = &[
"!help",
"!status",
"!agents",
"!agent",
"!sessions",
"!session",
"!new",
"!btw",
"!gateway",
"!metrics",
"!ping",
"!switch",
"!default",
"!model",
"!secure",
"!secret",
"!approve",
"!deny",
];

let lower = cmd.to_lowercase();
let without_bang = lower.trim_start_matches('!');
if without_bang.chars().count() > MAX_FUZZY_COMMAND_CHARS {
return None;
}

COMMANDS
.iter()
.copied()
.find(|candidate| candidate.trim_start_matches('!') == without_bang)
.or_else(|| {
COMMANDS.iter().copied().find(|candidate| {
levenshtein_distance(without_bang, candidate.trim_start_matches('!')) <= 2
})
})
}

fn levenshtein_distance(a: &str, b: &str) -> usize {
let b_len = b.chars().count();
let mut costs: Vec<usize> = (0..=b_len).collect();

for (i, ca) in a.chars().enumerate() {
let mut previous = costs[0];
costs[0] = i + 1;
for (j, cb) in b.chars().enumerate() {
let insertion = costs[j + 1] + 1;
let deletion = costs[j] + 1;
let substitution = previous + usize::from(ca != cb);
previous = costs[j + 1];
costs[j + 1] = insertion.min(deletion).min(substitution);
}
}

costs[b_len]
}

/// Load persisted active-agent selections from a given state directory.
/// Returns an empty map if the file doesn't exist or can't be parsed.
fn load_active_agents_from(state_dir: &Path) -> HashMap<String, String> {
Expand Down
105 changes: 105 additions & 0 deletions crates/calciforge/src/commands/parser.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
const MAX_FUZZY_COMMAND_CHARS: usize = 64;

const COMMANDS: &[&str] = &[
"!help",
"!status",
"!agents",
"!agent",
"!sessions",
"!session",
"!new",
"!btw",
"!gateway",
"!metrics",
"!ping",
"!switch",
"!default",
"!model",
"!secure",
"!secret",
"!approve",
"!deny",
];

pub(super) fn first_arg(text: &str) -> Option<&str> {
text.split_whitespace().nth(1)
}

pub(super) fn second_arg(text: &str) -> Option<&str> {
text.split_whitespace().nth(2)
}

pub(super) fn command_token(text: &str) -> &str {
text.split_whitespace().next().unwrap_or("")
}

pub(super) fn command_suggestion(cmd: &str) -> Option<&'static str> {
let raw_without_bang = cmd.trim_start_matches('!');
if raw_without_bang.chars().count() > MAX_FUZZY_COMMAND_CHARS {
return None;
}
Comment thread
bglusman marked this conversation as resolved.

let lower = raw_without_bang.to_lowercase();

COMMANDS
.iter()
.copied()
.find(|candidate| candidate.trim_start_matches('!') == lower)
.or_else(|| {
COMMANDS.iter().copied().find(|candidate| {
levenshtein_distance(&lower, candidate.trim_start_matches('!')) <= 2
})
})
}

fn levenshtein_distance(a: &str, b: &str) -> usize {
let b_len = b.chars().count();
let mut costs: Vec<usize> = (0..=b_len).collect();

for (i, ca) in a.chars().enumerate() {
let mut previous = costs[0];
costs[0] = i + 1;
for (j, cb) in b.chars().enumerate() {
let insertion = costs[j + 1] + 1;
let deletion = costs[j] + 1;
let substitution = previous + usize::from(ca != cb);
previous = costs[j + 1];
costs[j + 1] = insertion.min(deletion).min(substitution);
}
}

costs[b_len]
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn command_token_ignores_surrounding_whitespace() {
assert_eq!(command_token(" !agent list "), "!agent");
assert_eq!(command_token(""), "");
assert_eq!(command_token(" "), "");
}

#[test]
fn positional_args_are_whitespace_based() {
assert_eq!(first_arg("!agent details librarian"), Some("details"));
assert_eq!(second_arg("!agent details librarian"), Some("librarian"));
assert_eq!(first_arg("!agent"), None);
assert_eq!(second_arg("!agent details"), None);
}

#[test]
fn command_suggestions_accept_missing_bang_and_small_typos() {
assert_eq!(command_suggestion("agents"), Some("!agents"));
assert_eq!(command_suggestion("!stats"), Some("!status"));
assert_eq!(command_suggestion("!defualt"), Some("!default"));
}

#[test]
fn command_suggestions_ignore_unbounded_inputs() {
let long = format!("!{}", "x".repeat(MAX_FUZZY_COMMAND_CHARS + 1));
assert_eq!(command_suggestion(&long), None);
}
}
106 changes: 106 additions & 0 deletions scripts/check-architecture-ratchets.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/usr/bin/env ruby
# frozen_string_literal: true

require "find"
require "pathname"

ROOT = Pathname.new(__dir__).join("..").expand_path
MAX_NEW_RUST_LINES = 700

# Existing large files are technical debt, not precedent. Their budgets are
# pinned to the line counts from the first architecture-ratchet pass; growing
# one of these files should be an explicit decision, preferably paired with a
# split or a tighter follow-up budget.
RUST_LINE_BUDGETS = {
"crates/adversary-detector/src/proxy.rs" => 730,
"crates/adversary-detector/src/scanner.rs" => 1387,
"crates/calciforge/src/adapters/codex_cli.rs" => 752,
"crates/calciforge/src/adapters/mod.rs" => 1374,
"crates/calciforge/src/adapters/openclaw_channel.rs" => 1768,
"crates/calciforge/src/channels/matrix.rs" => 1815,
"crates/calciforge/src/channels/mock.rs" => 804,
"crates/calciforge/src/channels/signal.rs" => 1066,
"crates/calciforge/src/channels/sms.rs" => 958,
"crates/calciforge/src/channels/telegram.rs" => 2088,
"crates/calciforge/src/channels/whatsapp.rs" => 1291,
"crates/calciforge/src/commands.rs" => 4890,
"crates/calciforge/src/config.rs" => 2365,
"crates/calciforge/src/config/validator.rs" => 2419,
"crates/calciforge/src/doctor.rs" => 3580,
"crates/calciforge/src/install/cli.rs" => 1071,
Comment thread
bglusman marked this conversation as resolved.
"crates/calciforge/src/install/executor.rs" => 3681,
"crates/calciforge/src/install/linux_hardening.rs" => 751,
"crates/calciforge/src/install/model.rs" => 819,
"crates/calciforge/src/install/ssh.rs" => 1124,
"crates/calciforge/src/install/wizard.rs" => 701,
"crates/calciforge/src/providers/alloy.rs" => 1126,
"crates/calciforge/src/proxy/gateway.rs" => 1001,
"crates/calciforge/src/proxy/handlers.rs" => 2386,
"crates/host-agent/src/main.rs" => 1288,
"crates/paste-server/src/lib.rs" => 2623,
"crates/security-proxy/src/mitm.rs" => 1573,
"crates/security-proxy/src/proxy.rs" => 2296,
"crates/security-proxy/src/substitution.rs" => 912,
"crates/secrets-client/src/fnox_client.rs" => 787
}.freeze

WATCH_PATTERNS = {
"Arc<Mutex" => /Arc<Mutex/,
"Arc<RwLock" => /Arc<RwLock/,
"HashMap<String, String>" => /HashMap\s*<\s*String\s*,\s*String\s*>/,
"Vec<String>" => /Vec\s*<\s*String\s*>/,
"tokio::spawn" => /tokio::spawn\s*\(/,
"positional indexing" => /\[[0-9]+\]/,
"unsafe block" => /unsafe\s*\{/
}.freeze

def rust_files
files = []
Find.find(ROOT.join("crates").to_s) do |path|
next unless path.end_with?(".rs")
next if path.include?("/target/")

files << Pathname.new(path)
end
files.sort
end

def relative(path)
path.relative_path_from(ROOT).to_s
end

failed = false
pattern_counts = Hash.new(0)

rust_files.each do |file|
rel = relative(file)
text = file.read
line_count = text.lines.count
budget = RUST_LINE_BUDGETS.fetch(rel, MAX_NEW_RUST_LINES)

if line_count > budget
warn "#{rel}: #{line_count} lines exceeds architecture budget #{budget}"
failed = true
end

WATCH_PATTERNS.each do |name, regex|
pattern_counts[name] += text.scan(regex).count
end
end

missing_budget_files = RUST_LINE_BUDGETS.keys.reject { |path| ROOT.join(path).file? }
unless missing_budget_files.empty?
warn "architecture budget references missing files:"
missing_budget_files.each { |path| warn " #{path}" }
failed = true
end

puts "Architecture ratchets:"
puts " max lines for new Rust modules: #{MAX_NEW_RUST_LINES}"
puts " pinned large-module budgets: #{RUST_LINE_BUDGETS.length}"
puts " watched pattern counts:"
pattern_counts.sort.each do |name, count|
puts " #{name}: #{count}"
end

abort("architecture ratchets failed") if failed
Loading