feat: add agent lifecycle hooks (Claude Code-compatible config)#7411
feat: add agent lifecycle hooks (Claude Code-compatible config)#7411tlongwell-block wants to merge 13 commits intomainfrom
Conversation
78bd7d1 to
b9ec8bd
Compare
Hooks fire at key agent lifecycle events and execute user-configured actions via MCP tools. Supports blocking (prompt submit, tool use) and context injection (PostCompact). - 8 of 16 events wired: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure, PreCompact, PostCompact, Stop - Two action types: command (via developer__shell) and mcp_tool - Config reads .goose/settings.json, .claude/settings.json, or ~/.config/goose/hooks.json with forward-compatible parsing - Fail-open: hook errors never crash the agent
b9ec8bd to
8a2bbf7
Compare
Addresses issues identified in multi-round crossfire review:
- Remove Option<Hooks> wrapping; Hooks::load() returns empty default on
error, eliminating 12 if-let-Some guards across agent integration
- Extract run_post_compact_hook() helper to deduplicate 3 identical
PostCompact context-injection blocks
- Fix SessionStart to fire once per session (first user message) instead
of on every reply
- Thread parent CancellationToken through all hooks.run() calls so
session cancellation properly cancels running hooks
- Standardize blocked-tool errors on Err(ErrorData) — Ok(CallToolResult
{is_error:true}) was silently treated as success by LLM formatters
- Tighten tool name matching from contains() to exact equality
- Replace hand-rolled trailing-* glob with glob crate for full pattern
support
- Change manual_compact from Option<bool> to bool (always set for
compact events, defaults false otherwise)
- Delete unused Hooks::fire() method (all call sites use run())
0976d80 to
e4b822b
Compare
Claude Code supports "prompt" and "agent" hook action types that goose doesn't implement yet. Previously, a .claude/settings.json containing these types would fail serde deserialization entirely, silently dropping all hooks in the file — including valid "command" hooks. Adds a custom deserializer for HookEventConfig.hooks that parses each action individually: - Known types (command, mcp_tool): deserialize normally, warn on malformed config with the actual error - Unknown types (prompt, agent, etc.): warn and skip - Missing type field: warn and skip This matches the existing pattern in HookSettingsFile where unknown event names are warned and skipped rather than causing parse failures.
…lUse, PostToolUseFailure - Add inject_hook_context helper, refactor run_post_compact_hook to use it - Wire context injection for 4 additional events (was PostCompact-only) - mcp_tool parse_result: non-HookResult output becomes additional_context - 32KB truncation on mcp_tool output matching command path - Context injected as user message (hidden from user, visible to agent)
|
Hi @tlongwell-block — we've been building a similar hooks system from the Claude Code side and wanted to share some findings that might be useful for this PR. Our implementation: https://github.com/sovren-software/goose/tree/feat/lifecycle-hooks Architecture comparisonWe took a subprocess-first approach (direct process spawning, matching how Claude Code itself executes hooks) vs. your MCP-first approach. A few design differences worth discussing: Config format: We used the Claude Code hooks:
session_start:
- command: "/path/to/hook.sh"
timeout: 15
pre_tool_use:
- command: "/path/to/guard.sh"
tool_name: "developer__shell" # regex filterSubprocess-first trade-off: Direct spawning means hooks work without an active MCP extension. The downside is no Bugs we found and fixed (may apply here too)1. SessionStart race condition (critical)If // Wrong: spawn races with first message
tokio::spawn(async move {
if let Some(ctx) = run_session_start_hooks(&session_id).await { ... }
});
Ok(Json(session))
// Correct: await before response
if let Some(ctx) = run_session_start_hooks(&session_id).await { ... }
Ok(Json(session))2. session_id not in PreToolUse payloadsThe 3. Empty tool_name on parse failureIf a TestsWe have 12 tests (6 unit in Happy to contribute the test file directly to this PR if useful — our test setup (script-writing helpers, async test fixtures) should port cleanly to your architecture. One question on your security model: is |
|
I'm definitely switching to Goose as soon as this feature comes out. 🙌🏻 |
…patible * origin/main: (70 commits) feat: allow goose askai bot to search goose codebase (#7508) Revert "Reapply "fix: prevent crashes in long-running Electron sessions"" Reapply "fix: prevent crashes in long-running Electron sessions" Revert "fix: prevent crashes in long-running Electron sessions" fix: replace unwrap() with graceful error in scheduler execute_job (#7436) fix: Dictation API error message shows incorrect limit (#7423) fix(acp): Use ACP schema types for session/list (#7409) fix(desktop): make bundle and updater asset naming configurable (#7337) fix(openai): preserve order in Responses API history (#7500) Use the correct Goose emoji 🪿 instead of Swan in README.md (#7485) feat(ui): implement fullscreen and pip display modes for MCP Apps (#7312) fix: prevent crashes in long-running Electron sessions Disable tool pair summarization (#7481) fix: New Recipe Warning does not close on cancel (#7524) The client is not the source of truth (#7438) feat: support Anthropic adaptive thinking (#7356) copilot instructions: reword no prerelease docs (#7101) fix(acp): don't fail session creation when model listing is unavailable (#7484) feat: simplify developer extension (#7466) feat: add goose-powered release notes generator workflow (#7503) ... # Conflicts: # Cargo.lock
Replace MCP dispatch + GOOSE_HOOK_EXIT workaround with direct subprocess spawning for hook commands. Leverages PR #7466's new platform developer shell (build_shell_command, user_login_path) for native exit codes, stdin piping, and separate stdout/stderr. New hooks/shell.rs: - Deadlock-safe I/O: stdout + stderr drain concurrently before stdin write - Process group isolation (unix) so Ctrl+C doesn't kill hooks - Timeout + cancellation support with secondary drain timeouts - 30s stdin write timeout prevents hanging on uncooperative children Rewritten hooks/mod.rs: - Split execute_action into execute_command (direct subprocess) and execute_mcp_tool (MCP dispatch, unchanged behavior) - New parse_command_output with native exit code handling: exit 0 = JSON or plain text context, exit 2 = block with stderr reason, other = fail-open - Removed GOOSE_HOOK_EXIT marker parsing entirely - Zero-timeout guard (0 -> 600s) on both Command and McpTool paths HooksOutcome.reason: - Added reason field to propagate hook block reasons - All 3 call sites (agent.rs x2, tool_execution.rs x1) now surface the hook's stderr message instead of hardcoded strings Tests (11 total): - 9 pure-logic tests for parse_command_output covering every exit code path, JSON/non-JSON parsing, truncation, blockable/non-blockable - 2 real-subprocess tests for run_hook_command: stdin piping via jq, exit-2 stderr capture (Claude Code block protocol) Crossfire reviewed: R1 6/10 -> R2 8/10 -> R3 9/10 APPROVE (Opus+Codex)
Regex matchers (Claude Code compat):
- Replace glob/exact-match with regex for tool name matching
- Edit|Write, mcp__memory__.*, Notebook.* patterns now work
- Bash/Bash(pattern) special case preserved, also accepts 'shell'
- Notification matcher also upgraded to regex
- Removed glob dependency from Cargo.toml
ContextFill hook (goose extension):
- New HookEventKind::ContextFill fires when context reaches a
configured percentage threshold (e.g., 70%)
- Matcher is the threshold: {"matcher": "70", ...}
- Fires once per threshold crossing, not every turn while above
- Hook receives current_tokens, context_limit, fill_percentage
- Checked after each LLM response using provider-reported token count
- Output injected as context via inject_hook_context
SessionEnd hook:
- Fires after Stop hook at end of reply_internal
- Side-effect only (cleanup, logging) — cannot block
Tests (15 total):
- 9 exit-code contract tests (unchanged)
- 2 real-subprocess tests (unchanged)
- 4 new regex matcher tests: alternation, wildcard, exact, invalid
Resolved conflicts in: - Cargo.lock (accepted main's version) - agent.rs (kept main's refactored stream pattern + all hooks integration) - tool_execution.rs (kept main's security logging + hooks PreToolUse) - shell.rs (kept main's LazyLock + added pub(crate) accessor for hooks) Also fixed hooks/mod.rs to use new ToolCallContext API and CallToolRequestParams builder pattern from rmcp 1.2.0.
Critical: - ContextFill threshold state now persists across turns via Arc on Agent (previously reset every reply() call due to Hooks::load creating fresh instance) Major: - break→continue in approval loop when PreToolUse hook blocks, so remaining tools still get responses (was leaving empty Message::user() objects) - Shell drain tasks: bounded reads (32KB stdout, 4KB stderr) prevent OOM - Kill entire process group on Unix timeout/cancel (not just direct child) - Cap JSON-parsed HookResult fields (additionalContext, reason) same as plain-text path — prevents oversized context injection via valid JSON Minor: - Regex anchoring: || → && so partially-anchored patterns still get wrapped - Stop/SubagentStop removed from can_block() since block result is never checked - HookResult fields documented as Claude Code compat (unused by goose) - Added libc dependency for process group kill on Unix
- Apply same 32KB/4KB caps to MCP hook JSON-parsed fields (parse_mcp_result was missing caps on the JSON success path) - Fix misleading comment about drain task abort — tasks are safe to detach because reads are bounded and process group is killed
Formatting only — no logic changes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 76cec30c81
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if let Some(last_user_msg) = conversation | ||
| .messages() | ||
| .iter() | ||
| .rev() | ||
| .find(|m| m.role == rmcp::model::Role::User) |
There was a problem hiding this comment.
Use original user prompt for UserPromptSubmit hooks
UserPromptSubmit currently derives its prompt by taking the last Role::User message from conversation, but SessionStart hooks run just before this and may inject hidden user-context messages via inject_hook_context. In that case, the hook receives the injected context instead of the actual user input, so prompt-validation/blocking hooks can make decisions on the wrong text during first-turn flows where SessionStart adds context.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in e8025bd: the user prompt is now captured before SessionStart hooks fire, so UserPromptSubmit always sees the actual user input.
| let invocation = crate::hooks::HookInvocation::pre_compact( | ||
| session_config.id.clone(), | ||
| conversation_to_compact.messages().len(), | ||
| true, // manual = true for explicit compaction |
There was a problem hiding this comment.
Mark auto-compaction hook invocations as automatic
In the auto-compaction path (Exceeded auto-compact threshold...), PreCompact/PostCompact invocations are tagged with manual = true. Hook matching in hooks/mod.rs uses this flag to route manual vs auto, so this mislabels automatic compactions as manual, causing matcher: "auto" hooks to never fire and matcher: "manual" hooks to run unexpectedly.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in e8025bd: the auto-compaction path now correctly passes manual=false to PreCompact/PostCompact hooks.
|
/goose |
Signed-off-by: Douwe Osinga <douwe@squareup.com>
- Fix P2: auto-compaction path now correctly passes manual=false to PreCompact/PostCompact hooks (was incorrectly passing true) - Fix P1: capture user prompt text before SessionStart hooks fire, preventing UserPromptSubmit from receiving injected context instead of the actual user input - Remove comments that restate what the code does Signed-off-by: Douwe Osinga <douwe@squareup.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e8025bd850
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| session_config.id.clone(), | ||
| conversation.clone(), | ||
| tool_call_cut_off, | ||
| 0, |
There was a problem hiding this comment.
Preserve current-turn tool pairs from summarization
maybe_summarize_tool_pairs is now always called with protect_last_n = 0, which means no tool calls from the active turn are protected from summarization. In multi-step tool turns near the cutoff, this can summarize and hide intermediate tool request/response pairs before the model finishes the turn, removing context it still needs for follow-up tool decisions or the final answer. The context_mgmt::tool_ids_to_summarize contract explicitly treats protect_last_n as “current turn” protection, so hard-coding 0 regresses correctness.
Useful? React with 👍 / 👎.
| if fill_pct >= threshold && !fired.contains(&threshold) { | ||
| fired.insert(threshold); | ||
| new_thresholds.push(threshold); |
There was a problem hiding this comment.
Re-arm ContextFill thresholds after usage drops
check_context_fill only inserts thresholds into fired_context_thresholds and never removes them when context usage later drops (for example after compaction). As a result, each threshold fires at most once for the entire session, so later threshold crossings do not trigger hooks again. This breaks the documented “once per threshold crossing” behavior and prevents repeated context-refresh hooks in long sessions with multiple compact/rebuild cycles.
Useful? React with 👍 / 👎.
| return Ok(Box::pin(async_stream::try_stream! { | ||
| yield AgentEvent::Message(Message::assistant().with_text(block_msg)); | ||
| })); |
There was a problem hiding this comment.
Run terminal lifecycle hooks on prompt-block path
When UserPromptSubmit returns a block, reply_internal returns early with a synthetic assistant message. That early return skips the Stop and SessionEnd hook invocations that are executed only at the normal function tail, so cleanup/audit hooks never run for blocked prompts. Any workflows relying on those lifecycle hooks will silently miss blocked-turn exits.
Useful? React with 👍 / 👎.
|
This is a solid architecture and we want hooks in goose, but 1900+ lines in a single PR is a lot to review and merge confidently — can we start simpler and iterate? Here's where the bulk comes from and how to slim it down:
Agent integration is copy-paste (~130 lines across 10 call sites). Every site repeats the same 12-15 line pattern: build invocation → Duplicated parsing. 8 of 18 event kinds are never fired. Suggested phasing:
That gets the same functionality in ~800 lines for the first two PRs, and each follow-up is a clean ~200 line addition. (I also pushed a commit merging main and fixing the two codex findings — auto-compaction was passing |
162 hook timing measurements across 157 sessions: - SessionStart: 17.1ms p50, UserPromptSubmit: 2.8ms, PreToolUse: 3.0ms - Per-session overhead: ~39.6ms (0.76% of session time) 50 agent quality runs (goose-vanilla vs goose-engram, k=5): - Hooks are quality-transparent: 34.0 vs 30.0 mean exec_score (within noise) - Fail-open design verified: hooks fire but don't degrade tasks Architecture comparison vs upstream PR aaif-goose#7411: - 1,045 vs 1,972 LOC (47% reduction) - enum (8 variants) vs struct (15 optional fields) - 0 vs 8 unwired events - 0 vs 1+ rmcp imports - 19 tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Response addresses each of Douwe Osinga's concerns with benchmark data: - God struct replaced with tagged enum (242 vs 334 LOC) - Copy-paste wiring replaced with HookRuntime.emit() (13 sites, ~8 LOC each) - Duplicated parsing unified to single serde_json::from_str - 8 unwired events removed (all 8 defined events fire in production) - Head-to-head benchmark: identical performance, 47% less code Offers to rework PR in suggested 2-phase structure using fork architecture. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Thanks for the detailed review, @DOsinga. You're right that 1,900+ lines in a single PR is too much. I agree with every structural concern you raised, and I've already solved them in my fork. After your feedback, I ran benchmarks and refactored. Here's what I found and what I'd like to propose. Your concerns, my solutions1. "HookInvocation is a god struct (334 lines, 15 optional fields, 18 constructors)" I replaced this with a tagged enum: #[derive(Debug, Clone, Serialize)]
#[serde(tag = "hook_event_name")]
pub enum HookEvent {
SessionStart { session_id: String, cwd: String },
UserPromptSubmit { session_id: String, user_prompt: String, cwd: String },
PreToolUse { session_id: String, tool_name: String, tool_input: Value, cwd: String },
PostToolUse { session_id: String, tool_name: String, tool_input: Value, tool_output: String, cwd: String },
// ... 4 more variants, each carrying only its own fields
}242 lines total. Zero optional fields. Zero constructors (enum variants ARE constructors). Each variant carries exactly the data it needs, no "optional soup." 2. "Agent integration is copy-paste (~130 lines across 10 call sites)" I created if let Some(ref hooks) = hook_runtime {
let outcome = hooks.emit(
HookEvent::PreToolUse {
session_id: session_id.clone(),
tool_name: name.clone(),
tool_input: serde_json::to_value(&args).unwrap_or_default(),
cwd: working_dir.display().to_string(),
},
&working_dir,
cancel_token.clone(),
).await;
if outcome.blocked { /* handle block */ }
}13 emit sites across 3 files. No repeated patterns beyond the 3. "Duplicated parsing (parse_command_output = parse_mcp_result)" I have one parse path: 4. "8 of 18 event kinds are never fired" I define exactly 8 events, all fired in production: Adding a new event is a one-line enum variant + one emit call. Architecture: direct subprocess vs MCP routingThe original PR routes hooks through I benchmarked both paths head-to-head (same command, same machine, 100 iterations):
Performance is identical. The subprocess fork+exec cost dominates both paths. The MCP routing and output processing overhead (truncation, markdown rendering, CallToolResult wrapping) is negligible. But the architectural difference is significant:
Same performance, 47% less code, zero coupling to the extension system. ProposalI'd like to rework this PR using the architecture above, split into your suggested phases: Phase 1: Infrastructure + core events (~500 lines)
Phase 2: Agent wiring (~200 lines)
This has been running in production on my fleet for 5 weeks with zero hook-related failures and one upstream sync (v1.29.0, 126 commits) absorbed with zero hook-related conflicts. Full benchmark data: https://github.com/sovren-software/goose/blob/main/docs/hook-benchmark-results-2026-03-30.md Happy to start on Phase 1 whenever you're ready. |
|
Hi Douwe,
Thanks for the thorough review. You're absolutely right about the
structural issues.
The god struct, copy-paste wiring, and unwired events all need to go.
I've already rebuilt the hook system in my fork addressing every point you
raised:
a tagged enum instead of the god struct, a HookRuntime with a single
`emit()` method instead of copy-paste, and only the 8 events that are
actually wired. The result is 1,045 LOC vs the original 1,972. Same
functionality, same performance (I benchmarked head-to-head).
I posted a detailed response on the PR with the architecture comparison and
benchmark data.
At 1,045 lines total, this is already smaller than the ~800 lines you
suggested for Phase 1 alone,
so it could land as a single PR. Happy to split it if you prefer.
Regards,
Cesar Cruz
*Founder @ https://sovren.software/ <https://sovren.software/>*
https://x.com/TheCesarCross
…On Thu, Mar 26, 2026 at 1:43 PM Douwe Osinga ***@***.***> wrote:
*DOsinga* left a comment (aaif-goose/goose#7411)
<#7411>
This is a solid architecture and we want hooks in goose, but 1900+ lines
in a single PR is a lot to review and merge confidently — can we start
simpler and iterate?
Here's where the bulk comes from and how to slim it down:
*HookInvocation is a god struct (334 lines in types.rs).* 15 optional
fields, 18 constructors that each set 2-4 of them. Make it an enum with #[serde(tag
= "hook_event_name")] — each variant carries only its own fields, no
optional soup.
*Agent integration is copy-paste (~130 lines across 10 call sites).*
Every site repeats the same 12-15 line pattern: build invocation → hooks.run(invocation,
&self.extension_manager, &working_dir, cancel_token...) → check blocked →
check context. Create a HookRunner that captures the shared state once,
so each call site becomes a one-liner like hook_runner.fire_pre_tool_use(tool_name,
tool_input).await.
*Duplicated parsing.* parse_command_output and parse_mcp_result do the
same thing for two backends — unify them.
*8 of 18 event kinds are never fired.* SubagentStart, SubagentStop,
Notification, PermissionRequest, TeammateIdle, TaskCompleted, ConfigChange
are defined but not wired up. Drop them and add when needed.
*Suggested phasing:*
1. *Infrastructure + core events (~500 lines):* config loading, shell
execution, the enum-based invocation type, and HookRunner. Only the
events that are actually wired up.
2. *Agent wiring (~300 lines):* PreToolUse, PostToolUse,
UserPromptSubmit, SessionStart/End, Stop, PreCompact/PostCompact. With
HookRunner these are one-liners.
3. *Follow-up PRs:* ContextFill (most complex — state tracking across
turns), MCP tool execution backend, additional event kinds as they get
wired up.
That gets the same functionality in ~800 lines for the first two PRs, and
each follow-up is a clean ~200 line addition.
(I also pushed a commit merging main and fixing the two codex findings —
auto-compaction was passing manual=true to hooks, and UserPromptSubmit
was picking up injected context instead of the actual user prompt.)
—
Reply to this email directly, view it on GitHub
<#7411>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APURS2UIUQMQFTOZIPI357T4SVT37AVCNFSM6AAAAACV3OZVNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCMZWHEZDKMBZGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Description
Adds a hooks system for agent lifecycle events that executes user-configured shell
commands or MCP tools at key points in the agent loop. Hooks can block actions, inject
context into the conversation, and run arbitrary scripts with the full invocation
context piped to stdin as JSON.
The motivating use case: PostCompact hooks that re-inject project context after
auto-compaction, so long sessions don't lose critical knowledge when history is
summarized. Also: ContextFill hooks that inject context at configurable fill
percentages before compaction is needed.
Architecture
tokio::process::Commandwith native exit codes, stdin piping, and separatestdout/stderr capture. No MCP intermediary.
ExtensionManagerforcalling any registered MCP tool with static arguments.
both starting before stdin write. Secondary 5s timeout on drain collection prevents
grandchild FD inheritance hangs.
(not just at parse time). Prevents OOM from malicious/buggy hooks. JSON-parsed
HookResultfields are also capped post-parse on both command and MCP paths.Ctrl+C doesn't kill them. On timeout/cancellation, the entire process group
is killed via
libc::kill(-pgid, SIGKILL), preventing orphaned grandchildren.and continue. A misconfigured hook never crashes a session.
Configuration
Hooks are configured under a top-level
hookskey in:~/.config/goose/hooks.json.goose/settings.json(preferred) or.claude/settings.jsonProject hooks require
"allow_project_hooks": truein the global config (security).{ "hooks": { "PostCompact": [ { "hooks": [{"type": "command", "command": "cat PROJECT_CONTEXT.md"}] } ], "ContextFill": [ { "matcher": "70", "hooks": [{"type": "command", "command": "cat ~/.goose/context-refresh.md"}] } ], "PreToolUse": [ { "matcher": "Edit|Write", "hooks": [{"type": "command", "command": "./lint-check.sh"}] } ] } }Exit code contract (Claude Code compatible)
HookResult, or treated as plain textadditionalContext(capped at 32KB, UTF-8 safe).PreToolUse,UserPromptSubmit). Stderris fed back as the block reason (capped at 4KB). Stdout is ignored.
Matchers (regex, Claude Code compatible)
BashorBash(regex)— Claude Code-style; maps todeveloper__shell/shellwith optional regex matching against the command string.
Edit|Write,mcp__memory__.*,Notebook.*— full regexanchored with
^(?:pattern)$. Patterns already fully anchored (^...$) are used as-is."auto"/"manual"— for PreCompact/PostCompact, matches the compaction trigger."70"fires at 70% context fill.Events wired (10 of 17)
Remaining events defined but not fired: SubagentStart, SubagentStop, Notification,
PermissionRequest, TeammateIdle, TaskCompleted, ConfigChange. These depend on goose
features that don't have natural call sites yet.
Tests
15 unit tests:
parse_command_output— every exit code path, JSON/non-JSON,truncation, blockable/non-blockable interaction
18 live integration tests (run against real Anthropic API):
fire + block (rm -rf), PostToolUse fire, regex matcher, Stop, SessionEnd, execution
order, fail-open on broken hook
graceful degradation, stdin JSON field validation, tool_name/tool_input presence,
Claude Code Bash matcher
Files changed
hooks/mod.rshooks/types.rshooks/shell.rshooks/config.rsagent.rsexecute_commands.rstool_execution.rsdeveloper/shell.rspub(crate)forbuild_shell_commandanduser_login_pathCargo.tomlregex,libcdependencieslib.rspub mod hooksReview notes
Crossfire reviewed across 2 rounds with 4 models (GPT-5, Claude Sonnet, Codex, Gemini).
Round 1 found 10 issues (1 critical, 5 major, 4 minor). All fixed:
Arc<Mutex<HashSet>>on Agentbreak→continuein approval loop, bounded shell reads, process group kill,JSON-parsed field caps on both command and MCP paths
||→&&),can_block()trimmed, HookResult field docsRound 2 (verification): Codex 10/10 APPROVE (all 8 verified), GPT-5 and Sonnet
found 2 remaining issues (MCP path caps, drain comment) — both fixed.
Final state: all 15 unit tests pass, all 18 live integration tests pass.