fix(hooks): emit permissionDecision "ask" for T2 findings by elijahr · Pull Request #285 · axiomantic/spellbook

elijahr · 2026-05-07T21:49:09Z

What does this PR do?

Fixes a PreToolUse hook bug where T2 (TIER-ASK) findings collapsed onto the same sys.exit(2) deny path as T3 (TIER-DENY). The intended UX for T2 commands like git push and gh pr merge is Claude Code's yellow permissionDecision: "ask" prompt; instead the operator got a silent hard block with no in-session way to approve.

check_tool_input now returns an explicit verdict: "allow" | "ask" | "deny" alongside the existing safe: bool (non-breaking superset; the 7 existing safe-only callers are unchanged). _gate_bash, _gate_spawn, and _gate_state_sanitize short-circuit on verdict == "ask" to emit hookSpecificOutput.permissionDecision = "ask" and exit 0. Mixed TIER-ASK + non-ask findings still resolve to deny (deny-wins invariant preserved).

Bug source brief: ~/Development/spellbook-fix-t2-as-ask-prompt.md.

Related issue

Checklist

Tests pass locally (12 new tests in tests/test_security/test_check.py and tests/test_security/test_hooks.py cover pure-T2, pure-T3, mixed, safe, and the deny-wins invariant; full targeted suite green; ruff clean)
Documentation updated (if applicable) — N/A; behavior is restoring documented intent in _tier_findings docstring, no doc drift

The PreToolUse hook collapsed T2 (TIER-ASK) findings into the same exit-2 deny path as T3 (TIER-DENY), so commands like git push and gh pr merge got silently blocked instead of surfacing Claude Code's yellow permission prompt. check_tool_input now also returns a verdict field with values allow, ask, or deny. This is a non-breaking superset: the existing safe boolean is unchanged for the 7 callers that read it. _gate_bash, _gate_spawn, and _gate_state_sanitize short-circuit on verdict == ask to emit hookSpecificOutput.permissionDecision = ask and exit 0, letting the harness render its native permission prompt. Mixed TIER-ASK plus non-ask findings still resolve to deny (deny-wins invariant). 12 new tests cover pure-T2, pure-T3, mixed, and safe input cases plus the deny-wins invariant.

axiomantic-momus · 2026-05-07T21:49:54Z

✅ Momus review posted — verdict APPROVE, 0 findings

████████████████████ 100%

run log

gemini-code-assist

Code Review

This pull request introduces a new 'ask' verdict for security gates, enabling Claude Code to surface permission prompts for specific tool uses like git push. It adds the _emit_ask_and_exit helper to handle these prompts and updates the security check logic to compute verdicts ('allow', 'ask', 'deny'). Feedback focuses on improving code conciseness by removing redundant str() calls and unnecessary conditional checks.

gemini-code-assist · 2026-05-07T21:51:02Z

+    reason = "; ".join(
+        f.get("message", "")
+        for f in findings
+        if str(f.get("rule_id", "")).startswith("TIER-ASK")
+    )


The str() call on the rule_id is redundant here, as the findings generated by the security gates use string literals for their IDs. Removing it improves readability and adheres to the general maintainability standards mentioned in the style guide.

Suggested change

reason = "; ".join(

f.get("message", "")

for f in findings

if str(f.get("rule_id", "")).startswith("TIER-ASK")

)

reason = "; ".join(

f.get("message", "")

for f in findings

if f.get("rule_id", "").startswith("TIER-ASK")

)

References

Python code should follow PEP 8 and maintain clarity and conciseness. ^(link)

gemini-code-assist · 2026-05-07T21:51:02Z

+    non_low = [f for f in findings if f.get("severity") != "LOW"]
+    if non_low and all(
+        str(f.get("rule_id", "")).startswith("TIER-ASK") for f in non_low
+    ):
+        return "ask"


The if non_low check is redundant because the safe check at the beginning of the function already ensures that non_low will contain at least one element if the code reaches this point. Additionally, the str() call on rule_id is unnecessary as the rule IDs are already strings.

non_low = [f for f in findings if f.get("severity") != "LOW"] if all(f.get("rule_id", "").startswith("TIER-ASK") for f in non_low): return "ask"

References

Python code should follow PEP 8 and avoid redundant logic for better maintainability. ^(link)

axiomantic-momus

This PR fixes a PreToolUse hook bug where T2 (TIER-ASK) findings like git push and gh pr merge were silently blocked with exit 2 instead of surfacing Claude Code's permissionDecision: "ask" prompt. The fix adds a verdict field (allow | ask | deny) to check_tool_input's return alongside the existing safe bool, and wires the three hook gate functions to emit the permissionDecision JSON and exit 0 on verdict == "ask". The change is a non-breaking superset — all existing safe-only callers remain unchanged. Twelve new tests cover the pure-T2, pure-T3, mixed, safe, and deny-wins scenarios. One minor documentation gap noted.

No findings.

Noteworthy

Clean deny-wins invariant: _compute_verdict correctly collapses mixed TIER-ASK + non-ask findings to deny, tested via both synthetic unit test and end-to-end hook test (git push && echo done).
Backward-compatible design: adding verdict alongside existing safe keeps all 5+ non-hook callers unchanged.

Verdict: APPROVE.

Commands

Comment /ai-review or mention @axiomantic-momus[bot] to request a re-review of the latest changes.
Reply to a finding with won't fix, by design, or not a bug to decline it.
Reply with instead, ... to propose an alternative fix.

Cost: $0.11 - 211,208 in / 16,402 out tokens - deepseek/deepseek-v4-pro

Powered by Momus running deepseek/deepseek-v4-pro via openrouter.ai.

Remove redundant str() wraps on rule_id values; rule_ids are always string literals constructed inside the gate code. Aligns with the existing pattern elsewhere in the codebase (e.g. tests use f["rule_id"].startswith(...) directly). - hooks/spellbook_hook.py:363 (_emit_ask_and_exit TIER-ASK filter) - spellbook/gates/check.py:250 (_compute_verdict TIER-ASK check) Skipped the suggestion to remove the `if non_low` guard in _compute_verdict: the guard preserves correct behavior when the function is called directly with safe=False but an empty findings list (e.g. defensive callers, future tests). The redundancy gemini identified is real but the defensive check is cheap and intentional.

* fix(hooks): emit permissionDecision "ask" for T2 findings The PreToolUse hook collapsed T2 (TIER-ASK) findings into the same exit-2 deny path as T3 (TIER-DENY), so commands like git push and gh pr merge got silently blocked instead of surfacing Claude Code's yellow permission prompt. check_tool_input now also returns a verdict field with values allow, ask, or deny. This is a non-breaking superset: the existing safe boolean is unchanged for the 7 callers that read it. _gate_bash, _gate_spawn, and _gate_state_sanitize short-circuit on verdict == ask to emit hookSpecificOutput.permissionDecision = ask and exit 0, letting the harness render its native permission prompt. Mixed TIER-ASK plus non-ask findings still resolve to deny (deny-wins invariant). 12 new tests cover pure-T2, pure-T3, mixed, and safe input cases plus the deny-wins invariant. * feat(gates): opt-in to re-enable BASH-PARSER-COMPOUND deny The 0.63.2 compound-allow change deliberately widened the bash gate's allowlist so the L4 parser stopped emitting a CRITICAL deny on every pipe, double-ampersand, double-pipe, semicolon, and control-flow construct. Operators with stricter threat models can now restore the pre-0.63.2 policy via either of two opt-ins: - Environment variable SPELLBOOK_BASH_DENY_COMPOUND=1 (truthy values: 1, true, yes, case-insensitive, whitespace-tolerant). - Passing security_mode="paranoid" to spellbook.gates.bash_parser.parse_and_check (call-site control). Either path re-emits BASH-PARSER-COMPOUND for list/pipeline nodes AND for if/for/while/until/case/function control-flow constructs. With neither opt-in active, default behavior is unchanged from 0.63.2. Implementation: - New _compound_deny_enabled(security_mode) helper centralizes the truthy-env-var parse and the paranoid-mode check. - security_mode is threaded through private _walk, _classify_node, and _classify_compound. Public parse_and_check signature unchanged. - function added to the control-flow kind set in _classify_node for symmetry with the other control-flow kinds. - CHANGELOG Unreleased section documents the opt-in; the 0.63.2 entry gets a back-reference framing the prior change as a deliberate policy relaxation. Tests: - 7 new tests in TestCompoundDenyOptIn cover env-var truthy values, case/whitespace tolerance, paranoid mode, control-flow node coverage, and default-off behavior. - New autouse fixture _scrub_compound_deny_env prevents env-var pollution from breaking unrelated tests. - Full suite: 132 passed (with and without the env var set in the parent process). * chore(gates): apply gemini-code-assist suggestions from PR #285 Remove redundant str() wraps on rule_id values; rule_ids are always string literals constructed inside the gate code. Aligns with the existing pattern elsewhere in the codebase (e.g. tests use f["rule_id"].startswith(...) directly). - hooks/spellbook_hook.py:363 (_emit_ask_and_exit TIER-ASK filter) - spellbook/gates/check.py:250 (_compute_verdict TIER-ASK check) Skipped the suggestion to remove the `if non_low` guard in _compute_verdict: the guard preserves correct behavior when the function is called directly with safe=False but an empty findings list (e.g. defensive callers, future tests). The redundancy gemini identified is real but the defensive check is cheap and intentional. * fix(tests): route stint-hook gate-error read to stderr; fix stale docstring - tests/unit/test_stint_hooks.py: read proc.stderr instead of proc.stdout for gate-error JSON (matches the stderr routing introduced in 324cab5 but missed by that commit) - hooks/spellbook_hook.py: docstring at _gate_bash referenced stdout where the implementation already routes to stderr * fix(tests): reconcile fix-t2-ask-prompt verdict tests with compound-allow default The tests imported by the fix-t2-ask-prompt merge assumed BASH-PARSER-COMPOUND fires as CRITICAL on `git push ... && ...` style compounds — true before 0.63.2, but compound deny is now opt-in. The tests' real intent is to exercise the verdict computation with a CRITICAL finding alongside a T2 ask finding, regardless of which layer produces the CRITICAL. The merge also added a `verdict` field to `check_tool_input()` results without updating strict-equality assertions in `test_hooks_windows.py`, breaking 5 tests there. Updated: - test_check.py::test_critical_bashlex_finding_is_deny: opt into `SPELLBOOK_BASH_DENY_COMPOUND=1` via monkeypatch so the bashlex layer can produce a CRITICAL alongside the T2 TIER-ASK. - test_hooks.py::test_pure_t3_still_exits_2: switch error-JSON read from `proc.stdout` to `proc.stderr` (post-324cab5b stderr routing). - test_hooks.py::test_mixed_t2_and_critical_exits_2: same stderr switch, plus opt into compound deny via the `_run_bash_gate` `env_overrides` parameter. - test_hooks_windows.py: extend 5 strict-equality assertions (test_safe_bash_command_is_allowed, test_safe_spawn_prompt_is_allowed, test_injection_prompt_is_blocked, test_safe_workflow_state_is_allowed, test_injected_workflow_state_is_blocked) to include the new `verdict` field with the expected `allow` / `deny` value. * fix(gates): address gemini-code-assist review of PR #288 - HIGH: change parse_and_check default security_mode from "paranoid" to "standard" so direct callers get the same default behavior as the public check_tool_input API (compound allowed by default). Update docstring to reflect the new mode semantics. - MEDIUM: fix _classify_compound operator extraction to also inspect pipeline `pipe` nodes (previously pipelines always fell back to the literal "|" default) and deduplicate duplicate operators in the finding message. - MEDIUM: extract _handle_check_result helper from _gate_bash / _gate_spawn / _gate_state_sanitize to reduce duplication of the ask/deny result handling. * docs(agents): add Bash Gate navigation guide to AGENTS.spellbook.md Three real failure modes hit this session prompted explicit guidance: T2 ask gate that subagents cannot relay to the operator, bashlex parse errors on inline heredocs, and high-entropy detection on command-line payloads. Documents the layered pipeline, common block messages with mapped responses, the subagent escape hatch (surface to orchestrator, do not retry), and the heredoc-to-file workaround pattern. The content lands in AGENTS.spellbook.md so it propagates into every user's global CLAUDE.md, making the guidance authoritative for all spellbook-augmented harnesses (Claude Code, OpenCode, Codex, Gemini CLI, ForgeCode). * fix(hooks): filter LOW-severity findings from gate error reasons The _handle_check_result helper built the reasons string from all findings, but a block is only triggered by non-LOW severities. LOW findings (e.g., advisory rule_ids that should never block) added noise to the user-facing error message. Filter them out. Addresses gemini-code-assist review on PR #288. --------- Co-authored-by: elijahr <153711+elijahr@users.noreply.github.com>

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

axiomantic-momus Bot approved these changes May 7, 2026

View reviewed changes

elijahr mentioned this pull request May 8, 2026

feat(gates): opt-in to re-enable BASH-PARSER-COMPOUND deny #288

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hooks): emit permissionDecision "ask" for T2 findings#285

fix(hooks): emit permissionDecision "ask" for T2 findings#285
elijahr wants to merge 1 commit into
mainfrom
fix-t2-ask-prompt

elijahr commented May 7, 2026

Uh oh!

axiomantic-momus Bot commented May 7, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

axiomantic-momus Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elijahr commented May 7, 2026

What does this PR do?

Related issue

Checklist

Uh oh!

axiomantic-momus Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

axiomantic-momus Bot left a comment

Choose a reason for hiding this comment

Noteworthy

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

axiomantic-momus Bot commented May 7, 2026 •

edited

Loading