fix(chat): terminal sweep — resolve pending approvals on session end#837
fix(chat): terminal sweep — resolve pending approvals on session end#837swaroopvarma1 wants to merge 1 commit into
Conversation
Chat's idle-cleanup task ends a session but never resolved its still-PENDING tool_approval rows — they were left dangling until lazy expiry (a tool_use a late reload/audit would see unanswered, and a pending_approvals query would report as live on an ended session). Voice already does the equivalent eagerly via ApprovalManager.deny_all on disconnect / idle / conversation-end; chat had no terminal sweep — a documented divergence in the HITL deny-on- terminal policy. Add terminate_pending_approvals(session_id): atomically claim every pending row as EXPIRED (timeout result) and write the coalesced synthetic tool_result, so the dangling-tool_use invariant holds even on an ended session. Wire it into end_idle_chat_sessions right after end_chat_session, under the same per-session Redis lock (best-effort — a failure never undoes the end; the atomic claim makes a racing /approval simply lose). This is the deny-on-terminal slice of the HITL "one policy, two transports" unification: the terminal status differs from resolve_dangling_approvals only in EXPIRED (the session itself ended) vs SUPERSEDED (a mid-session new message). The remaining coordinator slices (when-to-gate shadow-gating; the supersede trigger keyed by tool_call_id) change working behavior and need a product call, so they are intentionally left out of this PR. Tests: tests/test_chat_approval_terminal_sweep.py. pyrefly 0 errors; full suite 443 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 13 minutes and 38 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Tara-ag
left a comment
There was a problem hiding this comment.
Review Summary
Files reviewed: 3
New issues: 0
Assessment
This PR successfully implements the chat-side "deny-on-terminal" policy to achieve parity with voice mode's ApprovalManager.deny_all behavior.
Changes Verified:
-
chat/approvals.py- Newterminate_pending_approvals()function:- Uses existing parameterized SQL queries (safe from injection)
- Properly marks pending approvals as
EXPIREDon session termination - Writes coalesced synthetic
tool_resultto maintain the dangling-tool_use invariant - Well-documented with clear explanation of the voice/chat divergence
-
chat/cleanup.py- Integration into idle session cleanup:- Called under the same per-session Redis lock (race-safe)
- Best-effort error handling (failure doesn't undo session end)
- Appropriate warning logging on failures
-
tests/test_chat_approval_terminal_sweep.py- Test coverage:- Validates EXPIRED status assignment
- Validates synthetic result row insertion
- Tests noop behavior when no pending approvals exist
Security & Quality Checks:
- ✅ SQL injection safe (uses
$1, $2parameterized queries via existing accessor layer) - ✅ No hardcoded secrets
- ✅ Proper async/await usage
- ✅ Consistent with existing codebase patterns
- ✅ Migration-safe (no existing migration files modified)
Approved - Clean implementation with appropriate test coverage.
What
The HITL deny-on-terminal policy slice of the "one approval policy, two transports" unification.
Chat's idle-cleanup task (
end_idle_chat_sessions) marks a session ENDED but never resolved its still-PENDINGtool_approvalrows — they were left dangling until lazy expiry. Voice already does the equivalent eagerly (ApprovalManager.deny_allon disconnect / idle / conversation-end); chat had no terminal sweep. That's a documented chat↔voice divergence (the deny-on-terminal decision differs between the two transports).How
chat/approvals.py—terminate_pending_approvals(session_id): atomically claims every pending row as EXPIRED (timeout result) and writes the coalesced synthetictool_result, so the dangling-tool_useinvariant holds even on an ended session. It differs fromresolve_dangling_approvals(only_expired=False)only in the terminal status: EXPIRED (the session itself ended) vs SUPERSEDED (a mid-session new message).chat/cleanup.py— calls it right afterend_chat_session, under the same per-session Redis lock, best-effort (a failure never undoes the end; the atomic claim makes a racing/approvalsimply lose the CAS and 409).Why this slice (and not the rest of the coordinator)
The HITL policy is already half-unified —
gate_calland the status vocabulary (approval.py:36-45) are shared. Of the three forked decisions, this PR fixes the cleanest one. The other two change working behavior and need a product call, so they're deliberately out of scope:tool_call_idthreading (findings doc Prompt enhancememts #5).Verification
tests/test_chat_approval_terminal_sweep.py(2) — claims-all-as-EXPIRED + writes the synthetic result; no-op when nothing pending.