Skip to content

fix(chat): terminal sweep — resolve pending approvals on session end#837

Open
swaroopvarma1 wants to merge 1 commit into
releasefrom
fix/chat-approval-terminal-sweep
Open

fix(chat): terminal sweep — resolve pending approvals on session end#837
swaroopvarma1 wants to merge 1 commit into
releasefrom
fix/chat-approval-terminal-sweep

Conversation

@swaroopvarma1

Copy link
Copy Markdown
Collaborator

What

The HITL deny-on-terminal policy slice of the "one approval policy, two transports" unification.

Chat's idle-cleanup task (end_idle_chat_sessions) marks a session ENDED but never resolved its still-PENDING tool_approval rows — they were left dangling until lazy expiry. Voice already does the equivalent eagerly (ApprovalManager.deny_all on disconnect / idle / conversation-end); chat had no terminal sweep. That's a documented chat↔voice divergence (the deny-on-terminal decision differs between the two transports).

How

  • chat/approvals.pyterminate_pending_approvals(session_id): atomically claims every pending row as EXPIRED (timeout result) and writes the coalesced synthetic tool_result, so the dangling-tool_use invariant holds even on an ended session. It differs from resolve_dangling_approvals(only_expired=False) only in the terminal status: EXPIRED (the session itself ended) vs SUPERSEDED (a mid-session new message).
  • chat/cleanup.py — calls it right after end_chat_session, under the same per-session Redis lock, best-effort (a failure never undoes the end; the atomic claim makes a racing /approval simply lose the CAS and 409).

Why this slice (and not the rest of the coordinator)

The HITL policy is already half-unified — gate_call and the status vocabulary (approval.py:36-45) are shared. Of the three forked decisions, this PR fixes the cleanest one. The other two change working behavior and need a product call, so they're deliberately out of scope:

  • when-to-gate / shadow-gating — a gated global shadowed by a same-named per-node function is gated by chat (name-keyed) but not voice; picking one rule is a behavior change with an ambiguous "intended" answer.
  • supersede trigger — voice keys on duplicate-same-function, chat on new-user-message; unifying needs the tool_call_id threading (findings doc Prompt enhancememts #5).

Verification

  • tests/test_chat_approval_terminal_sweep.py (2) — claims-all-as-EXPIRED + writes the synthetic result; no-op when nothing pending.
  • pyrefly 0 errors; full suite 443 passed; field_reference coverage green.

Chat's idle-cleanup task ends a session but never resolved its still-PENDING
tool_approval rows — they were left dangling until lazy expiry (a tool_use a
late reload/audit would see unanswered, and a pending_approvals query would
report as live on an ended session). Voice already does the equivalent
eagerly via ApprovalManager.deny_all on disconnect / idle / conversation-end;
chat had no terminal sweep — a documented divergence in the HITL deny-on-
terminal policy.

Add terminate_pending_approvals(session_id): atomically claim every pending
row as EXPIRED (timeout result) and write the coalesced synthetic tool_result,
so the dangling-tool_use invariant holds even on an ended session. Wire it
into end_idle_chat_sessions right after end_chat_session, under the same
per-session Redis lock (best-effort — a failure never undoes the end; the
atomic claim makes a racing /approval simply lose).

This is the deny-on-terminal slice of the HITL "one policy, two transports"
unification: the terminal status differs from resolve_dangling_approvals only
in EXPIRED (the session itself ended) vs SUPERSEDED (a mid-session new
message). The remaining coordinator slices (when-to-gate shadow-gating; the
supersede trigger keyed by tool_call_id) change working behavior and need a
product call, so they are intentionally left out of this PR.

Tests: tests/test_chat_approval_terminal_sweep.py. pyrefly 0 errors; full
suite 443 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@swaroopvarma1, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 13 minutes and 38 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07a6cfa2-536e-4033-bcef-05113ec5e65c

📥 Commits

Reviewing files that changed from the base of the PR and between 58d3fbc and 6353337.

📒 Files selected for processing (3)
  • app/ai/voice/agents/breeze_buddy/chat/approvals.py
  • app/ai/voice/agents/breeze_buddy/chat/cleanup.py
  • tests/test_chat_approval_terminal_sweep.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/chat-approval-terminal-sweep

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Tara-ag Tara-ag left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Files reviewed: 3
New issues: 0

Assessment

This PR successfully implements the chat-side "deny-on-terminal" policy to achieve parity with voice mode's ApprovalManager.deny_all behavior.

Changes Verified:

  1. chat/approvals.py - New terminate_pending_approvals() function:

    • Uses existing parameterized SQL queries (safe from injection)
    • Properly marks pending approvals as EXPIRED on session termination
    • Writes coalesced synthetic tool_result to maintain the dangling-tool_use invariant
    • Well-documented with clear explanation of the voice/chat divergence
  2. chat/cleanup.py - Integration into idle session cleanup:

    • Called under the same per-session Redis lock (race-safe)
    • Best-effort error handling (failure doesn't undo session end)
    • Appropriate warning logging on failures
  3. tests/test_chat_approval_terminal_sweep.py - Test coverage:

    • Validates EXPIRED status assignment
    • Validates synthetic result row insertion
    • Tests noop behavior when no pending approvals exist

Security & Quality Checks:

  • ✅ SQL injection safe (uses $1, $2 parameterized queries via existing accessor layer)
  • ✅ No hardcoded secrets
  • ✅ Proper async/await usage
  • ✅ Consistent with existing codebase patterns
  • ✅ Migration-safe (no existing migration files modified)

Approved - Clean implementation with appropriate test coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants