Skip to content

DRAFT: feat(hitl): Add Human-In-The-Loop support for agent tool approval (#336)#682

Draft
ryaneggz wants to merge 3 commits into
developmentfrom
feat/336-add-human-in-the-loop
Draft

DRAFT: feat(hitl): Add Human-In-The-Loop support for agent tool approval (#336)#682
ryaneggz wants to merge 3 commits into
developmentfrom
feat/336-add-human-in-the-loop

Conversation

@ryaneggz

@ryaneggz ryaneggz commented Jan 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Implements Human-In-The-Loop (HITL) workflow for agent tool approval
  • Users can now approve, edit, reject, or respond to agent tool calls before execution
  • Adds comprehensive frontend UI for interrupt handling and agent HITL configuration

Backend Changes

  • Interrupt Schemas (backend/src/schemas/entities/interrupt.py): New types for InterruptPayload, InterruptDecision, ActionRequest, ResumeRequest, and InterruptResponse
  • Interrupt Service (backend/src/services/interrupt.py): Service for managing HITL state and decision processing
  • Stream Detection (backend/src/utils/stream.py): Enhanced streaming to detect __interrupt__ events from LangGraph
  • Tool Utils (backend/src/utils/tools.py): Added interrupt configuration building
  • Thread Routes (backend/src/routes/v0/thread.py):
    • POST /threads/{thread_id}/resume - Resume execution after interrupt decision
    • GET /threads/{thread_id}/interrupts - Check pending interrupt status

Frontend Changes

  • InterruptApprovalDialog (frontend/src/components/modals/InterruptApprovalDialog.tsx): Modal dialog with tabbed interface for approve/edit/reject/respond actions
  • useInterrupt Hook (frontend/src/hooks/useInterrupt.ts): Hook for managing interrupt state
  • ChatContext Integration: Added interrupt state management to chat context
  • ChatPanel Integration: Integrated interrupt dialog into main chat interface
  • Agent Form HITL Config (frontend/src/components/forms/agents/agent-create-form.tsx):
    • Enable/disable HITL toggle
    • Tools requiring approval selection
    • Timeout configuration
    • Default action selection

Test plan

  • TypeScript compilation passes (npx tsc --noEmit)
  • All frontend tests pass (163 passed, 3 skipped)
  • Manual test: Create agent with HITL enabled
  • Manual test: Trigger tool call and verify interrupt dialog appears
  • Manual test: Test approve/edit/reject/respond actions
  • Manual test: Verify agent resumes correctly after decision

Related Issues

Closes #336


πŸ€– Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Human-In-The-Loop (HITL) agent settings in agent creation/edit forms.
    • In-chat interrupt approval dialog with Approve/Edit/Reject/Respond options and editable tool arguments.
    • Resume and fetch-pending endpoints to apply user decisions and view pending interrupts.
  • UX
    • Notifications and dialog flow for streaming interrupts with countdown/timeout and default-action handling.
  • Types/Integrations
    • Frontend hooks and services to manage pending interrupts and submit decisions.

✏️ Tip: You can customize this high-level summary in your review settings.

ryaneggz and others added 2 commits January 16, 2026 13:46
Implements HITL workflow allowing users to approve, edit, or reject
agent tool calls before execution. Key features:

Backend:
- Add interrupt schemas (InterruptPayload, InterruptDecision, etc.)
- Add interrupt service for managing HITL state
- Detect interrupts in streaming responses
- Add resume API endpoint (POST /threads/{id}/resume)
- Add interrupt status endpoint (GET /threads/{id}/interrupts)

Frontend:
- Add InterruptApprovalDialog component with approve/edit/reject/respond
- Add useInterrupt hook for managing interrupt state
- Integrate interrupt handling into ChatContext
- Add HITL configuration UI to agent create/edit forms
- Add InterruptConfig type to Agent service

Closes #336

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jan 16, 2026

Copy link
Copy Markdown
Contributor
πŸ“ Walkthrough

Walkthrough

Adds a Human‑In‑The‑Loop (HITL) design and implementation: new backend interrupt schemas and service, streaming interrupt detection and SSE emission, resume and interrupts endpoints, tool wrapper validation, frontend types/hooks/UI for interrupt approval, and multiple specification documents and tasks describing rollout, security, and optimization.

Changes

Cohort / File(s) Summary
Spec & Planning
.claude/specs/feat-336-human-in-the-loop/*
New proposal, craftsman, guardian, integrator, optimizer, tasks, review, user stories documents describing architecture, security, phased plan, risks and acceptance criteria.
Backend β€” Schemas
Interrupt schemas: backend/src/schemas/entities/interrupt.py, backend/src/schemas/entities/__init__.py
LLM/Assistant updates: backend/src/schemas/entities/llm.py
New Pydantic models/enums (InterruptTrigger, DecisionType, InterruptStatus, InterruptConfig, InterruptRequest/Response/EventData/Interrupt/List). Assistant/LLMRequest/LLMInput gain optional hitl / alias interrupt_config; re-exports added.
Backend β€” Service & Logic
backend/src/services/interrupt.py, backend/src/utils/tools.py
New InterruptService with strategy handlers (Approve/Edit/Reject/Respond), nonce/replay and ownership checks, arg validation, lifecycle management, should_interrupt_tool util; added InterruptValidationError and related exceptions; tool wrapper updated to validate edited args and call async tool invocation.
Backend β€” Streaming & Workers
backend/src/utils/stream.py, backend/src/workers/tasks.py, backend/src/controllers/llm.py
stream_generator signature extended with hitl_config; helpers create_interrupt_event/check_for_interrupt and DEFAULT_HITL_TIMEOUT_SECONDS added; run_agent_stream detects/langgraph Interrupts and publishes interrupt payloads; llm_stream passes assistant.hitl into streaming.
Backend β€” Routes / Resume
backend/src/routes/v0/thread.py
New endpoints: POST /threads/{thread_id}/resume (handle decision, reconstruct graph, error mappings) and GET /threads/{thread_id}/interrupts (list pending), with authorization and error handling.
Backend β€” Middleware
backend/src/utils/middleware.py
Guard added when updating last_msg.model to handle None runtime.context during resume paths.
Frontend β€” Types & Entities
frontend/src/lib/entities/interrupt.ts, frontend/src/lib/entities/index.ts, frontend/src/lib/entities/stream.ts
New TS types/interfaces and helpers for interrupts (DecisionType, InterruptStatus, InterruptTrigger, InterruptConfig, InterruptRequest/Response/EventData/Interrupt/List, utils), SSE interrupt event type added; re-exports updated.
Frontend β€” Services
frontend/src/lib/services/threadService.ts, frontend/src/lib/services/agentService.ts
Added resumeThread and getThreadInterrupts client functions; Agent type extended with optional interrupt_config/InterruptConfig.
Frontend β€” Hooks / Context
frontend/src/hooks/useInterrupt.ts, frontend/src/hooks/useChat.ts, frontend/src/context/ChatContext.tsx
New useInterrupt hook to manage pending interrupts, decision submission, fetch pending interrupts; useChat and ChatContext gain setOnInterrupt handler wiring and expose interrupt-related context/actions.
Frontend β€” UI
frontend/src/components/modals/InterruptApprovalDialog.tsx, frontend/src/components/forms/agents/agent-create-form.tsx, frontend/src/pages/chat/ChatPanel.tsx
New InterruptApprovalDialog modal (Approve/Edit/Reject/Respond tabs, JSON arg editing, countdown); agent create/edit form gains HITL configuration UI; ChatPanel integrates dialog and handlers.
Misc Frontend
frontend/src/lib/utils/fetchStreamReader.ts, frontend/src/components/panels/FileEditorPanel.tsx, minor files
fetchStreamReader now recognizes interrupt SSE events; small formatting tweaks in few components.
Changelog
Changelog.md
Adds entry for feat/336-add-human-in-the-loop.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Frontend
    participant SSE as Streaming/SSE
    participant Backend
    participant LG as LangGraph
    participant DB as PersistentState

    User->>Frontend: Submit chat (agent with hitl)
    Frontend->>Backend: POST /threads/{id} (create/run)
    Backend->>LG: Execute agent (with interrupt_on)
    LG-->>Backend: Emits Interrupt marker (__interrupt__)
    Backend->>DB: create_interrupt (id, nonce, timeout)
    Backend->>SSE: Emit SSE "interrupt" (InterruptEventData)
    SSE-->>Frontend: Deliver InterruptEventData
    Frontend->>Frontend: set pendingInterrupt & show dialog
    User->>Frontend: Choose decision (approve/edit/reject/respond)
    Frontend->>Backend: POST /threads/{id}/resume (decision + nonce)
    Backend->>Backend: InterruptService.handle_decision (validate ownership/nonce/args)
    alt edit
        Backend->>Backend: validate edited args against tool schema
    end
    Backend->>LG: Command(resume=...) with decisions
    LG->>LG: Resume graph execution
    LG-->>Backend: Final result stream
    Backend-->>Frontend: Stream final response
    Frontend->>User: Display final output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

πŸ‡ I paused the tools with a careful hop,
A nonce in paw, I held the stop,
Approve or edit, decide with cheer,
Resume the graph β€” the path is clear.
Together we nudge the agent on its way.

πŸš₯ Pre-merge checks | βœ… 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.35% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
βœ… Passed checks (4 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The PR title clearly identifies the main feature (HITL) and its primary purpose (agent tool approval), matching the substantial changes documented across backend and frontend components.
Linked Issues check βœ… Passed The PR implements the core HITL requirements from issue #336: interrupt schemas, service layer, streaming integration, resume endpoints, frontend UI, and agent HITL configuration support.
Out of Scope Changes check βœ… Passed All changes are aligned with HITL feature scope. Minor formatting tweaks in unrelated files (FileEditorPanel, useProject, shareService, ProjectPage, SharedThreadPage) appear to be incidental but do not constitute out-of-scope functional changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • πŸ“ Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
frontend/src/pages/chat/ChatPanel.tsx (1)

18-45: Handle submitDecision errors to avoid unhandled promise rejections.
submitDecision throws on failure; without a guard this can surface as unhandled rejections and leave the dialog in a confusing state. Consider catching and surfacing the hook’s error state.

πŸ› οΈ Suggested guard
-const handleInterruptDecision = async (request: Parameters<typeof submitDecision>[0]) => {
-	await submitDecision(request);
-};
+const handleInterruptDecision = async (request: Parameters<typeof submitDecision>[0]) => {
+	try {
+		await submitDecision(request);
+	} catch {
+		// error state is already set in the hook; keep dialog open or surface a toast
+	}
+};
backend/src/utils/stream.py (1)

245-255: Wire HITL interrupt detection into the stream loop.

The hitl_config parameter is accepted but never used. The helper functions check_for_interrupt() and create_interrupt_event() are defined but never called within the stream loop (lines 301–319). As a result, __interrupt__ chunks are never emitted to the frontend. Either integrate the interrupt detection logic or remove the unused parameter.

πŸ€– Fix all issues with AI agents
In @.claude/specs/feat-336-human-in-the-loop/PROPOSAL_CRAFTSMAN.md:
- Around line 55-63: The proposal references a non-existent folder/component
name; update the paths/names in PROPOSAL_CRAFTSMAN.md to match the implemented
components: replace references to components/modals/InterruptModal/ with
components/modals/InterruptApprovalDialog.tsx and ensure mentions of
useInterrupt.ts and lib/entities/interrupt.ts (or interrupt.ts) are accurate;
specifically, change the directory entry to reflect
frontend/src/components/modals/InterruptApprovalDialog.tsx, keep
frontend/src/hooks/useInterrupt.ts, and frontend/src/lib/entities/interrupt.ts
so the doc aligns with the actual component and hook names.

In @.claude/specs/feat-336-human-in-the-loop/PROPOSAL_OPTIMIZER.md:
- Around line 55-83: The fenced architecture diagram labeled "Architecture:
Checkpoint-First, Connection-Free Waits" is missing a language tag on its
triple-backtick block; update the opening fence to include a language identifier
(for example use ```text) so the code block becomes ```text ... ``` in
PROPOSAL_OPTIMIZER.md, ensuring the diagram retains exact content but adds the
language specifier for markdown linter and renderer consistency.

In `@backend/src/routes/v0/thread.py`:
- Around line 467-483: The code creates a ServiceContext (service_context via
ServiceContext(user_id=user.id, store=store, checkpointer=checkpointer)) that is
never used and the except block re-raises the HTTPException without chaining the
original exception; either remove the unused service_context creation (the three
lines that call get_checkpoint_db and instantiate ServiceContext) or actually
pass service_context into the call that needs it (e.g., into InterruptService or
whatever API should consume it), and update the except block to preserve the
original traceback by using exception chaining when raising the HTTPException
(raise HTTPException(... ) from e) so the original error is retained.
- Around line 410-441: Remove the unused ServiceContext instantiation (the
three-line block creating service_context using ServiceContext(user_id=user.id,
store=store, checkpointer=checkpointer)) or replace it with a one-line TODO
comment indicating future graph-resume usage, and update all raised
HTTPException/raise statements in this try/except block (the handlers for
InterruptNotFoundError, InterruptExpiredError, InterruptAuthorizationError,
InterruptValidationError and the generic Exception) to use exception chaining
(e.g., raise HTTPException(... ) from e) so the original exception context is
preserved; locate these changes around the logic that opens get_checkpoint_db(),
calls InterruptService(user_id=user.id).handle_decision(request), and returns
InterruptResponse(...).

In `@backend/src/services/interrupt.py`:
- Around line 188-207: The class currently defines mutable caches
_pending_interrupts and _used_nonces at class scope, causing cross-user data
leakage; move these to instance scope by initializing self._pending_interrupts =
{} and self._used_nonces = set() inside the __init__ of the InterruptService (or
the class containing _handlers and __init__), remove or convert the class-level
mutable definitions, and update all internal references to use
self._pending_interrupts and self._used_nonces (leave _handlers as a class-level
immutable mapping if desired). Ensure any methods that previously referenced the
class attributes now reference the instance attributes to restore per-user
isolation.
- Around line 281-287: Update validate_nonce to require the Interrupt object and
a nonce (make nonce mandatory) and verify both that the provided nonce equals
interrupt.nonce and that it hasn't been used before: change the signature from
validate_nonce(self, interrupt_id: str, nonce: str) to validate_nonce(self,
interrupt: Interrupt, nonce: str), first return False if nonce is falsy, then
check nonce == interrupt.nonce, then check the replay set self._used_nonces
using a key like f"{interrupt.id}:{nonce}" (or reuse existing key pattern) and
add it after a successful check; update any callers that passed interrupt_id to
pass the Interrupt instance instead.
- Around line 387-389: Replace the fragile enum lookup with an explicit mapping
from DecisionType to InterruptStatus: add RESPONDED = "responded" to the
InterruptStatus enum (in interrupt.py) and create a mapping dict (e.g.
DECISION_TO_STATUS) that maps DecisionType.APPROVE->InterruptStatus.APPROVED,
DecisionType.EDIT->InterruptStatus.EDITED,
DecisionType.REJECT->InterruptStatus.REJECTED,
DecisionType.RESPOND->InterruptStatus.RESPONDED; then set interrupt.status =
DECISION_TO_STATUS[request.action] (or .value as appropriate) and keep setting
interrupt.resolved_at as before to avoid KeyError from name mismatches.

In `@backend/src/utils/tools.py`:
- Around line 76-86: Replace the broad except Exception with an explicit import
and catch of pydantic.ValidationError when validating edited_args with
tool.args_schema, log the failure using logger.error(..., exc_info=True) to
include the traceback, and re-raise InterruptValidationError using "raise
InterruptValidationError(... ) from e" to preserve the exception chain; update
the block around tool.args_schema, edited_args, validated_args, logger.error,
and InterruptValidationError to implement these changes and ensure only
validation errors are handled here.

In `@frontend/src/hooks/useInterrupt.ts`:
- Around line 2-10: The import list in useInterrupt.ts includes an unused symbol
InterruptActions which triggers a TS6133 error; remove InterruptActions from the
import statement that currently imports Interrupt, InterruptEventData,
InterruptRequest, InterruptResponse, InterruptState, InterruptActions,
InterruptContextType so the file only imports the symbols actually used (e.g.,
Interrupt, InterruptEventData, InterruptRequest, InterruptResponse,
InterruptState, InterruptContextType) and then re-run the TypeScript build to
confirm the unused import error is resolved.
- Around line 54-72: The SSE event handler is creating a client-side interrupt
id and nonce which mismatch the backend; update the InterruptEventData type to
include interrupt_id and nonce, then in handleInterruptEvent stop generating
interruptId and crypto.randomUUID() and instead set interrupt.id =
data.interrupt_id and interrupt.nonce = data.nonce (replace the interruptId
variable and the crypto.randomUUID() usage). Keep the other fields (e.g.,
expires_at = data.timeout_at) intact so the created Interrupt object matches the
server record for resume lookups.

In `@frontend/src/lib/entities/interrupt.ts`:
- Around line 74-84: The InterruptEventData interface is missing interrupt_id
and nonce, and the hook useInterrupt is fabricating those values which breaks
resume matching; add interrupt_id: string and nonce: string to the
InterruptEventData interface and update the useInterrupt hook to use the values
from the SSE payload (data.interrupt_id instead of generating an id, and
data.nonce instead of creating a new nonce) where the hook currently fabricates
them (references: InterruptEventData interface and the useInterrupt hook's event
handling logic).
🧹 Nitpick comments (9)
frontend/src/lib/services/agentService.ts (1)

4-12: Duplicate type definition β€” consider importing from @/lib/entities/interrupt.ts.

The InterruptConfig type is already defined identically in frontend/src/lib/entities/interrupt.ts (lines 26-34). Duplicating it here violates DRY and creates a maintenance burden if the type needs to change.

♻️ Suggested refactor
 import apiClient from "@/lib/utils/apiClient";
 import { Schedule, ScheduleCreate } from "@/lib/entities/schedule";
+import { InterruptConfig } from "@/lib/entities/interrupt";

-export type InterruptConfig = {
-	enabled: boolean;
-	tools_requiring_approval: string[];
-	timeout_seconds: number;
-	default_action: "approve" | "reject" | "timeout";
-	allow_approve: boolean;
-	allow_edit: boolean;
-	allow_reject: boolean;
-};
-
+export type { InterruptConfig };

As per coding guidelines, use @/ alias for src imports.

frontend/src/components/forms/agents/agent-create-form.tsx (3)

156-167: Consider making allow_* options configurable or documenting the hardcoded values.

The allow_approve, allow_edit, and allow_reject flags are hardcoded to true. If this is intentional to simplify the UI, consider adding a comment explaining this design decision. Otherwise, these could be exposed as advanced configuration options in the future.

The fallback pattern using || [] and || 300 is appropriate for ensuring valid defaults.


911-944: Consider memoizing tool selection logic to reduce re-renders.

The form.watch("hitl_tools") on line 913 is called for each tool in the loop, which may cause performance issues with many tools. Consider extracting the watched value outside the map callback.

♻️ Suggested optimization
+ const hitlTools = form.watch("hitl_tools") || [];
  {agent.tools.map((tool: string) => {
-   const hitlTools = form.watch("hitl_tools") || [];
    const isSelected = hitlTools.includes(tool);
    return (
      // ... rest of the component
    );
  })}

Move the form.watch("hitl_tools") call outside the .map() to avoid redundant subscriptions.


982-984: Add radix parameter to parseInt for safety.

While modern browsers default to base 10, explicitly specifying the radix is a best practice to avoid potential edge cases.

♻️ Suggested fix
  onChange={(e) =>
-   field.onChange(parseInt(e.target.value) || 300)
+   field.onChange(parseInt(e.target.value, 10) || 300)
  }
.claude/specs/feat-336-human-in-the-loop/REVIEW.md (1)

86-132: Add language specifier to the architecture diagram code block.

The ASCII diagram fenced code block lacks a language identifier, which triggers markdown lint warnings. Use text or plaintext for better tooling support.

♻️ Suggested fix
-```
+```text
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚                    UNIFIED HITL ARCHITECTURE                         β”‚
backend/src/utils/tools.py (1)

23-27: Consolidate InterruptValidationError definition.
There’s already an InterruptValidationError in backend/src/services/interrupt.py. Consider centralizing (and importing) a single definition to avoid mismatched catches across modules.

frontend/src/components/modals/InterruptApprovalDialog.tsx (1)

85-102: Consider stabilizing onClose dependency.

The timer effect depends on onClose. If the parent component doesn't memoize onClose, this effect will reset on every render, potentially causing flickering or missed intervals. Ensure the parent passes a stable callback (via useCallback).

Verify that the parent component (likely ChatPanel) wraps onClose in useCallback to prevent unnecessary effect re-runs.

frontend/src/hooks/useInterrupt.ts (1)

190-240: Consider threading the interrupt nonce into decision helpers.

If the backend enforces nonce matching, generating a fresh nonce here will fail. Prefer accepting the interrupt’s nonce and passing it through.

♻️ Example pattern
-export function createApproveRequest(interruptId: string): InterruptRequest {
+export function createApproveRequest(
+	interruptId: string,
+	nonce: string,
+): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "approve",
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }

-export function createEditRequest(
-	interruptId: string,
-	editedArgs: Record<string, unknown>,
-): InterruptRequest {
+export function createEditRequest(
+	interruptId: string,
+	editedArgs: Record<string, unknown>,
+	nonce: string,
+): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "edit",
 		edited_args: editedArgs,
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }

-export function createRejectRequest(
-	interruptId: string,
-	reason?: string,
-): InterruptRequest {
+export function createRejectRequest(
+	interruptId: string,
+	reason: string | undefined,
+	nonce: string,
+): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "reject",
 		reason: reason || "User rejected the tool call",
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }

-export function createRespondRequest(
-	interruptId: string,
-	feedback: string,
-): InterruptRequest {
+export function createRespondRequest(
+	interruptId: string,
+	feedback: string,
+	nonce: string,
+): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "respond",
 		reason: feedback,
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }
backend/src/schemas/entities/interrupt.py (1)

12-12: Add explicit type hints to the validator.

Type hints are missing on the validator signature and are required.

♻️ Proposed fix
-from pydantic import BaseModel, Field, field_validator
+from pydantic import BaseModel, Field, ValidationInfo, field_validator
@@
-    def validate_edited_args_required_for_edit(cls, v, info):
+    def validate_edited_args_required_for_edit(
+        cls, v: Optional[Dict[str, Any]], info: ValidationInfo
+    ) -> Optional[Dict[str, Any]]:
         """Ensure edited_args is provided when action is 'edit'."""
         if info.data.get("action") == DecisionType.EDIT and not v:
             raise ValueError("edited_args is required when action is 'edit'")
         return v
As per coding guidelines, please add explicit type hints.

Also applies to: 101-107

Comment on lines +55 to +63
frontend/
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ components/modals/
β”‚ β”‚ └── InterruptModal/ # NEW: HITL decision UI
β”‚ β”œβ”€β”€ hooks/
β”‚ β”‚ └── useInterrupt.ts # NEW: Interrupt state
β”‚ └── lib/entities/
β”‚ └── interrupt.ts # NEW: TypeScript interfaces
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Update proposal paths to match current component names.
The doc references components/modals/InterruptModal/, but the implementation uses frontend/src/components/modals/InterruptApprovalDialog.tsx. Aligning names avoids confusion.

πŸ€– Prompt for AI Agents
In @.claude/specs/feat-336-human-in-the-loop/PROPOSAL_CRAFTSMAN.md around lines
55 - 63, The proposal references a non-existent folder/component name; update
the paths/names in PROPOSAL_CRAFTSMAN.md to match the implemented components:
replace references to components/modals/InterruptModal/ with
components/modals/InterruptApprovalDialog.tsx and ensure mentions of
useInterrupt.ts and lib/entities/interrupt.ts (or interrupt.ts) are accurate;
specifically, change the directory entry to reflect
frontend/src/components/modals/InterruptApprovalDialog.tsx, keep
frontend/src/hooks/useInterrupt.ts, and frontend/src/lib/entities/interrupt.ts
so the doc aligns with the actual component and hook names.

Comment on lines +55 to +83
**Architecture: Checkpoint-First, Connection-Free Waits**

```
Request β†’ Agent Run β†’ Interrupt Detected
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. Checkpoint State to Postgres β”‚
β”‚ 2. Write Interrupt Metadata to Redis │◄── Fast Path
β”‚ 3. Release DB Connection β”‚
β”‚ 4. Return SSE: {"interrupt": {...}} β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client Polls: GET /threads/{id}/status β”‚
β”‚ - No DB connection held │◄── Wait Period
β”‚ - Redis-only reads β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό (User Approves)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ POST /threads/{id}/interrupt/resume β”‚
β”‚ 1. Acquire DB connection from pool │◄── Resume Path
β”‚ 2. Load checkpoint β”‚
β”‚ 3. Inject approval into state β”‚
β”‚ 4. Continue streaming β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Add language specifier to fenced code block.

The architecture diagram code block lacks a language specifier. While this is a documentation file, adding a language identifier (e.g., text or leaving it explicit) improves markdown linter compliance and rendering consistency.

πŸ“ Suggested fix
 **Architecture: Checkpoint-First, Connection-Free Waits**

-```
+```text
 Request β†’ Agent Run β†’ Interrupt Detected
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Architecture: Checkpoint-First, Connection-Free Waits**
```
Request β†’ Agent Run β†’ Interrupt Detected
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. Checkpoint State to Postgres β”‚
β”‚ 2. Write Interrupt Metadata to Redis │◄── Fast Path
β”‚ 3. Release DB Connection β”‚
β”‚ 4. Return SSE: {"interrupt": {...}} β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client Polls: GET /threads/{id}/status β”‚
β”‚ - No DB connection held │◄── Wait Period
β”‚ - Redis-only reads β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό (User Approves)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ POST /threads/{id}/interrupt/resume β”‚
β”‚ 1. Acquire DB connection from pool │◄── Resume Path
β”‚ 2. Load checkpoint β”‚
β”‚ 3. Inject approval into state β”‚
β”‚ 4. Continue streaming β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
**Architecture: Checkpoint-First, Connection-Free Waits**
🧰 Tools
πŸͺ› markdownlint-cli2 (0.18.1)

55-55: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


57-57: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

πŸ€– Prompt for AI Agents
In @.claude/specs/feat-336-human-in-the-loop/PROPOSAL_OPTIMIZER.md around lines
55 - 83, The fenced architecture diagram labeled "Architecture:
Checkpoint-First, Connection-Free Waits" is missing a language tag on its
triple-backtick block; update the opening fence to include a language identifier
(for example use ```text) so the code block becomes ```text ... ``` in
PROPOSAL_OPTIMIZER.md, ensuring the diagram retains exact content but adds the
language specifier for markdown linter and renderer consistency.

Comment on lines +410 to +441
try:
async with get_checkpoint_db() as checkpointer:
service_context = ServiceContext(
user_id=user.id, store=store, checkpointer=checkpointer
)

# Get the interrupt service (graph would be loaded for actual resume)
interrupt_service = InterruptService(user_id=user.id)

# Handle the decision
await interrupt_service.handle_decision(request)

return InterruptResponse(
status="resumed",
interrupt_id=request.interrupt_id,
thread_id=thread_id,
message=f"Thread resumed with action: {request.action.value}",
)

except InterruptNotFoundError as e:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
except InterruptExpiredError as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
except InterruptAuthorizationError as e:
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=str(e))
except InterruptValidationError as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
except Exception as e:
logger.exception(f"Error resuming thread {thread_id}: {e}")
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full context around the lines in question
cat -n backend/src/routes/v0/thread.py | sed -n '380,445p'

Repository: ruska-ai/orchestra

Length of output: 2944


🏁 Script executed:

# Also check if service_context is referenced anywhere else in the function
# by looking at a wider range and searching for usages
grep -n "service_context" backend/src/routes/v0/thread.py | head -20

Repository: ruska-ai/orchestra

Length of output: 1408


🏁 Script executed:

# Check the function signature and surrounding context to understand intent
cat -n backend/src/routes/v0/thread.py | sed -n '370,450p'

Repository: ruska-ai/orchestra

Length of output: 3542


🏁 Script executed:

# Check if there are any linting issues or TODO comments related to this
grep -n "TODO\|FIXME" backend/src/routes/v0/thread.py | grep -A 5 -B 5 "410\|411\|412\|413\|414\|415\|416\|417"

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Check the InterruptService class to understand if it needs the checkpointer/store
rg "class InterruptService" -A 10 backend/

Repository: ruska-ai/orchestra

Length of output: 738


🏁 Script executed:

# Look for any comments about graph resume or actual implementation
grep -n "graph.*resume\|actual resume" backend/src/routes/v0/thread.py

Repository: ruska-ai/orchestra

Length of output: 147


🏁 Script executed:

# Check the InterruptService __init__ method to see what it accepts
cat -n backend/src/services/interrupt.py | sed -n '1,100p'

Repository: ruska-ai/orchestra

Length of output: 3374


🏁 Script executed:

# Check if service_context is used anywhere by looking at the full InterruptService implementation
grep -n "def __init__\|self.checkpointer\|self.store" backend/src/services/interrupt.py | head -20

Repository: ruska-ai/orchestra

Length of output: 146


🏁 Script executed:

# Look at other endpoints that create ServiceContext to see the pattern
cat -n backend/src/routes/v0/thread.py | sed -n '50,80p'

Repository: ruska-ai/orchestra

Length of output: 1627


🏁 Script executed:

# Get the full InterruptService.__init__ method
cat -n backend/src/services/interrupt.py | sed -n '200,250p'

Repository: ruska-ai/orchestra

Length of output: 2105


🏁 Script executed:

# Check if there's any usage of service_context in the resume_thread function or nearby
sed -n '410,441p' backend/src/routes/v0/thread.py | grep -c "service_context"

Repository: ruska-ai/orchestra

Length of output: 63


🏁 Script executed:

# Verify the coding guidelines - check if exception chaining is a pre-commit hook requirement
cat -n .pre-commit-config.yaml 2>/dev/null || echo "File not found"

Repository: ruska-ai/orchestra

Length of output: 1024


🏁 Script executed:

# Check for any B904 or similar flake8/ruff configurations
cat -n pyproject.toml 2>/dev/null | grep -A 20 "tool.ruff\|tool.flake8" | head -40

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Check the full pyproject.toml for ruff configuration
cat -n pyproject.toml | head -100

Repository: ruska-ai/orchestra

Length of output: 108


🏁 Script executed:

# Also check the Makefile to see what format command does
cat backend/Makefile

Repository: ruska-ai/orchestra

Length of output: 1266


🏁 Script executed:

# Check if there's a setup.cfg or ruff.toml or any ruff configuration
find . -name "ruff.toml" -o -name "setup.cfg" -o -name ".flake8" | head -10

Repository: ruska-ai/orchestra

Length of output: 73


🏁 Script executed:

# Check if the generic exception handler at line 437-441 is sufficient for tracebacks
# by looking at how logger.exception works (it's standard Python logging which preserves traceback)
python3 << 'EOF'
import logging
# Verify that logger.exception preserves the full traceback
# This is standard Python behavior - logger.exception includes the exception info
print("logger.exception() in Python logging automatically includes exception info and traceback")
print("Using 'from e' in raise statements is still recommended for exception chaining")
EOF

Repository: ruska-ai/orchestra

Length of output: 148


🏁 Script executed:

# Check the ruff configuration to see if exception chaining is enforced
cat ruff.toml

Repository: ruska-ai/orchestra

Length of output: 968


🏁 Script executed:

# Verify that service_context is truly not used by checking the entire function body
sed -n '387,441p' backend/src/routes/v0/thread.py | grep -o "service_context" | wc -l

Repository: ruska-ai/orchestra

Length of output: 63


🏁 Script executed:

# Let's verify if B904 (exception chaining) is indeed a linting rule being enforced
grep -n "B904\|exception.*chain" ruff.toml

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Check if there are other functions in the file that properly chain exceptions
rg "raise.*from\s+e" backend/src/routes/v0/thread.py

Repository: ruska-ai/orchestra

Length of output: 44


Remove unused service_context and add exception chaining.

The service_context variable is created but never used. While the comment indicates this is scaffolding for future graph resume functionality, it should either be removed now or annotated with a TODO comment. Additionally, the exception handlers should chain exceptions using from e to preserve the original exception context, as required by the flake8-bugbear B904 rule enforced in the ruff configuration.

Suggested improvement for exception chaining
     except InterruptNotFoundError as e:
-        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e)) from e
     except InterruptExpiredError as e:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e)) from e
     except InterruptAuthorizationError as e:
-        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=str(e)) from e
     except InterruptValidationError as e:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e)) from e
     except Exception as e:
         logger.exception(f"Error resuming thread {thread_id}: {e}")
         raise HTTPException(
             status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)
-        )
+        ) from e

Also remove lines 412-414 (service_context creation) or add a TODO comment if this is needed for future implementation.

🧰 Tools
πŸͺ› Ruff (0.14.11)

412-412: Local variable service_context is assigned to but never used

Remove assignment to unused variable service_context

(F841)


430-430: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


432-432: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


434-434: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


436-436: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


437-437: Do not catch blind exception: Exception

(BLE001)


439-441: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

πŸ€– Prompt for AI Agents
In `@backend/src/routes/v0/thread.py` around lines 410 - 441, Remove the unused
ServiceContext instantiation (the three-line block creating service_context
using ServiceContext(user_id=user.id, store=store, checkpointer=checkpointer))
or replace it with a one-line TODO comment indicating future graph-resume usage,
and update all raised HTTPException/raise statements in this try/except block
(the handlers for InterruptNotFoundError, InterruptExpiredError,
InterruptAuthorizationError, InterruptValidationError and the generic Exception)
to use exception chaining (e.g., raise HTTPException(... ) from e) so the
original exception context is preserved; locate these changes around the logic
that opens get_checkpoint_db(), calls
InterruptService(user_id=user.id).handle_decision(request), and returns
InterruptResponse(...).

Comment thread backend/src/routes/v0/thread.py
Comment on lines +188 to +207
# Handler registry
_handlers: Dict[DecisionType, type] = {
DecisionType.APPROVE: ApproveHandler,
DecisionType.EDIT: EditHandler,
DecisionType.REJECT: RejectHandler,
DecisionType.RESPOND: RespondHandler,
}

# In-memory cache for pending interrupts (could be replaced with Redis)
_pending_interrupts: Dict[str, Interrupt] = {}
_used_nonces: set = set()

def __init__(
self,
user_id: str,
graph: Optional[CompiledStateGraph] = None,
):
self.user_id = user_id
self.graph = graph

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the file and check its existence
find . -name "interrupt.py" -path "*/backend/*" -type f

Repository: ruska-ai/orchestra

Length of output: 141


🏁 Script executed:

# Read the file to see lines 188-207 and surrounding context
wc -l backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 99


🏁 Script executed:

# Read the relevant section of the file
sed -n '180,220p' backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 1263


🏁 Script executed:

# Search for all usages of _pending_interrupts and _used_nonces throughout the file
rg "_pending_interrupts|_used_nonces" backend/src/services/interrupt.py -n

Repository: ruska-ai/orchestra

Length of output: 648


🏁 Script executed:

# Get the full class definition to understand scope and initialization
sed -n '1,250p' backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 7654


🏁 Script executed:

# Check the create_interrupt method and how _pending_interrupts is used
sed -n '240,290p' backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 1998


🏁 Script executed:

# Check the nonce validation logic
sed -n '275,295p' backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 812


🏁 Script executed:

# Check if there are any class-level state management patterns elsewhere
rg "^    [a-z_]+\s*:" backend/src/services/interrupt.py -A 1 | head -30

Repository: ruska-ai/orchestra

Length of output: 314


🏁 Script executed:

# Check how the InterruptService is instantiated to understand multi-instance usage
rg "InterruptService" backend/src --type py -B 2 -A 2

Repository: ruska-ai/orchestra

Length of output: 1300


🏁 Script executed:

# Check cleanup logic and any other references to class-level state
sed -n '400,420p' backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 638


🏁 Script executed:

# Verify if there's any documentation or tests that rely on class-level sharing
rg "InterruptService" backend/tests --type py -B 1 -A 1 2>/dev/null | head -40

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Check the full __init__ to see if instance state is already initialized there
sed -n '195,215p' backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 576


Move pending/nonce caches to instance scope to avoid cross-user leakage.

Class-level mutable caches are shared across all InterruptService instances. Since each user gets a separate instance (as seen in routes), all users' interrupts and nonces will be stored in the same dictionaries, creating data exposure and breaking replay protection.

πŸ› οΈ Proposed fix
-    # In-memory cache for pending interrupts (could be replaced with Redis)
-    _pending_interrupts: Dict[str, Interrupt] = {}
-    _used_nonces: set = set()
 
     def __init__(
         self,
         user_id: str,
         graph: Optional[CompiledStateGraph] = None,
     ):
         self.user_id = user_id
         self.graph = graph
+        # In-memory cache for pending interrupts (could be replaced with Redis)
+        self._pending_interrupts: Dict[str, Interrupt] = {}
+        self._used_nonces: set[str] = set()
🧰 Tools
πŸͺ› Ruff (0.14.11)

189-194: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


197-197: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


198-198: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

πŸ€– Prompt for AI Agents
In `@backend/src/services/interrupt.py` around lines 188 - 207, The class
currently defines mutable caches _pending_interrupts and _used_nonces at class
scope, causing cross-user data leakage; move these to instance scope by
initializing self._pending_interrupts = {} and self._used_nonces = set() inside
the __init__ of the InterruptService (or the class containing _handlers and
__init__), remove or convert the class-level mutable definitions, and update all
internal references to use self._pending_interrupts and self._used_nonces (leave
_handlers as a class-level immutable mapping if desired). Ensure any methods
that previously referenced the class attributes now reference the instance
attributes to restore per-user isolation.

Comment thread backend/src/services/interrupt.py Outdated
Comment on lines +387 to +389
# Update interrupt status
interrupt.status = InterruptStatus[request.action.value.upper()]
interrupt.resolved_at = datetime.now(timezone.utc)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | πŸ”΄ Critical

🧩 Analysis chain

🏁 Script executed:

find backend -name "interrupt.py" -type f

Repository: ruska-ai/orchestra

Length of output: 137


🏁 Script executed:

cd backend && find . -name "interrupt.py" -type f

Repository: ruska-ai/orchestra

Length of output: 125


🏁 Script executed:

cat -n backend/src/services/interrupt.py | head -400 | tail -50

Repository: ruska-ai/orchestra

Length of output: 2263


🏁 Script executed:

grep -n "class InterruptStatus" backend/src/services/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

rg "class InterruptStatus|InterruptStatus.*=" backend --type py -B 2 -A 5

Repository: ruska-ai/orchestra

Length of output: 1141


🏁 Script executed:

ls -la backend/src/services/ | head -20

Repository: ruska-ai/orchestra

Length of output: 1156


🏁 Script executed:

find backend -name "*.py" -type f | grep -i interrupt

Repository: ruska-ai/orchestra

Length of output: 137


🏁 Script executed:

git ls-files backend | grep -i interrupt | head -20

Repository: ruska-ai/orchestra

Length of output: 137


🏁 Script executed:

git ls-files backend | head -30

Repository: ruska-ai/orchestra

Length of output: 1015


🏁 Script executed:

git ls-files backend/src/services/ 2>/dev/null | head -20

Repository: ruska-ai/orchestra

Length of output: 611


🏁 Script executed:

test -f backend/src/services/interrupt.py && echo "File exists" || echo "File does not exist"

Repository: ruska-ai/orchestra

Length of output: 73


🏁 Script executed:

ls -la backend/src/ 2>/dev/null | head -20

Repository: ruska-ai/orchestra

Length of output: 967


🏁 Script executed:

find . -name "interrupt.py" -type f 2>/dev/null

Repository: ruska-ai/orchestra

Length of output: 141


🏁 Script executed:

git ls-files | grep -E "\.py$" | head -30

Repository: ruska-ai/orchestra

Length of output: 1114


🏁 Script executed:

rg "InterruptStatus" --type py | head -20

Repository: ruska-ai/orchestra

Length of output: 812


🏁 Script executed:

git ls-files | head -50

Repository: ruska-ai/orchestra

Length of output: 2051


🏁 Script executed:

git status

Repository: ruska-ai/orchestra

Length of output: 340


🏁 Script executed:

rg "DecisionType" --type py -B 2 -A 10 | head -50

Repository: ruska-ai/orchestra

Length of output: 3092


🏁 Script executed:

rg "request\.action" --type py -B 3 -A 3 | head -100

Repository: ruska-ai/orchestra

Length of output: 2500


🏁 Script executed:

pwd && ls -la

Repository: ruska-ai/orchestra

Length of output: 2087


🏁 Script executed:

find . -type f -name "*.py" | head -20

Repository: ruska-ai/orchestra

Length of output: 816


🏁 Script executed:

rg "InterruptStatus\[request\.action" --type py

Repository: ruska-ai/orchestra

Length of output: 168


🏁 Script executed:

rg "interrupt.status = InterruptStatus" --type py -B 5 -A 5

Repository: ruska-ai/orchestra

Length of output: 1698


🏁 Script executed:

git ls-files "*.py" | wc -l

Repository: ruska-ai/orchestra

Length of output: 65


🏁 Script executed:

git ls-files | grep "\.py$"

Repository: ruska-ai/orchestra

Length of output: 5899


🏁 Script executed:

git ls-files | head -20 && echo "---" && git log --oneline | head -3

Repository: ruska-ai/orchestra

Length of output: 777


🏁 Script executed:

cat -n backend/src/schemas/entities/interrupt.py | head -100

Repository: ruska-ai/orchestra

Length of output: 3805


🏁 Script executed:

rg "class InterruptStatus|class DecisionType" backend --type py -A 10

Repository: ruska-ai/orchestra

Length of output: 1304


🏁 Script executed:

sed -n '/class InterruptStatus/,/^class /p' backend/src/schemas/entities/interrupt.py | head -20

Repository: ruska-ai/orchestra

Length of output: 312


🏁 Script executed:

sed -n '/class DecisionType/,/^class /p' backend/src/schemas/entities/interrupt.py | head -30

Repository: ruska-ai/orchestra

Length of output: 281


🏁 Script executed:

cat -n backend/src/schemas/entities/interrupt.py | head -80

Repository: ruska-ai/orchestra

Length of output: 2937


🏁 Script executed:

python3 << 'EOF'
import sys
sys.path.insert(0, '/home/jailuser/git/backend/src')

# Read the interrupt.py schema file
with open('/home/jailuser/git/backend/src/schemas/entities/interrupt.py', 'r') as f:
    content = f.read()
    
# Find InterruptStatus enum
import re
interrupt_status_match = re.search(r'class InterruptStatus\(str, Enum\):(.*?)(?=\n(?:class |\Z))', content, re.DOTALL)
if interrupt_status_match:
    print("=== InterruptStatus ===")
    print(interrupt_status_match.group(0)[:500])
    
# Find DecisionType enum
decision_type_match = re.search(r'class DecisionType\(str, Enum\):(.*?)(?=\n(?:class |\Z))', content, re.DOTALL)
if decision_type_match:
    print("\n=== DecisionType ===")
    print(decision_type_match.group(0)[:500])
EOF

Repository: ruska-ai/orchestra

Length of output: 510


Use an explicit status mapping instead of relying on enum name matching.

The code InterruptStatus[request.action.value.upper()] raises a KeyError for all decision types, not just RESPOND. The enum names don't align: DecisionType.APPROVE becomes "APPROVE" but the enum defines APPROVED; similarly, EDIT β†’ "EDITED", REJECT β†’ "REJECTED", and RESPOND has no corresponding status at all.

Implement an explicit mapping between DecisionType and InterruptStatus, and add RESPONDED to the InterruptStatus enum to support the RESPOND action.

Suggested fix
-        interrupt.status = InterruptStatus[request.action.value.upper()]
+        status_map = {
+            DecisionType.APPROVE: InterruptStatus.APPROVED,
+            DecisionType.EDIT: InterruptStatus.EDITED,
+            DecisionType.REJECT: InterruptStatus.REJECTED,
+            DecisionType.RESPOND: InterruptStatus.RESPONDED,
+        }
+        interrupt.status = status_map[request.action]

Add RESPONDED = "responded" to the InterruptStatus enum in backend/src/schemas/entities/interrupt.py.

πŸ€– Prompt for AI Agents
In `@backend/src/services/interrupt.py` around lines 387 - 389, Replace the
fragile enum lookup with an explicit mapping from DecisionType to
InterruptStatus: add RESPONDED = "responded" to the InterruptStatus enum (in
interrupt.py) and create a mapping dict (e.g. DECISION_TO_STATUS) that maps
DecisionType.APPROVE->InterruptStatus.APPROVED,
DecisionType.EDIT->InterruptStatus.EDITED,
DecisionType.REJECT->InterruptStatus.REJECTED,
DecisionType.RESPOND->InterruptStatus.RESPONDED; then set interrupt.status =
DECISION_TO_STATUS[request.action] (or .value as appropriate) and keep setting
interrupt.resolved_at as before to avoid KeyError from name mismatches.

Comment on lines +76 to +86
# Re-validate edited args against the tool's schema before execution
if tool.args_schema:
try:
validated_args = tool.args_schema(**edited_args).model_dump()
except Exception as e:
logger.error(
f"Edited args validation failed for tool {tool.name}: {e}"
)
raise InterruptValidationError(
f"Edited arguments failed validation for tool '{tool.name}': {e}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

🧩 Analysis chain

🏁 Script executed:

# First, examine the file structure and current imports
head -50 backend/src/utils/tools.py

Repository: ruska-ai/orchestra

Length of output: 1810


🏁 Script executed:

# Check for pydantic imports and version
rg "import.*pydantic|from pydantic" backend/src/utils/tools.py -A 2

Repository: ruska-ai/orchestra

Length of output: 175


🏁 Script executed:

# Look for pyproject.toml or requirements to determine pydantic version
fd -e "pyproject.toml" -e "requirements.txt" -e "setup.py" | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && cat {}'

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Search for pydantic ValidationError usage patterns in the codebase
rg "ValidationError|pydantic" backend --type py -l | head -10

Repository: ruska-ai/orchestra

Length of output: 371


🌐 Web query:

Pydantic v2 ValidationError exception type BaseModel validation fails

πŸ’‘ Result:

When a BaseModel fails validation in Pydantic v2, a pydantic.ValidationError is raised. Catch it like:

from pydantic import BaseModel, ValidationError

class M(BaseModel):
x: int

try:
M(x='nope')
except ValidationError as e:
print(e.errors())

Note: in v2 some exceptions (e.g., TypeError raised inside a validator) are no longer automatically wrapped as ValidationError β€” they will propagate as their original exception unless you catch/convert them yourself. [1][2]

Sources:

  • Pydantic v2 β€” Validation Errors. [1]
  • Pydantic v2 migration notes (TypeError behavior change). [2]

🏁 Script executed:

sed -n '70,90p' backend/src/utils/tools.py

Repository: ruska-ai/orchestra

Length of output: 986


🏁 Script executed:

# Check for other similar Exception catches in the file
rg "except Exception" backend/src/utils/tools.py -n

Repository: ruska-ai/orchestra

Length of output: 134


🏁 Script executed:

# Verify pydantic version from pyproject.toml or requirements
rg "pydantic" backend/pyproject.toml backend/requirements.txt -i | head -5

Repository: ruska-ai/orchestra

Length of output: 197


🏁 Script executed:

# Check if ValidationError is already imported elsewhere in the codebase
rg "from pydantic import.*ValidationError|import.*ValidationError" backend --type py

Repository: ruska-ai/orchestra

Length of output: 205


Catch ValidationError explicitly and chain the exception.
Catching Exception hides non-validation bugs. Pydantic v2 raises ValidationError for schema validation failures. Use raise ... from err to preserve the exception chain, and add exc_info to capture the full traceback in logs.

πŸ› οΈ Suggested refinement
-from pydantic import BaseModel
+from pydantic import BaseModel, ValidationError
@@
-			if tool.args_schema:
-				try:
-					validated_args = tool.args_schema(**edited_args).model_dump()
-				except Exception as e:
-					logger.error(
-						f"Edited args validation failed for tool {tool.name}: {e}"
-					)
-					raise InterruptValidationError(
-						f"Edited arguments failed validation for tool '{tool.name}': {e}"
-					)
+			if tool.args_schema:
+				try:
+					validated_args = tool.args_schema(**edited_args).model_dump()
+				except ValidationError as err:
+					logger.error(
+						f"Edited args validation failed for tool {tool.name}: {err}",
+						exc_info=err,
+					)
+					raise InterruptValidationError(
+						f"Edited arguments failed validation for tool '{tool.name}'."
+					) from err
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Re-validate edited args against the tool's schema before execution
if tool.args_schema:
try:
validated_args = tool.args_schema(**edited_args).model_dump()
except Exception as e:
logger.error(
f"Edited args validation failed for tool {tool.name}: {e}"
)
raise InterruptValidationError(
f"Edited arguments failed validation for tool '{tool.name}': {e}"
)
# Re-validate edited args against the tool's schema before execution
if tool.args_schema:
try:
validated_args = tool.args_schema(**edited_args).model_dump()
except ValidationError as err:
logger.error(
f"Edited args validation failed for tool {tool.name}: {err}",
exc_info=err,
)
raise InterruptValidationError(
f"Edited arguments failed validation for tool '{tool.name}'."
) from err
🧰 Tools
πŸͺ› Ruff (0.14.11)

80-80: Do not catch blind exception: Exception

(BLE001)


84-86: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


84-86: Avoid specifying long messages outside the exception class

(TRY003)

πŸ€– Prompt for AI Agents
In `@backend/src/utils/tools.py` around lines 76 - 86, Replace the broad except
Exception with an explicit import and catch of pydantic.ValidationError when
validating edited_args with tool.args_schema, log the failure using
logger.error(..., exc_info=True) to include the traceback, and re-raise
InterruptValidationError using "raise InterruptValidationError(... ) from e" to
preserve the exception chain; update the block around tool.args_schema,
edited_args, validated_args, logger.error, and InterruptValidationError to
implement these changes and ensure only validation errors are handled here.

Comment on lines +2 to +10
import {
Interrupt,
InterruptEventData,
InterruptRequest,
InterruptResponse,
InterruptState,
InterruptActions,
InterruptContextType,
} from "@/lib/entities/interrupt";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | πŸ”΄ Critical

🧩 Analysis chain

🏁 Script executed:

# Locate the file
find . -path "*frontend/src/hooks/useInterrupt.ts" -type f

Repository: ruska-ai/orchestra

Length of output: 98


🏁 Script executed:

# Read the full file to check InterruptActions usage
cat -n frontend/src/hooks/useInterrupt.ts

Repository: ruska-ai/orchestra

Length of output: 7445


🏁 Script executed:

# Search for InterruptActions references in the file
rg "InterruptActions" frontend/src/hooks/useInterrupt.ts

Repository: ruska-ai/orchestra

Length of output: 80


Remove unused InterruptActions import.

InterruptActions is imported but never referenced in this file, causing a TS6133 unused variable build error.

Proposed fix
 import {
 	Interrupt,
 	InterruptEventData,
 	InterruptRequest,
 	InterruptResponse,
 	InterruptState,
-	InterruptActions,
 	InterruptContextType,
 } from "@/lib/entities/interrupt";
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import {
Interrupt,
InterruptEventData,
InterruptRequest,
InterruptResponse,
InterruptState,
InterruptActions,
InterruptContextType,
} from "@/lib/entities/interrupt";
import {
Interrupt,
InterruptEventData,
InterruptRequest,
InterruptResponse,
InterruptState,
InterruptContextType,
} from "@/lib/entities/interrupt";
🧰 Tools
πŸͺ› GitHub Actions: Test

[error] 8-8: TS6133: 'InterruptActions' is declared but its value is never read.

πŸ€– Prompt for AI Agents
In `@frontend/src/hooks/useInterrupt.ts` around lines 2 - 10, The import list in
useInterrupt.ts includes an unused symbol InterruptActions which triggers a
TS6133 error; remove InterruptActions from the import statement that currently
imports Interrupt, InterruptEventData, InterruptRequest, InterruptResponse,
InterruptState, InterruptActions, InterruptContextType so the file only imports
the symbols actually used (e.g., Interrupt, InterruptEventData,
InterruptRequest, InterruptResponse, InterruptState, InterruptContextType) and
then re-run the TypeScript build to confirm the unused import error is resolved.

Comment thread frontend/src/hooks/useInterrupt.ts
Comment on lines +74 to +84
export interface InterruptEventData {
thread_id: string;
checkpoint_id: string;
tool_name: string;
tool_args: Record<string, unknown>;
tool_call_id: string;
tool_description?: string;
reason: string;
timeout_at: string;
created_at: string;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | πŸ”΄ Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's check if there's a backend definition of InterruptEventData
find . -type f -name "*.py" | xargs grep -l "InterruptEventData" | head -20

Repository: ruska-ai/orchestra

Length of output: 178


🏁 Script executed:

# Search for interrupt-related files in the backend
fd -e py interrupt | head -20

Repository: ruska-ai/orchestra

Length of output: 137


🏁 Script executed:

# Check the frontend interrupt.ts file structure
cat -n frontend/src/lib/entities/interrupt.ts

Repository: ruska-ai/orchestra

Length of output: 6284


🏁 Script executed:

# Search for usage of interrupt_id and nonce in the frontend codebase
rg "interrupt_id|nonce" frontend/src --type ts --type tsx -i

Repository: ruska-ai/orchestra

Length of output: 89


🏁 Script executed:

cat -n backend/src/schemas/entities/interrupt.py

Repository: ruska-ai/orchestra

Length of output: 9390


🏁 Script executed:

cat -n backend/src/services/interrupt.py | head -100

Repository: ruska-ai/orchestra

Length of output: 3374


🏁 Script executed:

rg "handleInterruptEvent\|setPendingInterrupt" frontend/src --type ts -A 10 | head -60

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Search for hook implementations that handle interrupts
fd -e ts -e tsx frontend/src | xargs grep -l "useInterrupt\|InterruptEventData" | head -10

Repository: ruska-ai/orchestra

Length of output: 415


🏁 Script executed:

# Check if there's any workaround or ID extraction happening
rg "interrupt_id" frontend/src --type ts -B 2 -A 2

Repository: ruska-ai/orchestra

Length of output: 2252


🏁 Script executed:

cat -n frontend/src/hooks/useInterrupt.ts

Repository: ruska-ai/orchestra

Length of output: 7445


Add interrupt_id and nonce to InterruptEventData interface.

The backend schema includes both interrupt_id and nonce in the SSE event payload, but the frontend interface omits them. The hook currently fabricates these values (lines 56 and 69 in useInterrupt.ts), which breaks the resume contractβ€”the generated ID won't match the backend's ID, causing interrupt resolution to fail.

export interface InterruptEventData {
+	interrupt_id: string;
	thread_id: string;
	checkpoint_id: string;
	tool_name: string;
	tool_args: Record<string, unknown>;
	tool_call_id: string;
	tool_description?: string;
	reason: string;
	timeout_at: string;
	created_at: string;
+	nonce: string;
}

Update useInterrupt.ts line 56 to use data.interrupt_id instead of fabricating it, and line 69 to use data.nonce instead of generating a new one.

πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export interface InterruptEventData {
thread_id: string;
checkpoint_id: string;
tool_name: string;
tool_args: Record<string, unknown>;
tool_call_id: string;
tool_description?: string;
reason: string;
timeout_at: string;
created_at: string;
}
export interface InterruptEventData {
interrupt_id: string;
thread_id: string;
checkpoint_id: string;
tool_name: string;
tool_args: Record<string, unknown>;
tool_call_id: string;
tool_description?: string;
reason: string;
timeout_at: string;
created_at: string;
nonce: string;
}
πŸ€– Prompt for AI Agents
In `@frontend/src/lib/entities/interrupt.ts` around lines 74 - 84, The
InterruptEventData interface is missing interrupt_id and nonce, and the hook
useInterrupt is fabricating those values which breaks resume matching; add
interrupt_id: string and nonce: string to the InterruptEventData interface and
update the useInterrupt hook to use the values from the SSE payload
(data.interrupt_id instead of generating an id, and data.nonce instead of
creating a new nonce) where the hook currently fabricates them (references:
InterruptEventData interface and the useInterrupt hook's event handling logic).

@ryaneggz ryaneggz changed the title feat(hitl): Add Human-In-The-Loop support for agent tool approval (#336) DRAFT: feat(hitl): Add Human-In-The-Loop support for agent tool approval (#336) Jan 17, 2026
@ryaneggz ryaneggz marked this pull request as draft January 17, 2026 02:51

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/src/workers/tasks.py (1)

37-42: Update docstring example to match current signature.

The actual invocation in backend/src/routes/v0/llm.py correctly passes task_dict, user_id, and thread_id. However, the usage example in backend/src/workers/__init__.py (line 12) shows an outdated call with a config_dict argument that is no longer part of the function signature. Update the docstring to:

await run_agent_stream.kiq(task_dict, user_id, thread_id)
πŸ€– Fix all issues with AI agents
In @.claude/specs/resolve-graph-interrupt-error/COUNCIL_REVIEW.md:
- Line 125: Replace the emphasized closing line "*Council Review Complete*" with
a proper Markdown heading (e.g., "## Council Review Complete" or "### Council
Review Complete") to satisfy markdownlint MD036; locate the literal text
"*Council Review Complete*" in the file and change the surrounding asterisks to
a heading marker instead of emphasis.

In @.claude/specs/resolve-graph-interrupt-error/PROPOSAL_GUARDIAN.md:
- Line 278: The markdown in PROPOSAL_GUARDIAN.md is using underscore-based bold
which triggers MD050; update the offending bold markup to use asterisk-based
strong (e.g., change _Current Risk_ to **Current Risk**) so it conforms to
markdownlint expectations; search for other underscore-based bold instances in
the same file and replace them with **...** (reference: PROPOSAL_GUARDIAN.md and
MD050 rule).

In @.claude/specs/resolve-graph-interrupt-error/TASKS.md:
- Around line 32-40: The import for ToolRuntime is wrong; replace the existing
import `from langchain.tools import ToolRuntime` with `from langgraph.prebuilt
import ToolRuntime` so resume_thread's graph-reconstruction imports use the
current prebuilt tool source; update the import statement alongside the other
imports (e.g., near LLMRequest, LLMInput/LLMMetadata, ContextSchema,
construct_agent, init_config, init_backend) to reference ToolRuntime from
langgraph.prebuilt.

In `@backend/src/schemas/entities/interrupt.py`:
- Around line 32-120: The InterruptStatus enum is missing RESPONDED and the
decision-to-status mapping is incorrectly mapping DecisionType.RESPOND to
INTERRUPTSTATUS.REJECTED; add RESPONDED = "responded" to the InterruptStatus
enum and update the mapping in resolve_interrupt() to map DecisionType.RESPOND
(and any code paths used by RespondHandler.handle()) to
InterruptStatus.RESPONDED so internal state reflects a "responded" outcome
instead of "rejected".

In `@backend/src/services/llm.py`:
- Around line 161-170: The wrapped HITL tools are created without the
assistant's decision policy so add_human_in_the_loop falls back to default
allow/accept/edit/respond behavior; update the wrapping loop to pass the
assistant's HITL decision config into add_human_in_the_loop (use the relevant
property on assistant.hitl, e.g., assistant.hitl.interrupt_config or
assistant.hitl.decision_config) whenever should_interrupt_tool(tool.name,
assistant.hitl) is used, so that add_human_in_the_loop(tool,
interrupt_config=...) receives the configured restrictions and assistant.tools
reflects the correct runtime policy.

In `@backend/src/utils/tools.py`:
- Line 60: The async wrapper call_tool_with_interrupt is missing type hints: add
an explicit return type annotation (e.g., -> Any or the specific result type)
and annotate the **tool_input kwargs** parameter (e.g., **tool_input: Any or
Mapping[str, Any]) as per backend typing rules; keep the existing RunnableConfig
type on the config parameter and update the function signature in tools.py
accordingly so all parameters and return types are fully annotated.

In `@backend/src/workers/tasks.py`:
- Around line 16-34: Annotate the contains_interrupt function to satisfy backend
type-hint rules: add parameter and return type hints such as def
contains_interrupt(obj: Any) -> Optional[Interrupt]:, import Any and Optional
from typing, and keep the current recursive logic and return values (return an
Interrupt instance or None). Ensure any iterable checks still accept dict, list,
tuple as before; update the function signature and imports (e.g., from typing
import Any, Optional) accordingly.
- Around line 155-189: The interrupt payloads built inline are missing required
fields for the frontend HITL UI; replace the manual dict creation in both the
top-level Interrupt branch and the nested_interrupt branch with a call to the
shared builder create_interrupt_event (from src.utils.stream) to produce the
complete InterruptEventData schema (including tool_name, tool_args,
tool_call_id, reason, timeout, nonce, created_at, etc.), pass thread_id and
checkpoint_id (config["configurable"].get("checkpoint_id")) as needed, and then
ujson.dumps the tuple ("interrupt", interrupt_event) before xadd so the emitted
stream matches the frontend schema.

In `@frontend/src/hooks/useChat.ts`:
- Around line 14-16: The import list in useChat.ts includes an unused symbol
InterruptEvent which breaks the TypeScript build; remove InterruptEvent from the
import statement (the line importing StreamEvent, InterruptEvent from
"@/lib/entities/stream") so only used types (e.g., StreamEvent) are imported, or
alternatively use InterruptEvent where intendedβ€”update the import at the top of
the file to eliminate the unused InterruptEvent reference.

In `@frontend/src/hooks/useInterrupt.ts`:
- Around line 95-106: The current resolve flow in resumeThread clears
pendingInterrupt and closes the dialog even when other pending interrupts
remain; update the setState call after resumeThread to compute the new
pendingInterrupts (filtering out request.interrupt_id) and then set
pendingInterrupt to the next item (e.g., the first element of the filtered
array) or null if none remain, and set showDialog to true only if a next
pendingInterrupt exists (keep isLoading false). Modify the state update around
setState in useInterrupt.ts to use prev.pendingInterrupts, produce filtered =
prev.pendingInterrupts.filter((i) => i.id !== request.interrupt_id), and set
pendingInterrupt to filtered[0] || null and showDialog to Boolean(filtered[0]).
- Around line 188-238: The helper functions createApproveRequest,
createEditRequest, createRejectRequest, and createRespondRequest currently
generate new nonces with crypto.randomUUID(), which breaks replay protection;
change each function signature to accept a nonce parameter (e.g., nonce: string)
and use that value instead of calling crypto.randomUUID(), then update the
callsites in InterruptApprovalDialog.tsx to pass the interrupt's existing nonce
(interrupt.nonce) into those helpers when building the InterruptRequest.
♻️ Duplicate comments (7)
backend/src/utils/tools.py (1)

76-86: Catch ValidationError explicitly and preserve the exception chain.

Catching Exception will mask non-validation bugs and loses traceback context. Pydantic v2 raises ValidationError for schema failures; log with exc_info=True and re-raise with from err.

πŸ› οΈ Targeted fix
-from pydantic import BaseModel
+from pydantic import BaseModel, ValidationError
@@
-                try:
-                    validated_args = tool.args_schema(**edited_args).model_dump()
-                except Exception as e:
-                    logger.error(
-                        f"Edited args validation failed for tool {tool.name}: {e}"
-                    )
-                    raise InterruptValidationError(
-                        f"Edited arguments failed validation for tool '{tool.name}': {e}"
-                    )
+                try:
+                    validated_args = tool.args_schema(**edited_args).model_dump()
+                except ValidationError as err:
+                    logger.error(
+                        f"Edited args validation failed for tool {tool.name}: {err}",
+                        exc_info=True,
+                    )
+                    raise InterruptValidationError(
+                        f"Edited arguments failed validation for tool '{tool.name}'."
+                    ) from err
backend/src/routes/v0/thread.py (2)

546-565: Add exception chaining and remove debug code before production.

As previously noted, exception handlers should use raise ... from e to preserve the original traceback. Additionally, the inline import traceback and TODO comment indicate debug code that should be cleaned up.

πŸ”§ Suggested fix
     except InterruptNotFoundError as e:
-        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e)) from e
     except InterruptExpiredError as e:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e)) from e
     except InterruptAuthorizationError as e:
-        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=str(e)) from e
     except InterruptValidationError as e:
-        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
+        raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e)) from e
     except HTTPException:
         raise
     except Exception as e:
         logger.exception(f"Error resuming thread {thread_id}: {e}")
-        # Include error details for debugging (TODO: remove in production)
-        import traceback
-
-        error_details = f"{type(e).__name__}: {str(e)}"
         raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=error_details,
-        )
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)
+        ) from e

591-607: Remove unused service_context and add exception chaining.

The service_context created on lines 593-595 is never used. Additionally, the exception handler should use raise ... from e.

πŸ”§ Suggested fix
 async def get_thread_interrupts(
     thread_id: str,
     user: ProtectedUser = Depends(verify_credentials),
     store: AsyncPostgresStore = Depends(get_store),
 ):
     """..."""
     try:
         async with get_checkpoint_db() as checkpointer:
-            service_context = ServiceContext(
-                user_id=user.id, store=store, checkpointer=checkpointer
-            )
-
             # Get pending interrupts for this thread
             interrupt_service = InterruptService(user_id=user.id)
             pending = interrupt_service.get_pending_interrupts(thread_id)

             return InterruptList(interrupts=pending)

     except Exception as e:
         logger.exception(f"Error getting interrupts for thread {thread_id}: {e}")
         raise HTTPException(
             status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)
-        )
+        ) from e
frontend/src/hooks/useInterrupt.ts (1)

2-10: Remove unused InterruptActions import to avoid TS6133.

Line 8 imports InterruptActions but it’s never referenced. With noUnusedLocals, this fails typecheck.

πŸ› οΈ Proposed fix
 import {
 	Interrupt,
 	InterruptEventData,
 	InterruptRequest,
 	InterruptResponse,
 	InterruptState,
-	InterruptActions,
 	InterruptContextType,
 } from "@/lib/entities/interrupt";
backend/src/services/interrupt.py (3)

198-208: Move pending/nonce caches to instance scope.

Lines 206–208 define mutable caches at class scope, which are shared across users and can leak interrupts/nonces across requests.

πŸ› οΈ Proposed fix
-    # In-memory cache for pending interrupts (could be replaced with Redis)
-    _pending_interrupts: Dict[str, Interrupt] = {}
-    _used_nonces: set = set()
-
     def __init__(
         self,
         user_id: str,
         graph: Optional[CompiledStateGraph] = None,
     ):
         self.user_id = user_id
         self.graph = graph
+        # In-memory cache for pending interrupts (could be replaced with Redis)
+        self._pending_interrupts: Dict[str, Interrupt] = {}
+        self._used_nonces: set[str] = set()

291-533: Nonce validation should verify the interrupt’s stored nonce and require it.

Line 503 only checks for nonce reuse and allows missing nonces, which defeats replay protection. Validate request.nonce against interrupt.nonce and require it for all decisions.

πŸ› οΈ Proposed fix
-    def check_nonce(self, interrupt_id: str, nonce: str) -> bool:
-        """Check if a nonce has been used before (without consuming it)."""
-        nonce_key = f"{interrupt_id}:{nonce}"
-        return nonce_key not in self._used_nonces
+    def check_nonce(self, interrupt: Interrupt, nonce: str) -> bool:
+        """Validate that the nonce matches the interrupt and has not been used."""
+        if not nonce or nonce != interrupt.nonce:
+            return False
+        nonce_key = f"{interrupt.id}:{nonce}"
+        return nonce_key not in self._used_nonces
 
-    def consume_nonce(self, interrupt_id: str, nonce: str) -> None:
+    def consume_nonce(self, interrupt: Interrupt, nonce: str) -> None:
         """Mark a nonce as used after successful operation."""
-        nonce_key = f"{interrupt_id}:{nonce}"
+        nonce_key = f"{interrupt.id}:{nonce}"
         self._used_nonces.add(nonce_key)
-        if request.nonce and not self.check_nonce(request.interrupt_id, request.nonce):
-            raise InterruptValidationError(
-                f"Nonce for interrupt {request.interrupt_id} has already been used"
-            )
+        if not request.nonce or not self.check_nonce(interrupt, request.nonce):
+            raise InterruptValidationError(
+                f"Invalid or missing nonce for interrupt {request.interrupt_id}"
+            )
-        if request.nonce:
-            self.consume_nonce(request.interrupt_id, request.nonce)
+        self.consume_nonce(interrupt, request.nonce)

534-543: RESPOND should map to a RESPONDED status, not REJECTED.

Line 539 maps DecisionType.RESPOND to InterruptStatus.REJECTED, which collapses two distinct outcomes. Once RESPONDED exists in the enum, use it here.

πŸ› οΈ Proposed fix
         decision_to_status = {
             DecisionType.APPROVE: InterruptStatus.APPROVED,
             DecisionType.EDIT: InterruptStatus.EDITED,
             DecisionType.REJECT: InterruptStatus.REJECTED,
-            DecisionType.RESPOND: InterruptStatus.REJECTED,  # RESPOND maps to REJECTED
+            DecisionType.RESPOND: InterruptStatus.RESPONDED,
         }
🧹 Nitpick comments (7)
backend/src/utils/middleware.py (1)

44-46: Defensive guard looks good.

The null check correctly prevents an AttributeError when runtime.context is None during HITL resume flows. The inline comment provides helpful context for future maintainers.

One optional consideration: you might add a debug-level log when context is None to aid troubleshooting HITL-related issues in production.

Optional: Add debug logging for observability
             # Guard against None context (e.g., during HITL resume)
             if runtime.context is not None:
                 last_msg.model = runtime.context.model
+            else:
+                logger.debug("Skipping model metadata: runtime.context is None (HITL resume?)")
backend/src/utils/tools.py (1)

23-27: Avoid redefining InterruptValidationError in multiple modules.

There is already an InterruptValidationError in backend/src/services/interrupt.py; a second class with the same name can make except InterruptValidationError handlers miss errors raised here. Consider importing the shared exception or moving it to a small common module.

♻️ Suggested consolidation (if no import cycle)
-from compileall import Compileall
-from pydantic import BaseModel
+from pydantic import BaseModel
+from src.services.interrupt import InterruptValidationError
@@
-class InterruptValidationError(Exception):
-    """Raised when edited arguments fail schema validation."""
-
-    pass
frontend/src/components/modals/InterruptApprovalDialog.tsx (3)

85-102: Consider stabilizing onClose reference to prevent timer resets.

If onClose changes identity between renders (e.g., inline arrow function in parent), the timer effect will reset. Ensure the parent component memoizes onClose with useCallback, or consider removing it from the dependency array and using a ref pattern if this becomes an issue.


125-159: Consider handling rejection from onDecision promise.

If onDecision throws or rejects, the error will propagate uncaught. Consider wrapping the call in try-catch to display an error message to the user rather than silently failing or crashing.

♻️ Suggested improvement
 const handleSubmit = useCallback(async () => {
   if (!interrupt) return;

   let request;
   switch (activeTab) {
     // ... switch cases ...
   }

-  await onDecision(request);
+  try {
+    await onDecision(request);
+  } catch (error) {
+    setParseError(error instanceof Error ? error.message : "Failed to submit decision");
+  }
 }, [
   interrupt,
   activeTab,
   editedArgs,
   feedback,
   validateEditedArgs,
   onDecision,
 ]);

263-273: Shared feedback state between Reject and Respond tabs may cause confusion.

The feedback state is shared between the Reject and Respond tabs. If a user types feedback in the Respond tab, switches to Reject, that feedback persists. Consider either clearing feedback on tab switch or using separate state variables.

.claude/specs/resolve-graph-interrupt-error/TASKS.md (1)

91-113: Documentation includes example credentials - ensure these are test-only.

The curl examples use admin@example.com / test1234. This is fine for documentation, but ensure these credentials don't exist in production environments.

backend/src/utils/stream.py (1)

307-349: Narrow the interrupt parsing exception handling.

Line 347 catches all Exception and continues, which can hide real parsing bugs in the stream. Prefer catching expected errors (KeyError, TypeError, ValueError) or re-raising unexpected ones.

πŸ› οΈ Proposed fix
-                    except Exception as e:
-                        logger.exception(f"Error processing interrupt: {e}")
-                        continue
+                    except (KeyError, TypeError, ValueError) as e:
+                        logger.exception(f"Error processing interrupt: {e}")
+                        continue


---

*Council Review Complete*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Use a heading instead of emphasis for the closing line.

Line 125 uses emphasis as a heading; switch to a proper heading to satisfy markdownlint MD036.

✍️ Suggested heading
-*Council Review Complete*
+## Council Review Complete
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
*Council Review Complete*
## Council Review Complete
🧰 Tools
πŸͺ› markdownlint-cli2 (0.18.1)

125-125: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

πŸ€– Prompt for AI Agents
In @.claude/specs/resolve-graph-interrupt-error/COUNCIL_REVIEW.md at line 125,
Replace the emphasized closing line "*Council Review Complete*" with a proper
Markdown heading (e.g., "## Council Review Complete" or "### Council Review
Complete") to satisfy markdownlint MD036; locate the literal text "*Council
Review Complete*" in the file and change the surrounding asterisks to a heading
marker instead of emphasis.


**What if the agent construction fails?**

**Current Risk**: `construct_agent()` (flows/__init__.py line 240-242) catches all exceptions and re-raises:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Markdown strong style mismatch (MD050).

Line 278 uses underscore-based bold; markdownlint expects asterisk-based bold. Please switch to **bold**.

🧰 Tools
πŸͺ› markdownlint-cli2 (0.18.1)

278-278: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)


278-278: Strong style
Expected: asterisk; Actual: underscore

(MD050, strong-style)

πŸ€– Prompt for AI Agents
In @.claude/specs/resolve-graph-interrupt-error/PROPOSAL_GUARDIAN.md at line
278, The markdown in PROPOSAL_GUARDIAN.md is using underscore-based bold which
triggers MD050; update the offending bold markup to use asterisk-based strong
(e.g., change _Current Risk_ to **Current Risk**) so it conforms to markdownlint
expectations; search for other underscore-based bold instances in the same file
and replace them with **...** (reference: PROPOSAL_GUARDIAN.md and MD050 rule).

Comment on lines +32 to +40
```python
# Imports for graph reconstruction in resume_thread
from deepagents.backends import StoreBackend
from langchain.tools import ToolRuntime
from src.schemas.entities import LLMRequest
from src.schemas.entities.llm import LLMInput, LLMMetadata
from src.schemas.contexts import ContextSchema
from src.flows import construct_agent, init_config, init_backend
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for ToolRuntime usage in the codebase
rg "ToolRuntime" -A 2 -B 2

Repository: ruska-ai/orchestra

Length of output: 180


🏁 Script executed:

# Check the TASKS.md file content around the import statement
cat -n .claude/specs/resolve-graph-interrupt-error/TASKS.md | sed -n '25,45p'

Repository: ruska-ai/orchestra

Length of output: 760


🏁 Script executed:

# Search for langchain imports in the codebase
rg "from langchain" | head -20

Repository: ruska-ai/orchestra

Length of output: 2123


🏁 Script executed:

# Search for deepagents imports
rg "from deepagents" | head -20

Repository: ruska-ai/orchestra

Length of output: 1305


🏁 Script executed:

# Search for ToolRuntime definitions in source code (not examples or notebooks)
rg "class ToolRuntime"

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Check the actual source thread.py file to see what imports it currently has
cat -n backend/src/routes/v0/thread.py | head -50

Repository: ruska-ai/orchestra

Length of output: 2193


🏁 Script executed:

# Search for any actual usage of ToolRuntime in source code (not specs/examples)
rg "ToolRuntime" backend/src

Repository: ruska-ai/orchestra

Length of output: 2511


Incorrect import path for ToolRuntime.

The import should be from langgraph.prebuilt import ToolRuntime, not from langchain.tools import ToolRuntime. The codebase shows newer files use langgraph.prebuilt as the correct source (e.g., backend/src/utils/format.py, backend/src/tools/test.py, backend/src/tools/api.py), while older files still reference langchain.tools. Update this import to match the current pattern.

πŸ€– Prompt for AI Agents
In @.claude/specs/resolve-graph-interrupt-error/TASKS.md around lines 32 - 40,
The import for ToolRuntime is wrong; replace the existing import `from
langchain.tools import ToolRuntime` with `from langgraph.prebuilt import
ToolRuntime` so resume_thread's graph-reconstruction imports use the current
prebuilt tool source; update the import statement alongside the other imports
(e.g., near LLMRequest, LLMInput/LLMMetadata, ContextSchema, construct_agent,
init_config, init_backend) to reference ToolRuntime from langgraph.prebuilt.

Comment on lines +32 to +120
class InterruptStatus(str, Enum):
"""Status of an interrupt."""

PENDING = "pending"
APPROVED = "approved"
EDITED = "edited"
REJECTED = "rejected"
TIMEOUT = "timeout"
EXPIRED = "expired"


class InterruptConfig(BaseModel):
"""Configuration for human-in-the-loop interrupts on an assistant.

This config determines which tools require human approval and what
actions are allowed during the approval process.
"""

enabled: bool = Field(
default=False, description="Whether HITL is enabled for this assistant"
)
tools_requiring_approval: List[str] = Field(
default_factory=list,
description="List of tool names that require human approval before execution",
)
timeout_seconds: int = Field(
default=300,
ge=30,
le=3600,
description="Timeout in seconds for pending interrupts (30s to 1 hour)",
)
default_action: Literal["approve", "reject", "timeout"] = Field(
default="timeout",
description="Default action to take when an interrupt times out",
)
allow_approve: bool = Field(
default=True, description="Whether approving tool calls is allowed"
)
allow_edit: bool = Field(
default=True, description="Whether editing tool call arguments is allowed"
)
allow_reject: bool = Field(
default=True, description="Whether rejecting tool calls is allowed"
)


class InterruptRequest(BaseModel):
"""Request to resolve a pending interrupt.

Sent by the frontend when a user makes a decision on a pending
tool call approval request.
"""

interrupt_id: str = Field(..., description="The unique ID of the interrupt")
action: DecisionType = Field(
..., description="The decision: approve, edit, reject, or respond"
)
edited_args: Optional[Dict[str, Any]] = Field(
default=None,
description="Modified tool arguments (required when action is 'edit')",
)
reason: Optional[str] = Field(
default=None,
description="Optional reason for the decision (useful for reject/respond)",
)
nonce: Optional[str] = Field(
default=None, description="One-time-use token for replay protection"
)

@field_validator("edited_args")
@classmethod
def validate_edited_args_required_for_edit(cls, v, info):
"""Ensure edited_args is provided when action is 'edit'."""
if info.data.get("action") == DecisionType.EDIT and not v:
raise ValueError("edited_args is required when action is 'edit'")
return v


class InterruptResponse(BaseModel):
"""Response after resolving an interrupt."""

status: Literal["resumed", "rejected", "error"] = Field(
..., description="The result status of the interrupt resolution"
)
interrupt_id: str = Field(..., description="The interrupt ID that was resolved")
thread_id: str = Field(..., description="The thread ID the interrupt belongs to")
message: Optional[str] = Field(
default=None, description="Optional message about the resolution"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find where DecisionType is defined
fd -t f "\.py$" backend/src | xargs rg "class DecisionType|DecisionType\s*=" -A 10

Repository: ruska-ai/orchestra

Length of output: 1663


🏁 Script executed:

# Check the interrupt service file
fd -t f "interrupt\.py$" backend/src/services

Repository: ruska-ai/orchestra

Length of output: 95


🏁 Script executed:

# Look at what statuses are actually returned in handlers
rg "approved|edited|responded|rejected" backend/src/services --type py -B 2 -A 2

Repository: ruska-ai/orchestra

Length of output: 4733


🏁 Script executed:

# Get more context on how InterruptResponse is used and constructed
rg "InterruptResponse" backend/src --type py -B 3 -A 3

Repository: ruska-ai/orchestra

Length of output: 2732


🏁 Script executed:

# Look at the complete flow in interrupt.py to see handler output handling
cat -n backend/src/services/interrupt.py | head -150

Repository: ruska-ai/orchestra

Length of output: 5807


🏁 Script executed:

# Continue reading the interrupt.py service file
cat -n backend/src/services/interrupt.py | sed -n '150,250p'

Repository: ruska-ai/orchestra

Length of output: 4265


🏁 Script executed:

# Look specifically at where handlers are invoked and how results are used
rg "resolve_interrupt|handler.*handle" backend/src/services/interrupt.py -A 15 -B 5

Repository: ruska-ai/orchestra

Length of output: 1329


🏁 Script executed:

# Check how resolve_interrupt is used in the route and if handler status is used
rg "resolve_interrupt" backend/src/routes/v0/thread.py -B 5 -A 20

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Verify the complete mapping and see if there's actual usage of handler status
cat -n backend/src/services/interrupt.py | sed -n '300,380p'

Repository: ruska-ai/orchestra

Length of output: 3492


🏁 Script executed:

# Check the route to see exactly how InterruptResponse is constructed and used
cat -n backend/src/routes/v0/thread.py | sed -n '1,50p'

Repository: ruska-ai/orchestra

Length of output: 2193


🏁 Script executed:

# Find the resume_thread endpoint completely
rg "async def resume_thread" backend/src/routes/v0/thread.py -A 50

Repository: ruska-ai/orchestra

Length of output: 2109


🏁 Script executed:

# Continue reading the resume_thread endpoint to see InterruptResponse construction
rg "async def resume_thread" backend/src/routes/v0/thread.py -A 120 | tail -70

Repository: ruska-ai/orchestra

Length of output: 2936


🏁 Script executed:

# Search for all places where InterruptResponse is instantiated
rg "InterruptResponse\(" backend/src --type py -B 3 -A 3

Repository: ruska-ai/orchestra

Length of output: 1005


🏁 Script executed:

# Check what the complete resume_thread function returns and how the handler result is used
rg "async def resume_thread" backend/src/routes/v0/thread.py -A 180 | tail -40

Repository: ruska-ai/orchestra

Length of output: 1574


🏁 Script executed:

# Look for error handling in resume_thread to understand the "error" status
rg "InterruptResponse.*error\|status.*error" backend/src/routes/v0/thread.py -B 5 -A 3

Repository: ruska-ai/orchestra

Length of output: 44


Add missing RESPONDED status to InterruptStatus enum.

InterruptStatus enum lacks a RESPONDED status, but handlers return "responded" as a decision outcome (see RespondHandler.handle() in backend/src/services/interrupt.py). Additionally, the decision-to-status mapping currently maps DecisionType.RESPOND to InterruptStatus.REJECTED, which is semantically incorrect. Add RESPONDED = "responded" to the enum and update the mapping accordingly.

Note: InterruptResponse.status does not need modificationβ€”it correctly uses "resumed" (success) and HTTP exceptions for errors. The handler status values are internal state tracking, not part of the API response contract.

Fix for InterruptStatus
 class InterruptStatus(str, Enum):
     """Status of an interrupt."""

     PENDING = "pending"
     APPROVED = "approved"
     EDITED = "edited"
     REJECTED = "rejected"
+    RESPONDED = "responded"
     TIMEOUT = "timeout"
     EXPIRED = "expired"

Update the mapping in resolve_interrupt():

 decision_to_status = {
     DecisionType.APPROVE: InterruptStatus.APPROVED,
     DecisionType.EDIT: InterruptStatus.EDITED,
     DecisionType.REJECT: InterruptStatus.REJECTED,
-    DecisionType.RESPOND: InterruptStatus.REJECTED,
+    DecisionType.RESPOND: InterruptStatus.RESPONDED,
 }
🧰 Tools
πŸͺ› Ruff (0.14.13)

106-106: Avoid specifying long messages outside the exception class

(TRY003)

πŸ€– Prompt for AI Agents
In `@backend/src/schemas/entities/interrupt.py` around lines 32 - 120, The
InterruptStatus enum is missing RESPONDED and the decision-to-status mapping is
incorrectly mapping DecisionType.RESPOND to INTERRUPTSTATUS.REJECTED; add
RESPONDED = "responded" to the InterruptStatus enum and update the mapping in
resolve_interrupt() to map DecisionType.RESPOND (and any code paths used by
RespondHandler.handle()) to InterruptStatus.RESPONDED so internal state reflects
a "responded" outcome instead of "rejected".

Comment on lines +161 to +170
# Wrap tools with HITL if configured
if assistant.hitl and assistant.hitl.enabled:
wrapped_tools = []
for tool in assistant.tools:
if should_interrupt_tool(tool.name, assistant.hitl):
logger.info(f"Wrapping tool '{tool.name}' with HITL")
wrapped_tools.append(add_human_in_the_loop(tool))
else:
wrapped_tools.append(tool)
assistant.tools = wrapped_tools

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

HITL decision restrictions aren’t applied to wrapped tools.

add_human_in_the_loop() is called without interrupt_config, so the wrapper falls back to allow-accept/edit/respond even if assistant.hitl limits decisions. Please pass the HITL decision config so runtime behavior matches the configured policy.

πŸ› οΈ Suggested fix (conceptual)
- wrapped_tools.append(add_human_in_the_loop(tool))
+ wrapped_tools.append(
+     add_human_in_the_loop(
+         tool,
+         interrupt_config=assistant.hitl.to_human_interrupt_config(),  # or explicit mapping
+     )
+ )
πŸ€– Prompt for AI Agents
In `@backend/src/services/llm.py` around lines 161 - 170, The wrapped HITL tools
are created without the assistant's decision policy so add_human_in_the_loop
falls back to default allow/accept/edit/respond behavior; update the wrapping
loop to pass the assistant's HITL decision config into add_human_in_the_loop
(use the relevant property on assistant.hitl, e.g.,
assistant.hitl.interrupt_config or assistant.hitl.decision_config) whenever
should_interrupt_tool(tool.name, assistant.hitl) is used, so that
add_human_in_the_loop(tool, interrupt_config=...) receives the configured
restrictions and assistant.tools reflects the correct runtime policy.

Comment on lines +16 to +34
from langgraph.types import Interrupt
from src.workers.broker import broker, REDIS_URL


def contains_interrupt(obj):
"""Recursively check if an object contains an Interrupt instance."""
if isinstance(obj, Interrupt):
return obj
if isinstance(obj, dict):
for v in obj.values():
result = contains_interrupt(v)
if result:
return result
if isinstance(obj, (list, tuple)):
for item in obj:
result = contains_interrupt(item)
if result:
return result
return None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ› οΈ Refactor suggestion | 🟠 Major

Add type hints to comply with backend standards.

All backend functions require type hints. Please annotate contains_interrupt. As per coding guidelines, type hints are required for all functions.

πŸ› οΈ Suggested fix
-from langgraph.types import Interrupt
+from langgraph.types import Interrupt
+from typing import Any

-def contains_interrupt(obj):
+def contains_interrupt(obj: Any) -> Interrupt | None:
πŸ€– Prompt for AI Agents
In `@backend/src/workers/tasks.py` around lines 16 - 34, Annotate the
contains_interrupt function to satisfy backend type-hint rules: add parameter
and return type hints such as def contains_interrupt(obj: Any) ->
Optional[Interrupt]:, import Any and Optional from typing, and keep the current
recursive logic and return values (return an Interrupt instance or None). Ensure
any iterable checks still accept dict, list, tuple as before; update the
function signature and imports (e.g., from typing import Any, Optional)
accordingly.

Comment on lines +155 to +189
# Check if chunk contains an Interrupt (HITL)
if isinstance(chunk, Interrupt):
# Convert Interrupt to JSON-serializable format
interrupt_data = {
"type": "interrupt",
"interrupt_id": chunk.id,
"value": chunk.value,
"thread_id": thread_id,
"checkpoint_id": config["configurable"].get("checkpoint_id"),
}
data = ujson.dumps(("interrupt", interrupt_data))
await redis_client.xadd(stream_key, {"data": data})
logger.info(f"πŸ›‘ HITL interrupt sent for thread: {thread_id}")
continue

# CRITICAL: Use handle_multi_mode to maintain LangGraph format
stream_chunk = handle_multi_mode(chunk)
if stream_chunk:
# Check for nested Interrupt objects (HITL)
nested_interrupt = contains_interrupt(stream_chunk)
if nested_interrupt:
# Convert nested Interrupt to JSON-serializable format
interrupt_data = {
"type": "interrupt",
"interrupt_id": nested_interrupt.id,
"value": nested_interrupt.value,
"thread_id": thread_id,
"checkpoint_id": config["configurable"].get(
"checkpoint_id"
),
}
data = ujson.dumps(("interrupt", interrupt_data))
await redis_client.xadd(stream_key, {"data": data})
logger.info(f"πŸ›‘ HITL interrupt sent for thread: {thread_id}")
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Interrupt payload schema is incomplete for frontend HITL UI.

The emitted interrupt event only includes interrupt_id, value, thread_id, and checkpoint_id, but the frontend expects InterruptEventData (tool_name, tool_args, tool_call_id, reason, timeout, nonce, created_at, etc.). This will break HITL rendering in distributed mode. Consider reusing the shared interrupt event builder (e.g., create_interrupt_event in src.utils.stream) to keep schema consistent.

πŸ› οΈ Suggested fix (reuse shared builder)
-                    interrupt_data = {
-                        "type": "interrupt",
-                        "interrupt_id": chunk.id,
-                        "value": chunk.value,
-                        "thread_id": thread_id,
-                        "checkpoint_id": config["configurable"].get("checkpoint_id"),
-                    }
+                    interrupt_data = create_interrupt_event(
+                        interrupt=chunk,
+                        thread_id=thread_id,
+                        checkpoint_id=config["configurable"].get("checkpoint_id"),
+                    )
πŸ€– Prompt for AI Agents
In `@backend/src/workers/tasks.py` around lines 155 - 189, The interrupt payloads
built inline are missing required fields for the frontend HITL UI; replace the
manual dict creation in both the top-level Interrupt branch and the
nested_interrupt branch with a call to the shared builder create_interrupt_event
(from src.utils.stream) to produce the complete InterruptEventData schema
(including tool_name, tool_args, tool_call_id, reason, timeout, nonce,
created_at, etc.), pass thread_id and checkpoint_id
(config["configurable"].get("checkpoint_id")) as needed, and then ujson.dumps
the tuple ("interrupt", interrupt_event) before xadd so the emitted stream
matches the frontend schema.

Comment on lines +14 to +16
import type { StreamEvent, InterruptEvent } from "@/lib/entities/stream";
import type { StreamSource } from "@/lib/utils/streamSource";
import type { InterruptEventData } from "@/lib/entities/interrupt";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | πŸ”΄ Critical

Fix unused InterruptEvent import to unblock TypeScript build.

The pipeline is failing because InterruptEvent is imported but never used. Remove it (or use it) to restore compilation.

πŸ› οΈ Suggested fix
-import type { StreamEvent, InterruptEvent } from "@/lib/entities/stream";
+import type { StreamEvent } from "@/lib/entities/stream";
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import type { StreamEvent, InterruptEvent } from "@/lib/entities/stream";
import type { StreamSource } from "@/lib/utils/streamSource";
import type { InterruptEventData } from "@/lib/entities/interrupt";
import type { StreamEvent } from "@/lib/entities/stream";
import type { StreamSource } from "@/lib/utils/streamSource";
import type { InterruptEventData } from "@/lib/entities/interrupt";
🧰 Tools
πŸͺ› GitHub Actions: Test

[error] 14-14: TypeScript error: 'InterruptEvent' is declared but never used.

πŸ€– Prompt for AI Agents
In `@frontend/src/hooks/useChat.ts` around lines 14 - 16, The import list in
useChat.ts includes an unused symbol InterruptEvent which breaks the TypeScript
build; remove InterruptEvent from the import statement (the line importing
StreamEvent, InterruptEvent from "@/lib/entities/stream") so only used types
(e.g., StreamEvent) are imported, or alternatively use InterruptEvent where
intendedβ€”update the import at the top of the file to eliminate the unused
InterruptEvent reference.

Comment on lines +95 to +106
const response = await resumeThread(threadId, request);

// Remove the resolved interrupt from pending list
setState((prev) => ({
...prev,
pendingInterrupt: null,
pendingInterrupts: prev.pendingInterrupts.filter(
(i) => i.id !== request.interrupt_id,
),
isLoading: false,
showDialog: false,
}));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Advance to the next pending interrupt after resolve.

Line 98 clears pendingInterrupt and hides the dialog even if other pending interrupts remain, so additional approvals never surface. Consider selecting the next pending item instead of always nulling the state.

πŸ› οΈ Proposed fix
-				setState((prev) => ({
-					...prev,
-					pendingInterrupt: null,
-					pendingInterrupts: prev.pendingInterrupts.filter(
-						(i) => i.id !== request.interrupt_id,
-					),
-					isLoading: false,
-					showDialog: false,
-				}));
+				setState((prev) => {
+					const remaining = prev.pendingInterrupts.filter(
+						(i) => i.id !== request.interrupt_id,
+					);
+					const nextPending =
+						remaining.find((i) => i.status === "pending") ?? null;
+					return {
+						...prev,
+						pendingInterrupt: nextPending,
+						pendingInterrupts: remaining,
+						isLoading: false,
+						showDialog: nextPending !== null,
+					};
+				});
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const response = await resumeThread(threadId, request);
// Remove the resolved interrupt from pending list
setState((prev) => ({
...prev,
pendingInterrupt: null,
pendingInterrupts: prev.pendingInterrupts.filter(
(i) => i.id !== request.interrupt_id,
),
isLoading: false,
showDialog: false,
}));
const response = await resumeThread(threadId, request);
// Remove the resolved interrupt from pending list
setState((prev) => {
const remaining = prev.pendingInterrupts.filter(
(i) => i.id !== request.interrupt_id,
);
const nextPending =
remaining.find((i) => i.status === "pending") ?? null;
return {
...prev,
pendingInterrupt: nextPending,
pendingInterrupts: remaining,
isLoading: false,
showDialog: nextPending !== null,
};
});
πŸ€– Prompt for AI Agents
In `@frontend/src/hooks/useInterrupt.ts` around lines 95 - 106, The current
resolve flow in resumeThread clears pendingInterrupt and closes the dialog even
when other pending interrupts remain; update the setState call after
resumeThread to compute the new pendingInterrupts (filtering out
request.interrupt_id) and then set pendingInterrupt to the next item (e.g., the
first element of the filtered array) or null if none remain, and set showDialog
to true only if a next pendingInterrupt exists (keep isLoading false). Modify
the state update around setState in useInterrupt.ts to use
prev.pendingInterrupts, produce filtered = prev.pendingInterrupts.filter((i) =>
i.id !== request.interrupt_id), and set pendingInterrupt to filtered[0] || null
and showDialog to Boolean(filtered[0]).

Comment on lines +188 to +238
export function createApproveRequest(interruptId: string): InterruptRequest {
return {
interrupt_id: interruptId,
action: "approve",
nonce: crypto.randomUUID(),
};
}

/**
* Helper to create an edit decision request
*/
export function createEditRequest(
interruptId: string,
editedArgs: Record<string, unknown>,
): InterruptRequest {
return {
interrupt_id: interruptId,
action: "edit",
edited_args: editedArgs,
nonce: crypto.randomUUID(),
};
}

/**
* Helper to create a reject decision request
*/
export function createRejectRequest(
interruptId: string,
reason?: string,
): InterruptRequest {
return {
interrupt_id: interruptId,
action: "reject",
reason: reason || "User rejected the tool call",
nonce: crypto.randomUUID(),
};
}

/**
* Helper to create a respond decision request
*/
export function createRespondRequest(
interruptId: string,
feedback: string,
): InterruptRequest {
return {
interrupt_id: interruptId,
action: "respond",
reason: feedback,
nonce: crypto.randomUUID(),
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the file structure and find the useInterrupt.ts file
find . -name "useInterrupt.ts" -type f

Repository: ruska-ai/orchestra

Length of output: 98


🏁 Script executed:

# Let's read the entire useInterrupt.ts file to understand the context
wc -l frontend/src/hooks/useInterrupt.ts

Repository: ruska-ai/orchestra

Length of output: 100


🏁 Script executed:

# Read the file to understand the full context
cat -n frontend/src/hooks/useInterrupt.ts | head -300

Repository: ruska-ai/orchestra

Length of output: 7377


🏁 Script executed:

# Also search for where these functions are called
rg -n "createApproveRequest|createEditRequest|createRejectRequest|createRespondRequest" --type ts --type tsx

Repository: ruska-ai/orchestra

Length of output: 89


🏁 Script executed:

# Check the InterruptRequest type definition
rg -n "type InterruptRequest|interface InterruptRequest" --type ts --type tsx -A 15

Repository: ruska-ai/orchestra

Length of output: 89


🏁 Script executed:

# Check where interrupt object comes from and if it has nonce property
rg -n "interrupt\." frontend/src/hooks/useInterrupt.ts | head -30

Repository: ruska-ai/orchestra

Length of output: 44


🏁 Script executed:

# Search for usages of these helper functions - try without type filters
rg "createApproveRequest|createEditRequest|createRejectRequest|createRespondRequest" frontend/src --no-type-list

Repository: ruska-ai/orchestra

Length of output: 505


🏁 Script executed:

# Find the InterruptRequest and Interrupt type definitions
fd "interrupt" frontend/src/lib/entities --type f

Repository: ruska-ai/orchestra

Length of output: 100


🏁 Script executed:

# Check the interrupt entity file
cat frontend/src/lib/entities/interrupt.ts

Repository: ruska-ai/orchestra

Length of output: 4902


🏁 Script executed:

# Search for actual usage of these helper functions in the codebase
grep -r "createApproveRequest\|createEditRequest\|createRejectRequest\|createRespondRequest" frontend/src --include="*.ts" --include="*.tsx" -n

Repository: ruska-ai/orchestra

Length of output: 1622


🏁 Script executed:

# Let's also check if there are any other files that might import from useInterrupt
grep -r "useInterrupt" frontend/src --include="*.ts" --include="*.tsx" -n

Repository: ruska-ai/orchestra

Length of output: 528


🏁 Script executed:

# Let's look at the InterruptApprovalDialog component to see how these functions are called
cat -n frontend/src/components/modals/InterruptApprovalDialog.tsx | head -200

Repository: ruska-ai/orchestra

Length of output: 6931


Pass the interrupt's nonce to these helper functions instead of generating new ones.

Lines 192, 207, 222, and 237 call crypto.randomUUID(), which defeats replay protection and will fail if the backend starts validating that the request nonce matches the interrupt's nonce. Update each function to accept the interrupt's nonce as a parameter and pass it from the callsites in InterruptApprovalDialog.tsx (lines 131, 135, 138, 145).

πŸ› οΈ Proposed fix
-export function createApproveRequest(interruptId: string): InterruptRequest {
+export function createApproveRequest(
+	interruptId: string,
+	nonce: string,
+): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "approve",
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }

-export function createEditRequest(
+export function createEditRequest(
 	interruptId: string,
 	editedArgs: Record<string, unknown>,
+	nonce: string,
 ): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "edit",
 		edited_args: editedArgs,
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }

-export function createRejectRequest(
+export function createRejectRequest(
 	interruptId: string,
 	reason?: string,
+	nonce: string,
 ): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "reject",
 		reason: reason || "User rejected the tool call",
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }

-export function createRespondRequest(
+export function createRespondRequest(
 	interruptId: string,
 	feedback: string,
+	nonce: string,
 ): InterruptRequest {
 	return {
 		interrupt_id: interruptId,
 		action: "respond",
 		reason: feedback,
-		nonce: crypto.randomUUID(),
+		nonce,
 	};
 }
πŸ€– Prompt for AI Agents
In `@frontend/src/hooks/useInterrupt.ts` around lines 188 - 238, The helper
functions createApproveRequest, createEditRequest, createRejectRequest, and
createRespondRequest currently generate new nonces with crypto.randomUUID(),
which breaks replay protection; change each function signature to accept a nonce
parameter (e.g., nonce: string) and use that value instead of calling
crypto.randomUUID(), then update the callsites in InterruptApprovalDialog.tsx to
pass the interrupt's existing nonce (interrupt.nonce) into those helpers when
building the InterruptRequest.

@ryaneggz ryaneggz added the invalid This doesn't seem right label Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid This doesn't seem right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FEATURE (High Priority): Add human in the loop

1 participant