Skip to content

[Bug] Messages send with assign get lost intermittently. #20

@patricka3125

Description

@patricka3125

Bug: Assign flow intermittently loses messages due to missing post-creation readiness check

Description

The assign MCP tool intermittently loses the task message sent to a newly created agent terminal. The message is neither delivered to the agent's tmux pane nor stored in the inbox — it is silently lost.

Root Cause

In _assign_impl (src/cli_agent_orchestrator/mcp_server/server.py:474-492), the message is sent immediately after _create_terminal() returns, with no post-creation readiness verification or stabilization delay:

def _assign_impl(agent_profile, message, working_directory=None):
    terminal_id, _ = _create_terminal(agent_profile, working_directory)
    _send_direct_input_assign(terminal_id, message)  # Sent immediately — no wait

By contrast, the handoff flow (server.py:313-325) includes both:

  1. A status poll via wait_until_terminal_status() (120s timeout) confirming the terminal is IDLE through the API/database
  2. A 2-second stabilization sleep (asyncio.sleep(2)) before sending

While _create_terminal() internally calls provider.initialize(), which waits for the CLI to reach IDLE (via local tmux pane content inspection), there is a timing gap between:

  • The provider reporting IDLE locally (tmux pane check)
  • The REST API response propagating back to the MCP server
  • The subsequent send_keys call arriving at the tmux pane

During this gap, the agent's TUI may redraw or transition state (e.g., Claude Code's Ink renderer updating), causing the pasted message to be lost or not processed.

Impact

  • Severity: High — task messages are silently lost with no error returned
  • Frequency: Intermittent (estimated ~10-20% of assign calls, more frequent under system load)
  • Affected flow: assign only; handoff is not affected due to its readiness checks

Expected Behavior

The assign flow should verify terminal readiness before sending the task message, similar to the handoff flow.

Proposed Fix

Add a post-creation readiness check and stabilization delay in _assign_impl, similar to the handoff flow:

def _assign_impl(agent_profile, message, working_directory=None):
    terminal_id, _ = _create_terminal(agent_profile, working_directory)

    # Verify terminal is ready (mirrors handoff flow)
    if not wait_until_terminal_status(
        terminal_id,
        {TerminalStatus.IDLE, TerminalStatus.COMPLETED},
        timeout=120.0,
    ):
        return {"success": False, "terminal_id": terminal_id,
                "message": f"Terminal {terminal_id} did not reach ready status"}

    await asyncio.sleep(2)  # Stabilization delay

    _send_direct_input_assign(terminal_id, message)

Note: _assign_impl is currently synchronous and would need to become async to use asyncio.sleep, or use time.sleep(2) as a simpler alternative since assign is non-blocking from the caller's perspective.

References

  • Handoff readiness check: server.py:313-325
  • Assign implementation: server.py:474-492
  • Provider initialization: claude_code.py:226-256
  • Terminal creation: terminal_service.py:94-207
  • Tmux send_keys: tmux.py:193-246

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions