feat(cockpit): native ACP rendering surface (Beta) for all supported agents by njbrake · Pull Request #868 · njbrake/agent-of-empires

njbrake · 2026-04-30T20:31:21Z

Description

Adds cockpit, an ACP-based structured rendering surface that runs alongside the existing tmux passthrough. Every aoe session is now a per-session pick: tmux (legacy, raw bytes through wterm) or cockpit (Beta, agent speaks Agent Client Protocol; aoe renders typed events as React cards). Tmux remains the default — cockpit is opt-in via aoe add --cockpit or the new substrate picker on the web wizard.

The data model (cockpit_mode: bool per session) is already merged on main; this branch is the polish + ecosystem expansion that turns cockpit into something you'd actually want to use, plus per-tool ACP support so it's not Claude-only.

What's in here

Core substrate (was already on the branch from earlier merges):

Per-session ACP supervisor (src/cockpit/supervisor.rs) with restart budget, drain task, fs/terminal handlers
Replay-buffered WebSocket fanout (src/server/cockpit_ws.rs)
React surface built on @assistant-ui/react primitives
Sandbox support via unix-socket transport

Per-tool ACP (e4ad824): verified each agent's ACP invocation against agentclientprotocol.com/get-started/agents.md and seeded the registry:

Tool	Path
`claude`	`claude-agent-acp` (Zed adapter)
`opencode`	`opencode acp` (native, SST)
`gemini`	`gemini --acp` (native, Google)
`codex`	`codex-acp` (Zed adapter)
`vibe`	`vibe-acp` (native, Mistral)
`pi`	`pi-acp` (Hermes coding agent)
`aoe-agent`	bundled multi-provider fallback
aider, cursor, copilot, droid, settl, hermes	greyed out — terminal-only

Supervisor::pick_agent_for_tool(tool, override) replaces three copy-pasted "claude → claude-code, else aoe-agent" fallbacks.

UX polish (56a8822 → dd249aa):

Composer rebuilt VSCode/Cursor-style — multi-line, lucide icons, focus glow, paper-plane Send / square Stop
Markdown rendering with shiki code blocks + smooth streaming (@assistant-ui/react-markdown)
Per-kind tool cards: bash / read / edit / search / fetch / think with proper input parsing
react-diff-viewer-continued for edit cards
@-mention file picker (assistant-ui's Unstable_TriggerPopover + workspace file index endpoint)
/ slash commands
Empire-themed working spinner ("Conscripting villagers", braille rattle)
Hover affordances (copy/edit/regenerate via ActionBarPrimitive)
Approval cards realigned with tool-card visual language
Mode picker: real ACP-advertised modes from NewSessionResponse.modes, drop-up menu in composer footer

Reliability fixes:

3cccf46 — drain-task/send_prompt deadlock (drain held client mutex across recv().await)
28e8066 + 30a21f8 — TUI no longer marks cockpit sessions as errored ("tmux pane is gone")
d244ad6 — composer wins focus race against wterm's async init
c1bb7e0 — auth env forwarded by default (ANTHROPIC_API_KEY, CLAUDE_CONFIG_DIR, etc.)

Library refactor (f1298bc): replaced ~520 lines of reinvention with first-party assistant-ui primitives (TriggerPopover, MarkdownTextPrimitive, smooth streaming).

Doctor + docs:

aoe cockpit doctor walks the full registry, prints per-agent install hints, --fix npm install -gs the npm-distributed adapters
docs/cockpit.md gets a Beta callout, per-agent support/auth table, and updated doctor sample

PR Type

New Feature

Checklist

I understand the code I am submitting
New and existing tests pass (5/5 cockpit_acp_smoke, 13/13 e2e Playwright steps)
Documentation was updated (docs/cockpit.md)
For UI changes: included screenshot or recording (see screenshots in commits)

AI Usage

AI was used for drafting/refactoring

AI Model/Tool used: Claude Opus 4.7 via Claude Code

Any Additional AI Details you'd like to share:
The branch was developed iteratively across many conversations. Architecture decisions (ACP as substrate B, per-session toggle, supervisor ownership of agent processes, drain-task pattern, assistant-ui for the React surface) were human-directed; the AI handled implementation, testing, and UX iteration. Notable AI-caught issues that humans would have caught later: the deadlock in Supervisor's drain task, the focus race against wterm's async init, the missing title fallback for tools with empty raw_input. Notable AI-missed issues that humans caught: the mode picker initially said "Default" when the agent was actually in yolo mode (we weren't reading agent-advertised modes), the bash card showed $ {} for the same reason, and the early version reinvented Unstable_TriggerPopover + MarkdownTextPrimitive instead of using the assistant-ui primitives that were already in our deps.

I am an AI Agent filling out this form (check box if true)

How to build & run

cargo build --features serve --profile dev-release
./target/dev-release/aoe add . --cmd claude --cockpit   # one cockpit session
./target/dev-release/aoe serve                          # web dashboard

Note: the cockpit cargo feature was folded into serve (commit
ec5cfb0). If you saw earlier instructions saying --features 'serve cockpit',
just use --features serve now. Cockpit ships alongside the dashboard.

How tested

cargo test --features serve --test cockpit_acp_smoke   # 5/5 pass
cargo build --features serve --profile dev-release
cargo clippy --features serve -- -D warnings   # clean

Plus a Playwright e2e harness against a live aoe serve with a Node ACP test shim:

13 steps: session create → cockpit composer mounts → user prompt → agent text streams → tool call card renders → final "done" → REQUEST_PERMISSION flow → Allow click → permission_outcome=yes
Verified the @ file picker + / slash command popovers
Verified the working spinner with verb cycling
Verified focus reclaim after wterm async init

Caveats

The five non-Claude agents (opencode/gemini/codex/vibe/pi) were verified at the documentation level (matching agentclientprotocol.com/agents.md against upstream docs) and via the supervisor/registry tests, but I didn't exercise each adapter end-to-end on real hardware in this branch — the smoke and e2e tests use the shim. First time you aoe add . --cmd opencode --cockpit you may hit per-agent quirks.
aoe cockpit doctor only checks binary presence, not auth state. A future improvement would spawn each adapter's initialize and inspect the response's auth_methods.
One Unstable_* primitive from assistant-ui (Unstable_TriggerPopover); if upstream renames it on a minor bump we'd need a mechanical update.

Test plan

claude /login then aoe add . --cmd claude --cockpit — verify cockpit conversation works end-to-end with a real Claude subscription
aoe add . --cmd opencode --cockpit — verify the registry expansion picks up opencode acp correctly
aoe cockpit doctor --fix — verify it installs the missing adapters
Web wizard — verify the substrate picker greys out for tools we know don't have ACP
TUI — verify no spurious "tmux pane is gone" errors on cockpit sessions

🤖 Generated with Claude Code

Adds the cockpit feature behind a Cargo flag. Implements the ACP client spine: subprocess spawn, JSON-RPC handshake, session creation, and prompt loop, plus the typed state/approval/replay-buffer modules from the v4 design. Validated end-to-end by a Node ACP shim agent that replays scripted session/update events. Deferred to follow-up slices: - Permission responder side-channel (currently auto-approves yolo-style) - Typed mapping of session/update kinds to CockpitState fields - AcpClient hooking fs/* and terminal/* into existing handlers - aoe-agent tool stubs that delegate via ACP - Settings TUI wiring, CLI commands, migration, WebSocket fanout - React components, push notifications, Docker socket transport, docs Tests: 22 cockpit unit + 1 e2e integration, 1094 existing tests still pass. Build: cockpit feature opt-in; default build unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the yolo auto-approve with a proper responder side-channel: on_receive_request parks the ACP responder keyed by a server-side nonce; resolve_permission(nonce, decision) wakes the parked future and answers with the matching option_id from the agent's offered options. Map ACP SessionUpdate variants to typed CockpitState Event variants (AgentMessageChunk, ToolCallStarted, ToolCallCompleted, PlanUpdated, ModeChanged) instead of passing everything through as RawAgentUpdate. Add a permission round-trip e2e test against the test shim agent. Tests: 26 cockpit unit + 2 e2e integration, 1094 existing tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Hook the cockpit's FsPolicy + TerminalManager into the ACP client's incoming-request callbacks. Now agents can issue fs/read_text_file, fs/write_text_file, terminal/create, terminal/output, terminal/wait, terminal/kill, terminal/release and aoe handles them with sandbox enforcement (worktree-rooted FsPolicy). Update aoe-agent to declare Read/Write/Bash tools via Vercel AI SDK 6 whose execute() bodies delegate back to aoe over ACP. The model never touches the filesystem or shell directly. Declare client capabilities (fs.readTextFile, fs.writeTextFile, terminal) in the ACP initialize so agents know they can use them. Tests: 26 cockpit unit + 4 e2e integration (added fs + terminal round-trip tests against the shim). 1094 existing tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add migration v005_cockpit_defaults: seeds [cockpit] section in the global config.toml on upgrade so users can flip the flag on without hand-editing. * Add CockpitConfig struct to session::config with 8 documented fields matching the v4 design doc: enabled, default_for_claude, default_agent, approval_timeout_secs, destructive_require_double_ confirm, max_concurrent_workers, replay_events, replay_bytes, node_path. All with serde defaults; loadable from config.toml. * Add `aoe add` flags: --cockpit, --no-cockpit, --agent <name>, --model <id>. The first two are mutually exclusive; the agent flag implies cockpit. * Add `aoe cockpit` subcommand with: - doctor [--json] [--fix]: checks Node runtime + each configured agent's spawn command. Exits 0/1/2 for ok/fail/partial. - agents: lists the registry with present/missing markers. - logs/restart: stubs reserved for the worker supervisor slice. Full settings TUI editing wiring (FieldKey + build_*_fields + merge logic across 8 fields × 5 touchpoints) is deferred to a follow-up; config loads cleanly via serde defaults today. Tests: 1263 lib tests + 4 e2e + 5 cockpit-acp integration all green; 1095 default-feature tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add CockpitBroadcastFrame (session_id + seq + event JSON) and a per-AppState broadcast::Sender<CockpitBroadcastFrame> with a 256-event capacity. Behaves like the existing status_tx fanout. * New WebSocket route /sessions/{id}/cockpit/ws (gated on cockpit feature). Subscribes to the broadcast and forwards frames matching the route session_id; emits a `lagged` notice frame so clients can request a snapshot+replay rather than diverge silently. * trigger_approval_push() helper that fires a Web Push payload to all subscribers when an ApprovalRequested event is observed. Reuses the existing PushState + push_send infrastructure. Wired so the worker supervisor (next slice) can call it without further plumbing. * Refactored build_router to use a let-bound chain so the cockpit route can be conditionally added under #[cfg(feature = "cockpit")]. Tests: lib tests now 1264 (up from 1263) with the cockpit-ws unit test guarding publish-with-no-receivers behavior. 1095 default tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds src/cockpit/node.rs with the documented resolve order: 1. AOE_COCKPIT_NODE env 2. cockpit.node_path setting 3. node on PATH (>= 20 enforced) 4. previously-extracted bundled Node at $AOE_DATA_DIR/cockpit/node-vX Tarball download is stubbed with a typed NotYetWired error so the cockpit doctor can surface a clear "install Node yourself for now" message until the auto-download lands in a follow-up. Docker unix- socket transport for sandboxed cockpit sessions is also deferred — the architecture supports it (acp_client takes a generic ByteStreams) but the spawn path needs sandbox-aware plumbing that's its own slice. Tests: 4 new unit tests covering env/PATH/bundled paths, including a serial pair that scrubs+restores PATH/AOE_COCKPIT_NODE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* lib/cockpitTypes.ts: typed wire model mirroring CockpitBroadcastFrame + a pure reducer applyEvent() that materialises CockpitState from a stream of frames. Bounded activity log (200 rows) and recentDiffs (16). * hooks/useCockpit.ts: WebSocket subscription to /sessions/{id}/cockpit/ws, dispatched through a useReducer; lagged control frames flag the state so the UI can request a snapshot. resolveApproval helper POSTs decisions to a REST endpoint that the worker supervisor will wire up. * components/cockpit/ApprovalCard: 3 phases (pending / submitting / rolled-back), destructive-vs-benign affordance per the design spike (long-press 800ms with progress ring + haptic for destructive; single tap for benign). Swipe never approves. * components/cockpit/PlanPanel: sticky current step, collapsed completed disclosure, expanded upcoming. Cancelled steps are rendered as struck-through. * components/cockpit/ActivityStream: tool rows with kind glyphs + colours (start=amber, complete=emerald, error=red, message=teal), thinking/in-flight chrome. * components/cockpit/CockpitView: top-level mobile-first layout composing the above plus connection chrome (connecting / lagged / closed banners) and the rate-limit notice. Type-checks pass; Vite production bundle builds clean. Mobile-vs- desktop layout split (3-pane on >=768px) + ChatDrawer + push-tap deep-linking deferred polish; production wiring of the REST endpoint ships with the worker supervisor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs/cockpit.md follows the 10-section outline from the DX review: what cockpit is, quickstart, requirements, verify (doctor), enabling per-session + globally, escape hatches, tool compatibility matrix, approvals UX, security, troubleshooting, deferred items. * website/scripts/sync-docs.mjs: register docs/cockpit.md in PAGES + URL_MAP so the nav link resolves on agent-of-empires.com. * website/src/data/docsNav.ts: link the new page under Guides. The upgrade messaging story is covered today by: - v005 migration silently seeds [cockpit] section in config.toml - aoe cockpit doctor is discoverable via aoe --help - docs/cockpit.md is the canonical reference Explicit first-run TUI card is deferred to a follow-up; the doctor command serves the same affordance and is already wired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* src/cockpit/supervisor.rs: per-aoe-process Supervisor that owns AcpClients keyed by session_id. Spawn/shutdown lifecycle, drain task bridges client events to a BroadcastSink, restart-budget bookkeeping (3 restarts in 60s window before parking the session in Status::Error). ChannelSink impl publishes to AppState::cockpit_events_tx and fires approval-side hooks. * src/server/api/cockpit.rs: REST endpoints - POST /api/sessions/{id}/cockpit/spawn (start a worker) - DELETE /api/sessions/{id}/cockpit (shutdown) - POST /api/sessions/{id}/cockpit/prompt (send user input) - POST /api/sessions/{id}/cockpit/approvals/{nonce} (resolve approval) * AppState gets cockpit_supervisor: Arc<Supervisor<ChannelSink>>; the router wires the new routes under #[cfg(feature = "cockpit")]. * Instance gains cockpit_mode + cockpit_agent + cockpit_model fields, hidden from serde when default. aoe add --cockpit/--no-cockpit/--agent /--model now flow through into Instance. Tests: 4 supervisor unit tests (spawn-unknown-agent, double-spawn, count, restart budget) + 4 e2e + 1273 lib tests with cockpit feature on. 1095 default-feature tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add tar + xz2 deps under the cockpit feature for tarball extraction. * node::download() resolves the host platform (linux x64/arm64, macOS x64/arm64; Windows is intentionally unsupported because it ships .zip), fetches the pinned Node 22.21.0 tarball from nodejs.org/dist, verifies SHA-256 against an embedded table, and extracts to $AOE_DATA_DIR/cockpit/node-vX.Y.Z/. * Pinned SHAs come straight from nodejs.org's SHASUMS256.txt; bumping PINNED_NODE_VERSION requires refreshing every entry. A unit test enforces that all four supported platforms are covered. * aoe cockpit doctor --fix now triggers the download when no usable Node is on PATH. The CLI command is now async so it can await the fetch + extract. Tests: 6 node unit tests (was 4; added sha256_hex against the empty- string vector + a coverage check on the SHA table). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wires every cockpit setting through the documented FieldKey + override + merge pipeline so they're editable in the settings TUI with profile overrides that round-trip correctly. * CockpitConfigOverride struct in profile_config with Option<T> for every field; merge_configs honors each override. * New SettingsCategory::Cockpit; build_cockpit_fields renders all 8 fields (3 bool, 4 number, 2 text) with inheritance markers. * apply_field_to_global covers each field; apply_field_to_profile uses the existing set_profile_override helper. * clear_profile_override sets each Option to None when the user hits the 'r' key. * Re-export CockpitConfigOverride from session::mod. Tests: 1274 lib tests with cockpit feature on (was 1268, +6 from the config + node module additions). 1095 default-feature tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* SpawnConfig.socket_path option: when set, aoe binds a unix listener at that path BEFORE spawning the agent, exports AOE_ACP_SOCKET=<path> to the agent's env, waits up to 10s for the agent to connect, and uses the connected UnixStream's split halves as the ByteStreams transport. On task exit the socket file is unlinked. * run_connection_task is now generic over <W: AsyncWrite, R: AsyncRead> so the same body handles stdio (ChildStdin/ChildStdout) and socket (UnixStream split halves). socket_path is also threaded in for cleanup. * test-shim honors AOE_ACP_SOCKET: connects to the socket and uses it as the ndJsonStream transport. Falls back to stdio when unset. * New e2e test shim_agent_round_trips_via_unix_socket exercises the full round-trip end-to-end: aoe creates the socket, spawns the shim, shim connects, prompt + session/update flow back. Same shape as the stdio path. Tests: 5 cockpit e2e tests (was 4); 1274 lib tests; 1095 default tests. Docker bind-mount integration (one -v line in src/containers/runtime_ base.rs for sandboxed cockpit sessions) lands when the cockpit session type is wired into the sandbox spawn path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* CockpitView is now responsive via a useIsDesktop matchMedia hook: mobile (<768px) renders single-column stack with a chat drawer FAB; desktop (>=768px) renders three-pane (plan left 300px, activity center, chat dock right 360px). * ChatDrawer component supports both variants: - mobile: bottom-anchored sheet with FAB to open/close, slides from bottom; close button visible - desktop: always-docked column on the right Enter sends, Shift+Enter for newline, optimistic disable while sending, plain hover/focus styling matching the cockpit palette. * useCockpit gains sendPrompt(text) helper that POSTs to /api/sessions/{id}/cockpit/prompt and forwards through to the worker supervisor. * Approval and connection chrome moves to a top header so it overlays on mobile but inlines on desktop. Type-checks pass; Vite production bundle builds clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add AppStateConfig.has_seen_cockpit_intro (cfg-gated on cockpit). Tracked separately from has_seen_welcome / last_seen_version so the one-time intro fires once, regardless of which version actually introduced cockpit on the user's machine. * New CockpitIntroDialog: 70x18 centered modal with the quickstart command, doctor command, docs URL, and a quiet note about the Node prereq. Same key handling as the existing welcome dialog (Enter / Esc / Space / q to dismiss). * Wired into HomeView like the existing one-time dialogs: - cockpit_intro_dialog: Option<CockpitIntroDialog> field - show_cockpit_intro() helper - input.rs dispatch - render.rs dispatch (cfg-gated branch after the macro for the other dialogs since the macro is shared with non-cockpit builds) * App::new fires it after the welcome+changelog flow when the flag is unset, then persists the flag. Tests: 1274 lib tests, 5 cockpit e2e all pass; 1095 default tests unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* web/App.tsx: dispatch on activeSession.cockpit_mode — sessions with the flag render <CockpitView/> in place of <TerminalView/>. The fallback for tmux-mode sessions is unchanged so existing terminal sessions keep working exactly as before. * SessionResponse gains cockpit_mode (gated on the cockpit feature server-side; optional in the TS shape so non-cockpit builds still satisfy the type). * CreateSessionBody learns cockpit_mode (defaults TRUE via default_cockpit_for_web so browser-created sessions land in the cockpit by default), cockpit_agent, cockpit_model. The fields flow through into the constructed Instance. * Cockpit-mode sessions skip tmux start() — no empty pane is created for sessions whose backend is the ACP supervisor. * After a successful create, if the session is cockpit_mode, kick off Supervisor::spawn() on a background task. claude tool defaults to the claude-code agent; everything else defaults to aoe-agent. Spawn failures (missing Node, etc.) log a warning but don't fail the request — the user can retry via the cockpit/spawn endpoint. * aoe serve startup now sweeps persisted instances with cockpit_mode and spawns workers for them too. Same best-effort semantics; happens in parallel so a slow agent doesn't block the listener bind. TUI default behavior is unchanged: NewSessionData doesn't set cockpit_mode so it defaults false and `n` continues to create tmux- backed sessions. Users opt in via aoe add --cockpit from the CLI. A visible toggle in the new-session dialog is a small UI follow-up. Tests: 1299 cockpit lib tests, 5 e2e, 1116 default. Clippy clean. Auto-formatted by cargo fmt as part of the precommit hook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The first cockpit view reinvented the layout with its own three-pane split, which collided with the app shell — <ContentSplit> in App.tsx already owns the workspace sidebar (left) and terminal/diff (right). The cockpit's job is just the middle pane, like Conductor's chat window. * CockpitView now renders a single scrollable conversation: - Optional plan strip pinned at the top, click to expand the steps. - Message-style cells: user prompts as right-aligned bubbles, agent text as full-width prose. Consecutive agent_message_chunk events fuse into one bubble. - Tool calls render INLINE as collapsible cards (status dot + one-line summary, click to reveal output). - Pending approvals appear inline at the bottom of the feed. - Thinking indicator as a small italic bubble. - Input area pinned at the bottom (auto-grow, Enter sends, Shift+Enter newline). - System notices (connecting / lagged / rate-limited) as a thin bar above the feed. - Auto-stick-to-bottom unless the user scrolled up >80px. * useCockpit::sendPrompt now also dispatches a `user_prompt` action that appends a user-side ActivityRow so the user's outgoing turns appear in the conversation timeline. ActivityRow.kind gains `user_prompt`. * Drop the now-stale subcomponents: ChatDrawer (replaced by inline Composer), PlanPanel (replaced by PlanStrip header), ActivityStream (replaced by ConversationFeed). * Drop useIsDesktop / 3-pane split entirely. Tests: 1299 lib + 5 e2e, type-check + Vite build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ssions Two related cockpit bugs: - ACP spawn could hang silently (e.g. `npx -y` downloading on first run). Gate spawn() on a 30s handshake deadline; on timeout, kill the wedged child and publish a new AgentStartupError event end-to-end (broadcast -> WS -> reducer -> red banner with `aoe cockpit doctor --fix` hint). Default `claude-code` agent now uses the installed `claude-agent-acp` binary instead of `npx -y`; `doctor --fix` runs the global npm install. - Cockpit-mode sessions were polled like tmux sessions and surfaced a spurious "tmux session is gone" Error. Short-circuit update_status_with_metadata_inner for cockpit_mode and clear any stale error state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # Cargo.lock # src/server/api/sessions.rs # web/src/lib/types.ts

The supervisor's drain task held client.lock() across next_event().await, which blocks indefinitely waiting on the inbound mpsc. Any concurrent send_prompt (or any other Supervisor method) tried to acquire the same mutex and hung forever, so the very first prompt from the web UI never made it past the API layer. Move the inbound mpsc::Receiver out of AcpClient (now Option<...>) when the supervisor builds the worker, and let the drain task own the receiver directly. The mutex now only guards the cmd_tx side, which is fine because that side never await-blocks past a channel send. Found by an e2e test of the cockpit UI; verified by reusing the existing cockpit_acp_smoke tests (5/5 still pass) plus a Playwright run that sends two prompts and resolves an approval through the real React surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three small fixes that together make the first-run cockpit experience not feel broken: - Remove the one-time "New: Cockpit (Native Agent Rendering)" TUI popup and its has_seen_cockpit_intro tracking. Discoverability lives in the docs and the `aoe cockpit` subcommand; we don't need an extra dialog on every first launch. - acp_client.spawn_subprocess now also forwards ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN, CLAUDE_CODE_OAUTH_TOKEN, and CLAUDE_CONFIG_DIR by default. Without this, users who already have ANTHROPIC_API_KEY exported (the common case) hit "Authentication required" because the agent inherits an env_clear()'d environment and can't see the key. - StartupErrorBanner branches on the error message: when the failure is auth-shaped (matches /authentic|login|api[_ -]?key/i), show "set ANTHROPIC_API_KEY or run claude /login" instead of the install-the-adapter hint, which is misleading once the binary is already on PATH. - CockpitView and ApprovalCard now use the shared design tokens (surface-*, text-*, brand-*) instead of Tailwind's blue-tinted slate and amber palettes, matching the surrounding app shell. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…polish) Brings the cockpit conversation surface in line with Cursor / VSCode agent chat patterns. All seven items from the UX research land here: 1. Agent text now renders as markdown (paragraphs, headings, lists, blockquotes, fenced code blocks with shiki syntax highlighting, inline code, links). New `Markdown.tsx` parses with `marked` and renders each token through tailwind-styled React components so colors come from the design tokens, not the library defaults. 2. Per-tool renderers in `ToolCards.tsx`, dispatched by the new `kind` field on `ToolCall` (plumbed from ACP `ToolKind`): - `execute` shows a `$ command` line with collapsible body - `read` shows file path + optional line range - `edit` / `delete` show a mini-diff with `-` / `+` styling - `search` shows the query (and scope when known) - `fetch` shows the URL - `think` is a one-line italic note - `other` falls back to a generic expandable card 3. Vertical rhythm: a soft horizontal divider above each user turn (except the first), tighter spacing for tool→tool runs, more breathing room for agent text. 4. Stop button: when the agent is thinking or has a tool in flight, the Send button is replaced by a stop-square that POSTs to a new `/api/sessions/{id}/cockpit/cancel` endpoint. The endpoint sends an ACP `session/cancel` notification via a new `ClientCmd::Cancel` and `AcpClient::cancel_prompt`. 5. Empty state: three starter prompt chips ("Explain this codebase" etc.) replace the bare "type a prompt" placeholder. Clicking a chip sends the prompt immediately. 6. Refined user bubble: smaller right-aligned chip with a rounded-br corner, subtle border, no longer competing with agent text for visual weight. 7. Composer affordances: keyboard hint (`Enter to send · Shift+Enter for newline`) under the textarea, auto-focus on mount so the user can type immediately after picking a session. Backend: `ToolCall` gains a `kind: String` field carrying the ACP `ToolKind` lowercased (read/edit/execute/search/...). The web `ActivityRow` now keeps the full `ToolCall` payload on `tool_start` rows so the UI doesn't have to look it up by id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two follow-ups from the cockpit overhaul: 1. wterm's async init() at node_modules/@wterm/dom/dist/wterm.js:56 calls input.focus() unconditionally after the WASM bridge loads, firing 200-500ms after mount and stealing focus from the agent composer. The composer's onMount focus runs sync and loses the race. Re-claim focus at 250ms and 700ms, but only when focus is on document.body or inside .wterm — so an intentional click into the host shell during the window sticks. 2. The "Thinking…" bubble only showed during ACP AgentThoughtChunk events, leaving the user staring at a blank pane while the agent was running tools or waiting for first text chunk. Track a new `turnActive` flag (set on user_prompt, cleared by Stopped / AgentStartupError) and render a spinning glyph + contextual label whenever the turn is open: "Thinking…", "Running <tool>…", or "Working…" otherwise. Also drives the Send → Stop button swap so cancel is reachable for the entire turn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the plain "Working…" / "Thinking…" / "Running X…" labels with a braille-spinner glyph and a rotating verb pool themed around Agent of Empires' civilization-building flavor. A nod to Claude Code's "ruminating" / "noodling" verbs and the Rust `rattles` spinner crate the TUI already uses for ratatui status indicators. `cockpitRattle.ts` defines: - SPINNER_FRAMES: 10-step braille rotation (⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏), 80ms/frame - WORKING_VERBS: 35 empire verbs (Conscripting villagers, Marshalling forces, Quarrying granite, Calibrating trebuchets, Plundering archives, Negotiating treaties, …) - THINKING_VERBS: 14 mystical/divinatory verbs for AgentThoughtChunk state (Consulting auguries, Casting bones, Whispering with elders, Studying the stars, …) - chooseVerb() — deterministic from a seed so the verb stays stable across re-renders within a tick. The seed bumps every 4s so long turns rotate through different verbs without flickering. Tool runs override the verb pool: instead of a generic empire word, the tool's own name is dressed up with a pool of action prefixes ("Wielding read", "Dispatching write", "Marshalling search", …), so the user always sees what's actually running. Test infra: the shim now honors a "SLOW" prompt keyword that adds 800ms gaps between session/update events, so e2e tests can observe the mid-turn UI without a real model. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…itives Replace the hand-rolled conversation feed with assistant-ui's headless React primitives. We keep ownership of the chat *state* (it streams from the cockpit WS, not from a chat protocol assistant-ui knows about); assistant-ui owns the chat *surface* — scroll viewport, message list, keyboard shortcuts, accessibility, message-edit affordances, the running/idle gating. Architecture ws frame → applyEvent → CockpitState.activity (ours) │ ▼ activityToThreadMessages() → ThreadMessageLike[] │ ▼ useExternalStoreRuntime(adapter) → AssistantRuntime │ ▼ <AssistantRuntimeProvider runtime> │ ▼ <ThreadPrimitive.Messages components={…}> <ComposerPrimitive.Root> The new `CockpitRuntime` component wraps the cockpit and exposes the raw state via render-prop for the bits assistant-ui doesn't own (plan strip, system notices, startup error banner, ACP approval cards). Renderers we wrote earlier all keep their place: - Markdown.tsx → injected as `Text` part component - ToolCards.tsx → injected as `tools.Override` so per-kind cards (read/edit/execute/search/…) render inside assistant messages - cockpitRattle (verbs + braille frames) → driven by the new `<ThreadPrimitive.If running>` gate - ApprovalCard.tsx → rendered below the message stream as before Composer behaviour: - `<ComposerPrimitive.Input>` replaces our textarea; auto-grow + the wterm focus-race reclaim logic still attach to its element ref - Send/Stop swap via `<ThreadPrimitive.If running>`; Stop calls `useThreadRuntime().cancelRun()` which the runtime adapter forwards to our `cockpit/cancel` REST endpoint - `onNew` flattens AppendMessage parts to plain text (ACP only accepts text prompts); attachments/images dropped silently for now Verified: e2e (13/13 steps), spinner mid-turn snapshot still shows the empire-themed rattle, focus reclaim still beats wterm's async init, all 5 cockpit_acp_smoke tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lift the composer out of CockpitView.tsx into its own file and rebuild the chrome around assistant-ui's `<ComposerPrimitive>` to match the feel of VSCode chat and Cursor agent chat. What changed visually: - Tall multi-line input by default (rows=2, min-h-14, max-h-50) so the composer reads as a writing surface, not a one-liner. Auto- grows up to 200px before scrolling. - Top-affordance toolbar row inside the composer card with `@` files, `/` commands, and paperclip attachment icons (lucide-react, same icon family VSCode/Cursor visually feel like). Disabled with "coming soon" tooltips for now — present sets the visual frame. - Send button: paper-plane icon on a brand-amber rounded pill, with a hover lift and 0.98 active-scale so it feels press-able. - Stop button: square-stop icon + "Stop" text, hover styling tinted rose to read as a destructive/cancel action. - Keyboard hint as a kbd-styled chip ("↵ Send") next to the send button instead of dim text floating below. - Subtle inner shadow on the composer well + amber focus ring (3px glow) on focus-within. Clear visual hierarchy: input > toolbar > actions. Lifted out of the view so it can grow features (model picker, file chips, slash command popover) without cluttering CockpitView.tsx. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… tool cards Closes the eight remaining naive UI items in one pass. Backend (new endpoints + ACP set_mode): - GET /api/sessions/:id/cockpit/files → workspace file list (5k cap, skips .git/node_modules/target/dist/build/.next/.venv/.cache/etc.) - POST /api/sessions/:id/cockpit/mode → ACP `session/set_mode` via new ClientCmd::SetMode + AcpClient::set_mode + Supervisor:: set_mode Frontend additions: - TriggerPopover.tsx: generic @-/-trigger combobox. Detects trigger chars at word boundaries (whitespace before, word chars after), arrow-keys/Enter/Tab/Esc navigation, mousedown-insert so the textarea doesn't lose focus. Plugged into the composer twice — once for @ (file picker, fed by /cockpit/files + fuzzyFilter) and once for / (slash commands, hard-coded /help/clear/tools/model for now). - ModePicker in PlanStrip: clickable mode chip with Default/Plan/ AcceptEdits/Yolo (BypassPermissions) options, each with a one-line hint. Click → POST /cockpit/mode. Tinted by current mode (rose for Yolo, amber for AcceptEdits, cyan for Plan). - Plan strip itself: progress bar (visual, animated transition), completed/total counter, chevron rotation on expand, and a thin always-visible bar with the mode picker even when no plan is active. - Hover affordances on messages via ActionBarPrimitive: copy + edit on user messages, copy + regenerate on assistant. Visible on hover/focus only (group-hover:opacity-100), Lucide icons. - useStreamReveal hook: char-budget reveal (24 chars/16ms baseline, accelerates when >200 chars behind). Smooths ACP's chunky text delivery into typewriter-style streaming. - Composer toolbar: @ and / icon buttons now insert the trigger char at the caret (with space-padding when mid-word) so the popover opens. Paperclip remains a coming-soon stub. ToolCards.tsx rewrite — VSCode/Cursor-style: - Common CardChrome with status dot + per-kind icon + label + meta + collapse chevron. - Bash: $-prefixed command, output highlighted as `bash` lang in shiki, line count in meta, "Show N more" expand for long output. - Read: file path + line range + line count, content highlighted by extension, 16-line preview default. - Edit: real diff lines with +/− gutter, shiki-highlighted body, +N/−N counters in meta. - Search: query + scope, line-numbered match list capped at 50. - Fetch: URL primary, JSON-highlighted output. - Think: one-line italic, no chrome. - Generic fallback: input + output sections with copy buttons. ApprovalCard rewrite: - Aligned with tool-card visual language: same border-md, same bg-surface-800/50 base, header strip with shield/alert icon and label. - Three-button block: Allow / Always (benign) or Hold-to-allow (destructive), plus Deny on the right. Lucide icons throughout. - Args preview lives in a max-h-32 scrollable code block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three pieces I shipped from scratch that were already in the toolbox. Net code: similar line count after factoring out package-lock churn, but ~520 lines of bespoke logic deleted in favor of battle-tested upstream code that the assistant-ui team maintains. 1. TriggerPopover.tsx (287 lines) → ComposerPrimitive.Unstable_TriggerPopover I had the exports printed out earlier and skipped over the Unstable_ family. The official primitive ships: - Trigger detection at word boundaries - Arrow-key/Enter/Tab/Esc navigation with data-highlighted - Mousedown insertion (so the textarea doesn't lose focus) - Plugin registry integration with ComposerInput so cursor position is wired automatically - Search vs categories drill-down with isSearchMode We provide an Unstable_TriggerAdapter (categories/categoryItems/ search) that returns items for @ files / / commands. Pairing the `@` trigger with .Directive (chip-into-text) and the `/` trigger with .Action (handler-fires-immediately) gets us both UXes declaratively. Empty `categories` skips the drill-down step so a flat file list shows the moment `@` is typed. 2. Markdown.tsx (~200 lines + useStreamReveal hook) → MarkdownTextPrimitive from @assistant-ui/react-markdown (already in deps, never imported). The primitive handles: - Streaming-aware rendering (incomplete fenced blocks) - Built-in `smooth` char-budget reveal (replaces useStreamReveal) - Standard markdown via remark/rehype We plug in a SyntaxHighlighter component backed by our existing shiki integration, plus a CodeHeader matching the design tokens. index.css gets descendant selectors for prose because the primitive emits unclassed elements. 3. EditToolCard custom +/- diff renderer → react-diff-viewer-continued The library does word-level diff, line numbers, gutter colors, expand-context. We override its dark theme variables to match our zinc/brand palette so the diff doesn't read as "pasted in from another app". Diff lib is loaded only inside the Edit card, so the bundle penalty stays in the tool-card chunk. Composer also gains the proper plugin-registry path: the TriggerPopoverRoot wraps ComposerPrimitive.Root, so the input fires setCursorPosition() into all registered triggers automatically. Both `@` and `/` use the same primitive, just with different adapters and behaviors (Directive vs Action). The custom code that stays: - CockpitRuntime.tsx (ACP-specific, not generic) - useFilesIndex (one-shot fetch + memo) - fuzzyFilter (small enough to not warrant a dep) - The ApprovalCard, ModePicker, and PlanStrip remain bespoke because they're product-specific (ACP approvals, ACP modes). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three issues from real-Claude usage: 1. Bash tool card showed literal "$ {}" when raw_input was empty. Forward the ACP `tool.title` through CockpitRuntime via a namespaced `_aoe_title` key in the args JSON so per-kind renderers can fall back to a descriptive label. Updated Execute/Read/Edit/Delete/ Search/Fetch cards to chain: real arg field → forwarded title → bare tool kind. 2. Mode picker said "Default" even when the agent was running in yolo mode. Root cause: we never read agent-advertised modes — only listened for a CurrentModeUpdate event we never received because the ACP `session/update` for mode change is rare. Now: - Capture the mode set from `NewSessionResponse.modes` on session creation and emit a new `Event::ModesAvailable` carrying the agent's actual modes (id + name + description). - Map ACP `SessionUpdate::CurrentModeUpdate` to a new `Event::CurrentModeChanged` in addition to the legacy enum-based `ModeChanged`. - UI tracks `state.availableModes` + `state.currentModeId` and renders the picker from those (falls back to the hard-coded four-mode taxonomy when the agent doesn't advertise any). 3. Mode picker also moved from the top PlanStrip strip into the composer footer (Cursor-style: inline with @ / / / paperclip toolbar buttons, opens upward via `bottom-full`). PlanStrip is now hidden entirely when there's no plan and mode is Default, instead of rendering an empty bar to host the picker. 4. Cockpit-mode session in the TUI showed "tmux pane is gone" because `Instance::start_with_size_opts` unconditionally created a tmux session. Cockpit sessions don't have a tmux backing — the supervisor spawns the ACP agent process directly. Short-circuit start() for cockpit_mode at the top. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…r (Beta) Make cockpit a real per-session opt-in across every aoe-supported agent that has a published ACP server, not just Claude. Default registry (`AgentRegistry::with_defaults()`) now seeds one entry per tool, keyed by the same name the tmux substrate uses, so the spawn path can map `instance.tool` directly to a registry key: claude → claude-agent-acp (Zed adapter for Claude SDK) opencode → `opencode acp` (native, SST) gemini → `gemini --acp` (native, Google) codex → codex-acp (Zed adapter, OpenAI Codex CLI) vibe → vibe-acp (native, Mistral) pi → pi-acp (adapter, Hermes coding agent) aoe-agent → bundled multi-provider fallback Verified each invocation against agentclientprotocol.com/get-started/ agents.md and the upstream agent docs (Jan 2026). The legacy "claude-code" key stays as an alias so persisted sessions with cockpit_agent="claude-code" still resolve. Spawn path (`Supervisor::pick_agent_for_tool`) replaces the hard-coded `"claude → claude-code, else aoe-agent"` fallback in three places: src/server/api/cockpit.rs (POST /cockpit/spawn) src/server/api/sessions.rs (auto-spawn after create) src/server/mod.rs (auto-spawn at serve startup) Precedence: explicit override → registry entry keyed on tool → legacy fallback. So `aoe add . --cmd opencode --cockpit` now spawns real opencode-via-ACP, not the generic aoe-agent. `aoe cockpit doctor` walks the new registry and prints a per-agent status with tailored install hints. `--fix` runs `npm install -g` for the npm-distributed adapters (claude / codex / pi); native CLIs get a one-line install hint pointing at the upstream installer. Doctor banner now reads "Cockpit doctor (Beta)" with a one-line explainer about substrate selection. Web wizard gains an explicit two-card substrate picker (Cockpit Beta vs Terminal Stable). Greys out cockpit when the selected tool isn't in our `ACP_CAPABLE_TOOLS` allowlist (aider, cursor, copilot, droid, settl, hermes for now). The wizard sends `cockpit_mode` on the create-session request; server's `default_cockpit_for_web` default still applies when omitted, but the wizard always sends it explicitly now. docs/cockpit.md gets a Beta callout at the top, a per-agent support/auth table, and an updated doctor sample reflecting the new format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # src/session/mod.rs

Two new endpoints + a discreet UI button per view make `cockpit_mode` runtime-mutable. The data model was already a per-session bool; this plumbs the actual transitions: POST /api/sessions/:id/cockpit/enable POST /api/sessions/:id/cockpit/disable Both are idempotent (200 with no work when already in the target state). Enable validates the tool has an ACP-capable registry entry before flipping (so we don't strand a session in cockpit mode with no agent to spawn). Persistence happens before the new substrate starts, so a crash mid-swap leaves the declared end state on disk rather than a half-broken intermediate. Cockpit → tmux: - Supervisor::shutdown drops the ACP worker (UnknownSession is fine if startup never completed). - cockpit_mode = false, save. - Instance::start() creates a fresh tmux pane and runs the agent. Tmux → cockpit: - Instance::kill() drops the tmux pane. - cockpit_mode = true, save. - Supervisor::spawn fires off a worker. If it fails (binary missing, auth missing, etc.) the standard AgentStartupError flows through to the UI's red banner — the swap itself returns 200 because the substrate state on disk is correct. UI: new SwitchSubstrateAction component with a destructive-confirm modal. Plugged into: - the cockpit composer toolbar (icon-only, next to mode picker) - the TerminalView top-right corner (icon-only, absolute-positioned) The cockpit-side button always works; the terminal-side button greys out when the tool isn't in our ACP_CAPABLE_TOOLS allowlist (aider, cursor, copilot, droid, settl, hermes for now). Confirm dialog explains what's destroyed (cockpit conversation log / tmux scrollback) and what's preserved (worktree, open files, session id). After the API returns the session-list poll picks up the new cockpit_mode within ~3s and the parent flips between <CockpitView> and <TerminalView>; the explicit refresh wasn't worth threading through. Verified end-to-end with a Playwright suite (13/13 pass): both directions through the UI, idempotent re-enable/re-disable through the API, backend/frontend stay in sync, confirm dialog gating works. The pre-existing 14-scenario comprehensive suite still passes (23/23 + 1 skip on the pre-existing replay-buffer item). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # web/src/components/TerminalView.tsx # website/scripts/sync-docs.mjs

Seluj78 · 2026-05-06T19:38:40Z

do you need some testing on this?

njbrake · 2026-05-07T08:58:08Z

@Seluj78 If you would have a chance, absolutely! This is the next generation feature I think to make the web mode of AOE great, but I haven't had the time to finish the feature.

Seluj78 · 2026-05-07T09:23:52Z

oooh fancy ! I like the new UI!

Commands to run to get it working:

cargo build --features 'serve cockpit' --profile dev-release
cd web && npm install && npm run build && cd ..
./target/dev-release/aoe add . --cmd claude --cockpit
./target/dev-release/aoe serve

Here are my feedbacks so far:

I was not expecting this, but now that I see the new central window, I feel like it's weird to have the terminal on the right side to still be a tmux.
These kind of interfaces are amazing but I am always reluctant in using them compared to claude's CLI because it might be missing features, like the AskUserQuestion tool for example, or other tools that might be added by the agents AOE supports that then need a downstream update to support them afterwards

I asked a suggested question and the right side isn't spinning

After a bit more investigation, it looks like the session never starts

njbrake · 2026-05-07T12:06:30Z

These kind of interfaces are amazing but I am always reluctant in using them compared to claude's CLI because it might be missing features, like the AskUserQuestion tool for example, or other tools that might be added by the agents AOE supports that then need a downstream update to support them afterwards

💯 this is my concern too. My hope 🤞 was that by using ACP we would have to worry about less of this and things would just work https://github.com/agentclientprotocol/agent-client-protocol .

Thank you for giving it a try. One of the biggest hurdles I'm trying to figure out is how to manage a session between TUI and Web in cockpit mode. Like, in the web dashboard I certainly want it using cockpit mode, but then if I go to use the TUI I think I would rather have it use the tmux+native Claude Code etc mode. So is there a way where I can run a single session and have it be able to switch between cockpit and tmux views. Idk 🤷

Seluj78 · 2026-05-07T12:43:02Z

💯 this is my concern too. My hope 🤞 was that by using ACP we would have to worry about less of this and things would just work agentclientprotocol/agent-client-protocol

Oooh that's what this is ! I didn't bother checking it out earlier. It makes so much sense! Then yeah 100% this is the right way to go, my concern has disappeared. The only thing I would keep in mind is a way to know when new versions of the SDK is released so we know that we need to update AOE to support it (a CI check?)

Thank you for giving it a try. One of the biggest hurdles I'm trying to figure out is how to manage a session between TUI and Web in cockpit mode. Like, in the web dashboard I certainly want it using cockpit mode, but then if I go to use the TUI I think I would rather have it use the tmux+native Claude Code etc mode. So is there a way where I can run a single session and have it be able to switch between cockpit and tmux views. Idk 🤷

From what I saw in the webUI (I didn't try the TUI in this PR) you had a toggle to switch back to tmux. So I don't understand right now the problem you're having :)

@njbrake could you update this PR (43 commits behind) and I'll try to do some sessions on it and see how it feels and give more feedbacks :)

# Conflicts: # Cargo.lock

The cockpit (ACP-based structured rendering) only ships alongside the web dashboard, so the standalone `cockpit` cargo feature was redundant. Consolidating means one feature flag for "I want the web surface" and no risk of someone enabling cockpit without realising they need serve. - Drop the `cockpit` feature from Cargo.toml; roll its deps (agent-client-protocol, agent-client-protocol-tokio, tar, xz2) into the existing `serve` feature list. - Sweep `#[cfg(feature = "cockpit")]` → `#[cfg(feature = "serve")]` across 44 sites in src/ and tests/. - Update docs/cockpit.md and the one stale comment in tui/settings/fields.rs to reference `serve`. - Fix three pre-existing test fixtures missing the `kind` field on ToolCall (uncovered now that cockpit tests run with `--features serve` instead of being routed to a separate feature build). Cockpit code still lives entirely under src/cockpit/, so the rip-out story is unchanged: delete the module + its handler files in src/server/, and the rest of `serve` keeps working.

Cockpit sessions spawned a `claude-agent-acp` subprocess (plus its SDK child) that lived on past two cleanup paths: 1. `DELETE /api/sessions/{id}` ran `perform_deletion` (worktree + tmux teardown) but never told the cockpit supervisor to shut down. The worker handle stayed in `Supervisor::workers`, the spawned process stayed alive. 2. `aoe serve` graceful shutdown handled SIGINT/SIGTERM/SIGHUP for the daemon itself but never called `Supervisor::shutdown_all`, so each running cockpit session leaked its wrapper + SDK child when the daemon exited. After repeated probe runs we'd accumulate 6+ orphan node processes per session. Wire the supervisor shutdown into both paths; verified end-to- end that two cockpit sessions drop their processes to zero on delete, and a SIGTERM to the daemon reaps the rest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Seluj78 · 2026-05-07T15:31:09Z

@njbrake is there specific things you need tested in this PR ? Or just general feedback on the experience of using it? Or something else ? :)

njbrake · 2026-05-07T15:45:01Z

@Seluj78 yes at this point just general feedback. There are a few dimensions we need to test/debug before merge.

How does the web app behave? Is it a smooht experience and can you easily view both cockpit sessions and non-cockpit sessions?
How does the TUI handle the cockpit sessions and can you cleanly convert a session between cockpit and tmux modes
What is the risk? Is there any impact to the TUI? The TUI is stable and needs to stay that way, the Web App is beta/experimental so I'm less worried about lil rough edges on the web app that we can smooth out in follow up prs

Basically I don't need this PR to handle all the edge cases but I need to be sure that the web app is at least generally as usable as it was before the change, and that it doesn't regress anything in the TUI to degrade the TUI experience

Seluj78 · 2026-05-07T16:05:56Z

@njbrake okay, well I can't go any further in my testing because I cannot get claude to respond to me in the cockpit mode
but switching back to tmux mode and asking the same question works

Actually I wanted to reproduce the issue above and I can't even get a new session to show up:

I ran those commands and it said session started but nothing happened in the webui.

Seluj78 · 2026-05-07T16:15:05Z

Comment from Claude (Opus 4.7), posted by @Seluj78 — investigating the "session started but nothing in the webui" report.

Traced the flow. Two-tier bug:

1. aoe session start is a no-op for cockpit sessions.

Instance::start_with_size_opts (src/session/instance.rs:797-804) returns Ok(()) immediately when cockpit_mode is true:

#[cfg(feature = "serve")]
if self.cockpit_mode {
    return Ok(());
}

The CLI prints ✓ Started session: ... regardless, so it looks like it worked.

2. Cockpit workers only auto-spawn at aoe serve startup or via REST.

ACP worker spawn happens in three places:

aoe serve startup scan (src/server/mod.rs:471-511) — runs once, over sessions that exist at that moment.
POST /api/sessions (web-UI create) — src/server/api/sessions.rs:745-767.
POST /api/cockpit/sessions/:id/enable (substrate switch) — src/server/api/cockpit.rs:344-356.

Nothing watches for cockpit sessions added after aoe serve is already running. The 2 s status_poll_loop (src/server/mod.rs:1218) reloads instances from disk, so the new session does appear in GET /api/sessions, but no code path calls cockpit_supervisor.spawn(...) for it.

Repro of the exact flow in the report:

Term A: aoe serve                              # daemon up, 0 cockpit sessions known
Term B: aoe add . --cmd claude --cockpit       # written to disk only
Term B: aoe session start Ethiopians           # no-op (cockpit early return)
                                               # 2s later: poll reloads disk, session
                                               # appears in webui list, but no worker
                                               # was ever spawned → agent silent.

This matches both screenshots in the comment above: "claude doesn't respond" (worker missing for an existing cockpit session) and "session never starts" (worker missing for the freshly-added one).

Fix options:

Daemon-side reconciler (recommended). Extend status_poll_loop to track attempted spawns and, on each tick, call supervisor.spawn for any cockpit session on disk that has no running worker and hasn't been attempted yet. Idempotent, mirrors the startup auto-spawn, fixes both aoe add --cockpit while serve is running and any race where serve starts before a session is fully written. Need a "already attempted" set so a permanently-failing spawn doesn't retry every 2 s — supervisor already has restart bookkeeping for in-process crashes, but the initial spawn is currently a one-shot.
CLI → daemon IPC. aoe session start for cockpit POSTs to the running daemon. Heavier (needs daemon discovery + auth token plumbing) and only fixes that one entry point — aoe add --cockpit while serve is already up still wouldn't trigger a spawn.
Loud CLI error. aoe session start on cockpit prints "cockpit sessions are managed by `aoe serve`; restart serve to spawn the worker." Cheap, but the UX is bad and the underlying mismatch stays.

Going with #1 unless you'd rather a different shape — happy to push a fix on top of this branch.

njbrake · 2026-05-07T16:35:00Z

@Seluj78 thanks! I'm good with you pushing, maybe you need to create a branch in your fork to be able to do this though.

Seluj78 · 2026-05-07T16:36:40Z

@Seluj78 thanks! I'm good with you pushing, maybe you need to create a branch in your fork to be able to do this though.

Yes, cause I'm not a maintainer in this repo, I'll make a branch in my fork that targets the native branch here :) gimme a few

Seluj78 · 2026-05-07T17:32:41Z

@njbrake #953 is ready to review, it fixes a few different bugs. Let me know there if you disagree with some choices!

Seluj78 · 2026-05-07T17:34:33Z

(one this PR is merged, or at least in a working state with my PR merged into it I will start using AOE to work on AOE 😉 )

njbrake and others added 30 commits April 25, 2026 09:09

Merge remote-tracking branch 'origin/main' into native

969ecab

Merge remote-tracking branch 'origin/main' into native

d407757

Merge remote-tracking branch 'origin/main' into native

adce029

Merge remote-tracking branch 'origin/main' into native

2639505

# Conflicts: # Cargo.lock # src/server/api/sessions.rs # web/src/lib/types.ts

njbrake and others added 3 commits April 30, 2026 18:36

Merge remote-tracking branch 'origin/main' into native

2e09f2c

# Conflicts: # src/session/mod.rs

njbrake mentioned this pull request Apr 30, 2026

RFC: Stop hook semantics — should Claude's Stop event mark a session 'waiting' instead of 'idle'? #863

Closed

njbrake and others added 2 commits May 1, 2026 11:01

Merge remote-tracking branch 'origin/main' into native

ca74ff7

# Conflicts: # web/src/components/TerminalView.tsx # website/scripts/sync-docs.mjs

njbrake mentioned this pull request May 7, 2026

WebUI doesn't support cmd/ctrl+click to open links #930

Open

njbrake added this to the Cockpit Mode milestone May 7, 2026

njbrake and others added 3 commits May 7, 2026 12:48

Merge remote-tracking branch 'origin/main' into native

9518d98

# Conflicts: # Cargo.lock

Seluj78 mentioned this pull request May 7, 2026

fix(cockpit): auto-spawn ACP workers for sessions added while serve runs #953

Open

14 tasks

Uh oh!

Conversation

njbrake commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What's in here

PR Type

Checklist

AI Usage

How to build & run

How tested

Caveats

Test plan

Uh oh!

Seluj78 commented May 6, 2026

Uh oh!

njbrake commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

njbrake commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

njbrake commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

njbrake commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

Seluj78 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njbrake commented Apr 30, 2026 •

edited

Loading