Conversation
Adds the cockpit feature behind a Cargo flag. Implements the ACP client spine: subprocess spawn, JSON-RPC handshake, session creation, and prompt loop, plus the typed state/approval/replay-buffer modules from the v4 design. Validated end-to-end by a Node ACP shim agent that replays scripted session/update events. Deferred to follow-up slices: - Permission responder side-channel (currently auto-approves yolo-style) - Typed mapping of session/update kinds to CockpitState fields - AcpClient hooking fs/* and terminal/* into existing handlers - aoe-agent tool stubs that delegate via ACP - Settings TUI wiring, CLI commands, migration, WebSocket fanout - React components, push notifications, Docker socket transport, docs Tests: 22 cockpit unit + 1 e2e integration, 1094 existing tests still pass. Build: cockpit feature opt-in; default build unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the yolo auto-approve with a proper responder side-channel: on_receive_request parks the ACP responder keyed by a server-side nonce; resolve_permission(nonce, decision) wakes the parked future and answers with the matching option_id from the agent's offered options. Map ACP SessionUpdate variants to typed CockpitState Event variants (AgentMessageChunk, ToolCallStarted, ToolCallCompleted, PlanUpdated, ModeChanged) instead of passing everything through as RawAgentUpdate. Add a permission round-trip e2e test against the test shim agent. Tests: 26 cockpit unit + 2 e2e integration, 1094 existing tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hook the cockpit's FsPolicy + TerminalManager into the ACP client's incoming-request callbacks. Now agents can issue fs/read_text_file, fs/write_text_file, terminal/create, terminal/output, terminal/wait, terminal/kill, terminal/release and aoe handles them with sandbox enforcement (worktree-rooted FsPolicy). Update aoe-agent to declare Read/Write/Bash tools via Vercel AI SDK 6 whose execute() bodies delegate back to aoe over ACP. The model never touches the filesystem or shell directly. Declare client capabilities (fs.readTextFile, fs.writeTextFile, terminal) in the ACP initialize so agents know they can use them. Tests: 26 cockpit unit + 4 e2e integration (added fs + terminal round-trip tests against the shim). 1094 existing tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add migration v005_cockpit_defaults: seeds [cockpit] section in the
global config.toml on upgrade so users can flip the flag on without
hand-editing.
* Add CockpitConfig struct to session::config with 8 documented fields
matching the v4 design doc: enabled, default_for_claude,
default_agent, approval_timeout_secs, destructive_require_double_
confirm, max_concurrent_workers, replay_events, replay_bytes,
node_path. All with serde defaults; loadable from config.toml.
* Add `aoe add` flags: --cockpit, --no-cockpit, --agent <name>,
--model <id>. The first two are mutually exclusive; the agent
flag implies cockpit.
* Add `aoe cockpit` subcommand with:
- doctor [--json] [--fix]: checks Node runtime + each configured
agent's spawn command. Exits 0/1/2 for ok/fail/partial.
- agents: lists the registry with present/missing markers.
- logs/restart: stubs reserved for the worker supervisor slice.
Full settings TUI editing wiring (FieldKey + build_*_fields + merge
logic across 8 fields × 5 touchpoints) is deferred to a follow-up;
config loads cleanly via serde defaults today.
Tests: 1263 lib tests + 4 e2e + 5 cockpit-acp integration all green;
1095 default-feature tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add CockpitBroadcastFrame (session_id + seq + event JSON) and a
per-AppState broadcast::Sender<CockpitBroadcastFrame> with a 256-event
capacity. Behaves like the existing status_tx fanout.
* New WebSocket route /sessions/{id}/cockpit/ws (gated on cockpit
feature). Subscribes to the broadcast and forwards frames matching
the route session_id; emits a `lagged` notice frame so clients can
request a snapshot+replay rather than diverge silently.
* trigger_approval_push() helper that fires a Web Push payload to all
subscribers when an ApprovalRequested event is observed. Reuses the
existing PushState + push_send infrastructure. Wired so the worker
supervisor (next slice) can call it without further plumbing.
* Refactored build_router to use a let-bound chain so the cockpit
route can be conditionally added under #[cfg(feature = "cockpit")].
Tests: lib tests now 1264 (up from 1263) with the cockpit-ws unit test
guarding publish-with-no-receivers behavior. 1095 default tests still
pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds src/cockpit/node.rs with the documented resolve order: 1. AOE_COCKPIT_NODE env 2. cockpit.node_path setting 3. node on PATH (>= 20 enforced) 4. previously-extracted bundled Node at $AOE_DATA_DIR/cockpit/node-vX Tarball download is stubbed with a typed NotYetWired error so the cockpit doctor can surface a clear "install Node yourself for now" message until the auto-download lands in a follow-up. Docker unix- socket transport for sandboxed cockpit sessions is also deferred — the architecture supports it (acp_client takes a generic ByteStreams) but the spawn path needs sandbox-aware plumbing that's its own slice. Tests: 4 new unit tests covering env/PATH/bundled paths, including a serial pair that scrubs+restores PATH/AOE_COCKPIT_NODE. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* lib/cockpitTypes.ts: typed wire model mirroring CockpitBroadcastFrame
+ a pure reducer applyEvent() that materialises CockpitState from a
stream of frames. Bounded activity log (200 rows) and recentDiffs (16).
* hooks/useCockpit.ts: WebSocket subscription to
/sessions/{id}/cockpit/ws, dispatched through a useReducer; lagged
control frames flag the state so the UI can request a snapshot.
resolveApproval helper POSTs decisions to a REST endpoint that the
worker supervisor will wire up.
* components/cockpit/ApprovalCard: 3 phases (pending / submitting /
rolled-back), destructive-vs-benign affordance per the design spike
(long-press 800ms with progress ring + haptic for destructive;
single tap for benign). Swipe never approves.
* components/cockpit/PlanPanel: sticky current step, collapsed
completed disclosure, expanded upcoming. Cancelled steps are
rendered as struck-through.
* components/cockpit/ActivityStream: tool rows with kind glyphs +
colours (start=amber, complete=emerald, error=red, message=teal),
thinking/in-flight chrome.
* components/cockpit/CockpitView: top-level mobile-first layout
composing the above plus connection chrome (connecting / lagged /
closed banners) and the rate-limit notice.
Type-checks pass; Vite production bundle builds clean. Mobile-vs-
desktop layout split (3-pane on >=768px) + ChatDrawer + push-tap
deep-linking deferred polish; production wiring of the REST endpoint
ships with the worker supervisor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs/cockpit.md follows the 10-section outline from the DX review: what cockpit is, quickstart, requirements, verify (doctor), enabling per-session + globally, escape hatches, tool compatibility matrix, approvals UX, security, troubleshooting, deferred items. * website/scripts/sync-docs.mjs: register docs/cockpit.md in PAGES + URL_MAP so the nav link resolves on agent-of-empires.com. * website/src/data/docsNav.ts: link the new page under Guides. The upgrade messaging story is covered today by: - v005 migration silently seeds [cockpit] section in config.toml - aoe cockpit doctor is discoverable via aoe --help - docs/cockpit.md is the canonical reference Explicit first-run TUI card is deferred to a follow-up; the doctor command serves the same affordance and is already wired. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* src/cockpit/supervisor.rs: per-aoe-process Supervisor that owns
AcpClients keyed by session_id. Spawn/shutdown lifecycle, drain task
bridges client events to a BroadcastSink, restart-budget bookkeeping
(3 restarts in 60s window before parking the session in Status::Error).
ChannelSink impl publishes to AppState::cockpit_events_tx and fires
approval-side hooks.
* src/server/api/cockpit.rs: REST endpoints
- POST /api/sessions/{id}/cockpit/spawn (start a worker)
- DELETE /api/sessions/{id}/cockpit (shutdown)
- POST /api/sessions/{id}/cockpit/prompt (send user input)
- POST /api/sessions/{id}/cockpit/approvals/{nonce} (resolve approval)
* AppState gets cockpit_supervisor: Arc<Supervisor<ChannelSink>>; the
router wires the new routes under #[cfg(feature = "cockpit")].
* Instance gains cockpit_mode + cockpit_agent + cockpit_model fields,
hidden from serde when default. aoe add --cockpit/--no-cockpit/--agent
/--model now flow through into Instance.
Tests: 4 supervisor unit tests (spawn-unknown-agent, double-spawn,
count, restart budget) + 4 e2e + 1273 lib tests with cockpit feature
on. 1095 default-feature tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add tar + xz2 deps under the cockpit feature for tarball extraction. * node::download() resolves the host platform (linux x64/arm64, macOS x64/arm64; Windows is intentionally unsupported because it ships .zip), fetches the pinned Node 22.21.0 tarball from nodejs.org/dist, verifies SHA-256 against an embedded table, and extracts to $AOE_DATA_DIR/cockpit/node-vX.Y.Z/. * Pinned SHAs come straight from nodejs.org's SHASUMS256.txt; bumping PINNED_NODE_VERSION requires refreshing every entry. A unit test enforces that all four supported platforms are covered. * aoe cockpit doctor --fix now triggers the download when no usable Node is on PATH. The CLI command is now async so it can await the fetch + extract. Tests: 6 node unit tests (was 4; added sha256_hex against the empty- string vector + a coverage check on the SHA table). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires every cockpit setting through the documented FieldKey + override + merge pipeline so they're editable in the settings TUI with profile overrides that round-trip correctly. * CockpitConfigOverride struct in profile_config with Option<T> for every field; merge_configs honors each override. * New SettingsCategory::Cockpit; build_cockpit_fields renders all 8 fields (3 bool, 4 number, 2 text) with inheritance markers. * apply_field_to_global covers each field; apply_field_to_profile uses the existing set_profile_override helper. * clear_profile_override sets each Option to None when the user hits the 'r' key. * Re-export CockpitConfigOverride from session::mod. Tests: 1274 lib tests with cockpit feature on (was 1268, +6 from the config + node module additions). 1095 default-feature tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* SpawnConfig.socket_path option: when set, aoe binds a unix listener at that path BEFORE spawning the agent, exports AOE_ACP_SOCKET=<path> to the agent's env, waits up to 10s for the agent to connect, and uses the connected UnixStream's split halves as the ByteStreams transport. On task exit the socket file is unlinked. * run_connection_task is now generic over <W: AsyncWrite, R: AsyncRead> so the same body handles stdio (ChildStdin/ChildStdout) and socket (UnixStream split halves). socket_path is also threaded in for cleanup. * test-shim honors AOE_ACP_SOCKET: connects to the socket and uses it as the ndJsonStream transport. Falls back to stdio when unset. * New e2e test shim_agent_round_trips_via_unix_socket exercises the full round-trip end-to-end: aoe creates the socket, spawns the shim, shim connects, prompt + session/update flow back. Same shape as the stdio path. Tests: 5 cockpit e2e tests (was 4); 1274 lib tests; 1095 default tests. Docker bind-mount integration (one -v line in src/containers/runtime_ base.rs for sandboxed cockpit sessions) lands when the cockpit session type is wired into the sandbox spawn path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* CockpitView is now responsive via a useIsDesktop matchMedia hook:
mobile (<768px) renders single-column stack with a chat drawer FAB;
desktop (>=768px) renders three-pane (plan left 300px, activity
center, chat dock right 360px).
* ChatDrawer component supports both variants:
- mobile: bottom-anchored sheet with FAB to open/close, slides from
bottom; close button visible
- desktop: always-docked column on the right
Enter sends, Shift+Enter for newline, optimistic disable while
sending, plain hover/focus styling matching the cockpit palette.
* useCockpit gains sendPrompt(text) helper that POSTs to
/api/sessions/{id}/cockpit/prompt and forwards through to the
worker supervisor.
* Approval and connection chrome moves to a top header so it overlays
on mobile but inlines on desktop.
Type-checks pass; Vite production bundle builds clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add AppStateConfig.has_seen_cockpit_intro (cfg-gated on cockpit).
Tracked separately from has_seen_welcome / last_seen_version so the
one-time intro fires once, regardless of which version actually
introduced cockpit on the user's machine.
* New CockpitIntroDialog: 70x18 centered modal with the quickstart
command, doctor command, docs URL, and a quiet note about the Node
prereq. Same key handling as the existing welcome dialog (Enter /
Esc / Space / q to dismiss).
* Wired into HomeView like the existing one-time dialogs:
- cockpit_intro_dialog: Option<CockpitIntroDialog> field
- show_cockpit_intro() helper
- input.rs dispatch
- render.rs dispatch (cfg-gated branch after the macro for the
other dialogs since the macro is shared with non-cockpit builds)
* App::new fires it after the welcome+changelog flow when the flag is
unset, then persists the flag.
Tests: 1274 lib tests, 5 cockpit e2e all pass; 1095 default tests
unaffected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* web/App.tsx: dispatch on activeSession.cockpit_mode — sessions with the flag render <CockpitView/> in place of <TerminalView/>. The fallback for tmux-mode sessions is unchanged so existing terminal sessions keep working exactly as before. * SessionResponse gains cockpit_mode (gated on the cockpit feature server-side; optional in the TS shape so non-cockpit builds still satisfy the type). * CreateSessionBody learns cockpit_mode (defaults TRUE via default_cockpit_for_web so browser-created sessions land in the cockpit by default), cockpit_agent, cockpit_model. The fields flow through into the constructed Instance. * Cockpit-mode sessions skip tmux start() — no empty pane is created for sessions whose backend is the ACP supervisor. * After a successful create, if the session is cockpit_mode, kick off Supervisor::spawn() on a background task. claude tool defaults to the claude-code agent; everything else defaults to aoe-agent. Spawn failures (missing Node, etc.) log a warning but don't fail the request — the user can retry via the cockpit/spawn endpoint. * aoe serve startup now sweeps persisted instances with cockpit_mode and spawns workers for them too. Same best-effort semantics; happens in parallel so a slow agent doesn't block the listener bind. TUI default behavior is unchanged: NewSessionData doesn't set cockpit_mode so it defaults false and `n` continues to create tmux- backed sessions. Users opt in via aoe add --cockpit from the CLI. A visible toggle in the new-session dialog is a small UI follow-up. Tests: 1299 cockpit lib tests, 5 e2e, 1116 default. Clippy clean. Auto-formatted by cargo fmt as part of the precommit hook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first cockpit view reinvented the layout with its own three-pane
split, which collided with the app shell — <ContentSplit> in App.tsx
already owns the workspace sidebar (left) and terminal/diff (right).
The cockpit's job is just the middle pane, like Conductor's chat
window.
* CockpitView now renders a single scrollable conversation:
- Optional plan strip pinned at the top, click to expand the steps.
- Message-style cells: user prompts as right-aligned bubbles, agent
text as full-width prose. Consecutive agent_message_chunk events
fuse into one bubble.
- Tool calls render INLINE as collapsible cards (status dot +
one-line summary, click to reveal output).
- Pending approvals appear inline at the bottom of the feed.
- Thinking indicator as a small italic bubble.
- Input area pinned at the bottom (auto-grow, Enter sends,
Shift+Enter newline).
- System notices (connecting / lagged / rate-limited) as a thin
bar above the feed.
- Auto-stick-to-bottom unless the user scrolled up >80px.
* useCockpit::sendPrompt now also dispatches a `user_prompt` action
that appends a user-side ActivityRow so the user's outgoing turns
appear in the conversation timeline. ActivityRow.kind gains
`user_prompt`.
* Drop the now-stale subcomponents: ChatDrawer (replaced by inline
Composer), PlanPanel (replaced by PlanStrip header), ActivityStream
(replaced by ConversationFeed).
* Drop useIsDesktop / 3-pane split entirely.
Tests: 1299 lib + 5 e2e, type-check + Vite build clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ssions Two related cockpit bugs: - ACP spawn could hang silently (e.g. `npx -y` downloading on first run). Gate spawn() on a 30s handshake deadline; on timeout, kill the wedged child and publish a new AgentStartupError event end-to-end (broadcast -> WS -> reducer -> red banner with `aoe cockpit doctor --fix` hint). Default `claude-code` agent now uses the installed `claude-agent-acp` binary instead of `npx -y`; `doctor --fix` runs the global npm install. - Cockpit-mode sessions were polled like tmux sessions and surfaced a spurious "tmux session is gone" Error. Short-circuit update_status_with_metadata_inner for cockpit_mode and clear any stale error state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # Cargo.lock # src/server/api/sessions.rs # web/src/lib/types.ts
The supervisor's drain task held client.lock() across next_event().await, which blocks indefinitely waiting on the inbound mpsc. Any concurrent send_prompt (or any other Supervisor method) tried to acquire the same mutex and hung forever, so the very first prompt from the web UI never made it past the API layer. Move the inbound mpsc::Receiver out of AcpClient (now Option<...>) when the supervisor builds the worker, and let the drain task own the receiver directly. The mutex now only guards the cmd_tx side, which is fine because that side never await-blocks past a channel send. Found by an e2e test of the cockpit UI; verified by reusing the existing cockpit_acp_smoke tests (5/5 still pass) plus a Playwright run that sends two prompts and resolves an approval through the real React surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small fixes that together make the first-run cockpit experience not feel broken: - Remove the one-time "New: Cockpit (Native Agent Rendering)" TUI popup and its has_seen_cockpit_intro tracking. Discoverability lives in the docs and the `aoe cockpit` subcommand; we don't need an extra dialog on every first launch. - acp_client.spawn_subprocess now also forwards ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN, CLAUDE_CODE_OAUTH_TOKEN, and CLAUDE_CONFIG_DIR by default. Without this, users who already have ANTHROPIC_API_KEY exported (the common case) hit "Authentication required" because the agent inherits an env_clear()'d environment and can't see the key. - StartupErrorBanner branches on the error message: when the failure is auth-shaped (matches /authentic|login|api[_ -]?key/i), show "set ANTHROPIC_API_KEY or run claude /login" instead of the install-the-adapter hint, which is misleading once the binary is already on PATH. - CockpitView and ApprovalCard now use the shared design tokens (surface-*, text-*, brand-*) instead of Tailwind's blue-tinted slate and amber palettes, matching the surrounding app shell. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…polish)
Brings the cockpit conversation surface in line with Cursor / VSCode
agent chat patterns. All seven items from the UX research land here:
1. Agent text now renders as markdown (paragraphs, headings, lists,
blockquotes, fenced code blocks with shiki syntax highlighting,
inline code, links). New `Markdown.tsx` parses with `marked` and
renders each token through tailwind-styled React components so
colors come from the design tokens, not the library defaults.
2. Per-tool renderers in `ToolCards.tsx`, dispatched by the new
`kind` field on `ToolCall` (plumbed from ACP `ToolKind`):
- `execute` shows a `$ command` line with collapsible body
- `read` shows file path + optional line range
- `edit` / `delete` show a mini-diff with `-` / `+` styling
- `search` shows the query (and scope when known)
- `fetch` shows the URL
- `think` is a one-line italic note
- `other` falls back to a generic expandable card
3. Vertical rhythm: a soft horizontal divider above each user turn
(except the first), tighter spacing for tool→tool runs, more
breathing room for agent text.
4. Stop button: when the agent is thinking or has a tool in flight,
the Send button is replaced by a stop-square that POSTs to a new
`/api/sessions/{id}/cockpit/cancel` endpoint. The endpoint sends an
ACP `session/cancel` notification via a new `ClientCmd::Cancel` and
`AcpClient::cancel_prompt`.
5. Empty state: three starter prompt chips ("Explain this codebase"
etc.) replace the bare "type a prompt" placeholder. Clicking a chip
sends the prompt immediately.
6. Refined user bubble: smaller right-aligned chip with a rounded-br
corner, subtle border, no longer competing with agent text for
visual weight.
7. Composer affordances: keyboard hint (`Enter to send · Shift+Enter
for newline`) under the textarea, auto-focus on mount so the user
can type immediately after picking a session.
Backend: `ToolCall` gains a `kind: String` field carrying the ACP
`ToolKind` lowercased (read/edit/execute/search/...). The web
`ActivityRow` now keeps the full `ToolCall` payload on `tool_start`
rows so the UI doesn't have to look it up by id.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups from the cockpit overhaul: 1. wterm's async init() at node_modules/@wterm/dom/dist/wterm.js:56 calls input.focus() unconditionally after the WASM bridge loads, firing 200-500ms after mount and stealing focus from the agent composer. The composer's onMount focus runs sync and loses the race. Re-claim focus at 250ms and 700ms, but only when focus is on document.body or inside .wterm — so an intentional click into the host shell during the window sticks. 2. The "Thinking…" bubble only showed during ACP AgentThoughtChunk events, leaving the user staring at a blank pane while the agent was running tools or waiting for first text chunk. Track a new `turnActive` flag (set on user_prompt, cleared by Stopped / AgentStartupError) and render a spinning glyph + contextual label whenever the turn is open: "Thinking…", "Running <tool>…", or "Working…" otherwise. Also drives the Send → Stop button swap so cancel is reachable for the entire turn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the plain "Working…" / "Thinking…" / "Running X…" labels with a
braille-spinner glyph and a rotating verb pool themed around Agent of
Empires' civilization-building flavor. A nod to Claude Code's
"ruminating" / "noodling" verbs and the Rust `rattles` spinner crate
the TUI already uses for ratatui status indicators.
`cockpitRattle.ts` defines:
- SPINNER_FRAMES: 10-step braille rotation (⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏), 80ms/frame
- WORKING_VERBS: 35 empire verbs (Conscripting villagers, Marshalling
forces, Quarrying granite, Calibrating trebuchets, Plundering
archives, Negotiating treaties, …)
- THINKING_VERBS: 14 mystical/divinatory verbs for AgentThoughtChunk
state (Consulting auguries, Casting bones, Whispering with elders,
Studying the stars, …)
- chooseVerb() — deterministic from a seed so the verb stays stable
across re-renders within a tick. The seed bumps every 4s so long
turns rotate through different verbs without flickering.
Tool runs override the verb pool: instead of a generic empire word,
the tool's own name is dressed up with a pool of action prefixes
("Wielding read", "Dispatching write", "Marshalling search", …), so
the user always sees what's actually running.
Test infra: the shim now honors a "SLOW" prompt keyword that adds
800ms gaps between session/update events, so e2e tests can observe
the mid-turn UI without a real model.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…itives
Replace the hand-rolled conversation feed with assistant-ui's headless
React primitives. We keep ownership of the chat *state* (it streams
from the cockpit WS, not from a chat protocol assistant-ui knows about);
assistant-ui owns the chat *surface* — scroll viewport, message list,
keyboard shortcuts, accessibility, message-edit affordances, the
running/idle gating.
Architecture
ws frame → applyEvent → CockpitState.activity (ours)
│
▼
activityToThreadMessages() → ThreadMessageLike[]
│
▼
useExternalStoreRuntime(adapter) → AssistantRuntime
│
▼
<AssistantRuntimeProvider runtime>
│
▼
<ThreadPrimitive.Messages components={…}>
<ComposerPrimitive.Root>
The new `CockpitRuntime` component wraps the cockpit and exposes the
raw state via render-prop for the bits assistant-ui doesn't own (plan
strip, system notices, startup error banner, ACP approval cards).
Renderers we wrote earlier all keep their place:
- Markdown.tsx → injected as `Text` part component
- ToolCards.tsx → injected as `tools.Override` so per-kind cards
(read/edit/execute/search/…) render inside assistant messages
- cockpitRattle (verbs + braille frames) → driven by the new
`<ThreadPrimitive.If running>` gate
- ApprovalCard.tsx → rendered below the message stream as before
Composer behaviour:
- `<ComposerPrimitive.Input>` replaces our textarea; auto-grow + the
wterm focus-race reclaim logic still attach to its element ref
- Send/Stop swap via `<ThreadPrimitive.If running>`; Stop calls
`useThreadRuntime().cancelRun()` which the runtime adapter
forwards to our `cockpit/cancel` REST endpoint
- `onNew` flattens AppendMessage parts to plain text (ACP only
accepts text prompts); attachments/images dropped silently for now
Verified: e2e (13/13 steps), spinner mid-turn snapshot still shows the
empire-themed rattle, focus reclaim still beats wterm's async init,
all 5 cockpit_acp_smoke tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lift the composer out of CockpitView.tsx into its own file and rebuild
the chrome around assistant-ui's `<ComposerPrimitive>` to match the feel
of VSCode chat and Cursor agent chat.
What changed visually:
- Tall multi-line input by default (rows=2, min-h-14, max-h-50) so
the composer reads as a writing surface, not a one-liner. Auto-
grows up to 200px before scrolling.
- Top-affordance toolbar row inside the composer card with `@`
files, `/` commands, and paperclip attachment icons (lucide-react,
same icon family VSCode/Cursor visually feel like). Disabled with
"coming soon" tooltips for now — present sets the visual frame.
- Send button: paper-plane icon on a brand-amber rounded pill, with
a hover lift and 0.98 active-scale so it feels press-able.
- Stop button: square-stop icon + "Stop" text, hover styling tinted
rose to read as a destructive/cancel action.
- Keyboard hint as a kbd-styled chip ("↵ Send") next to the send
button instead of dim text floating below.
- Subtle inner shadow on the composer well + amber focus ring (3px
glow) on focus-within. Clear visual hierarchy: input > toolbar >
actions.
Lifted out of the view so it can grow features (model picker, file
chips, slash command popover) without cluttering CockpitView.tsx.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… tool cards
Closes the eight remaining naive UI items in one pass.
Backend (new endpoints + ACP set_mode):
- GET /api/sessions/:id/cockpit/files → workspace file list (5k cap,
skips .git/node_modules/target/dist/build/.next/.venv/.cache/etc.)
- POST /api/sessions/:id/cockpit/mode → ACP `session/set_mode`
via new ClientCmd::SetMode + AcpClient::set_mode + Supervisor::
set_mode
Frontend additions:
- TriggerPopover.tsx: generic @-/-trigger combobox. Detects trigger
chars at word boundaries (whitespace before, word chars after),
arrow-keys/Enter/Tab/Esc navigation, mousedown-insert so the
textarea doesn't lose focus. Plugged into the composer twice —
once for @ (file picker, fed by /cockpit/files + fuzzyFilter)
and once for / (slash commands, hard-coded /help/clear/tools/model
for now).
- ModePicker in PlanStrip: clickable mode chip with Default/Plan/
AcceptEdits/Yolo (BypassPermissions) options, each with a one-line
hint. Click → POST /cockpit/mode. Tinted by current mode (rose
for Yolo, amber for AcceptEdits, cyan for Plan).
- Plan strip itself: progress bar (visual, animated transition),
completed/total counter, chevron rotation on expand, and a thin
always-visible bar with the mode picker even when no plan is
active.
- Hover affordances on messages via ActionBarPrimitive: copy + edit
on user messages, copy + regenerate on assistant. Visible on
hover/focus only (group-hover:opacity-100), Lucide icons.
- useStreamReveal hook: char-budget reveal (24 chars/16ms baseline,
accelerates when >200 chars behind). Smooths ACP's chunky text
delivery into typewriter-style streaming.
- Composer toolbar: @ and / icon buttons now insert the trigger char
at the caret (with space-padding when mid-word) so the popover
opens. Paperclip remains a coming-soon stub.
ToolCards.tsx rewrite — VSCode/Cursor-style:
- Common CardChrome with status dot + per-kind icon + label + meta
+ collapse chevron.
- Bash: $-prefixed command, output highlighted as `bash` lang in
shiki, line count in meta, "Show N more" expand for long output.
- Read: file path + line range + line count, content highlighted
by extension, 16-line preview default.
- Edit: real diff lines with +/− gutter, shiki-highlighted body,
+N/−N counters in meta.
- Search: query + scope, line-numbered match list capped at 50.
- Fetch: URL primary, JSON-highlighted output.
- Think: one-line italic, no chrome.
- Generic fallback: input + output sections with copy buttons.
ApprovalCard rewrite:
- Aligned with tool-card visual language: same border-md, same
bg-surface-800/50 base, header strip with shield/alert icon and
label.
- Three-button block: Allow / Always (benign) or Hold-to-allow
(destructive), plus Deny on the right. Lucide icons throughout.
- Args preview lives in a max-h-32 scrollable code block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three pieces I shipped from scratch that were already in the toolbox.
Net code: similar line count after factoring out package-lock churn,
but ~520 lines of bespoke logic deleted in favor of battle-tested
upstream code that the assistant-ui team maintains.
1. TriggerPopover.tsx (287 lines) → ComposerPrimitive.Unstable_TriggerPopover
I had the exports printed out earlier and skipped over the
Unstable_ family. The official primitive ships:
- Trigger detection at word boundaries
- Arrow-key/Enter/Tab/Esc navigation with data-highlighted
- Mousedown insertion (so the textarea doesn't lose focus)
- Plugin registry integration with ComposerInput so cursor
position is wired automatically
- Search vs categories drill-down with isSearchMode
We provide an Unstable_TriggerAdapter (categories/categoryItems/
search) that returns items for @ files / / commands. Pairing the
`@` trigger with .Directive (chip-into-text) and the `/` trigger
with .Action (handler-fires-immediately) gets us both UXes
declaratively. Empty `categories` skips the drill-down step so
a flat file list shows the moment `@` is typed.
2. Markdown.tsx (~200 lines + useStreamReveal hook) → MarkdownTextPrimitive
from @assistant-ui/react-markdown (already in deps, never imported).
The primitive handles:
- Streaming-aware rendering (incomplete fenced blocks)
- Built-in `smooth` char-budget reveal (replaces useStreamReveal)
- Standard markdown via remark/rehype
We plug in a SyntaxHighlighter component backed by our existing
shiki integration, plus a CodeHeader matching the design tokens.
index.css gets descendant selectors for prose because the primitive
emits unclassed elements.
3. EditToolCard custom +/- diff renderer → react-diff-viewer-continued
The library does word-level diff, line numbers, gutter colors,
expand-context. We override its dark theme variables to match
our zinc/brand palette so the diff doesn't read as "pasted in
from another app". Diff lib is loaded only inside the Edit
card, so the bundle penalty stays in the tool-card chunk.
Composer also gains the proper plugin-registry path: the
TriggerPopoverRoot wraps ComposerPrimitive.Root, so the input fires
setCursorPosition() into all registered triggers automatically.
Both `@` and `/` use the same primitive, just with different adapters
and behaviors (Directive vs Action).
The custom code that stays:
- CockpitRuntime.tsx (ACP-specific, not generic)
- useFilesIndex (one-shot fetch + memo)
- fuzzyFilter (small enough to not warrant a dep)
- The ApprovalCard, ModePicker, and PlanStrip remain bespoke
because they're product-specific (ACP approvals, ACP modes).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three issues from real-Claude usage:
1. Bash tool card showed literal "$ {}" when raw_input was empty.
Forward the ACP `tool.title` through CockpitRuntime via a namespaced
`_aoe_title` key in the args JSON so per-kind renderers can fall
back to a descriptive label. Updated Execute/Read/Edit/Delete/
Search/Fetch cards to chain: real arg field → forwarded title →
bare tool kind.
2. Mode picker said "Default" even when the agent was running in
yolo mode. Root cause: we never read agent-advertised modes —
only listened for a CurrentModeUpdate event we never received
because the ACP `session/update` for mode change is rare. Now:
- Capture the mode set from `NewSessionResponse.modes` on
session creation and emit a new `Event::ModesAvailable`
carrying the agent's actual modes (id + name + description).
- Map ACP `SessionUpdate::CurrentModeUpdate` to a new
`Event::CurrentModeChanged` in addition to the legacy
enum-based `ModeChanged`.
- UI tracks `state.availableModes` + `state.currentModeId`
and renders the picker from those (falls back to the
hard-coded four-mode taxonomy when the agent doesn't
advertise any).
3. Mode picker also moved from the top PlanStrip strip into the
composer footer (Cursor-style: inline with @ / / / paperclip
toolbar buttons, opens upward via `bottom-full`). PlanStrip is
now hidden entirely when there's no plan and mode is Default,
instead of rendering an empty bar to host the picker.
4. Cockpit-mode session in the TUI showed "tmux pane is gone"
because `Instance::start_with_size_opts` unconditionally
created a tmux session. Cockpit sessions don't have a tmux
backing — the supervisor spawns the ACP agent process directly.
Short-circuit start() for cockpit_mode at the top.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r (Beta)
Make cockpit a real per-session opt-in across every aoe-supported
agent that has a published ACP server, not just Claude.
Default registry (`AgentRegistry::with_defaults()`) now seeds one
entry per tool, keyed by the same name the tmux substrate uses, so
the spawn path can map `instance.tool` directly to a registry key:
claude → claude-agent-acp (Zed adapter for Claude SDK)
opencode → `opencode acp` (native, SST)
gemini → `gemini --acp` (native, Google)
codex → codex-acp (Zed adapter, OpenAI Codex CLI)
vibe → vibe-acp (native, Mistral)
pi → pi-acp (adapter, Hermes coding agent)
aoe-agent → bundled multi-provider fallback
Verified each invocation against agentclientprotocol.com/get-started/
agents.md and the upstream agent docs (Jan 2026). The legacy
"claude-code" key stays as an alias so persisted sessions with
cockpit_agent="claude-code" still resolve.
Spawn path (`Supervisor::pick_agent_for_tool`) replaces the
hard-coded `"claude → claude-code, else aoe-agent"` fallback in
three places:
src/server/api/cockpit.rs (POST /cockpit/spawn)
src/server/api/sessions.rs (auto-spawn after create)
src/server/mod.rs (auto-spawn at serve startup)
Precedence: explicit override → registry entry keyed on tool →
legacy fallback. So `aoe add . --cmd opencode --cockpit` now spawns
real opencode-via-ACP, not the generic aoe-agent.
`aoe cockpit doctor` walks the new registry and prints a per-agent
status with tailored install hints. `--fix` runs `npm install -g`
for the npm-distributed adapters (claude / codex / pi); native CLIs
get a one-line install hint pointing at the upstream installer.
Doctor banner now reads "Cockpit doctor (Beta)" with a one-line
explainer about substrate selection.
Web wizard gains an explicit two-card substrate picker (Cockpit Beta
vs Terminal Stable). Greys out cockpit when the selected tool isn't
in our `ACP_CAPABLE_TOOLS` allowlist (aider, cursor, copilot,
droid, settl, hermes for now). The wizard sends `cockpit_mode` on
the create-session request; server's `default_cockpit_for_web`
default still applies when omitted, but the wizard always sends it
explicitly now.
docs/cockpit.md gets a Beta callout at the top, a per-agent
support/auth table, and an updated doctor sample reflecting the
new format.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # src/session/mod.rs
Two new endpoints + a discreet UI button per view make `cockpit_mode`
runtime-mutable. The data model was already a per-session bool; this
plumbs the actual transitions:
POST /api/sessions/:id/cockpit/enable
POST /api/sessions/:id/cockpit/disable
Both are idempotent (200 with no work when already in the target
state). Enable validates the tool has an ACP-capable registry entry
before flipping (so we don't strand a session in cockpit mode with no
agent to spawn). Persistence happens before the new substrate starts,
so a crash mid-swap leaves the declared end state on disk rather than
a half-broken intermediate.
Cockpit → tmux:
- Supervisor::shutdown drops the ACP worker (UnknownSession is fine
if startup never completed).
- cockpit_mode = false, save.
- Instance::start() creates a fresh tmux pane and runs the agent.
Tmux → cockpit:
- Instance::kill() drops the tmux pane.
- cockpit_mode = true, save.
- Supervisor::spawn fires off a worker. If it fails (binary missing,
auth missing, etc.) the standard AgentStartupError flows through
to the UI's red banner — the swap itself returns 200 because the
substrate state on disk is correct.
UI: new SwitchSubstrateAction component with a destructive-confirm
modal. Plugged into:
- the cockpit composer toolbar (icon-only, next to mode picker)
- the TerminalView top-right corner (icon-only, absolute-positioned)
The cockpit-side button always works; the terminal-side button greys
out when the tool isn't in our ACP_CAPABLE_TOOLS allowlist (aider,
cursor, copilot, droid, settl, hermes for now). Confirm dialog
explains what's destroyed (cockpit conversation log / tmux scrollback)
and what's preserved (worktree, open files, session id).
After the API returns the session-list poll picks up the new
cockpit_mode within ~3s and the parent flips between <CockpitView>
and <TerminalView>; the explicit refresh wasn't worth threading
through.
Verified end-to-end with a Playwright suite (13/13 pass): both
directions through the UI, idempotent re-enable/re-disable through
the API, backend/frontend stay in sync, confirm dialog gating works.
The pre-existing 14-scenario comprehensive suite still passes
(23/23 + 1 skip on the pre-existing replay-buffer item).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # web/src/components/TerminalView.tsx # website/scripts/sync-docs.mjs
|
do you need some testing on this? |
|
@Seluj78 If you would have a chance, absolutely! This is the next generation feature I think to make the web mode of AOE great, but I haven't had the time to finish the feature. |
💯 this is my concern too. My hope 🤞 was that by using ACP we would have to worry about less of this and things would just work https://github.com/agentclientprotocol/agent-client-protocol . Thank you for giving it a try. One of the biggest hurdles I'm trying to figure out is how to manage a session between TUI and Web in cockpit mode. Like, in the web dashboard I certainly want it using cockpit mode, but then if I go to use the TUI I think I would rather have it use the tmux+native Claude Code etc mode. So is there a way where I can run a single session and have it be able to switch between cockpit and tmux views. Idk 🤷 |
Oooh that's what this is ! I didn't bother checking it out earlier. It makes so much sense! Then yeah 100% this is the right way to go, my concern has disappeared. The only thing I would keep in mind is a way to know when new versions of the SDK is released so we know that we need to update AOE to support it (a CI check?)
From what I saw in the webUI (I didn't try the TUI in this PR) you had a toggle to switch back to tmux. So I don't understand right now the problem you're having :) @njbrake could you update this PR (43 commits behind) and I'll try to do some sessions on it and see how it feels and give more feedbacks :) |
# Conflicts: # Cargo.lock
The cockpit (ACP-based structured rendering) only ships alongside the web dashboard, so the standalone `cockpit` cargo feature was redundant. Consolidating means one feature flag for "I want the web surface" and no risk of someone enabling cockpit without realising they need serve. - Drop the `cockpit` feature from Cargo.toml; roll its deps (agent-client-protocol, agent-client-protocol-tokio, tar, xz2) into the existing `serve` feature list. - Sweep `#[cfg(feature = "cockpit")]` → `#[cfg(feature = "serve")]` across 44 sites in src/ and tests/. - Update docs/cockpit.md and the one stale comment in tui/settings/fields.rs to reference `serve`. - Fix three pre-existing test fixtures missing the `kind` field on ToolCall (uncovered now that cockpit tests run with `--features serve` instead of being routed to a separate feature build). Cockpit code still lives entirely under src/cockpit/, so the rip-out story is unchanged: delete the module + its handler files in src/server/, and the rest of `serve` keeps working.
Cockpit sessions spawned a `claude-agent-acp` subprocess (plus its SDK
child) that lived on past two cleanup paths:
1. `DELETE /api/sessions/{id}` ran `perform_deletion` (worktree + tmux
teardown) but never told the cockpit supervisor to shut down. The
worker handle stayed in `Supervisor::workers`, the spawned process
stayed alive.
2. `aoe serve` graceful shutdown handled SIGINT/SIGTERM/SIGHUP for the
daemon itself but never called `Supervisor::shutdown_all`, so each
running cockpit session leaked its wrapper + SDK child when the
daemon exited.
After repeated probe runs we'd accumulate 6+ orphan node processes per
session. Wire the supervisor shutdown into both paths; verified end-to-
end that two cockpit sessions drop their processes to zero on delete,
and a SIGTERM to the daemon reaps the rest.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@njbrake is there specific things you need tested in this PR ? Or just general feedback on the experience of using it? Or something else ? :) |
|
@Seluj78 yes at this point just general feedback. There are a few dimensions we need to test/debug before merge.
Basically I don't need this PR to handle all the edge cases but I need to be sure that the web app is at least generally as usable as it was before the change, and that it doesn't regress anything in the TUI to degrade the TUI experience |
|
@njbrake okay, well I can't go any further in my testing because I cannot get claude to respond to me in the cockpit mode Actually I wanted to reproduce the issue above and I can't even get a new session to show up:
I ran those commands and it said |
Traced the flow. Two-tier bug: 1.
#[cfg(feature = "serve")]
if self.cockpit_mode {
return Ok(());
}The CLI prints 2. Cockpit workers only auto-spawn at ACP worker spawn happens in three places:
Nothing watches for cockpit sessions added after Repro of the exact flow in the report: This matches both screenshots in the comment above: "claude doesn't respond" (worker missing for an existing cockpit session) and "session never starts" (worker missing for the freshly-added one). Fix options:
Going with #1 unless you'd rather a different shape — happy to push a fix on top of this branch. |
|
@Seluj78 thanks! I'm good with you pushing, maybe you need to create a branch in your fork to be able to do this though. |
Yes, cause I'm not a maintainer in this repo, I'll make a branch in my fork that targets the |
|
(one this PR is merged, or at least in a working state with my PR merged into it I will start using AOE to work on AOE 😉 ) |




Description
Adds cockpit, an ACP-based structured rendering surface that runs alongside the existing tmux passthrough. Every aoe session is now a per-session pick:
tmux(legacy, raw bytes through wterm) orcockpit(Beta, agent speaks Agent Client Protocol; aoe renders typed events as React cards). Tmux remains the default — cockpit is opt-in viaaoe add --cockpitor the new substrate picker on the web wizard.The data model (
cockpit_mode: boolper session) is already merged onmain; this branch is the polish + ecosystem expansion that turns cockpit into something you'd actually want to use, plus per-tool ACP support so it's not Claude-only.What's in here
Core substrate (was already on the branch from earlier merges):
src/cockpit/supervisor.rs) with restart budget, drain task, fs/terminal handlerssrc/server/cockpit_ws.rs)@assistant-ui/reactprimitivesPer-tool ACP (
e4ad824): verified each agent's ACP invocation against agentclientprotocol.com/get-started/agents.md and seeded the registry:claudeclaude-agent-acp(Zed adapter)opencodeopencode acp(native, SST)geminigemini --acp(native, Google)codexcodex-acp(Zed adapter)vibevibe-acp(native, Mistral)pipi-acp(Hermes coding agent)aoe-agentSupervisor::pick_agent_for_tool(tool, override)replaces three copy-pasted"claude → claude-code, else aoe-agent"fallbacks.UX polish (
56a8822→dd249aa):@assistant-ui/react-markdown)react-diff-viewer-continuedfor edit cards@-mention file picker (assistant-ui'sUnstable_TriggerPopover+ workspace file index endpoint)/slash commandsActionBarPrimitive)NewSessionResponse.modes, drop-up menu in composer footerReliability fixes:
3cccf46— drain-task/send_prompt deadlock (drain held client mutex acrossrecv().await)28e8066+30a21f8— TUI no longer marks cockpit sessions as errored ("tmux pane is gone")d244ad6— composer wins focus race against wterm's async initc1bb7e0— auth env forwarded by default (ANTHROPIC_API_KEY, CLAUDE_CONFIG_DIR, etc.)Library refactor (
f1298bc): replaced ~520 lines of reinvention with first-party assistant-ui primitives (TriggerPopover, MarkdownTextPrimitive, smooth streaming).Doctor + docs:
aoe cockpit doctorwalks the full registry, prints per-agent install hints,--fixnpm install -gs the npm-distributed adaptersdocs/cockpit.mdgets a Beta callout, per-agent support/auth table, and updated doctor samplePR Type
Checklist
docs/cockpit.md)AI Usage
AI Model/Tool used: Claude Opus 4.7 via Claude Code
Any Additional AI Details you'd like to share:
The branch was developed iteratively across many conversations. Architecture decisions (ACP as substrate B, per-session toggle, supervisor ownership of agent processes, drain-task pattern, assistant-ui for the React surface) were human-directed; the AI handled implementation, testing, and UX iteration. Notable AI-caught issues that humans would have caught later: the deadlock in
Supervisor's drain task, the focus race against wterm's async init, the missing title fallback for tools with emptyraw_input. Notable AI-missed issues that humans caught: the mode picker initially said "Default" when the agent was actually in yolo mode (we weren't reading agent-advertised modes), the bash card showed$ {}for the same reason, and the early version reinventedUnstable_TriggerPopover+MarkdownTextPrimitiveinstead of using the assistant-ui primitives that were already in our deps.How to build & run
How tested
Plus a Playwright e2e harness against a live
aoe servewith a Node ACP test shim:@file picker +/slash command popoversCaveats
agentclientprotocol.com/agents.mdagainst upstream docs) and via the supervisor/registry tests, but I didn't exercise each adapter end-to-end on real hardware in this branch — the smoke and e2e tests use the shim. First time youaoe add . --cmd opencode --cockpityou may hit per-agent quirks.aoe cockpit doctoronly checks binary presence, not auth state. A future improvement would spawn each adapter'sinitializeand inspect the response'sauth_methods.Unstable_*primitive from assistant-ui (Unstable_TriggerPopover); if upstream renames it on a minor bump we'd need a mechanical update.Test plan
claude /loginthenaoe add . --cmd claude --cockpit— verify cockpit conversation works end-to-end with a real Claude subscriptionaoe add . --cmd opencode --cockpit— verify the registry expansion picks upopencode acpcorrectlyaoe cockpit doctor --fix— verify it installs the missing adapters🤖 Generated with Claude Code