feat(plugin-pty): PTY_SERVICE — real interactive eliza-code CLI (on cerebras) in the cockpit terminal by NubsCarson · Pull Request #10668 · elizaOS/eliza

NubsCarson · 2026-07-01T04:29:39Z

What & why

The elizaOS app already ships the entire web terminal — the xterm UI
(PtyTerminalPane), the typed client (spawnShellSession /
subscribePtyOutput / sendPtyInput / resizePty), and the agent-server
WebSocket handlers (pty-subscribe / pty-input / pty-output /
pty-resize). Those handlers resolve
runtime.getService("PTY_SERVICE")?.consoleBridge — which was always null,
because nothing registered a PTY_SERVICE. So the terminal was inert.

This PR adds that missing keystone and wires it into the coding cockpit: a real
interactive eliza-code CLI on Eliza Cloud/cerebras, driveable from the web
terminal on any device — all slash commands, an agent we own, zero TOS
exposure (running a real CLI on a subscription is inherently the TOS-unsafe
tier; eliza-code-on-cerebras avoids that entirely).

Changes

New plugin @elizaos/plugin-pty (opt-in; no autoEnable, so dormant
unless a character lists it; disabled on store builds):

PtyService (serviceType = "PTY_SERVICE") exposing the consoleBridge the
agent server drives + session lifecycle.
Routes: POST / GET /api/pty/sessions, DELETE /api/pty/sessions/:id
(authenticated).
buildElizaCodeCerebrasSpec: launches the interactive eliza-code CLI
(packages/examples/code) pointed at Eliza Cloud's OpenAI-compatible endpoint
→ inference on cerebras (gpt-oss-120b fast / zai-glm-4.7 smart).

Runtime-aware PTY engine. @lydell/node-pty's write path is broken under
Bun (this._socket.write is not a function — output streams, keystrokes throw),
and the agent runs under Bun in dev. So defaultSpawnResolver uses Bun's native
truePty (Bun.spawn({ terminal }), the same engine the Electrobun host uses)
under Bun and @lydell/node-pty under Node — both behind one PtyHandle.
(Gotcha handled: Bun's terminal exit callback reports the PTY-teardown status,
always 1; the real exit code is taken from proc.exited.)

Wiring.

client: spawnPtySession() / stopPtySession() (mirror spawnShellSession).
agent: register @elizaos/plugin-pty as a deferred, opt-in optional plugin.
cockpit: CockpitInteractiveTerminal spawns via spawnPtySession and mounts
the existing PtyTerminalPane; CockpitRoute gets a Fast/Smart launch control
that opens it as a full-panel overlay (presentational CockpitView untouched).

Testing / evidence

41 unit tests — plugin-pty (36): bridge routing, output streaming,
session lifecycle, cwd confinement, session cap, the cerebras spec builder, the
bin resolver, and every route gate/error, all against an injected fake PTY;
cockpit (5): spawn→attach at tier, cwd passthrough, error+retry,
unmount-kills-session, close→stop, via the real component path (mock only at the
client boundary + xterm pane). Full plugin-task-coordinator suite green.
Real-runtime proof — the actual engine (Bun truePty and node-pty) was
exercised end to end: spawn a real process, stream its output through the
bridge, round-trip a keystroke, exit code 0, clean session teardown.
Typecheck + build clean across plugin-pty, agent, ui,
plugin-task-coordinator (xterm stays lazy code-split).
N/A here, handoff: the full "real /help on cerebras, on a phone" device
capture requires a built eliza-code bundle + a live Eliza Cloud key on the
device; captured as the device handoff rather than in CI.

Notes for reviewers

services/pty-contract.ts intentionally re-declares ConsoleBridge /
PTYService instead of importing from @elizaos/agent (which depends on the
plugin set — importing it would create a cycle); the runtime binds them
structurally at the getService cast. A comment flags the sync requirement.
The Claude/Codex interactive CLIs can reuse this same PTY_SERVICE later —
deferred as the explicitly-gated experimental tier.

…the web terminal The app already ships the whole web terminal — the xterm UI (PtyTerminalPane), the typed client (spawnShellSession/subscribePtyOutput/sendPtyInput/resizePty), and the agent-server WS handlers (pty-subscribe/pty-input/pty-output/pty-resize). Those handlers resolve `runtime.getService("PTY_SERVICE")?.consoleBridge`, which was always null because nothing registered a PTY_SERVICE. This adds that missing keystone. New opt-in plugin @elizaos/plugin-pty: - PtyService (serviceType "PTY_SERVICE") exposing the consoleBridge the agent server drives, plus session lifecycle (start/stop/list). - Routes: POST/GET /api/pty/sessions, DELETE /api/pty/sessions/:id (authenticated). - buildElizaCodeCerebrasSpec: launches the interactive eliza-code CLI (a real slash-command TUI we own) pointed at Eliza Cloud's OpenAI-compatible endpoint so inference runs on cerebras (gpt-oss-120b fast / zai-glm-4.7 smart) — a real CLI with all slash commands, zero TOS exposure. Runtime-aware PTY engine: node-pty's write path is broken under Bun (this._socket.write is not a function), and the agent runs under Bun in dev, so defaultSpawnResolver uses Bun's native truePty (Bun.spawn({terminal})) under Bun and @lydell/node-pty under Node — both behind one PtyHandle. (The Bun terminal `exit` callback reports the PTY-teardown status, always 1; the real exit code is taken from proc.exited.) Wiring: - client: spawnPtySession() -> POST /api/pty/sessions (mirrors spawnShellSession), then reuse the existing subscribe/input/resize path. - agent: register @elizaos/plugin-pty as a deferred, opt-in optional plugin (no autoEnable -> dormant unless a character lists it; disabled on store builds). Tests: 36 unit tests (bridge routing, streaming, lifecycle, cwd confinement, session cap, spec builder, bin resolver, route gates/errors) against an injected fake PTY, plus a gated pty-real.e2e.test.ts. Real-runtime path verified under Bun (truePty) and Node (node-pty): real process output, keystroke round-trip, exit code 0, clean session teardown.

Wire the cockpit's "tap-in, drive it directly" pillar to the new PTY_SERVICE: launch a REAL interactive eliza-code CLI on Eliza Cloud/cerebras from the cockpit and drive it in the live xterm pane — all slash commands, an agent we own, zero TOS exposure. - CockpitInteractiveTerminal: spawns via client.spawnPtySession({kind:"eliza-code", tier}), mounts the existing self-contained PtyTerminalPane on the returned sessionId, surfaces spawn errors with a retry, and kills the session on unmount/close so the REPL never orphans. - CockpitRoute: a Fast/Smart launch control over the deck opens the terminal as a full-panel overlay (leaves the presentational CockpitView untouched). - client: stopPtySession() (DELETE /api/pty/sessions/:id) alongside spawnPtySession. Tests: 5 new component tests (spawn→attach at tier, cwd passthrough, error+retry, unmount-kills-session, close→stop+onClose) via the real component path, mocking only the client boundary + xterm pane. Full plugin suite green (192 tests); typecheck + view-bundle build clean (xterm stays lazy-split). Device/pixel verification (real /help on cerebras on-device) is the handoff.

greptile-apps

Your trial has ended. Reactivate Greptile to resume code reviews.

coderabbitai · 2026-07-01T04:29:48Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b18348e1-7ff8-4c26-af61-ed07a7d1423c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/plugin-pty-interactive-terminal

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

greptile-apps

Your trial has ended. Reactivate Greptile to resume code reviews.

greptile-apps

Your trial has ended. Reactivate Greptile to resume code reviews.

NubsCarson · 2026-07-01T04:40:16Z

Validation/update for #10668 after reviewing the PTY lane.

I found and pushed a cleanup follow-up, then refreshed the branch on current develop:

07216ee743 fix(plugin-pty): stabilize PTY package validation
- adds an ambient declaration for optional native @lydell/node-pty so plugins/plugin-pty typecheck does not require the native package to be installed locally
- replaces PtyDisposable | void with PtyDisposable | undefined to satisfy Biome's no-confusing-void rule
- formats/organizes the new plugin-pty files and the cockpit terminal component
- makes CockpitInteractiveTerminal declare the small PTY client surface it consumes so plugin-task-coordinator typecheck is stable even when @elizaos/ui/dist is stale before a UI build
c36f0d57f8 merges current origin/develop (0a3fd649de) into the PR branch; conflict-free.

Security/runtime notes from review:

/api/pty/* routes are private by default (public is not set), and the Hono route dispatcher returns 401 unless normal API/local auth passes.
spawn is disabled for store builds and when PTY_INTERACTIVE_ENABLED=false|0.
cwd is confined in PtySessionStore against PTY_ALLOWED_DIRECTORY/process cwd before spawning.
concurrent sessions have a hard cap and unmount/stop paths kill sessions.

Focused validation on head c36f0d57f8:

bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-pty test ✅ 4 files / 36 tests
bun run --cwd plugins/plugin-pty build ✅
bun run --cwd plugins/plugin-task-coordinator typecheck ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
bunx @biomejs/biome check plugins/plugin-pty plugins/plugin-task-coordinator/src/CockpitInteractiveTerminal.tsx plugins/plugin-task-coordinator/src/CockpitInteractiveTerminal.test.tsx plugins/plugin-task-coordinator/src/CockpitRoute.tsx packages/ui/src/api/client-agent.ts packages/agent/src/runtime/eliza.ts ✅
git diff --check origin/develop...HEAD ✅

GitHub checks are still queued/UNSTABLE on the refreshed head; no failed check was visible when I posted this.

lalalune · 2026-07-01T04:41:22Z

Review status for current head de60cf7af40a24b3f768c2afeea2e463d13a277c: not merge-ready yet.

Blocking findings:

plugins/plugin-pty/routes/pty-routes.ts:76 starts a real interactive process for any caller that passes the generic private-route auth check. The dispatcher only enforces route.public !== true && isAuthorized() (packages/agent/src/api/dispatch-route.ts:365), and this handler has no OWNER/ADMIN/developer-role gate before svc.startSession(spec) at plugins/plugin-pty/routes/pty-routes.ts:122. For a route that spawns a PTY and grants file/tool access under a server-side cwd, generic authenticated access is too broad. Please add a sensitive-route/role gate (OWNER at minimum, or equivalent cockpit developer gate) and a regression that a normal USER/authenticated non-owner cannot spawn/list/stop PTY sessions.
plugins/plugin-pty/services/pty-session-store.ts:169 passes { ...process.env, ...spec.env } into every spawned CLI. That gives eliza-code every secret available to the agent process, not just the Eliza Cloud key and terminal/cwd settings it needs. Since this is an interactive coding/tool process, please switch to an explicit allowlist/minimal env and cover it with a test that unrelated secrets from process.env are not inherited.
Required evidence is missing for the feature being claimed. This PR adds a new process-spawning backend route plus cockpit UI, but the changed file list has no .github/issue-evidence/... artifact, and the PR body explicitly says the real cerebras /help + phone/app-terminal proof is a handoff. For this repo's DoD, a non-draft feature like this needs real backend/frontend logs, screenshots/video for the cockpit UI, and a live run showing the claimed interactive CLI path, or a clear maintainer-approved reason to keep the PR draft until those are captured.
Biome currently fails on the changed files. Examples: plugins/plugin-pty/services/pty-types.ts:21/:24 (void in union), packages/agent/src/external-modules.d.ts:882 redeclares EmbeddedAppViewer, plus formatting/import-order failures in multiple plugins/plugin-pty/* files and CockpitInteractiveTerminal.tsx.

Validation I ran:

bun install --frozen-lockfile --ignore-scripts fails under Bun 1.4.0-canary.1 with lockfile had changes, but lockfile is frozen. Running unfrozen install left no tracked diff, so this may be a Bun/toolchain quirk, but frozen install is still red in this worktree.
bun run build:core passed: 65/65 Turbo build tasks successful.
bun run --cwd plugins/plugin-pty test passed after build:core: 4 files, 36 tests.
bun run --cwd plugins/plugin-task-coordinator test CockpitInteractiveTerminal.test.tsx passed: 1 file, 5 tests.
git diff --check origin/develop...HEAD passed.
bunx @biomejs/biome check ...changed focused files... failed as described above.

lalalune · 2026-07-01T04:42:38Z

Follow-up after rechecking head c36f0d57f8fcc5b622dd594ac3b46d4bf8a0e872.

The validation cleanup commit did fix the local Biome/typecheck class of issues I called out earlier:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 36 tests
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
focused bunx @biomejs/biome check ... ✅
git diff --check origin/develop...HEAD ✅

Remaining blockers are still the substantive ones:

The spawn route is still only protected by generic private-route auth; plugins/plugin-pty/routes/pty-routes.ts:79 has no OWNER/ADMIN/developer-role gate before svc.startSession(spec) at line 127. Generic authenticated API access is still too broad for a real PTY/process-spawn capability.
plugins/plugin-pty/services/pty-session-store.ts:175 still passes full process.env into the child process. Please move this to an explicit minimal env/allowlist and add a regression that unrelated process secrets are not inherited.
The PR still has no PTY/cockpit evidence artifact under .github/issue-evidence/...; the real cerebras /help + UI/log/screenshot/video proof is still a handoff. That keeps this non-draft feature below the repo DoD for a process-spawning backend route plus cockpit UI.

One validation note: bun install --frozen-lockfile --ignore-scripts still fails for me under Bun 1.4.0-canary.1 with lockfile had changes, but lockfile is frozen, even though an unfrozen install leaves no tracked diff. I would not treat that as the primary blocker without CI corroboration, but it is still part of the local validation record.

greptile-apps

Your trial has ended. Reactivate Greptile to resume code reviews.

greptile-apps

Your trial has ended. Reactivate Greptile to resume code reviews.

NubsCarson · 2026-07-01T05:05:44Z

Update on #10668 after Shaw/lalalune review blockers.

Pushed head: 79307e8196 (current with origin/develop, 0 behind / 7 ahead).

What changed:

Added a PTY route step-up gate: generic private-route auth is no longer enough for HTTP callers. HTTP spawn/list/buffer/stop now require ELIZA_TERMINAL_RUN_TOKEN via X-Eliza-Terminal-Token, terminalToken, or query token. Local in-process compatibility remains only when no terminal token is configured.
Added regressions that HTTP callers without/with invalid terminal token cannot spawn, and cannot list/stop an existing session.
Replaced child env inheritance from { ...process.env, ...spec.env } with an explicit minimal allowlist plus eliza-code spec env. Added regression proving unrelated server secrets like AWS_SECRET_ACCESS_KEY, DATABASE_URL, and ELIZA_API_TOKEN are not inherited.
Updated PTY docs/type comments to document the terminal-token gate and minimal child env behavior.

Local validation after merging latest origin/develop:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 45 tests
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-pty lint ✅
bun run --cwd plugins/plugin-pty build ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
git diff --check origin/develop...HEAD && git diff --check ✅

I also tried the gated real PTY e2e explicitly with a temporary Vitest config. The file was discovered, but it skipped because native @lydell/node-pty is not available in this checkout, so this still needs live/native/device evidence before I would call it merge-ready.

NubsCarson · 2026-07-01T05:07:09Z

Follow-up: Shaw merged #10658 into develop after my previous update, so I fetched again, merged current origin/develop (24a3b6b97d) into #10668, reran the focused validation, and pushed refreshed head dfb52e2e22.

Validation after this second develop refresh:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 45 tests
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-pty lint ✅
bun run --cwd plugins/plugin-pty build ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
git diff --check origin/develop...HEAD && git diff --check ✅

PR remains draft pending live/native/device evidence for the real PTY path.

NubsCarson · 2026-07-01T05:15:20Z

Follow-up after another develop advance: merged current origin/develop (76d21eba14) into #10668 and pushed refreshed head 7b00b2219d.

Validation after this latest refresh:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 45 tests
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-pty lint ✅
bun run --cwd plugins/plugin-pty build ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
git diff --check origin/develop...HEAD && git diff --check ✅

Additional real PTY evidence from this machine, using the actual PtySessionStore + defaultSpawnResolver under Bun, so it exercises bunTruePtySpawn / Bun.spawn({ terminal }) rather than the fake PTY:

Real process spawn/output/exit through the bridge:

bun --eval '... store.start({ command: "sh", args: ["-c", "printf BUNPTYHELLO"], ... }) ...'

Output:

{"ok":true,"engine":"bunTruePty","output":"BUNPTYHELLO","exitCode":0,"sessionId":"f70837ea-0e54-4297-9874-30aa5ef02b0c"}

Real keystroke round-trip through bridge.writeRaw() into cat:

bun --eval '... store.start({ command: "cat", args: [], ... }); bridge.writeRaw(sessionId, "ROUNDTRIP42\r") ...'

Output:

{"ok":true,"engine":"bunTruePty","saw":"ROUNDTRIP42","output":"ROUNDTRIP42\\r\\nROUNDTRIP42\\r\\n","sessionId":"58072cb1-9712-416a-bd65-195afa9dfe63"}

This now gives local real Bun-native PTY evidence for spawn/output/exit and input round-trip. I am keeping the PR draft for now because the remaining highest-confidence proof is still frontend/device/on-phone evidence for the full cockpit/eliza-code path.

NubsCarson · 2026-07-01T05:17:15Z

Additional frontend/plugin evidence for the cockpit terminal side:

bun run --cwd plugins/plugin-task-coordinator test ✅ 27 files / 206 tests

This includes the cockpit terminal panel tests that drive the pretty/CLI toggle, xterm pane mount, PTY websocket output into the rendered <pre>, typed input forwarding through sendPtyInput, and the interactive-terminal spawn/attach/cleanup path. Combined with the earlier Bun-native PTY smoke, the PR now has backend real-PTY evidence plus broader React/plugin wiring coverage.

Current head remains 7b00b2219d, current with origin/develop; GitHub check rollup currently has no failed checks, queued jobs only. PR remains draft pending full app/device/on-phone cockpit proof.

lalalune · 2026-07-01T05:40:12Z

Deep review — 4-agent sweep (security · correctness/leak · intent/2nd-order · architecture)

Reviewed at tip 7b00b2219d3. Verdict: legitimate, clean, well-structured feature with the right diagnosis (the whole web-terminal stack already existed and was inert because nothing registered PTY_SERVICE — this is the missing keystone). But NOT merge-ready: one broken end-to-end path + two HIGH-severity risks + design decisions to make. Recommend keeping it draft until the blockers below are cleared. Strong points confirmed: proper opt-in deferred/required:false/no-autoEnable plugin, store-build-disabled, env is a strict allowlist (EVM_PRIVATE_KEY/ELIZA_API_TOKEN/AGENT_SERVER_SHARED_SECRET are not inherited — good), timingSafeEqual token compare, cwd confinement, 41 real unit tests. Not a backdoor.

🔴 Blockers

Broken end-to-end — the cockpit can't spawn a session. ptyAccessRejection (routes/pty-routes.ts:81) requires X-Eliza-Terminal-Token for non-inProcess callers once ELIZA_TERMINAL_RUN_TOKEN is set, but the cockpit client spawnPtySession/stopPtySession (client-agent.ts) never sends it (and HTTP is always inProcess:false). Net: token set → 401; token unset → 403. As wired, the browser cockpit cannot start a terminal at all — consistent with the PR's own "device capture N/A / handoff" note. Fix: decide the auth model — either thread the token through the client, or explicitly allow the trusted-local/loopback cockpit origin to spawn (OWNER-gated), and add a test that the cockpit path actually spawns.
[HIGH] Recursive agent spawn / cost blow-up. The PTY launches eliza-code --interactive with the orchestrator loaded (no includeOrchestrator:false/codingOnly — lib/eliza-code-spec.ts). The recursion guard at packages/examples/code/src/lib/agent.ts:71 applies only to the ACP sub-agent variant, which this doesn't use. A user typing /task spawn N agents…, or the parent agent driving the terminal, can spawn orchestrator sub-agents recursively — unbounded tree, every call billing the same cerebras/Eliza-Cloud key, no depth/concurrency/spend cap (the 24-session PTY cap doesn't help — it's one PTY spawning many agents inside it). Fix: launch the cockpit terminal in coding-only / includeOrchestrator:false mode, or enforce explicit depth + concurrency + spend caps.
[HIGH] Session leak on disconnect. WS close/error (packages/agent/src/api/server.ts:5087,5102) only unsub() the listener — they never stopSession. eliza-code is a REPL that never self-exits, so a mobile-tab kill / browser crash / network drop / sleep (anything that skips the React unmount at CockpitInteractiveTerminal.tsx:82) orphans the child forever. After 24 orphans (DEFAULT_MAX_SESSIONS) the service refuses all new sessions. No idle timeout/GC for live sessions. Fix: track session ownership per WS connection and stopSession owned sessions on close/error; add a last-activity idle-timeout sweep in the store.

🟠 Should-fix before merge

[MED] Unvalidated baseUrl egress. spawnHandler passes body.baseUrl straight into the child as OPENAI_BASE_URL (routes/pty-routes.ts). A caller who clears the spawn gate can point eliza-code (+ the resolved API key) at an attacker endpoint → key exfil / SSRF-flavored egress; the "cerebras / zero-TOS" framing only holds for the default. Fix: allowlist baseUrl to Eliza Cloud (+ explicit operator allowlist).
[MED] Injected key is the agent's primary OPENAI_API_KEY. resolveCloudApiKey() falls back to the agent's own OPENAI_API_KEY; a terminal user can env/echo $OPENAI_API_KEY to read the agent's real production key. Fix: require a dedicated PTY_ELIZA_CLOUD_API_KEY (or a per-session minted, short-lived token) — never the primary key — so an env read leaks only a scoped, revocable credential.
[MED] The X-Eliza-Terminal-Token gate protects only spawn, not keystrokes. WS pty-input/pty-output (server.ts:5006-5028) are gated by the generic loopback/API-token model, not the step-up token. On loopback (desktop) any same-origin tab that opens the WS can drive a live session. Fix: apply the same step-up trust to the WS input path as to spawn.
[Brand — blocking under repo UI rules] Green accent. The launch button in plugin-task-coordinator/src/CockpitRoute.tsx uses background: "var(--accent, #5a9a2a)" (green) + raw #fff inline styles. Repo rule is orange-accent-only, no green/blue, no orange→black hover. Use the Button component + accent token.

🟡 Architecture / duplication (design call)

Third PTY engine. services/bun-pty-spawn.ts reimplements the Bun Bun.spawn({terminal}) truePty engine that already exists in packages/app-core/platforms/electrobun/remotes/pty/src/bun/pty-service.ts (same data/exit callbacks, same TextDecoder streaming, same "exit reports teardown status → take code from proc.exited" gotcha). Prefer extracting a shared session store both consume, per the consolidation mandate.
Second cockpit terminal. CockpitInteractiveTerminal.tsx mounts PtyTerminalPane — the same pane already in CockpitTerminalPanel.tsx (which was authored anticipating this PTY_SERVICE build). Prefer teaching CockpitTerminalPanel/CockpitSessionPane to spawn an eliza-code session rather than a parallel overlay.
Delete the dead PTY_SERVICE.coordinator fallback in coordinator-wiring.ts:43-46 and server-helpers-swarm.ts:261-267 — it's proven-dead scaffolding (get-acp-service-order.test.ts:37 pins "ignores removed PTY_SERVICE registrations"). Deleting it in this PR gives PTY_SERVICE exactly one meaning (the console bridge) instead of a semantic overload with first-wins-registration ambiguity.

🧪 Test gap

The "real e2e" (test/pty-real.e2e.test.ts:20) gates on @lydell/node-pty, an optional dep not built in CI → it skips; and under Bun the code runs bunTruePtySpawn, not node-pty. So the actual production engine has zero real-process coverage despite the PR's "real-runtime proof" claim. Add a real e2e that drives bunTruePtySpawn under isBunRuntime() (spawn sh -c, assert streamed output + keystroke round-trip + exit 0), plus regressions for: env-secret isolation (assert EVM_PRIVATE_KEY/ELIZA_API_TOKEN never in the child env), non-owner/no-token rejection, and disconnect-reaping.

Is it fixing a surface or the deeper problem?

Registering PTY_SERVICE is the correct root-cause fix. But "raw PTY streaming a full agent-with-shell-and-orchestrator over a WS" is the maximally-powerful, minimally-structured answer to the deeper need ("drive a coding agent from the cockpit, any device"). The cockpit already has the structured path (createOrchestratorTask + CockpitSessionPane, transcript + typed controls) that's introspectable/cost-attributable. Worth deciding: ship the raw PTY (blockers 1–3 fixed, coding-only, desktop/single-user only, never multi-tenant cloud / default-off on phone), or serve ~80% of the need via the existing structured session path + a /help command palette.

Reviewed by an 8-agent sweep (this PR + the launcher #10669). Happy to open a follow-up PR with the concrete safety patches (blockers 2/3, items 4/7/10) once the auth model (blocker 1) is decided — that decision is yours/the author's, and the fixes need a real device+cluster run to verify, which is why I'm not force-landing them.

NubsCarson · 2026-07-01T06:14:08Z

Pushed follow-up head 471768731b for the deep-review blockers on #10668.

What changed:

Cockpit/local browser path: route dispatch now carries a trusted-loopback bit into plugin route handlers, and PTY routes allow trusted local cockpit calls without exposing ELIZA_TERMINAL_RUN_TOKEN to the browser. Remote/non-local HTTP still needs the terminal step-up token.
Recursive-agent risk: eliza-code PTY sessions now launch in coding-only mode via --coding-only + ELIZA_CODE_CODING_ONLY=1, and the example-code interactive path disables orchestrator loading in that mode.
Session leaks / input ownership: spawned sessions capture x-elizaos-client-id; websocket subscribe/input/resize reject mismatched owners; websocket close/error stops sessions owned by that client; the PTY store also has an idle reaper.
Egress/key hardening: baseUrl is constrained to Eliza Cloud plus explicit PTY_ALLOWED_BASE_URLS / PTY_ELIZA_CLOUD_BASE_URL_ALLOWLIST; PTY no longer falls back to the agent's primary OPENAI_API_KEY, only explicit body apiKey or PTY_ELIZA_CLOUD_API_KEY.
Cleanup: removed the dead PTY_SERVICE.coordinator fallback, switched the cockpit terminal buttons to the shared Button, and moved the real PTY test into the repo's *.real.test.ts lane.

Validation on this worktree after merging current origin/develop:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
bunx --bun vitest run --config packages/test/vitest/real.config.ts plugins/plugin-pty/test/pty.real.test.ts ✅ 1 file / 2 tests, exercises Bun-native real PTY
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-pty lint ✅
bun run --cwd plugins/plugin-pty build ✅
bun run --cwd plugins/plugin-task-coordinator typecheck ✅
bun run --cwd plugins/plugin-task-coordinator test ✅ 27 files / 206 tests
bun run --cwd plugins/plugin-task-coordinator build ✅
bun run --cwd packages/agent typecheck ✅
bun run --cwd packages/core typecheck ✅
bun run --cwd packages/core build ✅
bun run --cwd packages/examples/code typecheck ✅
focused agent API tests around runtime plugin routes / PTY service plumbing / terminal auth / parse-action-block ✅ 5 files / 40 tests
changed-file Biome sweep ✅
git diff --check ✅

Notes:

Running the same real test with the default Node-hosted Vitest runner skips because optional native @lydell/node-pty is unavailable in this checkout; under Bun it passes and uses Bun.spawn({ terminal }).
I kept the PR draft. The remaining non-code proof before merge is still full app/device cockpit evidence for the end-to-end browser → PTY → eliza-code path, plus the larger architecture follow-ups around shared PTY engine / duplicate terminal surface if maintainers want those before landing.

NubsCarson · 2026-07-01T06:27:19Z

Additional browser-origin PTY smoke for current head 471768731b:\n\n- Built the interactive eliza-code CLI locally with bun run --cwd packages/examples/code build so packages/examples/code/dist/index.js existed.\n- Started the full dev stack on isolated temp ports/state/config with plugins.allow=["@elizaos/plugin-pty","@elizaos/plugin-task-coordinator"], ELIZA_TERMINAL_RUN_TOKEN configured, and PTY_ELIZA_CLOUD_API_KEY set to a dummy smoke key.\n- Waited specifically for deferred /api/pty/sessions route registration before exercising it.\n- From a Playwright page loaded at the Vite UI origin, called fetch("/api/pty/sessions", { method: "POST", headers: { "content-type": "application/json", "x-elizaos-client-id": "pty-browser-smoke-client" }, body: ... }) with no terminal token header/body.\n- Result: 200, spawned bun packages/examples/code/dist/index.js --interactive --coding-only, label eliza-code · fast · gpt-oss-120b, ownerClientId pty-browser-smoke-client.\n- Then DELETE /api/pty/sessions/:id from the same browser-origin client returned 200 { ok: true }, and GET /api/pty/sessions returned 200 { sessions: [] }.\n\nThis proves the local trusted cockpit/browser path works without exposing ELIZA_TERMINAL_RUN_TOKEN to the browser, and that the coding-only eliza-code spawn/cleanup path is live. Scope note: this was a route/browser-origin smoke, not a full visual onboarding/cloud-auth proof; the isolated no-provider UI shell stayed visually blank, so I am keeping that separate from the PTY route claim.

NubsCarson · 2026-07-01T06:47:34Z

Refreshed #10668 onto current origin/develop (10bb61bbc0) and pushed head 37481b8b2a. The merge only pulled in the #10676 benchmark/server evidence change; no PTY/cockpit conflict.

Focused validation after refresh:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
git diff --check origin/develop...HEAD && git diff --check ✅

This intentionally restarts CI but removes the branch-behind-develop state.

lalalune · 2026-07-01T07:01:44Z

Env-leak HIGH hold — LIFTED ✅

Re-reviewed the current committed head (37481b8) directly (not the gh pr diff output). The prior wholesale-env concern is resolved:

plugins/plugin-pty/services/pty-session-store.ts — buildPtyEnv() (L76) constructs a fresh env object from two explicit allowlists: SAFE_INHERITED_ENV_KEYS (L35: PATH/HOME/USER/SHELL/TMPDIR/locale/TERM/color/XDG — no secrets) and ALLOWED_SPEC_ENV_KEYS (L60) for spec-supplied keys. It is fed to spawn() at L252.
No ...process.env spread anywhere in the plugin's non-test source (grepped). OPENAI_API_KEY/OPENAI_BASE_URL reach the child only via the explicit spec allowlist, sourced from the dedicated PTY_ELIZA_CLOUD_API_KEY rather than the agent's primary key — intentional and scoped, not a leak.
Surrounding controls verified: step-up token gate on /api/pty/* (403 if unconfigured, 401 on bad token), resolveAllowedBaseUrl SSRF allowlist, confineCwd root confinement, opt-in/dormant registration (no autoEnable), and a real-spawn test (test/pty.real.test.ts).

Footgun before un-drafting: gh pr diff still surfaces a superseded env: { ...process.env, ... } hunk from a pre-force-push revision. It is NOT in the committed tree, but any reviewer reading the raw diff will re-trigger the alarm. Please squash the history so that stale hunk is gone from what a reviewer sees.

Remaining before ready: (1) squash to drop the stale diff, (2) follow-up to enforce the pty-contract.ts structural re-declaration of ConsoleBridge/PTYService via a shared type or contract test (manual sync drift risk), (3) the real-device /help-on-cerebras capture the PR marks handoff/N/A, per PR_EVIDENCE. Design is sound; security blocker cleared.

NubsCarson · 2026-07-01T07:02:28Z

Refreshed #10668 onto current origin/develop (0ec487bb7f) after #10678 and #10679 landed, and pushed head e1925606d5.

Pulled in:

fix(agent): allow PATCH in CORS preflight #10678 fix(agent): allow PATCH in CORS preflight
fix(benchmarks): calibrate trust injection case #10679 fix(benchmarks): calibrate trust injection case

Focused validation after refresh:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
bun test packages/agent/test/api/server-helpers-auth.test.ts ✅ 21 tests
git diff --check origin/develop...HEAD && git diff --check ✅

Current GitHub state after push: mergeable against develop; CI restarted and is queued/pending.

lalalune · 2026-07-01T07:24:51Z

Re-verified at current head `e1925606` — merge-ready pending CI

Re-ran the security/blast-radius review at the current head (past the earlier 37481b8 review):

Safety properties — all hold:

No env leak — grepped every non-test plugin source: zero ...process.env / env: process.env. buildPtyEnv uses the two allowlists (SAFE_INHERITED_ENV_KEYS / ALLOWED_SPEC_ENV_KEYS).
Opt-in / dormant — eliza.ts:483: registered as a deferred optional plugin, "dormant unless a character lists @elizaos/plugin-pty (no autoEnable)". Merging cannot activate a terminal by default.
Routes auth-gated — pty-routes.ts: 403 when no ELIZA_TERMINAL_RUN_TOKEN configured (disabled by default), 401 on missing/invalid token, plus x-elizaos-client-id ownership.
Core change additive — plugin.ts only adds optional isTrustedLocal?: boolean to RouteHandlerContext. Non-breaking.

Always-loaded wiring (the real blast radius) — reviewed, sound:

hono-mount auth refactor: app is now cached without baking in a per-request isAuthorized closure; auth + trusted-local are conveyed per-request via internal headers that the mount layer overwrites server-side (headers.set(...)), so a client-supplied x-eliza-internal-authorized: 1 cannot spoof auth. Functionally equivalent, and fixes the latent stale-closure caching. Applies to all routes.
Coordinator discovery: removing the PTY_SERVICE.coordinator fallback is dead-code removal — PTY_SERVICE isn't registered in the normal runtime pre-plugin, the live orchestrator fixtures access service.coordinator directly (not via agent-server discovery), and the orchestrator unit test already asserts "ignores removed PTY_SERVICE registrations". No regression.
WS PTY-session ownership helpers (server.ts): guarded — no-op when getPtyService returns null (PTY absent).

Holding the merge only until the typecheck/unit/integration lanes conclude (the auth refactor touches every route, so it gets a real CI signal before landing). The squash-merge will also collapse the stale intermediate ...process.env diff hunk out of history.

NubsCarson · 2026-07-01T07:25:23Z

Refreshed #10668 again onto current origin/develop (7d5f944c2a) after the latest merged batch (#10673, #10681, #10677, #10675, #10674, #10683), and pushed head 1336be3f4f.

Focused validation after this larger refresh:

bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
bun run --cwd plugins/plugin-pty typecheck ✅
bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
bun test packages/agent/test/api/server-helpers-auth.test.ts ✅ 21 tests
git diff --check origin/develop...HEAD && git diff --check ✅

No PTY/cockpit conflicts from the merged launcher/icon/cloud-app/app-control changes. CI restarted on the new head.

NubsCarson added 2 commits July 1, 2026 04:26

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

fix(plugin-pty): stabilize PTY package validation

07216ee

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

Merge remote-tracking branch 'origin/develop' into HEAD

c36f0d5

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

NubsCarson mentioned this pull request Jul 1, 2026

📋 Open-issues kanban — agent coordination board (all open work, verified 2026-06-30) #10561

Open

15 tasks

fix(pty): buffer interactive session startup output

6ba58ab

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

NubsCarson added 2 commits July 1, 2026 05:03

fix(plugin-pty): gate PTY routes and sanitize child env

61d4f1e

Merge remote-tracking branch 'origin/develop' into HEAD

79307e8

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

NubsCarson marked this pull request as draft July 1, 2026 05:05

Merge remote-tracking branch 'origin/develop' into HEAD

dfb52e2

NubsCarson added 2 commits July 1, 2026 05:12

Merge remote-tracking branch 'origin/develop' into HEAD

5a1379d

Merge remote-tracking branch 'origin/develop' into HEAD

7b00b22

NubsCarson added 2 commits July 1, 2026 05:58

Merge remote-tracking branch 'origin/develop' into HEAD

c26ff6b

fix(plugin-pty): harden cockpit terminal sessions

4717687

Merge remote-tracking branch 'origin/develop' into HEAD

37481b8

Merge remote-tracking branch 'origin/develop' into HEAD

e192560

Merge remote-tracking branch 'origin/develop' into HEAD

1336be3

Uh oh!

Conversation

NubsCarson commented Jul 1, 2026

What & why

Changes

Testing / evidence

Notes for reviewers

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

lalalune commented Jul 1, 2026

Uh oh!

lalalune commented Jul 1, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

lalalune commented Jul 1, 2026

Deep review — 4-agent sweep (security · correctness/leak · intent/2nd-order · architecture)

🔴 Blockers

🟠 Should-fix before merge

🟡 Architecture / duplication (design call)

🧪 Test gap

Is it fixing a surface or the deeper problem?

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

lalalune commented Jul 1, 2026

Env-leak HIGH hold — LIFTED ✅

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

lalalune commented Jul 1, 2026

Re-verified at current head e1925606 — merge-ready pending CI

Uh oh!

NubsCarson commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

Re-verified at current head `e1925606` — merge-ready pending CI