Skip to content

feat(plugin-pty): PTY_SERVICE — real interactive eliza-code CLI (on cerebras) in the cockpit terminal#10668

Draft
NubsCarson wants to merge 15 commits into
developfrom
feat/plugin-pty-interactive-terminal
Draft

feat(plugin-pty): PTY_SERVICE — real interactive eliza-code CLI (on cerebras) in the cockpit terminal#10668
NubsCarson wants to merge 15 commits into
developfrom
feat/plugin-pty-interactive-terminal

Conversation

@NubsCarson

Copy link
Copy Markdown
Member

What & why

The elizaOS app already ships the entire web terminal — the xterm UI
(PtyTerminalPane), the typed client (spawnShellSession /
subscribePtyOutput / sendPtyInput / resizePty), and the agent-server
WebSocket handlers (pty-subscribe / pty-input / pty-output /
pty-resize). Those handlers resolve
runtime.getService("PTY_SERVICE")?.consoleBridge — which was always null,
because nothing registered a PTY_SERVICE. So the terminal was inert.

This PR adds that missing keystone and wires it into the coding cockpit: a real
interactive eliza-code CLI on Eliza Cloud/cerebras, driveable from the web
terminal on any device — all slash commands, an agent we own, zero TOS
exposure
(running a real CLI on a subscription is inherently the TOS-unsafe
tier; eliza-code-on-cerebras avoids that entirely).

Changes

New plugin @elizaos/plugin-pty (opt-in; no autoEnable, so dormant
unless a character lists it; disabled on store builds):

  • PtyService (serviceType = "PTY_SERVICE") exposing the consoleBridge the
    agent server drives + session lifecycle.
  • Routes: POST / GET /api/pty/sessions, DELETE /api/pty/sessions/:id
    (authenticated).
  • buildElizaCodeCerebrasSpec: launches the interactive eliza-code CLI
    (packages/examples/code) pointed at Eliza Cloud's OpenAI-compatible endpoint
    → inference on cerebras (gpt-oss-120b fast / zai-glm-4.7 smart).

Runtime-aware PTY engine. @lydell/node-pty's write path is broken under
Bun (this._socket.write is not a function — output streams, keystrokes throw),
and the agent runs under Bun in dev. So defaultSpawnResolver uses Bun's native
truePty (Bun.spawn({ terminal }), the same engine the Electrobun host uses)
under Bun and @lydell/node-pty under Node — both behind one PtyHandle.
(Gotcha handled: Bun's terminal exit callback reports the PTY-teardown status,
always 1; the real exit code is taken from proc.exited.)

Wiring.

  • client: spawnPtySession() / stopPtySession() (mirror spawnShellSession).
  • agent: register @elizaos/plugin-pty as a deferred, opt-in optional plugin.
  • cockpit: CockpitInteractiveTerminal spawns via spawnPtySession and mounts
    the existing PtyTerminalPane; CockpitRoute gets a Fast/Smart launch control
    that opens it as a full-panel overlay (presentational CockpitView untouched).

Testing / evidence

  • 41 unit testsplugin-pty (36): bridge routing, output streaming,
    session lifecycle, cwd confinement, session cap, the cerebras spec builder, the
    bin resolver, and every route gate/error, all against an injected fake PTY;
    cockpit (5): spawn→attach at tier, cwd passthrough, error+retry,
    unmount-kills-session, close→stop, via the real component path (mock only at the
    client boundary + xterm pane). Full plugin-task-coordinator suite green.
  • Real-runtime proof — the actual engine (Bun truePty and node-pty) was
    exercised end to end: spawn a real process, stream its output through the
    bridge, round-trip a keystroke, exit code 0, clean session teardown.
  • Typecheck + build clean across plugin-pty, agent, ui,
    plugin-task-coordinator (xterm stays lazy code-split).
  • N/A here, handoff: the full "real /help on cerebras, on a phone" device
    capture requires a built eliza-code bundle + a live Eliza Cloud key on the
    device; captured as the device handoff rather than in CI.

Notes for reviewers

  • services/pty-contract.ts intentionally re-declares ConsoleBridge /
    PTYService instead of importing from @elizaos/agent (which depends on the
    plugin set — importing it would create a cycle); the runtime binds them
    structurally at the getService cast. A comment flags the sync requirement.
  • The Claude/Codex interactive CLIs can reuse this same PTY_SERVICE later —
    deferred as the explicitly-gated experimental tier.

…the web terminal

The app already ships the whole web terminal — the xterm UI (PtyTerminalPane),
the typed client (spawnShellSession/subscribePtyOutput/sendPtyInput/resizePty),
and the agent-server WS handlers (pty-subscribe/pty-input/pty-output/pty-resize).
Those handlers resolve `runtime.getService("PTY_SERVICE")?.consoleBridge`, which
was always null because nothing registered a PTY_SERVICE. This adds that missing
keystone.

New opt-in plugin @elizaos/plugin-pty:
- PtyService (serviceType "PTY_SERVICE") exposing the consoleBridge the agent
  server drives, plus session lifecycle (start/stop/list).
- Routes: POST/GET /api/pty/sessions, DELETE /api/pty/sessions/:id (authenticated).
- buildElizaCodeCerebrasSpec: launches the interactive eliza-code CLI (a real
  slash-command TUI we own) pointed at Eliza Cloud's OpenAI-compatible endpoint
  so inference runs on cerebras (gpt-oss-120b fast / zai-glm-4.7 smart) — a real
  CLI with all slash commands, zero TOS exposure.

Runtime-aware PTY engine: node-pty's write path is broken under Bun
(this._socket.write is not a function), and the agent runs under Bun in dev, so
defaultSpawnResolver uses Bun's native truePty (Bun.spawn({terminal})) under Bun
and @lydell/node-pty under Node — both behind one PtyHandle. (The Bun terminal
`exit` callback reports the PTY-teardown status, always 1; the real exit code is
taken from proc.exited.)

Wiring:
- client: spawnPtySession() -> POST /api/pty/sessions (mirrors spawnShellSession),
  then reuse the existing subscribe/input/resize path.
- agent: register @elizaos/plugin-pty as a deferred, opt-in optional plugin
  (no autoEnable -> dormant unless a character lists it; disabled on store builds).

Tests: 36 unit tests (bridge routing, streaming, lifecycle, cwd confinement,
session cap, spec builder, bin resolver, route gates/errors) against an injected
fake PTY, plus a gated pty-real.e2e.test.ts. Real-runtime path verified under Bun
(truePty) and Node (node-pty): real process output, keystroke round-trip, exit
code 0, clean session teardown.
Wire the cockpit's "tap-in, drive it directly" pillar to the new PTY_SERVICE:
launch a REAL interactive eliza-code CLI on Eliza Cloud/cerebras from the cockpit
and drive it in the live xterm pane — all slash commands, an agent we own, zero
TOS exposure.

- CockpitInteractiveTerminal: spawns via client.spawnPtySession({kind:"eliza-code",
  tier}), mounts the existing self-contained PtyTerminalPane on the returned
  sessionId, surfaces spawn errors with a retry, and kills the session on
  unmount/close so the REPL never orphans.
- CockpitRoute: a Fast/Smart launch control over the deck opens the terminal as a
  full-panel overlay (leaves the presentational CockpitView untouched).
- client: stopPtySession() (DELETE /api/pty/sessions/:id) alongside spawnPtySession.

Tests: 5 new component tests (spawn→attach at tier, cwd passthrough, error+retry,
unmount-kills-session, close→stop+onClose) via the real component path, mocking
only the client boundary + xterm pane. Full plugin suite green (192 tests);
typecheck + view-bundle build clean (xterm stays lazy-split).

Device/pixel verification (real /help on cerebras on-device) is the handoff.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your trial has ended. Reactivate Greptile to resume code reviews.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b18348e1-7ff8-4c26-af61-ed07a7d1423c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/plugin-pty-interactive-terminal

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your trial has ended. Reactivate Greptile to resume code reviews.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your trial has ended. Reactivate Greptile to resume code reviews.

@NubsCarson

Copy link
Copy Markdown
Member Author

Validation/update for #10668 after reviewing the PTY lane.

I found and pushed a cleanup follow-up, then refreshed the branch on current develop:

  • 07216ee743 fix(plugin-pty): stabilize PTY package validation
    • adds an ambient declaration for optional native @lydell/node-pty so plugins/plugin-pty typecheck does not require the native package to be installed locally
    • replaces PtyDisposable | void with PtyDisposable | undefined to satisfy Biome's no-confusing-void rule
    • formats/organizes the new plugin-pty files and the cockpit terminal component
    • makes CockpitInteractiveTerminal declare the small PTY client surface it consumes so plugin-task-coordinator typecheck is stable even when @elizaos/ui/dist is stale before a UI build
  • c36f0d57f8 merges current origin/develop (0a3fd649de) into the PR branch; conflict-free.

Security/runtime notes from review:

  • /api/pty/* routes are private by default (public is not set), and the Hono route dispatcher returns 401 unless normal API/local auth passes.
  • spawn is disabled for store builds and when PTY_INTERACTIVE_ENABLED=false|0.
  • cwd is confined in PtySessionStore against PTY_ALLOWED_DIRECTORY/process cwd before spawning.
  • concurrent sessions have a hard cap and unmount/stop paths kill sessions.

Focused validation on head c36f0d57f8:

  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 36 tests
  • bun run --cwd plugins/plugin-pty build
  • bun run --cwd plugins/plugin-task-coordinator typecheck
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • bunx @biomejs/biome check plugins/plugin-pty plugins/plugin-task-coordinator/src/CockpitInteractiveTerminal.tsx plugins/plugin-task-coordinator/src/CockpitInteractiveTerminal.test.tsx plugins/plugin-task-coordinator/src/CockpitRoute.tsx packages/ui/src/api/client-agent.ts packages/agent/src/runtime/eliza.ts
  • git diff --check origin/develop...HEAD

GitHub checks are still queued/UNSTABLE on the refreshed head; no failed check was visible when I posted this.

@lalalune

lalalune commented Jul 1, 2026

Copy link
Copy Markdown
Member

Review status for current head de60cf7af40a24b3f768c2afeea2e463d13a277c: not merge-ready yet.

Blocking findings:

  1. plugins/plugin-pty/routes/pty-routes.ts:76 starts a real interactive process for any caller that passes the generic private-route auth check. The dispatcher only enforces route.public !== true && isAuthorized() (packages/agent/src/api/dispatch-route.ts:365), and this handler has no OWNER/ADMIN/developer-role gate before svc.startSession(spec) at plugins/plugin-pty/routes/pty-routes.ts:122. For a route that spawns a PTY and grants file/tool access under a server-side cwd, generic authenticated access is too broad. Please add a sensitive-route/role gate (OWNER at minimum, or equivalent cockpit developer gate) and a regression that a normal USER/authenticated non-owner cannot spawn/list/stop PTY sessions.

  2. plugins/plugin-pty/services/pty-session-store.ts:169 passes { ...process.env, ...spec.env } into every spawned CLI. That gives eliza-code every secret available to the agent process, not just the Eliza Cloud key and terminal/cwd settings it needs. Since this is an interactive coding/tool process, please switch to an explicit allowlist/minimal env and cover it with a test that unrelated secrets from process.env are not inherited.

  3. Required evidence is missing for the feature being claimed. This PR adds a new process-spawning backend route plus cockpit UI, but the changed file list has no .github/issue-evidence/... artifact, and the PR body explicitly says the real cerebras /help + phone/app-terminal proof is a handoff. For this repo's DoD, a non-draft feature like this needs real backend/frontend logs, screenshots/video for the cockpit UI, and a live run showing the claimed interactive CLI path, or a clear maintainer-approved reason to keep the PR draft until those are captured.

  4. Biome currently fails on the changed files. Examples: plugins/plugin-pty/services/pty-types.ts:21/:24 (void in union), packages/agent/src/external-modules.d.ts:882 redeclares EmbeddedAppViewer, plus formatting/import-order failures in multiple plugins/plugin-pty/* files and CockpitInteractiveTerminal.tsx.

Validation I ran:

  • bun install --frozen-lockfile --ignore-scripts fails under Bun 1.4.0-canary.1 with lockfile had changes, but lockfile is frozen. Running unfrozen install left no tracked diff, so this may be a Bun/toolchain quirk, but frozen install is still red in this worktree.
  • bun run build:core passed: 65/65 Turbo build tasks successful.
  • bun run --cwd plugins/plugin-pty test passed after build:core: 4 files, 36 tests.
  • bun run --cwd plugins/plugin-task-coordinator test CockpitInteractiveTerminal.test.tsx passed: 1 file, 5 tests.
  • git diff --check origin/develop...HEAD passed.
  • bunx @biomejs/biome check ...changed focused files... failed as described above.

@lalalune

lalalune commented Jul 1, 2026

Copy link
Copy Markdown
Member

Follow-up after rechecking head c36f0d57f8fcc5b622dd594ac3b46d4bf8a0e872.

The validation cleanup commit did fix the local Biome/typecheck class of issues I called out earlier:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 36 tests
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • focused bunx @biomejs/biome check ...
  • git diff --check origin/develop...HEAD

Remaining blockers are still the substantive ones:

  1. The spawn route is still only protected by generic private-route auth; plugins/plugin-pty/routes/pty-routes.ts:79 has no OWNER/ADMIN/developer-role gate before svc.startSession(spec) at line 127. Generic authenticated API access is still too broad for a real PTY/process-spawn capability.
  2. plugins/plugin-pty/services/pty-session-store.ts:175 still passes full process.env into the child process. Please move this to an explicit minimal env/allowlist and add a regression that unrelated process secrets are not inherited.
  3. The PR still has no PTY/cockpit evidence artifact under .github/issue-evidence/...; the real cerebras /help + UI/log/screenshot/video proof is still a handoff. That keeps this non-draft feature below the repo DoD for a process-spawning backend route plus cockpit UI.

One validation note: bun install --frozen-lockfile --ignore-scripts still fails for me under Bun 1.4.0-canary.1 with lockfile had changes, but lockfile is frozen, even though an unfrozen install leaves no tracked diff. I would not treat that as the primary blocker without CI corroboration, but it is still part of the local validation record.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your trial has ended. Reactivate Greptile to resume code reviews.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your trial has ended. Reactivate Greptile to resume code reviews.

@NubsCarson

Copy link
Copy Markdown
Member Author

Update on #10668 after Shaw/lalalune review blockers.

Pushed head: 79307e8196 (current with origin/develop, 0 behind / 7 ahead).

What changed:

  • Added a PTY route step-up gate: generic private-route auth is no longer enough for HTTP callers. HTTP spawn/list/buffer/stop now require ELIZA_TERMINAL_RUN_TOKEN via X-Eliza-Terminal-Token, terminalToken, or query token. Local in-process compatibility remains only when no terminal token is configured.
  • Added regressions that HTTP callers without/with invalid terminal token cannot spawn, and cannot list/stop an existing session.
  • Replaced child env inheritance from { ...process.env, ...spec.env } with an explicit minimal allowlist plus eliza-code spec env. Added regression proving unrelated server secrets like AWS_SECRET_ACCESS_KEY, DATABASE_URL, and ELIZA_API_TOKEN are not inherited.
  • Updated PTY docs/type comments to document the terminal-token gate and minimal child env behavior.

Local validation after merging latest origin/develop:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 45 tests
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-pty lint
  • bun run --cwd plugins/plugin-pty build
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • git diff --check origin/develop...HEAD && git diff --check

I also tried the gated real PTY e2e explicitly with a temporary Vitest config. The file was discovered, but it skipped because native @lydell/node-pty is not available in this checkout, so this still needs live/native/device evidence before I would call it merge-ready.

@NubsCarson NubsCarson marked this pull request as draft July 1, 2026 05:05
@NubsCarson

Copy link
Copy Markdown
Member Author

Follow-up: Shaw merged #10658 into develop after my previous update, so I fetched again, merged current origin/develop (24a3b6b97d) into #10668, reran the focused validation, and pushed refreshed head dfb52e2e22.

Validation after this second develop refresh:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 45 tests
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-pty lint
  • bun run --cwd plugins/plugin-pty build
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • git diff --check origin/develop...HEAD && git diff --check

PR remains draft pending live/native/device evidence for the real PTY path.

@NubsCarson

Copy link
Copy Markdown
Member Author

Follow-up after another develop advance: merged current origin/develop (76d21eba14) into #10668 and pushed refreshed head 7b00b2219d.

Validation after this latest refresh:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 45 tests
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-pty lint
  • bun run --cwd plugins/plugin-pty build
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • git diff --check origin/develop...HEAD && git diff --check

Additional real PTY evidence from this machine, using the actual PtySessionStore + defaultSpawnResolver under Bun, so it exercises bunTruePtySpawn / Bun.spawn({ terminal }) rather than the fake PTY:

  1. Real process spawn/output/exit through the bridge:
bun --eval '... store.start({ command: "sh", args: ["-c", "printf BUNPTYHELLO"], ... }) ...'

Output:

{"ok":true,"engine":"bunTruePty","output":"BUNPTYHELLO","exitCode":0,"sessionId":"f70837ea-0e54-4297-9874-30aa5ef02b0c"}
  1. Real keystroke round-trip through bridge.writeRaw() into cat:
bun --eval '... store.start({ command: "cat", args: [], ... }); bridge.writeRaw(sessionId, "ROUNDTRIP42\r") ...'

Output:

{"ok":true,"engine":"bunTruePty","saw":"ROUNDTRIP42","output":"ROUNDTRIP42\\r\\nROUNDTRIP42\\r\\n","sessionId":"58072cb1-9712-416a-bd65-195afa9dfe63"}

This now gives local real Bun-native PTY evidence for spawn/output/exit and input round-trip. I am keeping the PR draft for now because the remaining highest-confidence proof is still frontend/device/on-phone evidence for the full cockpit/eliza-code path.

@NubsCarson

Copy link
Copy Markdown
Member Author

Additional frontend/plugin evidence for the cockpit terminal side:

  • bun run --cwd plugins/plugin-task-coordinator test ✅ 27 files / 206 tests

This includes the cockpit terminal panel tests that drive the pretty/CLI toggle, xterm pane mount, PTY websocket output into the rendered <pre>, typed input forwarding through sendPtyInput, and the interactive-terminal spawn/attach/cleanup path. Combined with the earlier Bun-native PTY smoke, the PR now has backend real-PTY evidence plus broader React/plugin wiring coverage.

Current head remains 7b00b2219d, current with origin/develop; GitHub check rollup currently has no failed checks, queued jobs only. PR remains draft pending full app/device/on-phone cockpit proof.

@lalalune

lalalune commented Jul 1, 2026

Copy link
Copy Markdown
Member

Deep review — 4-agent sweep (security · correctness/leak · intent/2nd-order · architecture)

Reviewed at tip 7b00b2219d3. Verdict: legitimate, clean, well-structured feature with the right diagnosis (the whole web-terminal stack already existed and was inert because nothing registered PTY_SERVICE — this is the missing keystone). But NOT merge-ready: one broken end-to-end path + two HIGH-severity risks + design decisions to make. Recommend keeping it draft until the blockers below are cleared. Strong points confirmed: proper opt-in deferred/required:false/no-autoEnable plugin, store-build-disabled, env is a strict allowlist (EVM_PRIVATE_KEY/ELIZA_API_TOKEN/AGENT_SERVER_SHARED_SECRET are not inherited — good), timingSafeEqual token compare, cwd confinement, 41 real unit tests. Not a backdoor.

🔴 Blockers

  1. Broken end-to-end — the cockpit can't spawn a session. ptyAccessRejection (routes/pty-routes.ts:81) requires X-Eliza-Terminal-Token for non-inProcess callers once ELIZA_TERMINAL_RUN_TOKEN is set, but the cockpit client spawnPtySession/stopPtySession (client-agent.ts) never sends it (and HTTP is always inProcess:false). Net: token set → 401; token unset → 403. As wired, the browser cockpit cannot start a terminal at all — consistent with the PR's own "device capture N/A / handoff" note. Fix: decide the auth model — either thread the token through the client, or explicitly allow the trusted-local/loopback cockpit origin to spawn (OWNER-gated), and add a test that the cockpit path actually spawns.

  2. [HIGH] Recursive agent spawn / cost blow-up. The PTY launches eliza-code --interactive with the orchestrator loaded (no includeOrchestrator:false/codingOnlylib/eliza-code-spec.ts). The recursion guard at packages/examples/code/src/lib/agent.ts:71 applies only to the ACP sub-agent variant, which this doesn't use. A user typing /task spawn N agents…, or the parent agent driving the terminal, can spawn orchestrator sub-agents recursively — unbounded tree, every call billing the same cerebras/Eliza-Cloud key, no depth/concurrency/spend cap (the 24-session PTY cap doesn't help — it's one PTY spawning many agents inside it). Fix: launch the cockpit terminal in coding-only / includeOrchestrator:false mode, or enforce explicit depth + concurrency + spend caps.

  3. [HIGH] Session leak on disconnect. WS close/error (packages/agent/src/api/server.ts:5087,5102) only unsub() the listener — they never stopSession. eliza-code is a REPL that never self-exits, so a mobile-tab kill / browser crash / network drop / sleep (anything that skips the React unmount at CockpitInteractiveTerminal.tsx:82) orphans the child forever. After 24 orphans (DEFAULT_MAX_SESSIONS) the service refuses all new sessions. No idle timeout/GC for live sessions. Fix: track session ownership per WS connection and stopSession owned sessions on close/error; add a last-activity idle-timeout sweep in the store.

🟠 Should-fix before merge

  1. [MED] Unvalidated baseUrl egress. spawnHandler passes body.baseUrl straight into the child as OPENAI_BASE_URL (routes/pty-routes.ts). A caller who clears the spawn gate can point eliza-code (+ the resolved API key) at an attacker endpoint → key exfil / SSRF-flavored egress; the "cerebras / zero-TOS" framing only holds for the default. Fix: allowlist baseUrl to Eliza Cloud (+ explicit operator allowlist).

  2. [MED] Injected key is the agent's primary OPENAI_API_KEY. resolveCloudApiKey() falls back to the agent's own OPENAI_API_KEY; a terminal user can env/echo $OPENAI_API_KEY to read the agent's real production key. Fix: require a dedicated PTY_ELIZA_CLOUD_API_KEY (or a per-session minted, short-lived token) — never the primary key — so an env read leaks only a scoped, revocable credential.

  3. [MED] The X-Eliza-Terminal-Token gate protects only spawn, not keystrokes. WS pty-input/pty-output (server.ts:5006-5028) are gated by the generic loopback/API-token model, not the step-up token. On loopback (desktop) any same-origin tab that opens the WS can drive a live session. Fix: apply the same step-up trust to the WS input path as to spawn.

  4. [Brand — blocking under repo UI rules] Green accent. The launch button in plugin-task-coordinator/src/CockpitRoute.tsx uses background: "var(--accent, #5a9a2a)" (green) + raw #fff inline styles. Repo rule is orange-accent-only, no green/blue, no orange→black hover. Use the Button component + accent token.

🟡 Architecture / duplication (design call)

  1. Third PTY engine. services/bun-pty-spawn.ts reimplements the Bun Bun.spawn({terminal}) truePty engine that already exists in packages/app-core/platforms/electrobun/remotes/pty/src/bun/pty-service.ts (same data/exit callbacks, same TextDecoder streaming, same "exit reports teardown status → take code from proc.exited" gotcha). Prefer extracting a shared session store both consume, per the consolidation mandate.
  2. Second cockpit terminal. CockpitInteractiveTerminal.tsx mounts PtyTerminalPane — the same pane already in CockpitTerminalPanel.tsx (which was authored anticipating this PTY_SERVICE build). Prefer teaching CockpitTerminalPanel/CockpitSessionPane to spawn an eliza-code session rather than a parallel overlay.
  3. Delete the dead PTY_SERVICE.coordinator fallback in coordinator-wiring.ts:43-46 and server-helpers-swarm.ts:261-267 — it's proven-dead scaffolding (get-acp-service-order.test.ts:37 pins "ignores removed PTY_SERVICE registrations"). Deleting it in this PR gives PTY_SERVICE exactly one meaning (the console bridge) instead of a semantic overload with first-wins-registration ambiguity.

🧪 Test gap

The "real e2e" (test/pty-real.e2e.test.ts:20) gates on @lydell/node-pty, an optional dep not built in CI → it skips; and under Bun the code runs bunTruePtySpawn, not node-pty. So the actual production engine has zero real-process coverage despite the PR's "real-runtime proof" claim. Add a real e2e that drives bunTruePtySpawn under isBunRuntime() (spawn sh -c, assert streamed output + keystroke round-trip + exit 0), plus regressions for: env-secret isolation (assert EVM_PRIVATE_KEY/ELIZA_API_TOKEN never in the child env), non-owner/no-token rejection, and disconnect-reaping.

Is it fixing a surface or the deeper problem?

Registering PTY_SERVICE is the correct root-cause fix. But "raw PTY streaming a full agent-with-shell-and-orchestrator over a WS" is the maximally-powerful, minimally-structured answer to the deeper need ("drive a coding agent from the cockpit, any device"). The cockpit already has the structured path (createOrchestratorTask + CockpitSessionPane, transcript + typed controls) that's introspectable/cost-attributable. Worth deciding: ship the raw PTY (blockers 1–3 fixed, coding-only, desktop/single-user only, never multi-tenant cloud / default-off on phone), or serve ~80% of the need via the existing structured session path + a /help command palette.


Reviewed by an 8-agent sweep (this PR + the launcher #10669). Happy to open a follow-up PR with the concrete safety patches (blockers 2/3, items 4/7/10) once the auth model (blocker 1) is decided — that decision is yours/the author's, and the fixes need a real device+cluster run to verify, which is why I'm not force-landing them.

@NubsCarson

Copy link
Copy Markdown
Member Author

Pushed follow-up head 471768731b for the deep-review blockers on #10668.

What changed:

  • Cockpit/local browser path: route dispatch now carries a trusted-loopback bit into plugin route handlers, and PTY routes allow trusted local cockpit calls without exposing ELIZA_TERMINAL_RUN_TOKEN to the browser. Remote/non-local HTTP still needs the terminal step-up token.
  • Recursive-agent risk: eliza-code PTY sessions now launch in coding-only mode via --coding-only + ELIZA_CODE_CODING_ONLY=1, and the example-code interactive path disables orchestrator loading in that mode.
  • Session leaks / input ownership: spawned sessions capture x-elizaos-client-id; websocket subscribe/input/resize reject mismatched owners; websocket close/error stops sessions owned by that client; the PTY store also has an idle reaper.
  • Egress/key hardening: baseUrl is constrained to Eliza Cloud plus explicit PTY_ALLOWED_BASE_URLS / PTY_ELIZA_CLOUD_BASE_URL_ALLOWLIST; PTY no longer falls back to the agent's primary OPENAI_API_KEY, only explicit body apiKey or PTY_ELIZA_CLOUD_API_KEY.
  • Cleanup: removed the dead PTY_SERVICE.coordinator fallback, switched the cockpit terminal buttons to the shared Button, and moved the real PTY test into the repo's *.real.test.ts lane.

Validation on this worktree after merging current origin/develop:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
  • bunx --bun vitest run --config packages/test/vitest/real.config.ts plugins/plugin-pty/test/pty.real.test.ts ✅ 1 file / 2 tests, exercises Bun-native real PTY
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-pty lint
  • bun run --cwd plugins/plugin-pty build
  • bun run --cwd plugins/plugin-task-coordinator typecheck
  • bun run --cwd plugins/plugin-task-coordinator test ✅ 27 files / 206 tests
  • bun run --cwd plugins/plugin-task-coordinator build
  • bun run --cwd packages/agent typecheck
  • bun run --cwd packages/core typecheck
  • bun run --cwd packages/core build
  • bun run --cwd packages/examples/code typecheck
  • focused agent API tests around runtime plugin routes / PTY service plumbing / terminal auth / parse-action-block ✅ 5 files / 40 tests
  • changed-file Biome sweep ✅
  • git diff --check

Notes:

  • Running the same real test with the default Node-hosted Vitest runner skips because optional native @lydell/node-pty is unavailable in this checkout; under Bun it passes and uses Bun.spawn({ terminal }).
  • I kept the PR draft. The remaining non-code proof before merge is still full app/device cockpit evidence for the end-to-end browser → PTY → eliza-code path, plus the larger architecture follow-ups around shared PTY engine / duplicate terminal surface if maintainers want those before landing.

@NubsCarson

Copy link
Copy Markdown
Member Author

Additional browser-origin PTY smoke for current head 471768731b:\n\n- Built the interactive eliza-code CLI locally with bun run --cwd packages/examples/code build so packages/examples/code/dist/index.js existed.\n- Started the full dev stack on isolated temp ports/state/config with plugins.allow=["@elizaos/plugin-pty","@elizaos/plugin-task-coordinator"], ELIZA_TERMINAL_RUN_TOKEN configured, and PTY_ELIZA_CLOUD_API_KEY set to a dummy smoke key.\n- Waited specifically for deferred /api/pty/sessions route registration before exercising it.\n- From a Playwright page loaded at the Vite UI origin, called fetch("/api/pty/sessions", { method: "POST", headers: { "content-type": "application/json", "x-elizaos-client-id": "pty-browser-smoke-client" }, body: ... }) with no terminal token header/body.\n- Result: 200, spawned bun packages/examples/code/dist/index.js --interactive --coding-only, label eliza-code · fast · gpt-oss-120b, ownerClientId pty-browser-smoke-client.\n- Then DELETE /api/pty/sessions/:id from the same browser-origin client returned 200 { ok: true }, and GET /api/pty/sessions returned 200 { sessions: [] }.\n\nThis proves the local trusted cockpit/browser path works without exposing ELIZA_TERMINAL_RUN_TOKEN to the browser, and that the coding-only eliza-code spawn/cleanup path is live. Scope note: this was a route/browser-origin smoke, not a full visual onboarding/cloud-auth proof; the isolated no-provider UI shell stayed visually blank, so I am keeping that separate from the PTY route claim.

@NubsCarson

Copy link
Copy Markdown
Member Author

Refreshed #10668 onto current origin/develop (10bb61bbc0) and pushed head 37481b8b2a. The merge only pulled in the #10676 benchmark/server evidence change; no PTY/cockpit conflict.

Focused validation after refresh:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • git diff --check origin/develop...HEAD && git diff --check

This intentionally restarts CI but removes the branch-behind-develop state.

@lalalune

lalalune commented Jul 1, 2026

Copy link
Copy Markdown
Member

Env-leak HIGH hold — LIFTED ✅

Re-reviewed the current committed head (37481b8) directly (not the gh pr diff output). The prior wholesale-env concern is resolved:

  • plugins/plugin-pty/services/pty-session-store.tsbuildPtyEnv() (L76) constructs a fresh env object from two explicit allowlists: SAFE_INHERITED_ENV_KEYS (L35: PATH/HOME/USER/SHELL/TMPDIR/locale/TERM/color/XDG — no secrets) and ALLOWED_SPEC_ENV_KEYS (L60) for spec-supplied keys. It is fed to spawn() at L252.
  • No ...process.env spread anywhere in the plugin's non-test source (grepped). OPENAI_API_KEY/OPENAI_BASE_URL reach the child only via the explicit spec allowlist, sourced from the dedicated PTY_ELIZA_CLOUD_API_KEY rather than the agent's primary key — intentional and scoped, not a leak.
  • Surrounding controls verified: step-up token gate on /api/pty/* (403 if unconfigured, 401 on bad token), resolveAllowedBaseUrl SSRF allowlist, confineCwd root confinement, opt-in/dormant registration (no autoEnable), and a real-spawn test (test/pty.real.test.ts).

Footgun before un-drafting: gh pr diff still surfaces a superseded env: { ...process.env, ... } hunk from a pre-force-push revision. It is NOT in the committed tree, but any reviewer reading the raw diff will re-trigger the alarm. Please squash the history so that stale hunk is gone from what a reviewer sees.

Remaining before ready: (1) squash to drop the stale diff, (2) follow-up to enforce the pty-contract.ts structural re-declaration of ConsoleBridge/PTYService via a shared type or contract test (manual sync drift risk), (3) the real-device /help-on-cerebras capture the PR marks handoff/N/A, per PR_EVIDENCE. Design is sound; security blocker cleared.

@NubsCarson

Copy link
Copy Markdown
Member Author

Refreshed #10668 onto current origin/develop (0ec487bb7f) after #10678 and #10679 landed, and pushed head e1925606d5.

Pulled in:

Focused validation after refresh:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • bun test packages/agent/test/api/server-helpers-auth.test.ts ✅ 21 tests
  • git diff --check origin/develop...HEAD && git diff --check

Current GitHub state after push: mergeable against develop; CI restarted and is queued/pending.

@lalalune

lalalune commented Jul 1, 2026

Copy link
Copy Markdown
Member

Re-verified at current head e1925606 — merge-ready pending CI

Re-ran the security/blast-radius review at the current head (past the earlier 37481b8 review):

Safety properties — all hold:

  • No env leak — grepped every non-test plugin source: zero ...process.env / env: process.env. buildPtyEnv uses the two allowlists (SAFE_INHERITED_ENV_KEYS / ALLOWED_SPEC_ENV_KEYS).
  • Opt-in / dormanteliza.ts:483: registered as a deferred optional plugin, "dormant unless a character lists @elizaos/plugin-pty (no autoEnable)". Merging cannot activate a terminal by default.
  • Routes auth-gatedpty-routes.ts: 403 when no ELIZA_TERMINAL_RUN_TOKEN configured (disabled by default), 401 on missing/invalid token, plus x-elizaos-client-id ownership.
  • Core change additiveplugin.ts only adds optional isTrustedLocal?: boolean to RouteHandlerContext. Non-breaking.

Always-loaded wiring (the real blast radius) — reviewed, sound:

  • hono-mount auth refactor: app is now cached without baking in a per-request isAuthorized closure; auth + trusted-local are conveyed per-request via internal headers that the mount layer overwrites server-side (headers.set(...)), so a client-supplied x-eliza-internal-authorized: 1 cannot spoof auth. Functionally equivalent, and fixes the latent stale-closure caching. Applies to all routes.
  • Coordinator discovery: removing the PTY_SERVICE.coordinator fallback is dead-code removal — PTY_SERVICE isn't registered in the normal runtime pre-plugin, the live orchestrator fixtures access service.coordinator directly (not via agent-server discovery), and the orchestrator unit test already asserts "ignores removed PTY_SERVICE registrations". No regression.
  • WS PTY-session ownership helpers (server.ts): guarded — no-op when getPtyService returns null (PTY absent).

Holding the merge only until the typecheck/unit/integration lanes conclude (the auth refactor touches every route, so it gets a real CI signal before landing). The squash-merge will also collapse the stale intermediate ...process.env diff hunk out of history.

@NubsCarson

Copy link
Copy Markdown
Member Author

Refreshed #10668 again onto current origin/develop (7d5f944c2a) after the latest merged batch (#10673, #10681, #10677, #10675, #10674, #10683), and pushed head 1336be3f4f.

Focused validation after this larger refresh:

  • bun run --cwd plugins/plugin-pty test ✅ 4 files / 49 tests
  • bun run --cwd plugins/plugin-pty typecheck
  • bun run --cwd plugins/plugin-task-coordinator test src/CockpitInteractiveTerminal.test.tsx ✅ 1 file / 5 tests
  • bun test packages/agent/test/api/server-helpers-auth.test.ts ✅ 21 tests
  • git diff --check origin/develop...HEAD && git diff --check

No PTY/cockpit conflicts from the merged launcher/icon/cloud-app/app-control changes. CI restarted on the new head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants