Skip to content

Add experimental RPC bridge API#664

Draft
aron-cf wants to merge 5 commits intorpc-bridge-cifrom
bridge-rpc-api
Draft

Add experimental RPC bridge API#664
aron-cf wants to merge 5 commits intorpc-bridge-cifrom
bridge-rpc-api

Conversation

@aron-cf
Copy link
Copy Markdown
Contributor

@aron-cf aron-cf commented May 1, 2026

Adds a typed capnweb RPC endpoint to @cloudflare/sandbox/bridge and a thin TypeScript client at @cloudflare/sandbox/bridge-client. One WebSocket per client gives consumers a typed handle to every sandbox method, no per-route HTTP plumbing, no warm pool indirection.

⚠️ Experimental. The Sandbox RPC API differs from the Sandbox class interface and until we consolidate transports this is going to be messy. For now this interface should be considered exploratory. The /v1/rpc route returns 404 unless > explicitly enabled. See Why experimental below.

What's in this PR

  • @cloudflare/sandbox/bridgeGET /v1/rpc WebSocket endpoint, gated behind a new enableExperimentalRPC flag on BridgeConfig. Subprotocol-only bearer auth (Sec-WebSocket-Protocol: cloudflare-sandbox-bridge.bearer.<token>) so browser WebSocket clients work uniformly with Bun, Node 22+, and Cloudflare Workers. The endpoint deliberately bypasses the /sandbox/* HTTP middleware: sandbox-id validation lives inside the RPC call, container resolution is direct, and one connection can address many sandboxes.

  • @cloudflare/sandbox/bridge-client — typed client subpath: createBridgeClient({ url, token }) returns one BridgeClient instance that manages many sandboxes over a single socket. client.sandbox(id) returns a lazy proxy structurally typed as SandboxRPCAPI (the 10 domains — commands, files, processes, ports, git, interpreter, utils, backup, desktop, watch — plus an id getter). Auth failures surface as a typed BridgeAuthError. await using is supported via Symbol.asyncDispose.

  • bridge/ (the deployable bridge Worker) — reads SANDBOX_EXPERIMENTAL_RPC=true from the deployment vars and passes enableExperimentalRPC: true to bridge(). Route returns 404 by default. Integration tests cover the new endpoint end-to-end.

Enabling the endpoint

Just set the SANDBOX_EXPERIMENTAL_RPC=true var.

If you're embedding bridge() directly, opt in via config:

import { bridge } from '@cloudflare/sandbox/bridge';

export default bridge(
  { /* user handlers */ },
  { enableExperimentalRPC: true }
);

Using the client

Run a command

import { createBridgeClient } from '@cloudflare/sandbox/bridge-client';

const client = createBridgeClient({
  url: 'https://bridge.example.com/v1',
  token: process.env.SANDBOX_API_KEY,
});

const sandbox = client.sandbox();
const result = await sandbox.commands.execute(
  'echo hello',
  'my-session-id'
);
console.log(result.stdout); // "hello\n"

await client.close();

Read and write files

const sandbox = client.sandbox();

await sandbox.files.writeFile(
  '/workspace/hello.txt',
  'hi from rpc',
  'my-session-id'
);

const { content } = await sandbox.files.readFile(
  '/workspace/hello.txt',
  'my-session-id'
);
console.log(content); // "hi from rpc"

Stream command output

commands.executeStream returns a ReadableStream<Uint8Array> of SSE-encoded ExecEvents, identical to the wire format produced by Sandbox.execStream(). Capnweb forwards the stream natively across the WebSocket:

import { parseSSEStream } from '@cloudflare/sandbox';

const stream = await client
  .sandbox('my-sandbox')
  .commands.executeStream('npm test', 'my-session-id');

for await (const event of parseSSEStream(stream)) {
  if (event.type === 'stdout') process.stdout.write(event.data);
  if (event.type === 'complete') console.log('exit:', event.exitCode);
}

Re-connect to a previous sandbox ID

const sandbox = client.sandbox();
const id = await sandbox.id; // "ar3zchxcgucsl2gzw7oku4ynbe"
await sandbox.commands.execute('hostname', sessionId);

const sandbox2 = client.sandbox(id);
await sandbox2.commands.execute('hostname', sessionId);

Address many sandboxes from one client

A single BridgeClient opens one WebSocket and reuses it for every sandbox you address. Per-sandbox stubs are cached on first access:

const client = createBridgeClient({ url, token });

await Promise.all([
  client.sandbox().commands.execute('uname -a', sessionA),
  client.sandbox().commands.execute('uname -a', sessionB),
  client.sandbox().files.listFiles('/workspace', sessionC),
]);

await client.close(); // tears the WebSocket down

Handle auth failures

import {
  BridgeAuthError,
  createBridgeClient,
} from '@cloudflare/sandbox/bridge-client';

try {
  await client.sandbox('my-sandbox').utils.ping();
} catch (err) {
  if (err instanceof BridgeAuthError) {
    console.error(`auth failed (${err.status}):`, err.message);
  } else {
    throw err;
  }
}

await using (TypeScript 5.2+ / Node 22+)

{
  await using client = createBridgeClient({ url, token });
  await client.sandbox('my-sandbox').commands.execute('ls', sessionId);
} // socket closed automatically here

Why experimental

The RPC surface (SandboxRPCAPI) was sketched to mirror the container's internal SandboxAPI shape — ten neatly-typed domains, each with a small set of methods. Wiring it up against the SDK's actual Sandbox class exposed several places where the two have drifted:

  • Method shapes — Several methods exist on both surfaces with different signatures. commands.execute takes (command, sessionId, { timeoutMs, env, cwd }) on the wire; Sandbox.exec takes (command, { timeout, env, cwd }) with sessionId either implicit (default session) or passed as a separate getSession(id).exec(...) call. The shim translates timeoutMs ↔ timeout and routes through getSession() for explicit-session calls — but the types in @repo/shared's ISandbox interface have drifted from the runtime Sandbox class. We had to switch the bridge surface from ISandbox to Sandbox<any> (via PublicInterface<T>) to access what the proxy actually exposes — desktop, exposePort, etc.

  • Return shapesprocesses.startProcess returns a flat ProcessStartResult DTO on the wire ({ processId, pid, command, timestamp }); Sandbox.startProcess returns a rich Process object with methods (kill, getStatus, waitForLog, waitForPort, waitForExit).

  • Hostname couplingports.exposePort needs a hostname to synthesise preview URLs. The bridge captures it from the upgrade request's Host header (or BRIDGE_PREVIEW_HOSTNAME env var) and threads it through every RPC session.

  • Method gaps — The wire shape lists ten domains' worth of methods; some have no public equivalent on Sandbox and went through the underlying client.{utils,backup,interpreter,ports}.X() to be wired.

These rough edges all map back to the same root cause: the SDK's public Sandbox interface and the container's SandboxAPI were designed independently, and the bridge sits between them. These should evolve over time so that the public Sandbox interface is a subset of the DO/Container interface.

aron-cf added 3 commits May 1, 2026 12:51
Currently e2e tests will only run when `run-e2e-bridge` label is applied
to the PR (for testing).
Introduces a typed capnweb RPC layer alongside the existing HTTP routes.
Exposed at `${apiPrefix}/rpc` (default `/v1/rpc`), gated behind a new
`enableExperimentalRPC` flag on `BridgeConfig`. Returns 404 by default.

Wire shape:

    interface BridgeRPCAPI {
      sandbox(id?: string): Promise<SandboxRPCAPI>;
    }

`sandbox()` validates the id (or generates a fresh one via
`generateSandboxId`), then returns a `SandboxRPCAPI` stub bound to that
sandbox. `SandboxRPCAPI` mirrors the container's internal `SandboxAPI`
and exposes all ten domains: commands, files, processes, ports, git,
interpreter, utils, backup, desktop, watch — plus an `id` getter so
callers can read back a server-generated id. Each domain shim forwards
to the SDK's `BridgeSandbox` proxy (the runtime `Sandbox` class typed
via `PublicInterface<Sandbox<any>>` so private members can't leak).

Authentication is carried in `Sec-WebSocket-Protocol`
(`cloudflare-sandbox-bridge.bearer.<token>`) — the only auth-bearing
header browser `WebSocket` constructors can set. Factored into
`authenticateRpcUpgrade()` so the check is independently testable.

The endpoint deliberately bypasses the `/sandbox/*` HTTP middleware:
sandbox-id validation lives inside the RPC call, container resolution
is direct via `getBridgeSandbox(ns, id)` (no warm-pool indirection),
and one connection can address many sandboxes via repeated
`rpc.sandbox(id)` calls. The route is registered in the same Hono app
as the rest of the `/v1/*` surface; the `enableExperimentalRPC` gate
lives inside the single handler (404 fast-path before any auth or
upgrade work).

Adds `listSessions()` to `UtilityClient` so the HTTP transport
satisfies `SandboxUtilsAPI` (the capnweb transport already had it);
the bridge `utils.listSessions` shim now delegates instead of throwing
not_implemented.

Tests live in `packages/sandbox/tests/`:
- `bridge-rpc.test.ts` (27 tests) drives `handleRpcUpgrade()` directly
  through an in-process WebSocket pair, plus a small `createBridgeApp`
  slice for the gating behaviour.
- `bridge-test-helpers.ts` provides the focused `createMockSandbox` /
  `createMockEnv` used by both the RPC tests and the bridge-client
  tests.

This endpoint is **experimental**: the surface mirrors the still-
evolving sandbox interface and is subject to breaking changes.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

🦋 Changeset detected

Latest commit: 3936cc8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/sandbox Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 1, 2026

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/sandbox-sdk/@cloudflare/sandbox@664

commit: 3936cc8

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

🐳 Docker Images Published

Variant Image
Default cloudflare/sandbox:0.0.0-pr-664-3936cc8
Python cloudflare/sandbox:0.0.0-pr-664-3936cc8-python
OpenCode cloudflare/sandbox:0.0.0-pr-664-3936cc8-opencode
Musl cloudflare/sandbox:0.0.0-pr-664-3936cc8-musl
Desktop cloudflare/sandbox:0.0.0-pr-664-3936cc8-desktop

Usage:

FROM cloudflare/sandbox:0.0.0-pr-664-3936cc8

Version: 0.0.0-pr-664-3936cc8


📦 Standalone Binary

For arbitrary Dockerfiles:

COPY --from=cloudflare/sandbox:0.0.0-pr-664-3936cc8 /container-server/sandbox /sandbox
ENTRYPOINT ["/sandbox"]

Download via GitHub CLI:

gh run download 25215867899 -n sandbox-binary

Extract from Docker:

docker run --rm cloudflare/sandbox:0.0.0-pr-664-3936cc8 cat /container-server/sandbox > sandbox && chmod +x sandbox

aron-cf added 2 commits May 1, 2026 13:17
Typed capnweb client for the bridge Worker's `GET /v1/rpc` endpoint.
One `BridgeClient` instance manages many sandboxes over a single
WebSocket; method calls on `client.sandbox(id)` lazily resolve a
per-sandbox stub and forward the call through capnweb.

```ts
import { createBridgeClient } from '@cloudflare/sandbox/bridge-client';

const client = createBridgeClient({
  url: 'https://bridge.example.com',
  token: process.env.SANDBOX_API_KEY,
});

const sandbox = client.sandbox('my-sandbox');
const result = await sandbox.commands.execute('ls', sessionId);

await client.close();
```

Implementation notes:

- Single WebSocket per `BridgeClient`. Per-sandbox stubs are cached;
  `client.sandbox(id)` returns a lazy proxy that resolves the stub on
  first method call and reuses it thereafter. `sandbox()` (no id)
  asks the server to generate one; the result is readable via the
  `id` getter on the handle.
- Auth is carried in `Sec-WebSocket-Protocol`
  (`cloudflare-sandbox-bridge.bearer.<token>`), so the same client
  works in browsers, Bun, Node 22+, and Cloudflare Workers without
  per-runtime adapters.
- Browsers and Bun hide the upgrade response status on a failed WS
  upgrade. To keep the `BridgeAuthError` contract honest, the client
  probes the endpoint with a follow-up `fetch()` on close-without-open
  and uses the real HTTP status — 401 reliably becomes
  `BridgeAuthError { status: 401 }`; everything else stays
  `BridgeConnectError`.
- `Symbol.asyncDispose` is implemented for `await using` syntax.

Wired into the build via `tsdown.config.ts` and the `./bridge-client`
subpath in `packages/sandbox/package.json`.

Tests in `packages/sandbox/tests/bridge-client.test.ts` (9 tests) run
against the real bridge route via `handleRpcUpgrade()` and an
in-process WebSocket pair, exercising the full subprotocol auth and
capnweb wiring through the public client surface.

Includes a changeset describing the experimental endpoint and the
opt-in flag from the consumer's perspective.
Plumbs the new `enableExperimentalRPC` flag through the bridge worker
behind a `SANDBOX_EXPERIMENTAL_RPC=true` deployment var:

- `bridge/worker/src/index.ts` reads `env.SANDBOX_EXPERIMENTAL_RPC`
  and passes the flag to `bridge()`.
- `bridge/worker/wrangler.jsonc` declares the new var (empty default,
  override per deployment).
- `bridge/worker/.dev.vars.example` documents the opt-in alongside
  `SANDBOX_API_KEY` (using the bare `KEY=` style consistently).
- `bridge/worker/package.json` updates `cf-typegen` to pass
  `--env-file .dev.vars.example` so vars get a usable `string` type
  in the generated `Env` interface, not a literal-narrowed `""`.
  Regenerated `worker-configuration.d.ts` to match.
- `bridge/worker/README.md` documents the route in the API table and
  adds a full reference section with the experimental warning, the
  `SANDBOX_EXPERIMENTAL_RPC=true` flag, the subprotocol-only auth
  scheme, and a `bridge-client` usage snippet.

`bridge/script/integration` exercises the new endpoint end-to-end:
- Raw HTTP coverage (plain GET → 400, missing/wrong subprotocol →
  401, Authorization header rejected, correct subprotocol → 101 with
  echoed protocol).
- `@cloudflare/sandbox/bridge-client` coverage (auth-error path,
  end-to-end `utils.ping`, pool-error fallthrough when no container
  backend is available).
- Spawns `wrangler dev` with `--var SANDBOX_EXPERIMENTAL_RPC:true`
  so the local route is reachable; bare process env doesn't reach
  workerd.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant