Add lifecycle event log and listEvents API by whoiskatrin · Pull Request #509 · cloudflare/sandbox-sdk

whoiskatrin · 2026-03-22T21:11:33Z

Summary

This is PR 1 for lifecycle events.

It adds the first durable event log and replay API for sandboxes so we can
build webhook and streaming delivery on top of a stable internal model.

What changed

adds shared lifecycle event types in @repo/shared
adds sandbox.listEvents() to the public SDK surface
stores lifecycle events in Durable Object storage with monotonic seq
adds bounded retention for the per-sandbox event log
records the first event set for:
- sandbox.created
- sandbox.started
- sandbox.destroyed
- process.started
- process.exited
- port.exposed
- port.unexposed
adds unit coverage for replay and event-type filtering

Why this shape

I kept this PR intentionally narrow.

The goal here is to establish the canonical event schema and a replayable
storage model before adding webhook delivery or streaming subscriptions.
That gives us one source of truth for ordering, filtering, and retention.

API

const events = await sandbox.listEvents({
  afterSeq: 10,
  limit: 100,
  types: ['process.exited', 'port.exposed']
});

Each event includes:

id
seq
sandboxId
timestamp
type
event-specific fields such as processId, exitCode, port, or url

Reviewer notes

Please review this PR primarily for the lifecycle event model rather than for
final product surface.

The main things worth pressure-testing are:

Whether the v1 event set is the right minimal base for later webhook work.
Whether listEvents({ afterSeq, limit, types }) is the right replay API.
Whether Durable Object storage is the right source of truth for ordering.
Whether the retention cap of 1000 events feels reasonable for this phase.

A few intentional constraints in this PR:

no webhook delivery yet
no SSE/WebSocket event stream yet
no command-level events yet
no cross-sandbox aggregation yet

Those should be easier to add once the event schema and ordering semantics are
settled.

Testing

Ran:

npm run check -w @cloudflare/sandbox
npm run typecheck -w @repo/shared
npm test -w @cloudflare/sandbox -- sandbox.test.ts get-sandbox.test.ts
full pre-push typecheck hook

Follow-up

I plan to stack PR 2 on top of this branch to continue lifecycle event work.

Add a first pass at sandbox lifecycle events with durable storage and replay support. This gives callers a simple way to inspect sandbox, process, and port state changes without building their own polling log. The new listEvents API is intentionally small and synchronous. It is a safe base for later webhook and streaming work because it defines the event schema, sequence model, and retention behavior in one place.

changeset-bot · 2026-03-22T21:11:38Z

🦋 Changeset detected

Latest commit: 1e5dc07

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@cloudflare/sandbox	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

…er destruction

…rphaning running processes

pkg-pr-new · 2026-03-22T21:21:31Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/sandbox-sdk/@cloudflare/sandbox@509

commit: 1e5dc07

github-actions · 2026-03-22T21:21:32Z

🐳 Docker Images Published

Variant	Image
Default	`cloudflare/sandbox:0.0.0-pr-509-1e5dc07`
Python	`cloudflare/sandbox:0.0.0-pr-509-1e5dc07-python`
OpenCode	`cloudflare/sandbox:0.0.0-pr-509-1e5dc07-opencode`
Musl	`cloudflare/sandbox:0.0.0-pr-509-1e5dc07-musl`
Desktop	`cloudflare/sandbox:0.0.0-pr-509-1e5dc07-desktop`

Usage:

FROM cloudflare/sandbox:0.0.0-pr-509-1e5dc07

Version: 0.0.0-pr-509-1e5dc07

📦 Standalone Binary

For arbitrary Dockerfiles:

COPY --from=cloudflare/sandbox:0.0.0-pr-509-1e5dc07 /container-server/sandbox /sandbox
ENTRYPOINT ["/sandbox"]

Download via GitHub CLI:

gh run download 23413619720 -n sandbox-binary

Extract from Docker:

docker run --rm cloudflare/sandbox:0.0.0-pr-509-1e5dc07 cat /container-server/sandbox > sandbox && chmod +x sandbox

All call sites bypassed enqueueLifecycleEvent and called recordLifecycleEvent directly, defeating the write queue's serialization guarantee. Concurrent calls could read the same seq value and overwrite each other's events in storage. Route every write through enqueueLifecycleEvent and return the Promise so callers that need to await the result can do so.

…rt is already unexposed

… is already exposed

…ipping onExit callback

…itialization

A storage failure in any enqueueLifecycleEvent call should never prevent the primary operation from completing. The constructor fix also moves the lifecycleEventsInitialized flag write to after the event attempt, so a failed write does not suppress future retries.

ghostwriternr

I went through this change and have some high level feedback. Largely, I think the event schema is good and is roughly similar to your earlier changes under #456. But the fact that we've now landed #456 is super interesting and powerful in the context of this PR - Because the canonical logs carry more context per event (e.g., duration, outcome, origin) and broader coverage (file ops, git ops, command exec, backups).

If I were to hash out a few use-cases users could have around this change and propose alternate approaches:
Audit trails / dashboards
This one is more straightforward. With observability.enabled, all canonical events flow into Workers Logs and are queryable in the dashboard, filterable by any field, retained for 7 days, and importantly is cross-sandbox. For longer retention, Logpush ships workers_trace_events to many different destinations. Both of these are zero-code-change and more capable than our in-SDK implementation here that covers 1000-event DO-scoped logs.

Orchestration / reacting to events
A Tail Worker can be setup rather easily to receive every canonical event in real-time. It can filter by event name and forward to a Queue, HTTP endpoint, or any other destination.

export default {
  async tail(events, env) {
    for (const event of events) {
      for (const log of event.logs) {
        if (log.message?.[0]?.event === 'process.exit') {
          await env.MY_QUEUE.send(log.message[0]);
        }
      }
    }
  }
}

This does not cover replay-on-restart, but can still theoretically be done with a Tail Worker writing to D1 or KV with a sequence number, giving the same listEvents({ afterSeq }) semantics but with unlimited retention and cross-sandbox queries - both of which we can't offer within the SDK.

If there's any other use-cases that emerge beyond these, we should discuss it and see how we can work this out at the overall workers platform level than solving for it within the DO/container layers. Or if it's just these, maybe we can more simply write an example and/or docs to clearly illustrate how to achieve the same setup with just the existing primitives.

This comment was marked as resolved.

Sign in to view

whoiskatrin added 2 commits March 22, 2026 21:19

Catch lifecycle event errors in destroy() to prevent blocking contain…

9103c7b

…er destruction

Make lifecycle event recording in startProcess best-effort to avoid o…

e39d554

…rphaning running processes

This comment was marked as resolved.

Sign in to view

whoiskatrin added 3 commits March 22, 2026 21:46

Catch lifecycle event error in unexposePort to avoid failing after po…

15597d5

…rt is already unexposed

Catch lifecycle event error in exposePort to avoid failing after port…

f0319dc

… is already exposed

Catch lifecycle event error in startProcessCallbackStream to avoid sk…

0827a3f

…ipping onExit callback

This comment was marked as resolved.

Sign in to view

Make sandbox.created lifecycle event best-effort to avoid blocking in…

5e5509f

…itialization

This comment was marked as resolved.

Sign in to view

ghostwriternr requested changes Apr 1, 2026

View reviewed changes

ghostwriternr mentioned this pull request Apr 1, 2026

Add lifecycle events with webhook delivery #510

Closed

whoiskatrin closed this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lifecycle event log and listEvents API#509

Add lifecycle event log and listEvents API#509
whoiskatrin wants to merge 9 commits intomainfrom
feat/lifecycle-events-pr1

whoiskatrin commented Mar 22, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

changeset-bot bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

pkg-pr-new bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

ghostwriternr left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

whoiskatrin commented Mar 22, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Why this shape

API

Reviewer notes

Testing

Follow-up

Uh oh!

changeset-bot bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as resolved.

Uh oh!

pkg-pr-new bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐳 Docker Images Published

📦 Standalone Binary

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

ghostwriternr left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

whoiskatrin commented Mar 22, 2026 •

edited by devin-ai-integration bot

Loading

changeset-bot bot commented Mar 22, 2026 •

edited

Loading

pkg-pr-new bot commented Mar 22, 2026 •

edited

Loading

github-actions bot commented Mar 22, 2026 •

edited

Loading

ghostwriternr left a comment •

edited

Loading