Add disconnected file change checks by whoiskatrin · Pull Request #493 · cloudflare/sandbox-sdk

whoiskatrin · 2026-03-13T15:18:18Z

Summary

add sandbox.checkChanges() for apps that disconnect and reconnect later, but still need to know whether files changed in the meantime
keep sandbox.watch() as the live event-stream API for connected consumers
return a simple status (unchanged, changed, or resync) plus a version token so callers can cheaply skip work, sync incrementally, or do a full rescan

Why

Some consumers do not stay connected to a watch stream. They just need to reconnect later and ask whether a path changed while they were away.

This change adds that simpler workflow directly instead of exposing the lower-level watch coordination protocol. The API is intentionally an invalidation check, not an event log, and retained state only lasts for the current container lifetime.

Example

const first = await sandbox.checkChanges("/workspace")

const next = await sandbox.checkChanges("/workspace", {
  since: first.version,
})

if (next.status === "changed") {
  await backup()
}

if (next.status === "resync") {
  await fullRescan()
}

Keep file watches usable for hibernating Durable Objects by separating live SSE delivery from retained watch state in the container. Add owner-scoped acknowledgement and idle expiry so background consumers can reconcile safely without sharing a global dirty bit.

changeset-bot · 2026-03-13T15:18:26Z

🦋 Changeset detected

Latest commit: e736ac1

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@cloudflare/sandbox	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

whoiskatrin · 2026-03-13T15:23:51Z

/bonk review this PR

Pass ownerId through Sandbox.ensureWatch so persistent watches keep their ownership metadata and the reconnect workflow can validate the same consumer across ack and stop operations.

pkg-pr-new · 2026-03-13T15:37:47Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/sandbox-sdk/@cloudflare/sandbox@493

commit: e736ac1

github-actions · 2026-03-13T15:37:48Z

🐳 Docker Images Published

Variant	Image
Default	`cloudflare/sandbox:0.0.0-pr-493-e736ac1`
Python	`cloudflare/sandbox:0.0.0-pr-493-e736ac1-python`
OpenCode	`cloudflare/sandbox:0.0.0-pr-493-e736ac1-opencode`
Musl	`cloudflare/sandbox:0.0.0-pr-493-e736ac1-musl`
Desktop	`cloudflare/sandbox:0.0.0-pr-493-e736ac1-desktop`

Usage:

FROM cloudflare/sandbox:0.0.0-pr-493-e736ac1

Version: 0.0.0-pr-493-e736ac1

📦 Standalone Binary

For arbitrary Dockerfiles:

COPY --from=cloudflare/sandbox:0.0.0-pr-493-e736ac1 /container-server/sandbox /sandbox
ENTRYPOINT ["/sandbox"]

Download via GitHub CLI:

gh run download 23848723310 -n sandbox-binary

Extract from Docker:

docker run --rm cloudflare/sandbox:0.0.0-pr-493-e736ac1 cat /container-server/sandbox > sandbox && chmod +x sandbox

ask-bonk · 2026-03-13T15:49:18Z

@whoiskatrin Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

Replace the initial ownership-flavoured watch API with a cleaner checkpoint and lease model for background consumers. Use `changed`, `checkpointWatch()`, and returned lease tokens for the public flow, while `resumeToken` keeps `ensureWatch()` retryable without exposing another consumer's lease.

ask-bonk

Review: Persistent file watch state

The design is well-thought-out. The watch-as-invalidation-signal model (cursor + changed + overflowed) is the right abstraction for DO-based consumers that wake up periodically — much better than trying to replay an event log. The lease/resume token split is clean, and the expiry-on-idle approach avoids leaking watches.

I found several issues worth addressing, ranging from a concurrency bug to some cleanup items.

Issues

1. Race condition in stopWatch lease verification (watch-service.ts:208-236)

stopWatch calls verifyPersistentWatchLease before awaiting watch.ready.promise, unlike checkpointWatch which correctly awaits readiness first. If stopWatch is called while the watch is still initializing, watch.persistent may still be false even though ensureWatch is in-flight, causing verifyPersistentWatchLease to return an error ("Only persistent watches can stop") when it should succeed.

Consider adding the same await watch.ready.promise guard as in checkpointWatch, or document why the asymmetry is intentional.

2. handleStopWatch validates leaseToken even when undefined (watch-handler.ts:228-233)

extractQueryParam returns null which gets coerced to undefined via ?? undefined. Then validateToken('leaseToken', undefined) is called, which returns null (passes validation). This works by accident but the flow is confusing — validateToken silently accepts undefined because it treats it as "not provided", yet stopWatch at the service level requires a lease token for persistent watches. The handler does no semantic validation that leaseToken is actually present when stopping a persistent watch; that burden falls entirely on the service layer.

This is fine as-is (the service layer catches it), but the handler-level validation feels like it's doing work that doesn't accomplish anything for the stop case. Consider documenting the intentional pass-through, or adding a comment that lease enforcement is in the service layer.

3. normalizePatterns called redundantly in getOrCreateWatch (watch-service.ts:260-262)

normalizePatterns is called at the top of getOrCreateWatch, then the results are stored on the ActiveWatch. But createWatchKey (called on line 264) also calls normalizePatterns internally. The patterns are normalized 3 times total for a single getOrCreateWatch call. Minor, but easy to clean up by passing the already-normalized values into createWatchKey.

4. Missing resumeToken in createWatchKey (watch-service.ts:342-354)

The watch key does not include resumeToken, which means two ensureWatch calls with different resume tokens but the same path/filters will resolve to the same ActiveWatch. This is likely intentional (the key identifies the filesystem watch, not the lease holder), but worth a comment since it's a subtle design choice.

5. WatchStopOptions.leaseToken is optional in the type but semantically required for persistent watches

The WatchStopOptions interface (types.ts:823-825) has leaseToken?: string, which makes it easy for callers to forget the token. The error only surfaces at runtime. Consider documenting this requirement in the JSDoc, or adding a PersistentWatchStopOptions with a required leaseToken field.

6. E2E test helpers use wrong API paths (file-watch-workflow.test.ts:250-268)

The createFile, createDir, and deleteFile helpers in the E2E test use paths like /api/file/write, /api/file/mkdir, /api/file/delete — but looking at the test-worker (index.ts), the actual endpoints are /api/write, /api/mkdir, /api/delete (no /file/ prefix). These helpers will 404.

Unless there's middleware in the test worker I'm missing, these E2E tests won't work as written. This needs verification.

Minor / Nits

The WatchState.expiresAt is string | null — consider making it consistently null when not persistent, and only setting it to a non-null ISO string when the idle timer is active. Currently it looks correct, just noting it for documentation.
The changeset description is good — user-focused and explains the use case.
PERSISTENT_WATCH_IDLE_TTL_MS at 10 minutes is reasonable. Consider whether this should be configurable (probably not for v1, but worth a TODO).

What looks good

The Deferred pattern with readyState tracking is clean and avoids double-resolve bugs
Event coalescing with pendingEvents Map keyed by event identity is a good optimization
The stopPromise deduplication prevents concurrent stop races
Lease token verification is consistent across checkpoint/stop paths (modulo the race noted above)
Good test coverage of the checkpoint cursor semantics (match vs stale)
The broadcastTerminalEvent / closeSubscriber separation handles the subscriber lifecycle correctly

ask-bonk · 2026-03-13T16:44:40Z

Review is complete. I posted:

A detailed top-level review comment covering 6 issues found across the PR, from a race condition in stopWatch to minor cleanup items, plus positive callouts for what's well-done.
Three inline comments on specific lines:
- watch-service.ts:221 - Race condition where stopWatch doesn't await watch.ready.promise before lease verification (unlike checkpointWatch which does)
- watch-service.ts:260 - Redundant normalizePatterns calls
- file-watch-workflow.test.ts:436 - Suggestion for negative E2E test cases

The most actionable issue is the stopWatch race condition -- if a caller gets a lease token from ensureWatch and immediately calls stopWatch before the watch finishes initializing, the lease check will fail because watch.persistent hasn't been set to true yet. The fix is straightforward: add the same await watch.ready.promise guard that checkpointWatch already has.

github run

Clarify stopWatch token validation, remove redundant key normalization work, and normalize legacy watch responses so clients still see `changed` while older paths return `dirty`.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

scuffi

nice! 🚀

Background consumers only need to know whether a path changed while disconnected. Replace the lease-based persistent watch API with checkChanges() so callers store one version token and choose whether to skip work, sync incrementally, or rescan.

ghostwriternr

Super clean. Good call to refactor the existing watch layer to support this too.

Forward persistent watch owner IDs

b970fd5

Pass ownerId through Sandbox.ensureWatch so persistent watches keep their ownership metadata and the reconnect workflow can validate the same consumer across ack and stop operations.

ask-bonk bot reviewed Mar 13, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

Polish persistent watch compatibility

aa31d8c

Clarify stopWatch token validation, remove redundant key normalization work, and normalize legacy watch responses so clients still see `changed` while older paths return `dirty`.

whoiskatrin marked this pull request as ready for review March 13, 2026 18:19

whoiskatrin requested review from aron-cf, ghostwriternr and scuffi as code owners March 13, 2026 18:19

devin-ai-integration bot reviewed Mar 13, 2026

View reviewed changes

scuffi previously approved these changes Mar 16, 2026

View reviewed changes

Refocus file watches on change checks

47d0d30

Background consumers only need to know whether a path changed while disconnected. Replace the lease-based persistent watch API with checkChanges() so callers store one version token and choose whether to skip work, sync incrementally, or rescan.

whoiskatrin dismissed scuffi’s stale review via 47d0d30 March 17, 2026 14:39

whoiskatrin changed the title ~~Add persistent file watch state~~ Add disconnected file change checks Mar 17, 2026

ghostwriternr added 2 commits April 1, 2026 12:04

Merge remote-tracking branch 'origin/main' into kate/watch-state-design

16e91f2

Use canonical log events in WatchService

e736ac1

ghostwriternr approved these changes Apr 1, 2026

View reviewed changes

ghostwriternr enabled auto-merge (squash) April 1, 2026 12:33

ghostwriternr merged commit fdd3efa into main Apr 1, 2026
20 checks passed

ghostwriternr deleted the kate/watch-state-design branch April 1, 2026 12:45

sandy-bonk bot mentioned this pull request Apr 1, 2026

Version Packages #548

Merged

Conversation

whoiskatrin commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Example

Uh oh!

changeset-bot bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

whoiskatrin commented Mar 13, 2026

Uh oh!

pkg-pr-new bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐳 Docker Images Published

📦 Standalone Binary

Uh oh!

ask-bonk bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ask-bonk bot left a comment

Choose a reason for hiding this comment

Review: Persistent file watch state

Issues

Minor / Nits

What looks good

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

ask-bonk bot commented Mar 13, 2026

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

scuffi left a comment

Choose a reason for hiding this comment

Uh oh!

ghostwriternr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

whoiskatrin commented Mar 13, 2026 •

edited

Loading

changeset-bot bot commented Mar 13, 2026 •

edited

Loading

pkg-pr-new bot commented Mar 13, 2026 •

edited

Loading

github-actions bot commented Mar 13, 2026 •

edited

Loading

ask-bonk bot commented Mar 13, 2026 •

edited

Loading