CS-10623: Reimplement boxel realm watch command#4554
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Polls a realm's `_mtimes` endpoint, accumulates changes between ticks, and applies them in a debounced batch — downloading new/modified files, removing locally what's gone remote, and writing a checkpoint. Reuses the `RealmSyncBase` + `CheckpointManager` + sync-manifest plumbing the other realm commands share, and accepts `RealmAuthenticator` so both the profile flow and `--realm-secret-seed` work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a new boxel realm watch command implementation in the monorepo Boxel CLI, along with integration tests, to continuously poll a realm for remote changes and pull them into a local directory (with checkpoint creation).
Changes:
- Added
packages/boxel-cli/src/commands/realm/watch.tsimplementingRealmWatcher,watchRealms(), and CLI registration forboxel realm watch. - Added integration tests covering initial sync, modify/delete detection, debounced batching, abort handling, and error cases.
- Wired the new
watchsubcommand into therealmcommand group.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| packages/boxel-cli/src/commands/realm/watch.ts | Implements polling-based realm watcher, debounced apply/flush, checkpointing, and CLI command registration. |
| packages/boxel-cli/tests/integration/realm-watch.test.ts | Adds end-to-end integration coverage for watcher behaviors and error handling. |
| packages/boxel-cli/src/commands/realm/index.ts | Registers the new watch subcommand under boxel realm. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const intervalId = setInterval(tickAll, intervalMs); | ||
|
|
||
| await new Promise<void>((resolve) => { | ||
| let stopped = false; | ||
| const cleanup = () => { | ||
| if (stopped) return; | ||
| stopped = true; | ||
| clearInterval(intervalId); |
There was a problem hiding this comment.
setInterval(tickAll, intervalMs) will invoke tickAll again even if the previous async run hasn’t finished, causing overlapping polls (and potentially overlapping scheduleFlush timers) for the same watcher. This can amplify the lost-update race on pendingChanges and increase load. Prefer a self-scheduling loop that awaits tickAll() (e.g., while + await sleep(intervalMs)), or track an inFlight flag/promise to prevent reentrancy.
| const intervalId = setInterval(tickAll, intervalMs); | |
| await new Promise<void>((resolve) => { | |
| let stopped = false; | |
| const cleanup = () => { | |
| if (stopped) return; | |
| stopped = true; | |
| clearInterval(intervalId); | |
| let stopped = false; | |
| let timeoutId: ReturnType<typeof setTimeout> | null = null; | |
| const scheduleNextTick = () => { | |
| if (stopped) return; | |
| timeoutId = setTimeout(async () => { | |
| if (stopped) return; | |
| await tickAll(); | |
| scheduleNextTick(); | |
| }, intervalMs); | |
| }; | |
| scheduleNextTick(); | |
| await new Promise<void>((resolve) => { | |
| const cleanup = () => { | |
| if (stopped) return; | |
| stopped = true; | |
| if (timeoutId !== null) { | |
| clearTimeout(timeoutId); | |
| timeoutId = null; | |
| } |
| private async persistManifest(): Promise<void> { | ||
| const manifest: SyncManifest = { | ||
| realmUrl: this.normalizedRealmUrl, | ||
| files: {}, | ||
| remoteMtimes: {}, | ||
| }; | ||
| for (const [file, mtime] of this.lastKnownMtimes) { | ||
| const localPath = path.join(this.options.localDir, file); | ||
| try { | ||
| const hash = await computeFileHash(localPath); | ||
| manifest.files[file] = hash; | ||
| if (mtime !== 0) { | ||
| manifest.remoteMtimes![file] = mtime; | ||
| } | ||
| } catch (err: any) { | ||
| if (err.code !== 'ENOENT') throw err; | ||
| } |
There was a problem hiding this comment.
persistManifest() recomputes an md5 for every file in lastKnownMtimes on every flush (and computeFileHash reads the full file into memory). For large realms this makes each applied batch O(total files) even if only one file changed. Consider incrementally updating the prior manifest (update hashes only for pulled/deleted files) or storing only remoteMtimes for watch so applying a batch is O(changed files).
| .argument( | ||
| '<realm-url>', | ||
| 'The URL of the realm to watch (e.g., https://app.boxel.ai/demo/)', | ||
| ) | ||
| .argument('<local-dir>', 'The local directory to write changes into') | ||
| .option( |
There was a problem hiding this comment.
PR description calls for multi-realm support (“accept multiple positional realms, run independent loops concurrently”), but the CLI wiring here only accepts a single <realm-url> and <local-dir> pair and passes a single spec into watchRealms. If multi-realm is still intended, the command signature and argument parsing need to be updated accordingly (e.g., variadic realm URLs and a strategy for local dirs per realm).
| async poll(): Promise<boolean> { | ||
| const remoteMtimes = await this.getRemoteMtimes(); | ||
| let hasNewChanges = false; | ||
|
|
||
| for (const [file, mtime] of remoteMtimes) { | ||
| if (isProtectedFile(file)) continue; |
There was a problem hiding this comment.
poll() relies on RealmSyncBase.getRemoteMtimes(), which swallows fetch/parsing errors and returns an empty map. In a watcher, an empty mtimes result will be interpreted as “everything deleted” (see deletion loop below) and can wipe the local directory on transient network/auth issues. Consider overriding mtimes fetch here (or adding a “throw on error” mode) so polling failures don’t translate into destructive deletes.
| if (!remoteMtimes.has(file) && !this.pendingChanges.has(file)) { | ||
| this.pendingChanges.set(file, { status: 'deleted', mtime: 0 }); | ||
| hasNewChanges = true; |
There was a problem hiding this comment.
Deletion detection skips files that are already in pendingChanges (!this.pendingChanges.has(file)). If a file is pending as added/modified and then gets deleted remotely before the debounced flush runs, the watcher will keep the non-deleted pending status and try to download a now-missing file (likely 404), instead of switching the pending change to deleted. Update this logic to override any existing pending change to deleted when the file disappears remotely (unless it’s already marked deleted).
| if (!remoteMtimes.has(file) && !this.pendingChanges.has(file)) { | |
| this.pendingChanges.set(file, { status: 'deleted', mtime: 0 }); | |
| hasNewChanges = true; | |
| if (!remoteMtimes.has(file)) { | |
| const pending = this.pendingChanges.get(file); | |
| if (pending?.status !== 'deleted') { | |
| this.pendingChanges.set(file, { status: 'deleted', mtime: 0 }); | |
| hasNewChanges = true; | |
| } |
| for (const [file, info] of this.pendingChanges) { | ||
| if (info.status === 'deleted') { | ||
| const localPath = path.join(this.options.localDir, file); | ||
| try { | ||
| await fs.unlink(localPath); | ||
| } catch (err: any) { | ||
| if (err.code !== 'ENOENT') throw err; | ||
| } | ||
| deleted.push(file); | ||
| changes.push({ file, status: 'deleted' }); | ||
| } else { | ||
| const localPath = path.join(this.options.localDir, file); | ||
| await this.downloadFile(file, localPath); | ||
| pulled.push(file); | ||
| changes.push({ file, status: info.status }); | ||
| } | ||
| } | ||
|
|
||
| for (const [file, info] of this.pendingChanges) { | ||
| if (info.status === 'deleted') { | ||
| this.lastKnownMtimes.delete(file); | ||
| } else { | ||
| this.lastKnownMtimes.set(file, info.mtime); | ||
| } | ||
| } | ||
| this.pendingChanges.clear(); |
There was a problem hiding this comment.
flushPending() applies changes while poll() can still be running (via setInterval(tickAll, ...)) and while new polls can append to pendingChanges. Because flushPending() iterates over this.pendingChanges and then calls this.pendingChanges.clear(), any changes recorded during an in-flight flush can be lost. Consider serializing poll/flush per watcher (e.g., a per-watcher mutex / in-flight promise), or snapshot+delete only the keys being applied so concurrently-recorded changes aren’t dropped.
Captures the five-commit plan for addressing the review feedback on PR #4554: correctness fixes (poll-error swallowing, flush/poll race, setInterval reentrancy, pending→delete transition), minimal lock file, PR description cleanup, code cleanups (option typing, persistManifest incrementality, duplicate _mtimes probe, TTY colors), and nits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Override `getRemoteMtimes` so poll failures throw instead of returning an empty map. The base swallow-and-empty behavior is fine for `pull` but in the watcher it would be read as "every file was deleted remotely" and wipe the local directory on a transient network blip. - Snapshot `pendingChanges` and clear it before any I/O in `flushPending`. Anything an interleaved `poll()` records during the flush now rolls into the next flush instead of being dropped by the trailing `clear()`. - Replace `setInterval(tickAll, intervalMs)` with a self-scheduling `setTimeout` chain. Two ticks can no longer overlap, eliminating a reentrancy that compounded the flush/poll race above. - When a previously-known file is missing from `_mtimes`, override any non-`deleted` pending entry to `deleted` instead of skipping the deletion sweep. Previously, a pending add/modify followed by a remote delete would try to download a 404'd file at flush time. Adds two integration tests: one verifying a poll error doesn't delete local files, and one verifying that a remote delete supersedes a pending modify. Addresses Copilot review comments on PR #4554 lines 144, 161, 208, and 417. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each `watchRealms()` call now writes `.boxel-watch.lock` (containing pid, start time, realm URL) into every spec.localDir before constructing watchers, and removes it during shutdown. A second `watchRealms()` against the same localDir returns an error referencing the running pid; a stale lock from a non-existent pid is detected via process.kill(pid, 0) and overwritten with a notice. Cross-command coordination (pull/push/sync warning when watch is active) is intentionally out of scope of this PR — that's a separate ticket. Two integration tests cover both the live-pid block and the stale-lock overwrite paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CLI passes a single spec; the array-of-specs shape on `watchRealms()` exists for programmatic / test use. Make the single-tenant authenticator resolution explicit in the doc comment so future multi-realm callers know they must use realms that share a profile / secret seed. Companion change: PR description updated via `gh pr edit` to drop the "Multi-realm support" claim and resolve the `boxel stop` open question now that the lock file landed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small cleanups, no behavior change at the CLI surface:
- Drop `WatcherInternalOptions`. `RealmWatcher` now passes plain
`{ realmUrl, localDir }` to `super()` and keeps `debounceMs`/`quiet`
only as instance fields, matching the `pullOptions`/`pushOptions`
shape used by the sibling sync commands.
- Make `persistManifest` O(changed files): load the prior manifest,
delete the entries for `deleted`, rehash `pulled`, copy
`lastKnownMtimes` for `remoteMtimes`. Previously every applied
batch rehashed every file in `lastKnownMtimes`.
- Replace the explicit `_mtimes` probe in `initialize()` with a
call to `getRemoteMtimes()` — the override added in commit 1
already throws on access failure, so the duplicated probe code
was redundant.
- Make `lib/colors.ts` TTY-aware. Constants resolve to empty strings
when `process.stdout.isTTY` is false or `NO_COLOR` is set, so
`boxel realm watch ... > log.txt` no longer captures raw ANSI
escapes. Affects every command that imports from `lib/colors.ts`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `localDirs: string[]` is `const` (the binding never gets reassigned). - Reflow two prettier complaints introduced by the cleanup commit. - Remove the plan doc per the project's plan-then-implement convention (matches how the original plan was rolled into c6076cf). `pnpm --filter @cardstack/boxel-cli lint` now exits 0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
boxel watchfrom the standalonecardstack/boxel-cliintopackages/boxel-cli/src/commands/realm/watch.tsrealmgroup (was top-levelboxel watchin the legacy CLI). The Claude Code plugin's skill copy needs to useboxel realm watch.-i, --interval <seconds>(default 30),-d, --debounce <seconds>(default 5),-q, --quiet<realm-url> <local-dir>). The internalwatchRealms()API takes an array of specs to keep room for a future multi-realm CLI; today's CLI passes a single spec and the resolved authenticator is shared, so multi-realm callers must use realms that share a profile / secret seed.boxel stop— resolvedThe monorepo's
watchruns in the foreground, so SIGINT (Ctrl+C) is sufficient and a separatestopcommand isn't needed. A.boxel-watch.lockfile is written into the local-dir while a watch is active so a secondboxel realm watchagainst the same dir refuses to start (and overwrites a stale lock from a non-running pid). Cross-command coordination (pull/push/sync warning when watch is active) is intentionally out of scope here — that's a separate ticket if/when wanted.Linear
CS-10623 — blocks Claude Code plugin marketplace submission in CS-10900
Plan doc
See
docs/cs-10623-boxel-realm-watch-plan.mdin this PR (added in the planning commit, will be removed with the final cleanup commit).Test plan
pnpm --filter @cardstack/boxel-cli test:integrationpasses (12 watch cases: add/modify/delete/burst/loop/abort/error paths/poll-error-doesn't-delete/pending-modify→delete-supersedes/lock-blocks-second-watch/stale-lock-overwrite)pnpm --filter @cardstack/boxel-cli buildsucceedsboxel realm watch --helpdocuments-i,-d,-qboxel realm watch <staging-url>, edit a card via Boxel web UI, confirm local pull within ~30s and a[remote]checkpoint is created.Depends on
CheckpointManagerdirectly.🤖 Generated with Claude Code