Skip to content

CS-10623: Reimplement boxel realm watch command#4554

Draft
FadhlanR wants to merge 8 commits intomainfrom
cs-10623-reimplement-boxel-realm-watch-command
Draft

CS-10623: Reimplement boxel realm watch command#4554
FadhlanR wants to merge 8 commits intomainfrom
cs-10623-reimplement-boxel-realm-watch-command

Conversation

@FadhlanR
Copy link
Copy Markdown
Contributor

@FadhlanR FadhlanR commented Apr 28, 2026

Summary

  • Port boxel watch from the standalone cardstack/boxel-cli into packages/boxel-cli/src/commands/realm/watch.ts
  • Namespaced under realm group (was top-level boxel watch in the legacy CLI). The Claude Code plugin's skill copy needs to use boxel realm watch.
  • Options: -i, --interval <seconds> (default 30), -d, --debounce <seconds> (default 5), -q, --quiet
  • Single realm at the CLI surface (<realm-url> <local-dir>). The internal watchRealms() API takes an array of specs to keep room for a future multi-realm CLI; today's CLI passes a single spec and the resolved authenticator is shared, so multi-realm callers must use realms that share a profile / secret seed.
  • Marquee Claude-Code workflow: collaborate on a card while teammates edit in the web UI.

boxel stop — resolved

The monorepo's watch runs in the foreground, so SIGINT (Ctrl+C) is sufficient and a separate stop command isn't needed. A .boxel-watch.lock file is written into the local-dir while a watch is active so a second boxel realm watch against the same dir refuses to start (and overwrites a stale lock from a non-running pid). Cross-command coordination (pull/push/sync warning when watch is active) is intentionally out of scope here — that's a separate ticket if/when wanted.

Linear

CS-10623 — blocks Claude Code plugin marketplace submission in CS-10900

Plan doc

See docs/cs-10623-boxel-realm-watch-plan.md in this PR (added in the planning commit, will be removed with the final cleanup commit).

Test plan

  • pnpm --filter @cardstack/boxel-cli test:integration passes (12 watch cases: add/modify/delete/burst/loop/abort/error paths/poll-error-doesn't-delete/pending-modify→delete-supersedes/lock-blocks-second-watch/stale-lock-overwrite)
  • pnpm --filter @cardstack/boxel-cli build succeeds
  • boxel realm watch --help documents -i, -d, -q
  • Manual: boxel realm watch <staging-url>, edit a card via Boxel web UI, confirm local pull within ~30s and a [remote] checkpoint is created.

Depends on

  • CS-10625 for checkpoint creation on detected changes — already merged when this implementation landed; uses CheckpointManager directly.

🤖 Generated with Claude Code

FadhlanR and others added 2 commits April 28, 2026 14:09
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Polls a realm's `_mtimes` endpoint, accumulates changes between ticks,
and applies them in a debounced batch — downloading new/modified files,
removing locally what's gone remote, and writing a checkpoint. Reuses
the `RealmSyncBase` + `CheckpointManager` + sync-manifest plumbing the
other realm commands share, and accepts `RealmAuthenticator` so both
the profile flow and `--realm-secret-seed` work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new boxel realm watch command implementation in the monorepo Boxel CLI, along with integration tests, to continuously poll a realm for remote changes and pull them into a local directory (with checkpoint creation).

Changes:

  • Added packages/boxel-cli/src/commands/realm/watch.ts implementing RealmWatcher, watchRealms(), and CLI registration for boxel realm watch.
  • Added integration tests covering initial sync, modify/delete detection, debounced batching, abort handling, and error cases.
  • Wired the new watch subcommand into the realm command group.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
packages/boxel-cli/src/commands/realm/watch.ts Implements polling-based realm watcher, debounced apply/flush, checkpointing, and CLI command registration.
packages/boxel-cli/tests/integration/realm-watch.test.ts Adds end-to-end integration coverage for watcher behaviors and error handling.
packages/boxel-cli/src/commands/realm/index.ts Registers the new watch subcommand under boxel realm.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +410 to +417
const intervalId = setInterval(tickAll, intervalMs);

await new Promise<void>((resolve) => {
let stopped = false;
const cleanup = () => {
if (stopped) return;
stopped = true;
clearInterval(intervalId);
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setInterval(tickAll, intervalMs) will invoke tickAll again even if the previous async run hasn’t finished, causing overlapping polls (and potentially overlapping scheduleFlush timers) for the same watcher. This can amplify the lost-update race on pendingChanges and increase load. Prefer a self-scheduling loop that awaits tickAll() (e.g., while + await sleep(intervalMs)), or track an inFlight flag/promise to prevent reentrancy.

Suggested change
const intervalId = setInterval(tickAll, intervalMs);
await new Promise<void>((resolve) => {
let stopped = false;
const cleanup = () => {
if (stopped) return;
stopped = true;
clearInterval(intervalId);
let stopped = false;
let timeoutId: ReturnType<typeof setTimeout> | null = null;
const scheduleNextTick = () => {
if (stopped) return;
timeoutId = setTimeout(async () => {
if (stopped) return;
await tickAll();
scheduleNextTick();
}, intervalMs);
};
scheduleNextTick();
await new Promise<void>((resolve) => {
const cleanup = () => {
if (stopped) return;
stopped = true;
if (timeoutId !== null) {
clearTimeout(timeoutId);
timeoutId = null;
}

Copilot uses AI. Check for mistakes.
Comment on lines +260 to +276
private async persistManifest(): Promise<void> {
const manifest: SyncManifest = {
realmUrl: this.normalizedRealmUrl,
files: {},
remoteMtimes: {},
};
for (const [file, mtime] of this.lastKnownMtimes) {
const localPath = path.join(this.options.localDir, file);
try {
const hash = await computeFileHash(localPath);
manifest.files[file] = hash;
if (mtime !== 0) {
manifest.remoteMtimes![file] = mtime;
}
} catch (err: any) {
if (err.code !== 'ENOENT') throw err;
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

persistManifest() recomputes an md5 for every file in lastKnownMtimes on every flush (and computeFileHash reads the full file into memory). For large realms this makes each applied batch O(total files) even if only one file changed. Consider incrementally updating the prior manifest (update hashes only for pulled/deleted files) or storing only remoteMtimes for watch so applying a batch is O(changed files).

Copilot uses AI. Check for mistakes.
Comment on lines +496 to +501
.argument(
'<realm-url>',
'The URL of the realm to watch (e.g., https://app.boxel.ai/demo/)',
)
.argument('<local-dir>', 'The local directory to write changes into')
.option(
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description calls for multi-realm support (“accept multiple positional realms, run independent loops concurrently”), but the CLI wiring here only accepts a single <realm-url> and <local-dir> pair and passes a single spec into watchRealms. If multi-realm is still intended, the command signature and argument parsing need to be updated accordingly (e.g., variadic realm URLs and a strategy for local dirs per realm).

Copilot uses AI. Check for mistakes.
Comment on lines +139 to +144
async poll(): Promise<boolean> {
const remoteMtimes = await this.getRemoteMtimes();
let hasNewChanges = false;

for (const [file, mtime] of remoteMtimes) {
if (isProtectedFile(file)) continue;
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poll() relies on RealmSyncBase.getRemoteMtimes(), which swallows fetch/parsing errors and returns an empty map. In a watcher, an empty mtimes result will be interpreted as “everything deleted” (see deletion loop below) and can wipe the local directory on transient network/auth issues. Consider overriding mtimes fetch here (or adding a “throw on error” mode) so polling failures don’t translate into destructive deletes.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +161
if (!remoteMtimes.has(file) && !this.pendingChanges.has(file)) {
this.pendingChanges.set(file, { status: 'deleted', mtime: 0 });
hasNewChanges = true;
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deletion detection skips files that are already in pendingChanges (!this.pendingChanges.has(file)). If a file is pending as added/modified and then gets deleted remotely before the debounced flush runs, the watcher will keep the non-deleted pending status and try to download a now-missing file (likely 404), instead of switching the pending change to deleted. Update this logic to override any existing pending change to deleted when the file disappears remotely (unless it’s already marked deleted).

Suggested change
if (!remoteMtimes.has(file) && !this.pendingChanges.has(file)) {
this.pendingChanges.set(file, { status: 'deleted', mtime: 0 });
hasNewChanges = true;
if (!remoteMtimes.has(file)) {
const pending = this.pendingChanges.get(file);
if (pending?.status !== 'deleted') {
this.pendingChanges.set(file, { status: 'deleted', mtime: 0 });
hasNewChanges = true;
}

Copilot uses AI. Check for mistakes.
Comment on lines +183 to +208
for (const [file, info] of this.pendingChanges) {
if (info.status === 'deleted') {
const localPath = path.join(this.options.localDir, file);
try {
await fs.unlink(localPath);
} catch (err: any) {
if (err.code !== 'ENOENT') throw err;
}
deleted.push(file);
changes.push({ file, status: 'deleted' });
} else {
const localPath = path.join(this.options.localDir, file);
await this.downloadFile(file, localPath);
pulled.push(file);
changes.push({ file, status: info.status });
}
}

for (const [file, info] of this.pendingChanges) {
if (info.status === 'deleted') {
this.lastKnownMtimes.delete(file);
} else {
this.lastKnownMtimes.set(file, info.mtime);
}
}
this.pendingChanges.clear();
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushPending() applies changes while poll() can still be running (via setInterval(tickAll, ...)) and while new polls can append to pendingChanges. Because flushPending() iterates over this.pendingChanges and then calls this.pendingChanges.clear(), any changes recorded during an in-flight flush can be lost. Consider serializing poll/flush per watcher (e.g., a per-watcher mutex / in-flight promise), or snapshot+delete only the keys being applied so concurrently-recorded changes aren’t dropped.

Copilot uses AI. Check for mistakes.
FadhlanR and others added 6 commits May 5, 2026 15:29
Captures the five-commit plan for addressing the review feedback on
PR #4554: correctness fixes (poll-error swallowing, flush/poll race,
setInterval reentrancy, pending→delete transition), minimal lock file,
PR description cleanup, code cleanups (option typing, persistManifest
incrementality, duplicate _mtimes probe, TTY colors), and nits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Override `getRemoteMtimes` so poll failures throw instead of
  returning an empty map. The base swallow-and-empty behavior is fine
  for `pull` but in the watcher it would be read as "every file was
  deleted remotely" and wipe the local directory on a transient
  network blip.
- Snapshot `pendingChanges` and clear it before any I/O in
  `flushPending`. Anything an interleaved `poll()` records during the
  flush now rolls into the next flush instead of being dropped by the
  trailing `clear()`.
- Replace `setInterval(tickAll, intervalMs)` with a self-scheduling
  `setTimeout` chain. Two ticks can no longer overlap, eliminating a
  reentrancy that compounded the flush/poll race above.
- When a previously-known file is missing from `_mtimes`, override
  any non-`deleted` pending entry to `deleted` instead of skipping
  the deletion sweep. Previously, a pending add/modify followed by a
  remote delete would try to download a 404'd file at flush time.

Adds two integration tests: one verifying a poll error doesn't
delete local files, and one verifying that a remote delete supersedes
a pending modify.

Addresses Copilot review comments on PR #4554 lines 144, 161, 208,
and 417.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each `watchRealms()` call now writes `.boxel-watch.lock` (containing
pid, start time, realm URL) into every spec.localDir before constructing
watchers, and removes it during shutdown. A second `watchRealms()`
against the same localDir returns an error referencing the running pid;
a stale lock from a non-existent pid is detected via process.kill(pid, 0)
and overwritten with a notice.

Cross-command coordination (pull/push/sync warning when watch is active)
is intentionally out of scope of this PR — that's a separate ticket.

Two integration tests cover both the live-pid block and the stale-lock
overwrite paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CLI passes a single spec; the array-of-specs shape on
`watchRealms()` exists for programmatic / test use. Make the
single-tenant authenticator resolution explicit in the doc comment so
future multi-realm callers know they must use realms that share a
profile / secret seed.

Companion change: PR description updated via `gh pr edit` to drop the
"Multi-realm support" claim and resolve the `boxel stop` open question
now that the lock file landed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four small cleanups, no behavior change at the CLI surface:

- Drop `WatcherInternalOptions`. `RealmWatcher` now passes plain
  `{ realmUrl, localDir }` to `super()` and keeps `debounceMs`/`quiet`
  only as instance fields, matching the `pullOptions`/`pushOptions`
  shape used by the sibling sync commands.
- Make `persistManifest` O(changed files): load the prior manifest,
  delete the entries for `deleted`, rehash `pulled`, copy
  `lastKnownMtimes` for `remoteMtimes`. Previously every applied
  batch rehashed every file in `lastKnownMtimes`.
- Replace the explicit `_mtimes` probe in `initialize()` with a
  call to `getRemoteMtimes()` — the override added in commit 1
  already throws on access failure, so the duplicated probe code
  was redundant.
- Make `lib/colors.ts` TTY-aware. Constants resolve to empty strings
  when `process.stdout.isTTY` is false or `NO_COLOR` is set, so
  `boxel realm watch ... > log.txt` no longer captures raw ANSI
  escapes. Affects every command that imports from `lib/colors.ts`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `localDirs: string[]` is `const` (the binding never gets reassigned).
- Reflow two prettier complaints introduced by the cleanup commit.
- Remove the plan doc per the project's plan-then-implement convention
  (matches how the original plan was rolled into c6076cf).

`pnpm --filter @cardstack/boxel-cli lint` now exits 0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants