Skip to content

Commit a549315

Browse files
kevinccbsgclaude
andauthored
feat: abort throttled test runs + improve RUN_IN_PROGRESS recovery (#5)
* docs: add spec for throttle detection + abort Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add implementation plan for throttle detection + abort Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: add run:aborted event type and maxTestDurationMs option * feat: expand RUN_IN_PROGRESS error with recovery guidance * feat: add RunMonitor type + skeleton factory * feat: implement runMonitor threshold detection * test: pin runMonitor threshold boundary + document non-positive disable Code review surfaced that the exact-threshold case (duration === thresholdMs) was untested, and the JSDoc only mentioned "0 disables" even though the implementation treats any non-positive value the same way. Add a boundary test and tighten the comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: wire runMonitor into browser client for throttle-based abort * fix: paint favicon fail on abort tick + prune monitor state on runner crash Code review surfaced two issues: 1. The abort-tick sent run:complete over the wire but left the favicon showing 'running' until the (possibly hung) runner promise resolved. Paint 'fail' in the tick so the tab indicator matches the wire immediately. 2. A runner crash left monitor.currentTest populated, so a subsequent threshold check could read stale in-flight state. Call monitor.onTestEnd() in the catch block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: pin maxTestDurationMs forwarding and run:aborted broadcasting * feat: CLI --max-test-duration flag + run:aborted handler Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: detect throttling via onTestEnd + lower default threshold to 5s Smoke-testing surfaced the flaw: tests under throttling cluster in the 4-8s band per test — individually under the 10s threshold. The heartbeat tick (3s interval) was the only detector, so tests that completed between ticks slipped through. Result: a 1s run stretched to 27s with no abort. Two fixes: - RunMonitor.onTestEnd() now returns breach info if the just-ended test exceeded threshold. createBrowserClient calls fireAbort() on that breach from the onPass/onFail/onSkip callbacks — catches every completed test, not just those still running at a heartbeat boundary. - Default threshold lowered from 10000ms to 5000ms. 5s is still 5-10× slower than any real TWD test and matches the observed per-test duration under Chrome's background throttling. Also refactored the abort block into a shared fireAbort() helper used by both the heartbeat tick and the new onTestEnd breach path. Spec updated to reflect the two-pronged detection design. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: document run:aborted threshold and RUN_IN_PROGRESS recovery --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 7cba63d commit a549315

13 files changed

Lines changed: 1872 additions & 24 deletions

File tree

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ twd-relay is a WebSocket relay that lets AI agents and external tools trigger an
2323

2424
**Relay Server** (`src/relay/`, exported as `twd-relay`) — A WebSocket server that attaches to an HTTP server. It manages exactly one browser connection and many client connections. Clients send commands (`run`, `status`); the relay forwards them to the browser. The browser sends test lifecycle events (`test:start`, `test:pass`, `test:fail`, `run:complete`, etc.); the relay broadcasts them to all clients. A `runInProgress` lock prevents concurrent test runs.
2525

26-
**Browser Client** (`src/browser/`, exported as `twd-relay/browser`) — Runs in the browser. Connects to the relay, listens for commands, dynamically imports `twd-js/runner` to execute tests, and streams results back. Uses native browser `WebSocket` with auto-reconnect. Reads test state from `window.__TWD_STATE__` (set by twd-js). A small `faviconManager` (in `src/browser/faviconManager.ts`) sets a colored favicon + `document.title` prefix based on connection/run state so the active TWD tab is identifiable among multiple tabs to the same origin.
26+
**Browser Client** (`src/browser/`, exported as `twd-relay/browser`) — Runs in the browser. Connects to the relay, listens for commands, dynamically imports `twd-js/runner` to execute tests, and streams results back. Uses native browser `WebSocket` with auto-reconnect. Reads test state from `window.__TWD_STATE__` (set by twd-js). A small `faviconManager` (in `src/browser/faviconManager.ts`) sets a colored favicon + `document.title` prefix based on connection/run state so the active TWD tab is identifiable among multiple tabs to the same origin. A sibling `runMonitor` (in `src/browser/runMonitor.ts`) tracks per-test wall-clock time; on the 3 s heartbeat tick AND at the end of every test, the browser checks whether the test exceeded `maxTestDurationMs` (default 5 s) and, if so, emits a `run:aborted` event so the CLI can exit with a clear error instead of hanging on a throttled tab.
2727

2828
**Vite Plugin** (`src/vite/`, exported as `twd-relay/vite`) — A Vite plugin that hooks into `configureServer` to attach the relay to the dev server's HTTP instance.
2929

README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,21 @@ Once connected, the browser client sets a colored favicon and prefixes `document
7979

8080
On disconnect or eviction (another tab taking over), the original favicon and title are restored.
8181

82+
### Aborting throttled runs
83+
84+
Chrome aggressively throttles timers in backgrounded tabs, which can stretch a 1-second test run to 30+ seconds. To avoid AI/CI hangs, the browser client monitors per-test wall-clock time. If any single test runs longer than 5 seconds (configurable), the browser emits `run:aborted`, the CLI prints a clear error with recovery guidance, and the run ends with exit code 1.
85+
86+
Override the threshold with `--max-test-duration <ms>` on `twd-relay run`, or pass `maxTestDurationMs` to `createBrowserClient`. Set it to `0` to disable detection entirely:
87+
88+
```bash
89+
twd-relay run --max-test-duration 15000 # raise to 15s for heavy multistep tests
90+
twd-relay run --max-test-duration 0 # disable detection
91+
```
92+
93+
The default of 5 s is deliberate: real TWD tests are typically sub-second, and 5 s is a strong throttling signal. Heavy legitimate tests (complex multistep forms, many API calls) may need to raise this.
94+
95+
Recovery when an abort fires: foreground the TWD tab (identified by the `[TWD …]` title prefix set by the favicon indicator) and retry. For unattended runs (CI, agents), prefer `twd-cli`: it drives a headless browser where the tab is always focused and throttling doesn't apply.
96+
8297
**3. Open your app in a browser** — the page connects to the relay as “browser”.
8398

8499
**4. Trigger a run** — something must connect as a **client** and send `run`:

0 commit comments

Comments
 (0)