Skip to content

Commit 43de088

Browse files
garrytanclaude
andcommitted
Merge origin/main into garrytan/slim-gstack-skills
VERSION → 1.15.0.0 (MINOR bump on top of main's v1.14.0.0). Branch's v1.13.1.0 work (preamble compression + real-PTY harness + 5 plan-mode tests passing) consolidated with v1.15.0.0 work (6 new E2E tests on the harness + parseNumberedOptions + budget regression utils) into a single release entry — v1.13.1.0 never landed on main, so its content rolls into the final shippable version per the never-orphan rule in CLAUDE.md. Conflicts resolved: - VERSION: 1.13.1.0 (HEAD) + 1.14.0.0 (main) → 1.15.0.0 - package.json: matching 1.15.0.0 - CHANGELOG.md: replaced HEAD's 1.13.1.0 entry with a consolidated 1.15.0.0 entry above main's untouched 1.14.0.0 entry. Itemized changes split per-version (no shared header). CLAUDE.md adds "Scale-aware bumps — use common sense" guidance under CHANGELOG + VERSION style. Big diffs (>2K LOC, new capability) bump MINOR; PATCH is for fixes/small adds; MAJOR for breaking changes. Codified after a v1.14.1.0 PATCH attempt got correctly pushed back on for a ~10K-line additions / -24K-line removals release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 parents e6fd776 + ed1e4be commit 43de088

35 files changed

Lines changed: 3075 additions & 5146 deletions

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ bin/gstack-global-discover
2020
.gbrain/
2121
.context/
2222
extension/.auth.json
23+
# xterm assets are vendored from npm at build time; not source-of-truth.
24+
extension/lib/xterm.js
25+
extension/lib/xterm.css
26+
extension/lib/xterm-addon-fit.js
2327
.gstack-worktrees/
2428
/tmp/
2529
*.log

CHANGELOG.md

Lines changed: 78 additions & 7 deletions
Large diffs are not rendered by default.

CLAUDE.md

Lines changed: 54 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -225,12 +225,35 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the
225225
project uses.
226226

227227
**Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`,
228-
`content.js`, `sidebar-agent.ts`, or sidebar-related server endpoints, read
229-
`docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents the full initialization
230-
timeline, message flow, auth token chain, tab concurrency model, and known
231-
failure modes. The sidebar spans 5 files across 2 codebases (extension + server)
232-
with non-obvious ordering dependencies. The doc exists to prevent the kind of
233-
silent failures that come from not understanding the cross-component flow.
228+
`content.js`, `terminal-agent.ts`, or sidebar-related server endpoints,
229+
read `docs/designs/SIDEBAR_MESSAGE_FLOW.md`. The sidebar has one primary
230+
surface — the **Terminal** pane (interactive `claude` PTY) — with
231+
Activity / Refs / Inspector as debug overlays behind the footer's
232+
`debug` toggle. The chat queue path was ripped once the PTY proved out;
233+
`sidebar-agent.ts` and the `/sidebar-command` / `/sidebar-chat` /
234+
`/sidebar-agent/event` endpoints are gone. The doc covers the WS auth
235+
flow, dual-token model, and threat-model boundary — silent failures
236+
here usually trace to not understanding the cross-component flow.
237+
238+
**WebSocket auth uses Sec-WebSocket-Protocol, not cookies.** Browsers
239+
can't set `Authorization` on a WebSocket upgrade, but they CAN set
240+
`Sec-WebSocket-Protocol` via `new WebSocket(url, [token])`. The agent
241+
reads it, validates against `validTokens`, and MUST echo the protocol
242+
back in the upgrade response — without the echo, Chromium closes the
243+
connection immediately. `Set-Cookie: gstack_pty=...` is kept as a
244+
fallback for non-browser callers (the cross-port `SameSite=Strict`
245+
cookie path doesn't survive from a chrome-extension origin).
246+
247+
**Cross-pane PTY injection.** The toolbar's Cleanup button and the
248+
Inspector's "Send to Code" action both pipe text into the live claude
249+
PTY via `window.gstackInjectToTerminal(text)`, exposed by
250+
`sidepanel-terminal.js`. No `/sidebar-command` POST — the live REPL is
251+
the only execution surface in the sidebar now.
252+
253+
**`/health` MUST NOT surface any shell-grant token.** It already leaks
254+
`AUTH_TOKEN` to localhost callers in headed mode (a v1.1+ TODO). Don't
255+
make that worse by adding the PTY session token there. PTY auth flows
256+
through `POST /pty-session` only.
234257

235258
**Transport-layer security** (v1.6.0.0+). When `pair-agent` starts an ngrok tunnel,
236259
the daemon binds two HTTP listeners: a local listener (127.0.0.1, full command
@@ -437,6 +460,31 @@ claims v1.7.0.0 as a MINOR and branch B is also a MINOR, B lands at v1.8.0.0
437460
`bin/gstack-next-version` advances within the chosen bump level rather than
438461
repicking the level when collisions happen.
439462

463+
**Scale-aware bumps — use common sense.** When the diff is big, bump MINOR (or
464+
MAJOR), not PATCH. PATCH is for bug fixes and small additions; MINOR is for
465+
substantial new capability or substantial reduction; MAJOR is for breaking
466+
changes. Rough guideposts (don't treat as rules, treat as smell-checks):
467+
468+
- **PATCH (X.Y.Z+1.0)**: bug fix, doc tweak, small additive change, single
469+
test/file added. Net diff under ~500 lines, no new user-facing capability.
470+
- **MINOR (X.Y+1.0.0)**: new capability shipped (skill, harness, command, big
471+
refactor), substantial code reduction (compression, migration), or coordinated
472+
multi-file change. Net diff over ~2000 lines added/removed, OR a user-visible
473+
feature you'd put in a tweet.
474+
- **MAJOR (X+1.0.0.0)**: breaking change to public surface (CLI flag rename,
475+
skill removed, config format changed), OR a release big enough to be the
476+
headline of a blog post.
477+
478+
If you find yourself debating "is 10K added + 24K removed really a PATCH?" — it
479+
isn't. Bump MINOR. Same for "this adds a whole new test harness with 6 new E2E
480+
tests + helper utilities" — MINOR. The bump level is communication to the user
481+
about what kind of release this is; don't undersell it.
482+
483+
When merging origin/main brings a higher VERSION, re-evaluate the bump level
484+
against the SCALE of your branch's work, not just whether main moved forward.
485+
If main bumped MINOR and your branch is also a substantial change, you bump
486+
MINOR again on top (e.g., main at v1.14.0.0, your branch lands v1.15.0.0).
487+
440488
**VERSION and CHANGELOG are branch-scoped.** Every feature branch that ships gets its
441489
own version bump and CHANGELOG entry. The entry describes what THIS branch adds —
442490
not what was already on main.

SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -880,6 +880,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
880880
| `closetab [id]` | Close tab |
881881
| `newtab [url] [--json]` | Open new tab. With --json, returns {"tabId":N,"url":...} for programmatic use (make-pdf). |
882882
| `tab <id>` | Switch to tab |
883+
| `tab-each <command> [args...]` | Run a command on every open tab. Returns JSON with per-tab results. |
883884
| `tabs` | List open tabs |
884885

885886
### Server

TODOS.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,57 @@
11
# TODOS
22

3+
## Sidebar Terminal (cc-pty-import follow-ups)
4+
5+
### v1.1: PTY session survives sidebar reload
6+
7+
**What:** Today the Terminal tab's PTY dies with the WebSocket — sidebar
8+
reload, side-panel close, even a quick navigate-away in another tab close
9+
the session. v1.1 should key the PTY on a tab/session id so a reload
10+
reattaches to the existing claude process and you keep `/resume` history.
11+
12+
**Why:** Mid-task resilience. When you've been pair-programming with claude
13+
for 20 minutes and an accidental Cmd-R blows it away, the cost is real.
14+
15+
**Pros:** Better UX, fewer interrupted sessions. **Cons:** Session-tracking
16+
state, ghost-process risk, lifecycle bugs (when DOES the PTY actually go
17+
away?). v1 chose the simple "PTY dies with WS" model deliberately.
18+
19+
**Context:** /plan-eng-review Issue 1C decision (cc-pty-import branch,
20+
2026-04-25). v1 ships with phoenix's lifecycle. **Depends on:**
21+
cc-pty-import landed.
22+
23+
**Priority:** P2 (nice-to-have).
24+
**Effort:** M. Likely needs a per-tab session map keyed by chrome.tabs.id
25+
plus a TTL so abandoned PTYs eventually exit.
26+
27+
---
28+
29+
### v1.1+: Audit `/health` token distribution
30+
31+
**What:** Codex's outside-voice review on cc-pty-import flagged that
32+
`/health` already surfaces `AUTH_TOKEN` to any localhost caller in headed
33+
mode (`server.ts:1657`). That's a pre-existing soft leak — anything
34+
running on localhost gets the root token by hitting `/health`.
35+
36+
**Why:** cc-pty-import sidesteps it by NOT putting the PTY token there
37+
(uses an HttpOnly cookie path instead). But the underlying leak is still
38+
shippable surface. A second extension or a localhost web app could
39+
currently scrape `AUTH_TOKEN` and hit any browse-server endpoint.
40+
41+
**Pros:** Closes a real privilege-escalation path on multi-extension
42+
machines. **Cons:** Either we tighten the gate (Origin must be OUR
43+
extension id, not just any chrome-extension://) or we move bootstrap
44+
discovery off `/health` entirely. Either has migration cost for tests
45+
and the existing extension.
46+
47+
**Context:** codex finding #2 on cc-pty-import plan-eng review. Not in
48+
scope of that PR; deliberately deferred to keep PTY-import small.
49+
50+
**Priority:** P2.
51+
**Effort:** M.
52+
53+
---
54+
355
## Testing
456

557
## P1: Structural STOP-Ask forcing function across all skills

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.13.1.0
1+
1.15.0.0

browse/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -804,6 +804,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
804804
| `closetab [id]` | Close tab |
805805
| `newtab [url] [--json]` | Open new tab. With --json, returns {"tabId":N,"url":...} for programmatic use (make-pdf). |
806806
| `tab <id>` | Switch to tab |
807+
| `tab-each <command> [args...]` | Run a command on every open tab. Returns JSON with per-tab results. |
807808
| `tabs` | List open tabs |
808809

809810
### Server

browse/src/cli.ts

Lines changed: 32 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -853,7 +853,7 @@ Refs: After 'snapshot', use @e1, @e2... as selectors:
853853
// Delete stale state file
854854
safeUnlinkQuiet(config.stateFile);
855855

856-
console.log('Launching headed Chromium with extension + sidebar agent...');
856+
console.log('Launching headed Chromium with extension + terminal agent...');
857857
try {
858858
// Start server in headed mode with extension auto-loaded
859859
// Use a well-known port so the Chrome extension auto-connects
@@ -882,56 +882,41 @@ Refs: After 'snapshot', use @e1, @e2... as selectors:
882882
const status = await resp.text();
883883
console.log(`Connected to real Chrome\n${status}`);
884884

885-
// Auto-start sidebar agent
886-
// __dirname is inside $bunfs in compiled binaries — resolve from execPath instead
887-
let agentScript = path.resolve(__dirname, 'sidebar-agent.ts');
888-
if (!fs.existsSync(agentScript)) {
889-
agentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'sidebar-agent.ts');
885+
// sidebar-agent.ts spawn was here. Ripped alongside the chat queue —
886+
// the Terminal pane runs an interactive PTY now, no more one-shot
887+
// claude -p subprocesses to multiplex.
888+
889+
// Auto-start terminal agent (non-compiled bun process). Owns the PTY
890+
// WebSocket for the sidebar Terminal pane.
891+
let termAgentScript = path.resolve(__dirname, 'terminal-agent.ts');
892+
if (!fs.existsSync(termAgentScript)) {
893+
termAgentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'terminal-agent.ts');
890894
}
891895
try {
892-
if (!fs.existsSync(agentScript)) {
893-
throw new Error(`sidebar-agent.ts not found at ${agentScript}`);
894-
}
895-
// Clear old agent queue
896-
const agentQueue = path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
897-
try {
898-
fs.mkdirSync(path.dirname(agentQueue), { recursive: true, mode: 0o700 });
899-
fs.writeFileSync(agentQueue, '', { mode: 0o600 });
900-
} catch (err: any) {
901-
if (err?.code !== 'EACCES') throw err;
902-
}
903-
904-
// Resolve browse binary path the same way — execPath-relative
905-
let browseBin = path.resolve(__dirname, '..', 'dist', 'browse');
906-
if (!fs.existsSync(browseBin)) {
907-
browseBin = process.execPath; // the compiled binary itself
908-
}
909-
910-
// Kill any existing sidebar-agent processes before starting a new one.
911-
// Old agents have stale auth tokens and will silently fail to relay events,
912-
// causing the server to mark the agent as "hung".
913-
try {
914-
const { spawnSync } = require('child_process');
915-
spawnSync('pkill', ['-f', 'sidebar-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
916-
} catch (err: any) {
917-
if (err?.code !== 'ENOENT') throw err;
896+
if (fs.existsSync(termAgentScript)) {
897+
// Kill old terminal-agents so a stale port file can't trick the
898+
// server into routing /pty-session at a dead listener.
899+
try {
900+
const { spawnSync } = require('child_process');
901+
spawnSync('pkill', ['-f', 'terminal-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
902+
} catch (err: any) {
903+
if (err?.code !== 'ENOENT') throw err;
904+
}
905+
const termProc = Bun.spawn(['bun', 'run', termAgentScript], {
906+
cwd: config.projectDir,
907+
env: {
908+
...process.env,
909+
BROWSE_STATE_FILE: config.stateFile,
910+
BROWSE_SERVER_PORT: String(newState.port),
911+
},
912+
stdio: ['ignore', 'ignore', 'ignore'],
913+
});
914+
termProc.unref();
915+
console.log(`[browse] Terminal agent started (PID: ${termProc.pid})`);
918916
}
919-
920-
const agentProc = Bun.spawn(['bun', 'run', agentScript], {
921-
cwd: config.projectDir,
922-
env: {
923-
...process.env,
924-
BROWSE_BIN: browseBin,
925-
BROWSE_STATE_FILE: config.stateFile,
926-
BROWSE_SERVER_PORT: String(newState.port),
927-
},
928-
stdio: ['ignore', 'ignore', 'ignore'],
929-
});
930-
agentProc.unref();
931-
console.log(`[browse] Sidebar agent started (PID: ${agentProc.pid})`);
932917
} catch (err: any) {
933-
console.error(`[browse] Sidebar agent failed to start: ${err.message}`);
934-
console.error(`[browse] Run manually: bun run ${agentScript}`);
918+
// Non-fatal: chat still works without the terminal agent.
919+
console.error(`[browse] Terminal agent failed to start: ${err.message}`);
935920
}
936921
} catch (err: any) {
937922
console.error(`[browse] Connect failed: ${err.message}`);

browse/src/commands.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ export const WRITE_COMMANDS = new Set([
3030
]);
3131

3232
export const META_COMMANDS = new Set([
33-
'tabs', 'tab', 'newtab', 'closetab',
33+
'tabs', 'tab', 'tab-each', 'newtab', 'closetab',
3434
'status', 'stop', 'restart',
3535
'screenshot', 'pdf', 'responsive',
3636
'chain', 'diff',
@@ -144,6 +144,7 @@ export const COMMAND_DESCRIPTIONS: Record<string, { category: string; descriptio
144144
'tab': { category: 'Tabs', description: 'Switch to tab', usage: 'tab <id>' },
145145
'newtab': { category: 'Tabs', description: 'Open new tab. With --json, returns {"tabId":N,"url":...} for programmatic use (make-pdf).', usage: 'newtab [url] [--json]' },
146146
'closetab':{ category: 'Tabs', description: 'Close tab', usage: 'closetab [id]' },
147+
'tab-each':{ category: 'Tabs', description: 'Run a command on every open tab. Returns JSON with per-tab results.', usage: 'tab-each <command> [args...]' },
147148
// Server
148149
'status': { category: 'Server', description: 'Health check' },
149150
'stop': { category: 'Server', description: 'Shutdown server' },

browse/src/meta-commands.ts

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,108 @@ export async function handleMetaCommand(
285285
return `Closed tab${id ? ` ${id}` : ''}`;
286286
}
287287

288+
case 'tab-each': {
289+
// Fan out a single command across every open tab. Returns a JSON
290+
// object: { results: [{tabId, url, title, status, output}], total }.
291+
// Restores the originally active tab when done so the user's view
292+
// doesn't shift under them.
293+
//
294+
// Usage: $B tab-each <command> [args...]
295+
// $B tab-each snapshot -i → snapshot every tab
296+
// $B tab-each text → grab clean text from every tab
297+
// $B tab-each goto https://x.y → load the same URL in every tab
298+
if (args.length === 0) {
299+
throw new Error(
300+
'Usage: browse tab-each <command> [args...]\n' +
301+
'Example: browse tab-each snapshot -i'
302+
);
303+
}
304+
305+
const innerRaw = args[0];
306+
const innerName = canonicalizeCommand(innerRaw);
307+
const innerArgs = args.slice(1);
308+
309+
// Scope check the inner command before fanning out, so a single
310+
// permission failure aborts the whole batch instead of partially
311+
// mutating tabs.
312+
if (tokenInfo && tokenInfo.clientId !== 'root' && !checkScope(tokenInfo, innerName)) {
313+
throw new Error(
314+
`tab-each rejected: subcommand "${innerRaw}" not allowed by your token scope (${tokenInfo.scopes.join(', ')}).`
315+
);
316+
}
317+
318+
const tabs = await bm.getTabListWithTitles();
319+
const originalActive = tabs.find(t => t.active)?.id ?? bm.getActiveTabId();
320+
321+
const executeCmd = opts?.executeCommand;
322+
const results: Array<{
323+
tabId: number;
324+
url: string;
325+
title: string;
326+
status: number;
327+
output: string;
328+
}> = [];
329+
330+
try {
331+
for (const tab of tabs) {
332+
// Skip chrome:// internal pages — they aren't useful targets and
333+
// many commands fail outright on them.
334+
if (tab.url.startsWith('chrome://') || tab.url.startsWith('chrome-extension://')) {
335+
results.push({
336+
tabId: tab.id,
337+
url: tab.url,
338+
title: tab.title || '',
339+
status: 0,
340+
output: 'skipped: internal page',
341+
});
342+
continue;
343+
}
344+
// Switch to the tab. Don't pull focus away — we're a background
345+
// operation; the user shouldn't see the OS window jump.
346+
bm.switchTab(tab.id, { bringToFront: false });
347+
348+
let status = 0;
349+
let output = '';
350+
if (executeCmd) {
351+
const r = await executeCmd(
352+
{ command: innerName, args: innerArgs, tabId: tab.id },
353+
tokenInfo,
354+
);
355+
status = r.status;
356+
output = r.result;
357+
if (status !== 200) {
358+
try { output = JSON.parse(output).error || output; } catch (err: any) { if (!(err instanceof SyntaxError)) throw err; }
359+
}
360+
} else {
361+
// Fallback path (CLI / test harness without a server context).
362+
// We don't recurse through read/write/meta directly here because
363+
// tab-each is only meaningful with the live server; surface a
364+
// clear error.
365+
status = 500;
366+
output = 'tab-each requires the browse server (no executeCommand context)';
367+
}
368+
369+
results.push({
370+
tabId: tab.id,
371+
url: tab.url,
372+
title: tab.title || '',
373+
status,
374+
output,
375+
});
376+
}
377+
} finally {
378+
// Restore the original active tab so the user's view is unchanged.
379+
try { bm.switchTab(originalActive, { bringToFront: false }); } catch {}
380+
}
381+
382+
return JSON.stringify({
383+
command: innerName,
384+
args: innerArgs,
385+
total: results.length,
386+
results,
387+
}, null, 2);
388+
}
389+
288390
// ─── Server Control ────────────────────────────────
289391
case 'status': {
290392
const page = bm.getPage();

0 commit comments

Comments
 (0)