Skip to content

Commit a80bd3a

Browse files
committed
Merge remote-tracking branch 'origin/main' into garrytan/dublin-v1
# Conflicts: # CHANGELOG.md # VERSION # bin/gstack-memory-ingest.ts # package.json # test/gstack-memory-ingest.test.ts
2 parents f571ffb + 7489506 commit a80bd3a

58 files changed

Lines changed: 990 additions & 30 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

autoplan/SKILL.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -324,6 +324,26 @@ Effort both-scales: when an option involves effort, label both human-team and CC
324324

325325
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
326326

327+
12. **Non-ASCII characters — write directly, never \u-escape.** When any
328+
string field (question, option label, option description) contains
329+
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
330+
the literal UTF-8 characters in the JSON string. **Never escape them
331+
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
332+
and passes characters through unchanged. Manually escaping requires
333+
recalling each codepoint from training, which is unreliable for long
334+
CJK strings — the model regularly emits the wrong codepoint (e.g.
335+
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
336+
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
337+
The trigger is long, multi-line questions with hundreds of CJK
338+
characters: that is exactly when reflexive escaping kicks in and
339+
exactly when miscoding is most damaging. Long ≠ escape. Keep
340+
characters literal.
341+
342+
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
343+
Right: `"question": "請選擇管理工具"`
344+
345+
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
346+
327347
### Self-check before emitting
328348

329349
Before calling AskUserQuestion, verify:
@@ -336,6 +356,7 @@ Before calling AskUserQuestion, verify:
336356
- [ ] Dual-scale effort labels on effort-bearing options (human / CC)
337357
- [ ] Net line closes the decision
338358
- [ ] You are calling the tool, not writing prose
359+
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
339360

340361

341362
## Artifacts Sync (skill start)

bin/gstack-memory-ingest.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -866,6 +866,13 @@ function renderPageBody(page: PageRecord): string {
866866
body,
867867
].join("\n");
868868
}
869+
// Strip NUL bytes — Postgres rejects 0x00 in UTF-8 text columns. Some Claude
870+
// Code transcripts contain NUL inside user-pasted content or tool output, and
871+
// surfacing those as `internal_error: invalid byte sequence` from the brain
872+
// is unhelpful when we can sanitize at write time. Originally landed in v1.32.0.0
873+
// (PR #1411) on the per-file `gbrain put` path; moved here so all staged
874+
// pages still get the same sanitization.
875+
body = body.replace(/\x00/g, "");
869876
return body;
870877
}
871878

browse/src/token-registry.ts

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,20 @@ export function getRootToken(): string {
155155
}
156156

157157
export function isRootToken(token: string): boolean {
158-
return token === rootToken;
158+
// Constant-time compare so a tunnel-reachable caller who can provoke an
159+
// isRootToken() call (e.g., via the 403 "root over tunnel" rejection path)
160+
// can't measure byte-by-byte string-compare timing to recover the token.
161+
// Compare UTF-8 byte lengths (not JS string length) before timingSafeEqual,
162+
// which throws on length-mismatched buffers. A multibyte input whose JS
163+
// string length matches rootToken but whose UTF-8 byte length differs must
164+
// return false on the auth path, not error out.
165+
if (!rootToken) return false;
166+
const tokenBytes = Buffer.byteLength(token, 'utf8');
167+
const rootBytes = Buffer.byteLength(rootToken, 'utf8');
168+
if (tokenBytes !== rootBytes) return false;
169+
const a = Buffer.from(token, 'utf8');
170+
const b = Buffer.from(rootToken, 'utf8');
171+
return crypto.timingSafeEqual(a, b);
159172
}
160173

161174
function generateToken(prefix: string): string {

browse/src/url-validation.ts

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -19,14 +19,15 @@ export const BLOCKED_METADATA_HOSTS = new Set([
1919
]);
2020

2121
/**
22-
* IPv6 prefixes to block (CIDR-style). Any address starting with these
23-
* hex prefixes is rejected. Covers the full ULA range (fc00::/7 = fc00:: and fd00::).
22+
* IPv6 prefixes to block (CIDR-style). ULA addresses cover fc00::/7 and
23+
* link-local addresses cover fe80::/10.
2424
*/
25-
const BLOCKED_IPV6_PREFIXES = ['fc', 'fd'];
25+
const BLOCKED_IPV6_PREFIXES = ['fc', 'fd', 'fe8', 'fe9', 'fea', 'feb'];
2626

2727
/**
2828
* Check if an IPv6 address falls within a blocked prefix range.
29-
* Handles the full ULA range (fc00::/7), not just the exact literal fd00::.
29+
* Handles the full ULA range (fc00::/7) and link-local range (fe80::/10),
30+
* not just exact literals like fd00:: or fe80::1.
3031
* Only matches actual IPv6 addresses (must contain ':'), not hostnames
3132
* like fd.example.com or fcustomer.com.
3233
*/
@@ -95,9 +96,7 @@ async function resolvesToBlockedIp(hostname: string): Promise<boolean> {
9596
const v6Check = resolve6(hostname).then(
9697
(addresses) => addresses.some(addr => {
9798
const normalized = addr.toLowerCase();
98-
return BLOCKED_METADATA_HOSTS.has(normalized) || isBlockedIpv6(normalized) ||
99-
// fe80::/10 is link-local — always block (covers all fe80:: addresses)
100-
normalized.startsWith('fe80:');
99+
return BLOCKED_METADATA_HOSTS.has(normalized) || isBlockedIpv6(normalized);
101100
}),
102101
() => false, // ENODATA / ENOTFOUND — no AAAA records, not a risk
103102
);

browse/test/sidebar-tabs.test.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -254,3 +254,15 @@ describe('manifest: ws permission + xterm-safe CSP', () => {
254254
}
255255
});
256256
});
257+
258+
describe('manifest: live tab awareness needs "tabs" permission', () => {
259+
// Without "tabs", chrome.tabs.query() returns tab objects with undefined
260+
// url/title for any site outside host_permissions (e.g., everything except
261+
// 127.0.0.1). snapshotTabs() then writes empty strings into tabs.json and
262+
// active-tab.json silently skips the write — the sidebar agent loses track
263+
// of what page the user is on. activeTab is too narrow (only after a user
264+
// gesture on the extension action) for background polling.
265+
test('permissions includes "tabs"', () => {
266+
expect(MANIFEST.permissions).toContain('tabs');
267+
});
268+
});

browse/test/token-registry.test.ts

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,39 @@ describe('token-registry', () => {
2828
expect(info!.scopes).toEqual(['read', 'write', 'admin', 'meta', 'control']);
2929
expect(info!.rateLimit).toBe(0);
3030
});
31+
32+
// Regression: the previous fix did a JS string-length short-circuit before
33+
// crypto.timingSafeEqual, but the buffers passed in are UTF-8. A multibyte
34+
// input with matching string length but mismatched byte length would slip
35+
// past the check and crash inside timingSafeEqual. Auth path must return
36+
// false, not error.
37+
it('returns false for a multibyte token whose string length matches but UTF-8 byte length differs', () => {
38+
// 'root-token-for-tests' is 20 ASCII chars (20 bytes).
39+
// 'é'.repeat(20) is 20 chars but 40 UTF-8 bytes.
40+
const multibyte = 'é'.repeat(20);
41+
expect(multibyte.length).toBe('root-token-for-tests'.length);
42+
expect(Buffer.byteLength(multibyte, 'utf8')).not.toBe(
43+
Buffer.byteLength('root-token-for-tests', 'utf8'),
44+
);
45+
expect(() => isRootToken(multibyte)).not.toThrow();
46+
expect(isRootToken(multibyte)).toBe(false);
47+
});
48+
49+
it('returns false for a token that differs only in length (same prefix)', () => {
50+
expect(isRootToken('root-token-for-tests-extra')).toBe(false);
51+
expect(isRootToken('root-token-for-test')).toBe(false);
52+
});
53+
54+
it('returns false for a same-length token that differs only in the last byte', () => {
55+
const expected = 'root-token-for-tests';
56+
const wrong = expected.slice(0, -1) + (expected.endsWith('x') ? 'y' : 'x');
57+
expect(wrong.length).toBe(expected.length);
58+
expect(isRootToken(wrong)).toBe(false);
59+
});
60+
61+
it('returns false for the empty string even when root is set', () => {
62+
expect(isRootToken('')).toBe(false);
63+
});
3164
});
3265

3366
describe('createToken', () => {

browse/test/url-validation.test.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,10 @@ describe('validateNavigationUrl', () => {
9999
await expect(validateNavigationUrl('http://[fc00::]/')).rejects.toThrow(/cloud metadata/i);
100100
});
101101

102+
it('blocks direct IPv6 link-local addresses', async () => {
103+
await expect(validateNavigationUrl('http://[fe80::2]/')).rejects.toThrow(/cloud metadata/i);
104+
});
105+
102106
it('does not block hostnames starting with fd (e.g. fd.example.com)', async () => {
103107
await expect(validateNavigationUrl('https://fd.example.com/')).resolves.toBe('https://fd.example.com/');
104108
});

canary/SKILL.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,26 @@ Effort both-scales: when an option involves effort, label both human-team and CC
316316

317317
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
318318

319+
12. **Non-ASCII characters — write directly, never \u-escape.** When any
320+
string field (question, option label, option description) contains
321+
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
322+
the literal UTF-8 characters in the JSON string. **Never escape them
323+
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
324+
and passes characters through unchanged. Manually escaping requires
325+
recalling each codepoint from training, which is unreliable for long
326+
CJK strings — the model regularly emits the wrong codepoint (e.g.
327+
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
328+
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
329+
The trigger is long, multi-line questions with hundreds of CJK
330+
characters: that is exactly when reflexive escaping kicks in and
331+
exactly when miscoding is most damaging. Long ≠ escape. Keep
332+
characters literal.
333+
334+
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
335+
Right: `"question": "請選擇管理工具"`
336+
337+
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
338+
319339
### Self-check before emitting
320340

321341
Before calling AskUserQuestion, verify:
@@ -328,6 +348,7 @@ Before calling AskUserQuestion, verify:
328348
- [ ] Dual-scale effort labels on effort-bearing options (human / CC)
329349
- [ ] Net line closes the decision
330350
- [ ] You are calling the tool, not writing prose
351+
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
331352

332353

333354
## Artifacts Sync (skill start)

codex/SKILL.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,26 @@ Effort both-scales: when an option involves effort, label both human-team and CC
318318

319319
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
320320

321+
12. **Non-ASCII characters — write directly, never \u-escape.** When any
322+
string field (question, option label, option description) contains
323+
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
324+
the literal UTF-8 characters in the JSON string. **Never escape them
325+
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
326+
and passes characters through unchanged. Manually escaping requires
327+
recalling each codepoint from training, which is unreliable for long
328+
CJK strings — the model regularly emits the wrong codepoint (e.g.
329+
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
330+
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
331+
The trigger is long, multi-line questions with hundreds of CJK
332+
characters: that is exactly when reflexive escaping kicks in and
333+
exactly when miscoding is most damaging. Long ≠ escape. Keep
334+
characters literal.
335+
336+
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
337+
Right: `"question": "請選擇管理工具"`
338+
339+
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
340+
321341
### Self-check before emitting
322342

323343
Before calling AskUserQuestion, verify:
@@ -330,6 +350,7 @@ Before calling AskUserQuestion, verify:
330350
- [ ] Dual-scale effort labels on effort-bearing options (human / CC)
331351
- [ ] Net line closes the decision
332352
- [ ] You are calling the tool, not writing prose
353+
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
333354

334355

335356
## Artifacts Sync (skill start)

context-restore/SKILL.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,26 @@ Effort both-scales: when an option involves effort, label both human-team and CC
320320

321321
Net line closes the tradeoff. Per-skill instructions may add stricter rules.
322322

323+
12. **Non-ASCII characters — write directly, never \u-escape.** When any
324+
string field (question, option label, option description) contains
325+
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
326+
the literal UTF-8 characters in the JSON string. **Never escape them
327+
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
328+
and passes characters through unchanged. Manually escaping requires
329+
recalling each codepoint from training, which is unreliable for long
330+
CJK strings — the model regularly emits the wrong codepoint (e.g.
331+
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
332+
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
333+
The trigger is long, multi-line questions with hundreds of CJK
334+
characters: that is exactly when reflexive escaping kicks in and
335+
exactly when miscoding is most damaging. Long ≠ escape. Keep
336+
characters literal.
337+
338+
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
339+
Right: `"question": "請選擇管理工具"`
340+
341+
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
342+
323343
### Self-check before emitting
324344

325345
Before calling AskUserQuestion, verify:
@@ -332,6 +352,7 @@ Before calling AskUserQuestion, verify:
332352
- [ ] Dual-scale effort labels on effort-bearing options (human / CC)
333353
- [ ] Net line closes the decision
334354
- [ ] You are calling the tool, not writing prose
355+
- [ ] Non-ASCII characters (CJK / accents) written directly, NOT \u-escaped
335356

336357

337358
## Artifacts Sync (skill start)

0 commit comments

Comments
 (0)