Skip to content

Commit 4b89406

Browse files
committed
test(domain-skills): cover #1369 classifier_score=0 quarantine + score>0 promote path
The pre-existing T6 test seeded skills via writeSkill (which defaults classifier_score to 0 until L4 is rewired) and then expected 3 uses to auto-promote. PR #1369 added `current.classifier_score > 0` to the gate specifically to block that path — a quarantined skill written under the influence of a poisoned page would otherwise auto-promote after three benign uses. Updated test asserts both halves of the new contract: - classifier_score=0 + 3 uses → stays quarantined (the security guarantee) - classifier_score>0 + 3 more uses → promotes to active (unblock path) Catches both regressions: the gate going away (would re-allow the bypass) and the unblock path breaking (would silently quarantine all skills forever once L4 is rewired).
1 parent 2944c31 commit 4b89406

1 file changed

Lines changed: 25 additions & 2 deletions

File tree

browse/test/domain-skills-e2e.test.ts

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,11 +84,34 @@ describe('$B domain-skill (E2E gate tier)', () => {
8484
expect(out).toContain('[quarantined] 127.0.0.1');
8585
});
8686

87-
test('readSkill returns null until the skill is promoted to active (T6)', async () => {
87+
test('readSkill returns null while quarantined; classifier_score=0 blocks auto-promote (#1369)', async () => {
8888
const { readSkill, recordSkillUse } = await import('../src/domain-skills');
89+
const jsonlPath = path.join(TMP_HOME, 'projects', 'e2e-test-slug', 'learnings.jsonl');
90+
8991
// While quarantined, readSkill returns null
9092
expect(await readSkill('127.0.0.1', 'e2e-test-slug')).toBeNull();
91-
// Three uses without flag triggers auto-promote
93+
94+
// Three uses without flag with classifier_score=0 (the default until L4 is
95+
// rewired) MUST stay quarantined per #1369. The gate is load-bearing: a
96+
// quarantined skill written under the influence of a poisoned page would
97+
// otherwise auto-promote after three benign uses without the L4 body scan
98+
// ever running.
99+
await recordSkillUse('127.0.0.1', 'e2e-test-slug', false);
100+
await recordSkillUse('127.0.0.1', 'e2e-test-slug', false);
101+
await recordSkillUse('127.0.0.1', 'e2e-test-slug', false);
102+
expect(await readSkill('127.0.0.1', 'e2e-test-slug')).toBeNull();
103+
104+
// Simulate L4 having scored the body (classifier_score > 0) by appending a
105+
// new tombstone row with a non-zero score, then verify the next use
106+
// promotes. This documents the unblock path the day L4 starts populating
107+
// classifier_score for skill writes again.
108+
const lines = (await fs.readFile(jsonlPath, 'utf8')).trim().split('\n').map((l) => JSON.parse(l));
109+
const latest = lines.filter((r: any) => r.type === 'domain' && r.host === '127.0.0.1').pop();
110+
expect(latest).toBeTruthy();
111+
const scored = { ...latest, classifier_score: 0.05, version: latest.version + 1, updated_ts: new Date().toISOString() };
112+
await fs.appendFile(jsonlPath, JSON.stringify(scored) + '\n');
113+
114+
// Now three uses promote
92115
await recordSkillUse('127.0.0.1', 'e2e-test-slug', false);
93116
await recordSkillUse('127.0.0.1', 'e2e-test-slug', false);
94117
await recordSkillUse('127.0.0.1', 'e2e-test-slug', false);

0 commit comments

Comments
 (0)