Skip to content

Commit 6b775ea

Browse files
backnotpropclaude
andauthored
feat: /plannotator-last — annotate the last agent message (#325)
* feat: add /plannotator-last command to annotate last assistant message Adds a new slash command that extracts the last rendered assistant message from Claude Code's session log and opens it in the annotation UI. Session log parser (apps/hook/server/session-log.ts): - Parses Claude Code JSONL logs at ~/.claude/projects/{slug}/*.jsonl - Finds the last assistant message.id with text content blocks - Skips noise entries (progress, system, file-history-snapshot, queue-operation) - Filters system-generated user messages by prefix to avoid false turn boundaries - Walks backward through empty turns when back-to-back user messages exist - No anchoring — reads from end of log since <command-message> isn't written until after the binary completes New files: - apps/hook/commands/plannotator-last.md — slash command definition - apps/hook/server/session-log.ts — Claude-Code-specific log parser - apps/hook/server/session-log.test.ts — 30 tests covering streaming chunks, tool call turns, sub-agent noise, stop hooks, thinking blocks, and edge cases Modified: - apps/hook/server/index.ts — annotate-last subcommand Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove 3 redundant real-world scenario tests These duplicated coverage already provided by focused unit tests: - "full conversation" → covered by "grabs last message.id in multi-tool turn" - "stop hook interrupted" → covered by "skips progress and system noise" - "long tool-only sequence" → covered by "skips tool-only assistant entries" Kept the thinking block test (unique coverage). 27 tests remain. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add /plannotator-last command to Pi extension Uses Pi's session manager API to find the last assistant message — walks backward through ctx.sessionManager.getEntries(), finds the last entry with role "assistant" and text content, opens it in the annotation UI. Reuses existing isAssistantMessage(), getTextContent(), startAnnotateServer(), and runBrowserReview() from the extension. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add /plannotator-last to OpenCode plugin + extract command handlers Adds annotate-last command that fetches session messages via client.session.messages(), finds the last assistant message with text parts, and opens it in the annotation UI. Refactors command handling: extracts review, annotate, and annotate-last handlers from the inline event hook into commands.ts module. Reduces index.ts by ~120 lines and makes adding future commands cleaner. New files: - apps/opencode-plugin/commands.ts — extracted command handlers - apps/opencode-plugin/commands/plannotator-last.md — command metadata Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: context-aware UI labels for annotate-last mode Adds "annotate-last" mode to the annotate server, passed through to the UI via /api/plan response. The editor uses this to show "Copy message" instead of "Copy plan", and "annotations on the message" in the completion overlay. - packages/server/annotate.ts: new `mode` option on AnnotateServerOptions - packages/editor/App.tsx: annotateSource state derived from mode - packages/ui/components/Viewer.tsx: copyLabel prop for button text - All three harnesses pass mode: "annotate-last" in their callers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Codex support to annotate-last command Detects Codex via CODEX_THREAD_ID env var (injected by Codex into every spawned process). Uses the thread ID to find the rollout file in ~/.codex/sessions/, parses the Codex rollout JSONL format to extract the last assistant message. Also adds `plannotator last` alias for shorter usage in Codex bang commands (!plannotator last). New files: - apps/hook/server/codex-session.ts — Codex rollout parser - apps/hook/server/codex-session.test.ts — 9 tests Modified: - apps/hook/server/index.ts — Codex detection + `last` alias Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: context-aware feedback title + top spacing for paragraph-first content - exportAnnotations now accepts a title param: "Message Feedback" for annotate-last, "File Feedback" for file annotation, "Plan Feedback" for plan review (default) - Adds top spacer when content starts with a paragraph (not a heading) and has no frontmatter, fixing tight spacing in annotate-last mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: add sandbox scripts for Pi and Codex testing - sandbox-pi.sh: builds extension, creates temp project, installs via `pi install`, launches Pi with sample files - sandbox-codex.sh: compiles binary, creates temp project, launches Codex. Test with `!plannotator last` Both follow the same pattern as sandbox-opencode.sh. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add hook build step to opencode sandbox script The opencode build copies HTML from hook/dist/ — without building hook first, the sandbox could use stale HTML. Pi and Codex sandboxes already had this step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove command body from plannotator-last to prevent agent response The .md body was being sent to the agent as a prompt, causing it to respond with "Opening annotation UI..." before the event handler could fetch messages. That response became the "last message" instead of the actual one. Empty body = agent stays silent, event handler intercepts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use command.execute.before hook for OpenCode annotate-last Moves plannotator-last from the passive event hook to the command.execute.before hook. This intercepts the command before the agent sees it, clears output.parts so the agent stays silent, fetches session messages, opens the annotation UI, then sends feedback via client.session.prompt() — same pattern as review/annotate. Previously the agent would respond to the command body before the event handler could fetch messages, polluting the session history. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add Codex to origin type and agent name mapping Origin "codex" was falling through to the default "Coding Agent" label. Added "codex" to the origin union type across annotate server, editor, and removed the `as any` cast in the hook. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remote share link, plan-specific prose, and codex type unions - Add writeRemoteShareLink to annotate-last onReady callback so remote sessions get a reachable URL - Add subject parameter to exportAnnotations so feedback says "message" or "file" instead of "plan" when appropriate - Add 'codex' to origin type unions in useAgents, Settings, UpdateBanner, and App.tsx fetch handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: correct JSDoc for projectSlugFromCwd (leading dash is kept, not stripped) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: use RenderedMessage type instead of inline structural type Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b483214 commit 6b775ea

20 files changed

Lines changed: 1921 additions & 162 deletions

File tree

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
description: Annotate the last rendered assistant message
3+
allowed-tools: Bash(plannotator:*)
4+
---
5+
6+
## Message Annotations
7+
8+
!`plannotator annotate-last`
9+
10+
## Your task
11+
12+
Address the annotation feedback above. The user has reviewed your last message and provided specific annotations and comments.
Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
/**
2+
* Codex Session Parser Tests
3+
*
4+
* Run: bun test apps/hook/server/codex-session.test.ts
5+
*
6+
* Uses synthetic JSONL fixtures matching the real Codex rollout format.
7+
*/
8+
9+
import { describe, expect, test, beforeEach, afterEach } from "bun:test";
10+
import { mkdtempSync, mkdirSync, writeFileSync, rmSync } from "node:fs";
11+
import { tmpdir } from "node:os";
12+
import { join } from "node:path";
13+
import { getLastCodexMessage } from "./codex-session";
14+
15+
// --- Fixture Helpers ---
16+
17+
function rolloutLine(type: string, payload: Record<string, unknown>): string {
18+
return JSON.stringify({
19+
timestamp: new Date().toISOString(),
20+
type,
21+
payload,
22+
});
23+
}
24+
25+
function assistantMessage(text: string): string {
26+
return rolloutLine("response_item", {
27+
type: "message",
28+
role: "assistant",
29+
content: [{ type: "output_text", text }],
30+
});
31+
}
32+
33+
function userMessage(text: string): string {
34+
return rolloutLine("response_item", {
35+
type: "message",
36+
role: "user",
37+
content: [{ type: "input_text", text }],
38+
});
39+
}
40+
41+
function developerMessage(text: string): string {
42+
return rolloutLine("response_item", {
43+
type: "message",
44+
role: "developer",
45+
content: [{ type: "input_text", text }],
46+
});
47+
}
48+
49+
function functionCall(name: string, args: string): string {
50+
return rolloutLine("response_item", {
51+
type: "function_call",
52+
name,
53+
arguments: args,
54+
call_id: `call_${crypto.randomUUID().slice(0, 12)}`,
55+
});
56+
}
57+
58+
function functionOutput(callId: string, output: string): string {
59+
return rolloutLine("response_item", {
60+
type: "function_call_output",
61+
call_id: callId,
62+
output,
63+
});
64+
}
65+
66+
function sessionMeta(): string {
67+
return rolloutLine("session_meta", {
68+
id: crypto.randomUUID(),
69+
cwd: "/tmp/test",
70+
model_provider: "openai",
71+
});
72+
}
73+
74+
function turnContext(): string {
75+
return rolloutLine("turn_context", {
76+
cwd: "/tmp/test",
77+
model: "o3",
78+
});
79+
}
80+
81+
function eventMsg(type: string): string {
82+
return JSON.stringify({
83+
timestamp: new Date().toISOString(),
84+
type: "event_msg",
85+
payload: { type },
86+
});
87+
}
88+
89+
function buildRollout(...lines: string[]): string {
90+
return lines.join("\n");
91+
}
92+
93+
// --- Temp file helpers ---
94+
95+
let tempFiles: string[] = [];
96+
97+
function writeTempRollout(content: string): string {
98+
const dir = mkdtempSync(join(tmpdir(), "plannotator-codex-test-"));
99+
const path = join(dir, "rollout.jsonl");
100+
writeFileSync(path, content);
101+
tempFiles.push(dir);
102+
return path;
103+
}
104+
105+
afterEach(() => {
106+
for (const dir of tempFiles.splice(0)) {
107+
rmSync(dir, { recursive: true, force: true });
108+
}
109+
});
110+
111+
// --- Tests ---
112+
113+
describe("getLastCodexMessage", () => {
114+
test("finds last assistant message", () => {
115+
const path = writeTempRollout(
116+
buildRollout(
117+
sessionMeta(),
118+
userMessage("Hello"),
119+
assistantMessage("Hi there!"),
120+
userMessage("Thanks"),
121+
assistantMessage("You're welcome.")
122+
)
123+
);
124+
const result = getLastCodexMessage(path);
125+
expect(result).not.toBeNull();
126+
expect(result!.text).toBe("You're welcome.");
127+
});
128+
129+
test("skips function_call entries", () => {
130+
const path = writeTempRollout(
131+
buildRollout(
132+
sessionMeta(),
133+
userMessage("Fix the bug"),
134+
assistantMessage("Let me look into that."),
135+
functionCall("exec_command", '{"cmd":"ls"}'),
136+
functionOutput("call_123", "file1.ts\nfile2.ts"),
137+
assistantMessage("Found the issue.")
138+
)
139+
);
140+
const result = getLastCodexMessage(path);
141+
expect(result).not.toBeNull();
142+
expect(result!.text).toBe("Found the issue.");
143+
});
144+
145+
test("skips developer and user messages", () => {
146+
const path = writeTempRollout(
147+
buildRollout(
148+
sessionMeta(),
149+
developerMessage("System instructions..."),
150+
userMessage("Do something"),
151+
assistantMessage("The actual response"),
152+
developerMessage("More instructions"),
153+
userMessage("Another user message")
154+
)
155+
);
156+
const result = getLastCodexMessage(path);
157+
expect(result).not.toBeNull();
158+
expect(result!.text).toBe("The actual response");
159+
});
160+
161+
test("extracts multiple output_text blocks", () => {
162+
const path = writeTempRollout(
163+
buildRollout(
164+
sessionMeta(),
165+
rolloutLine("response_item", {
166+
type: "message",
167+
role: "assistant",
168+
content: [
169+
{ type: "output_text", text: "First part." },
170+
{ type: "output_text", text: "Second part." },
171+
],
172+
})
173+
)
174+
);
175+
const result = getLastCodexMessage(path);
176+
expect(result).not.toBeNull();
177+
expect(result!.text).toBe("First part.\nSecond part.");
178+
});
179+
180+
test("skips event_msg and turn_context entries", () => {
181+
const path = writeTempRollout(
182+
buildRollout(
183+
sessionMeta(),
184+
turnContext(),
185+
userMessage("Hello"),
186+
assistantMessage("Response here"),
187+
eventMsg("task_started"),
188+
turnContext(),
189+
eventMsg("token_count")
190+
)
191+
);
192+
const result = getLastCodexMessage(path);
193+
expect(result).not.toBeNull();
194+
expect(result!.text).toBe("Response here");
195+
});
196+
197+
test("skips assistant messages with empty text", () => {
198+
const path = writeTempRollout(
199+
buildRollout(
200+
sessionMeta(),
201+
assistantMessage("Good response"),
202+
rolloutLine("response_item", {
203+
type: "message",
204+
role: "assistant",
205+
content: [{ type: "output_text", text: " " }],
206+
})
207+
)
208+
);
209+
const result = getLastCodexMessage(path);
210+
expect(result).not.toBeNull();
211+
expect(result!.text).toBe("Good response");
212+
});
213+
214+
test("returns null when no assistant messages exist", () => {
215+
const path = writeTempRollout(
216+
buildRollout(
217+
sessionMeta(),
218+
developerMessage("Instructions"),
219+
userMessage("Hello"),
220+
functionCall("exec_command", '{"cmd":"pwd"}')
221+
)
222+
);
223+
const result = getLastCodexMessage(path);
224+
expect(result).toBeNull();
225+
});
226+
227+
test("returns null for empty file", () => {
228+
const path = writeTempRollout("");
229+
const result = getLastCodexMessage(path);
230+
expect(result).toBeNull();
231+
});
232+
233+
test("skips malformed JSON lines", () => {
234+
const path = writeTempRollout(
235+
buildRollout(
236+
assistantMessage("Valid message"),
237+
"not valid json",
238+
"{broken"
239+
)
240+
);
241+
const result = getLastCodexMessage(path);
242+
expect(result).not.toBeNull();
243+
expect(result!.text).toBe("Valid message");
244+
});
245+
});

apps/hook/server/codex-session.ts

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
/**
2+
* Codex Session Parser
3+
*
4+
* Extracts the last rendered assistant message from a Codex rollout file.
5+
* Codex stores sessions at ~/.codex/sessions/YYYY/MM/DD/rollout-<timestamp>-<uuid>.jsonl
6+
*
7+
* Detection: Codex injects CODEX_THREAD_ID into every spawned process.
8+
* The thread ID is the UUID in the rollout filename.
9+
*
10+
* Rollout format (JSONL, one object per line):
11+
* {"timestamp":"...","type":"response_item","payload":{"type":"message","role":"assistant","content":[{"type":"output_text","text":"..."}]}}
12+
* {"timestamp":"...","type":"response_item","payload":{"type":"function_call","name":"exec_command","arguments":"...","call_id":"..."}}
13+
*/
14+
15+
import { readFileSync, readdirSync, statSync } from "node:fs";
16+
import { join } from "node:path";
17+
import { homedir } from "node:os";
18+
19+
// --- Types ---
20+
21+
interface RolloutEntry {
22+
timestamp?: string;
23+
type: string;
24+
payload?: {
25+
type?: string;
26+
role?: string;
27+
content?: { type: string; text?: string }[];
28+
[key: string]: unknown;
29+
};
30+
}
31+
32+
// --- Rollout File Discovery ---
33+
34+
/**
35+
* Find the Codex rollout file for a given thread ID.
36+
* The thread ID is the UUID portion of the filename:
37+
* rollout-<timestamp>-<uuid>.jsonl
38+
*
39+
* Scans ~/.codex/sessions/ directory tree for a matching file.
40+
*/
41+
export function findCodexRolloutByThreadId(threadId: string): string | null {
42+
const sessionsDir = join(homedir(), ".codex", "sessions");
43+
44+
try {
45+
// Walk YYYY/MM/DD directories in reverse order (most recent first)
46+
const years = readdirSync(sessionsDir).sort().reverse();
47+
for (const year of years) {
48+
const yearDir = join(sessionsDir, year);
49+
if (!isDir(yearDir)) continue;
50+
51+
const months = readdirSync(yearDir).sort().reverse();
52+
for (const month of months) {
53+
const monthDir = join(yearDir, month);
54+
if (!isDir(monthDir)) continue;
55+
56+
const days = readdirSync(monthDir).sort().reverse();
57+
for (const day of days) {
58+
const dayDir = join(monthDir, day);
59+
if (!isDir(dayDir)) continue;
60+
61+
const files = readdirSync(dayDir);
62+
for (const file of files) {
63+
if (file.endsWith(".jsonl") && file.includes(threadId)) {
64+
return join(dayDir, file);
65+
}
66+
}
67+
}
68+
}
69+
}
70+
} catch {
71+
return null;
72+
}
73+
74+
return null;
75+
}
76+
77+
function isDir(path: string): boolean {
78+
try {
79+
return statSync(path).isDirectory();
80+
} catch {
81+
return false;
82+
}
83+
}
84+
85+
// --- Message Extraction ---
86+
87+
/**
88+
* Extract the last assistant message from a Codex rollout file.
89+
*
90+
* Walks backward through the JSONL, finds the last entry where:
91+
* type === "response_item"
92+
* payload.type === "message"
93+
* payload.role === "assistant"
94+
*
95+
* Extracts output_text blocks from payload.content.
96+
*/
97+
export function getLastCodexMessage(
98+
rolloutPath: string
99+
): { text: string } | null {
100+
const content = readFileSync(rolloutPath, "utf-8");
101+
const lines = content.trim().split("\n");
102+
103+
// Walk backward
104+
for (let i = lines.length - 1; i >= 0; i--) {
105+
let entry: RolloutEntry;
106+
try {
107+
entry = JSON.parse(lines[i]);
108+
} catch {
109+
continue;
110+
}
111+
112+
if (entry.type !== "response_item") continue;
113+
if (entry.payload?.type !== "message") continue;
114+
if (entry.payload?.role !== "assistant") continue;
115+
116+
const contentBlocks = entry.payload?.content;
117+
if (!Array.isArray(contentBlocks)) continue;
118+
119+
const textParts = contentBlocks
120+
.filter((b) => b.type === "output_text" && b.text?.trim())
121+
.map((b) => b.text!);
122+
123+
if (textParts.length === 0) continue;
124+
125+
return { text: textParts.join("\n") };
126+
}
127+
128+
return null;
129+
}

0 commit comments

Comments
 (0)