-
Notifications
You must be signed in to change notification settings - Fork 406
Description
Problem Summary
ClawBot/pi-coding-agent intermittently fails with:
LLM request rejected: messages.X.content.Y: unexpected tool_use_id found in tool_result blocks: toolu_XXXXX. Each tool_result block must have a corresponding tool_use block in the previous message.
This happens when tool calls fail mid-execution due to JSON serialization errors like "Bad control character in string literal" or "Bad escaped character in JSON".
Root Cause Analysis
The issue stems from three gaps in JSON sanitization where unpaired Unicode surrogates or control characters can cause JSON.stringify() to throw:
Gap 1: Tool Arguments Not Sanitized (PRIMARY CAUSE)
File: packages/pi-ai/src/providers/anthropic.ts (around line 473 in compiled JS)
// VULNERABLE - arguments not sanitized
blocks.push({
type: "tool_use",
id: block.id,
name: block.name,
input: block.arguments, // ❌ MISSING sanitizeSurrogates() call
});Compare to text content which IS sanitized (lines 445, 457, 463 in compiled JS use sanitizeSurrogates()).
Gap 2: Session Manager Uses Raw JSON.stringify
File: packages/pi-coding-agent/src/core/session-manager.ts (around lines 505/510 in compiled JS)
// VULNERABLE - No error handling or custom replacer
appendFileSync(this.sessionFile, `${JSON.stringify(entry)}\n`);If entry contains unpaired Unicode surrogates or control characters, this throws and the message is lost.
Gap 3: Tool Result Content Not Sanitized
File: ClawBot's dist/agents/tools/common.js (line 111)
export function jsonResult(payload) {
return {
content: [{
type: "text",
text: JSON.stringify(payload, null, 2), // ❌ No sanitization
}],
};
}Failure Sequence
- Tool receives data with unpaired surrogates or control characters (common in file content, grep results, binary files)
- Tool arguments are parsed from streaming JSON but NOT sanitized before being sent back to the API
- Session manager calls
JSON.stringify(entry)which throws - Tool result message fails to persist to JSONL
- Tool call (assistant message) WAS already persisted → orphaned tool_use
- Next API call → Anthropic rejects: "unexpected tool_use_id"
Suggested Fixes
Fix 1: Sanitize Tool Arguments in anthropic.ts
Add a recursive sanitization helper and apply it to tool arguments:
function sanitizeObject(obj: unknown): unknown {
if (obj === null || obj === undefined) return obj;
if (typeof obj === "string") return sanitizeSurrogates(obj);
if (Array.isArray(obj)) return obj.map(sanitizeObject);
if (typeof obj === "object") {
const result: Record<string, unknown> = {};
for (const key of Object.keys(obj)) {
result[key] = sanitizeObject((obj as Record<string, unknown>)[key]);
}
return result;
}
return obj;
}
// Then at line ~473:
input: sanitizeObject(block.arguments),Fix 2: Add Safe JSON Stringify to Session Manager
function safeJsonStringify(obj: unknown): string {
try {
return JSON.stringify(obj, (_key, value) => {
if (typeof value === "string") {
return sanitizeSurrogates(value);
}
return value;
});
} catch (e) {
const errorMsg = e instanceof Error ? e.message : "unknown serialization error";
return JSON.stringify({ type: "serialization_error", error: errorMsg, timestamp: new Date().toISOString() });
}
}Then replace all JSON.stringify(entry) calls with safeJsonStringify(entry).
Environment
- Node.js v22.15.0
- ClawBot (latest from npm)
- macOS
Reproduction
- Use ClawBot to read a file containing binary data or unpaired Unicode surrogates
- The tool result will contain problematic characters
- Session persistence fails silently
- Next LLM request fails with "unexpected tool_use_id"
Workaround
I've applied local patches to the installed npm packages, but these will be lost on update. Happy to submit a PR if you'd prefer.