-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Claude extended thinking 400 error caused by summarizeVirtualTools gpt-4o-mini responses in conversation history #4919
Description
Bug Description
When using Claude models (Opus 4.6 / Sonnet 4.6) with extended thinking enabled, pipeline-style workflows that span multiple tool calls eventually fail with an API 400 error. The root cause is summarizeVirtualTools responses generated by gpt-4o-mini being included in the conversation history sent to Claude's API.
Error Message
Request Failed: 400 {"message":"messages.9.content.0.type: Expected thinking or redacted_thinking,
but found text. When thinking is enabled, a final assistant message must start with a thinking block
(preceeding the lastmost set of tool_use and tool_result blocks). We recommend you include thinking
blocks from previous turns. To avoid this requirement, disable thinking."}
Reproduction Steps
- Open a workspace with agent
.mdfiles configured for Claude Opus 4.6 / Sonnet 4.6 - Have deferred tool groups active (e.g., SonarQube, Python environment tools)
- Start a multi-step agent workflow that uses
runSubagentmultiple times - Observe that
summarizeVirtualToolscalls usegpt-4o-miniat session start - After ~8-10 conversation turns (including subagent invocations and user checkpoints), the next Claude API call fails with the 400 error above
Root Cause Analysis
Claude's extended thinking API requires all assistant messages in the conversation history to begin with a thinking or redacted_thinking content block. The summarizeVirtualTools mechanism uses gpt-4o-mini-2024-07-18 to generate tool group summaries. These GPT-4o-mini responses are stored as assistant messages without thinking blocks. When VS Code later constructs the full messages[] array for a Claude API call, these non-thinking assistant messages violate the API contract.
From debug logs, the sequence is:
[20:57:31.429Z] MODEL_TURN: summarizeVirtualTools | model: gpt-4o-mini-2024-07-18 <- no thinking block
[20:57:31.450Z] MODEL_TURN: summarizeVirtualTools | model: gpt-4o-mini-2024-07-18 <- no thinking block
[20:57:34.657Z] MODEL_TURN: panel/editAgent | model: claude-sonnet-4 <- has thinking
...
[later] 400 error at messages.9 <- mixed-model history causes validation failure
The error manifests non-deterministically after enough conversation turns accumulate, because Claude's validation depends on the specific message index positions.
Suggested Fixes
- Inject
redacted_thinkingstubs into non-Claude assistant messages before sending them to Claude's API - Strip
summarizeVirtualToolsresponses from the conversation history after the first turn (they are only needed once for tool discovery) - Use the same model (Claude) for
summarizeVirtualToolswhen the primary session model is Claude
Environment
- VS Code: Latest Insiders (April 2026)
- Copilot Chat Extension: Latest
- Primary Model: Claude Opus 4.6 / Claude Sonnet 4.6
- OS: macOS
Workaround
Using shorter sessions (fewer conversation turns) reduces the probability of hitting the message index threshold. Retrying also works since the fresh conversation lacks the stale gpt-4o-mini messages.
Copilot Request ID
a8e136b1-cb39-47e8-9a84-5031c536c723
GH Request ID
B1D2:22790A:1D09BD5:20104C6:69CD8C28