Skip to content

docs(blog): fix Opus 4.6/4.7/4.8 thinking mode and effort guidance#255

Open
mateo-berri wants to merge 1 commit into
mainfrom
litellm_docs_opus_thinking_effort_accuracy
Open

docs(blog): fix Opus 4.6/4.7/4.8 thinking mode and effort guidance#255
mateo-berri wants to merge 1 commit into
mainfrom
litellm_docs_opus_thinking_effort_accuracy

Conversation

@mateo-berri
Copy link
Copy Markdown
Collaborator

@mateo-berri mateo-berri commented May 29, 2026

Summary

The day-0 blog posts for Claude Opus 4.6, 4.7, and 4.8 carried a copy-pasted "Adaptive Thinking" note that told readers to pass the native thinking parameter directly to use explicit thinking budgets with type: "enabled". That guidance is wrong for Opus 4.7 and 4.8.

Per Anthropic's adaptive thinking and effort docs, adaptive thinking is the only supported mode on Opus 4.7 and 4.8, and a manual thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error. LiteLLM forwards the native thinking parameter unchanged on both /chat/completions and the /v1/messages passthrough, so following the old instructions just produces that 400. On Opus 4.6 the explicit budget still works but is deprecated and no longer recommended. Each note now states the actual behavior and points readers at output_config.effort, paired with adaptive thinking, as the way to control thinking depth.

Effort ladder correction

While verifying the above, the 4.7 post claimed max effort was "Claude Opus 4.6 only and is not available on 4.7." Anthropic's effort docs list max as available on Opus 4.6, 4.7, and 4.8, and LiteLLM's REASONING_EFFORT_TO_OUTPUT_CONFIG_EFFORT maps max to max with no per-model gating. So the 4.7 effort section now lists all five levels (low, medium, high, xhigh, max), the 4.8 "max (previously Opus 4.6 only)" parenthetical is corrected to "also available on Opus 4.6 and 4.7," and the reasoning_effort enumeration in each Adaptive Thinking note now includes max so it matches that model's own effort section. The 4.6 note gains max but not xhigh, since 4.6 never supported xhigh.

Verification

Cross-checked against Anthropic's official docs (the adaptive thinking and effort pages) and the LiteLLM source. _map_reasoning_effort short-circuits every effort value to thinking: {type: "adaptive"} for adaptive-thinking models, the thinking parameter is passed through verbatim, and the effort mapping applies no per-model max restriction. These are prose-only changes to three blog markdown files, with no code or behavior changes.

Type

Documentation


Note

Low Risk
Markdown-only documentation corrections with no code or configuration changes.

Overview
Corrects Adaptive Thinking and Effort Levels guidance in the Opus 4.6, 4.7, and 4.8 day-0 blog posts so it matches Anthropic’s current API behavior and LiteLLM’s passthrough.

The shared note no longer tells readers to use native thinking: {type: "enabled", budget_tokens: ...} as the primary path. Opus 4.7 and 4.8 are documented as adaptive-only (explicit budgets → 400); depth should be tuned with output_config.effort alongside adaptive thinking. Opus 4.6 keeps explicit budgets as still accepted but deprecated, with the same effort-based recommendation.

Effort ladder fixes: reasoning_effort lists now include max where appropriate (4.6 adds max, not xhigh; 4.7/4.8 include max). 4.7 drops the wrong claim that max is 4.6-only and documents five levels including max, with updated guidance on xhigh vs max. 4.8 fixes the parenthetical that max was “previously Opus 4.6 only.”

Prose-only; no runtime or gateway behavior changes.

Reviewed by Cursor Bugbot for commit c1d5fb6. Bugbot is set up for automated code reviews on this repo. Configure here.

The Adaptive Thinking notes told readers to pass the native thinking parameter directly to get explicit thinking budgets. That is wrong on Opus 4.7 and 4.8, where the Anthropic API rejects thinking {type: enabled, budget_tokens} with a 400 and adaptive is the only supported mode. On Opus 4.6 it still works but is deprecated. Each note now says so and points to output_config.effort for controlling thinking depth.

The 4.7 post also claimed max effort was Opus 4.6 only and unavailable on 4.7; max is in fact supported on 4.6, 4.7, and 4.8. The 4.7 effort section now lists all five levels, the 4.8 'previously 4.6 only' parenthetical is corrected, and the reasoning_effort enumeration in each note now includes max so it matches that model's effort section (4.6 gets max but not xhigh, which it never supported).
@vercel
Copy link
Copy Markdown

vercel Bot commented May 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment May 29, 2026 5:46pm

Request Review

@mateo-berri
Copy link
Copy Markdown
Collaborator Author

bugbot run

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit c1d5fb6. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant