Skip to content

Polyglot benchmark: Claude Opus 4.7 new #1 at 93.3%#5066

Open
anishesg wants to merge 1 commit intoAider-AI:mainfrom
anishesg:polyglot-opus47-adaptive-benchmark
Open

Polyglot benchmark: Claude Opus 4.7 new #1 at 93.3%#5066
anishesg wants to merge 1 commit intoAider-AI:mainfrom
anishesg:polyglot-opus47-adaptive-benchmark

Conversation

@anishesg
Copy link
Copy Markdown

@anishesg anishesg commented Apr 24, 2026

Summary

  • Model: Claude Opus 4.7 via AWS Bedrock (global inference profile) with adaptive thinking (model decides when to think, no fixed budget)
  • Score: 93.3% pass rate (210/225), a new all-time number 1 on the polyglot leaderboard
  • Beats previous number 1 (GPT-5 high at 88.0%) by 5.3 percentage points
  • Perfect response quality: 100% well-formed responses, 0 malformed responses, 0 error outputs, 0 syntax/indentation errors, 0 exhausted context windows
  • Cost: $26.27 total, 18.8 seconds per case
  • Command: aider --model bedrock/global.anthropic.claude-opus-4-7
  • Aider version: 0.86.3.dev

Test plan

  • Verify YAML syntax is valid
  • Confirm entry appears on the rendered polyglot leaderboard page
  • Check that all field names match the existing schema

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 24, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants