Skip to content

Fix anthropic thinking#1718

Draft
eformat wants to merge 13 commits intovllm-project:mainfrom
eformat:fix-anthropic-thinking
Draft

Fix anthropic thinking#1718
eformat wants to merge 13 commits intovllm-project:mainfrom
eformat:fix-anthropic-thinking

Conversation

@eformat
Copy link
Copy Markdown

@eformat eformat commented Apr 7, 2026

Purpose

Adds anthropic reasoning / thinking token support to golang server.

Test Plan

Testing using claude models in vertex, local hosted models with vllm-sr. Config is here:

https://github.com/eformat/vllm-sr-claude/blob/main/vllm-sr-config.yaml

Test Result

can now correctly see thinking tokens from claude models sonnet, opus.

{"level":"info","ts":"2026-04-05T22:43:31.547","caller":"client.go:180","msg":"Raw Anthropic response (7123 bytes): {\"model\":\"claude-sonnet-4-6\",\"id\":\"msg_vrtx_012AR97qvoL7RoMLcB7PpiJz\",\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"thinking\",\"thinking\":\"The user wants an analysis of recursion - why it can be both elegant and dangerous. Let me provide a thorough, well-structured analysis covering both sides.\",\"signature\":\"EtwC..."}```

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 7, 2026

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 83a6a43
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/69f07ec8c23cca0008223911
😎 Deploy Preview https://deploy-preview-1718--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

✅ Supply Chain Security Report — All Clear

Scanner Status Findings
AST Codebase Scan (Py, Go, JS/TS, Rust) 27 finding(s) — MEDIUM: 21 · LOW: 6
AST PR Diff Scan No issues detected
Regex Fallback Scan No issues detected

Scanned at 2026-04-28T09:33:24.261Z · View full workflow logs

eformat added 2 commits April 7, 2026 16:39
Signed-off-by: Mike Hepburn <eformat@gmail.com>
Signed-off-by: Mike Hepburn <eformat@gmail.com>
@eformat eformat force-pushed the fix-anthropic-thinking branch from 68938bc to 966e49f Compare April 7, 2026 06:40
eformat and others added 11 commits April 7, 2026 16:40
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…c routing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…election

When a client specifies a specific model (not "auto"), the decision engine
returns an empty selectedModel. This caused llm_model_requests_total to record
model="unknown". Fall back to originalModel for the metric label.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant