fix: use per-model output token limits for Anthropic agent provider by elevatingcreativity · Pull Request #5040 · Mintplex-Labs/anything-llm

elevatingcreativity · 2026-02-21T04:29:19Z

Bug

The Anthropic agent provider hardcodes max_tokens: 4096 for all API calls. Modern Claude models (Claude 3.5 and later) support output limits well above 4096 tokens, but the hardcoded value was set when older models had smaller limits. As a result, agent responses are silently cut off at 4096 tokens regardless of which model is selected.

Location: server/utils/agents/aibitat/providers/anthropic.js — max_tokens: 4096 is hardcoded in both the stream() and complete() methods.

Fix

Replaces the hardcoded max_tokens: 4096 with a MODEL_MAX_OUTPUT_TOKENS lookup table mapping each Anthropic model to its actual API-enforced output token limit. Claude 3.5 models use 8192, Claude 3.7 uses 64,000, legacy models retain 4096, and any unknown or future models fall back to 4096 safely.

This also prevents 400 API errors that would occur if a hardcoded value exceeded an older model's output limit — the Anthropic API rejects requests where max_tokens exceeds the model's maximum rather than silently clamping it.

Test plan

Added Jest tests covering all model tiers (legacy, Claude 3.5, Claude 3.7, unknown)
Verified the correct max_tokens value is passed in actual API calls via mock
Run: jest server/__tests__/utils/agents/aibitat/providers/anthropic.test.js

🤖 Generated with Claude Code

Replaces hardcoded max_tokens: 4096 in the Anthropic agent provider with a MODEL_MAX_OUTPUT_TOKENS lookup table mapping each model to its actual API-enforced output token limit. Claude 3.5 models use 8192, Claude 3.7 uses 64000, legacy models retain 4096, and unknown models fall back to 4096 safely. Fixes truncation of agent responses on modern Anthropic models. Also prevents 400 API errors that would occur if a hardcoded value exceeded an older model's output limit. Adds Jest tests covering all model tiers and verifying the correct max_tokens value is sent in API calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

elevatingcreativity · 2026-03-05T22:03:14Z

Hi @timothycarambat I would like to bump this one, I didn't see it pulled in the latest release. It's a simple but important fix. I run into truncated output from Claude within AnythingLLM on many instances (several times today). It would be great to see this fix in the main branch of the code. Thanks for your consideration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: use per-model output token limits for Anthropic agent provider#5040

fix: use per-model output token limits for Anthropic agent provider#5040
elevatingcreativity wants to merge 1 commit intoMintplex-Labs:masterfrom
elevatingcreativity:fix/anthropic-agent-max-tokens

elevatingcreativity commented Feb 21, 2026 •

edited

Loading

Uh oh!

elevatingcreativity commented Mar 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

elevatingcreativity commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug

Fix

Test plan

Uh oh!

elevatingcreativity commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

elevatingcreativity commented Feb 21, 2026 •

edited

Loading

elevatingcreativity commented Mar 5, 2026 •

edited

Loading