Skip to content

fix(custom provider): enable prompt caching and accurate context window for custom endpoints serving Claude#50946

Open
Xiaochengzi2048 wants to merge 2 commits into
NousResearch:mainfrom
Xiaochengzi2048:fix/custom-claude-prompt-caching
Open

fix(custom provider): enable prompt caching and accurate context window for custom endpoints serving Claude#50946
Xiaochengzi2048 wants to merge 2 commits into
NousResearch:mainfrom
Xiaochengzi2048:fix/custom-claude-prompt-caching

Conversation

@Xiaochengzi2048

@Xiaochengzi2048 Xiaochengzi2048 commented Jun 22, 2026

Copy link
Copy Markdown

Problem

Two related issues affect users running Claude through a custom OpenAI-compatible endpoint (LiteLLM, Bedrock proxies, local gateways):

1. Prompt caching disabled

anthropic_prompt_cache_policy() returns (False, False) for provider=custom, even when the model name clearly identifies it as Claude. Every request re-bills the full prompt with no cache hits.

2. Context window reported as 200k

lookup_models_dev_context() returns None for provider=custom (not in PROVIDER_TO_MODELS_DEV), so hermes falls back to the 200k default. Claude Opus 4 users see 200k instead of the real 1M window.

Both issues share the same root cause: hermes does not recognise custom as a provider with known model metadata.

Fix

agent/agent_runtime_helpers.py

Add a branch for provider=custom + is_claude that enables caching with envelope layout (same as the existing openrouter branch):

agent/models_dev.py

When provider=custom has no mapping, fall back to the canonical provider based on model name:

This gives Claude Opus 4 its real 1M context window and Sonnet its real 200k — instead of both hitting the 200k fallback.

Related

Closes #50939 (skill delete PermissionError is a separate issue, already in #50940).

@Xiaochengzi2048 Xiaochengzi2048 changed the title fix(caching): enable prompt caching for custom provider serving Claude models fix(custom provider): enable prompt caching and accurate context window for custom endpoints serving Claude Jun 22, 2026
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles P3 Low — cosmetic, nice to have labels Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

skill_manage: _delete_skill crashes on read-only bundled skills (PermissionError)

2 participants