Bug
When provider_budget_config is enabled (e.g., anthropic: 5.0/24h), every call to /v1/messages (Anthropic format) and /v1/embeddings triggers a ValueError in the budget-limiter callback. Calls succeed (200 OK), but stderr floods with traceback from budget_limiter.async_log_success_event.
Reproducer
LiteLLM v1.83.7, config:
litellm_settings:
callbacks: ["prometheus"]
provider_budget_config:
anthropic:
budget_limit: 5.0
time_period: "24h"
model_list:
- model_name: claude-haiku-4-5-direct-anthropic
litellm_params:
model: anthropic/claude-haiku-4-5-20251001
api_key: os.environ/ANTHROPIC_API_KEY
Then call:
curl -X POST http://litellm:4000/v1/messages \
-H "x-api-key: $LITELLM_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{"model":"claude-haiku-4-5-direct-anthropic","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'
Returns 200 + valid response, BUT stderr emits ValueError from router_strategy/budget_limiter.py complaining about missing custom_llm_provider in the kwargs/data dict.
Same behavior for /v1/embeddings calls.
Frequency
In our production deployment: 306 ValueError tracebacks / 2 hours during normal operation (call volume ~100 req/h split between /v1/messages and /v1/embeddings).
Workaround
Disable provider_budget_config entirely. This loses the hard-cap protection but stops the spam. We replaced it with a Prometheus alert on litellm_spend_metric_total as a soft-warning fallback.
Root-cause hypothesis
The provider_budget_config callback async_log_success_event reads custom_llm_provider from data (or kwargs), but the request-routing layer for /v1/messages and /v1/embeddings does NOT inject custom_llm_provider into the kwargs the way /v1/chat/completions does. We tried adding custom_llm_provider: under litellm_params: in YAML — wirkungslos (LiteLLM reads data.get at deployment-top-level, not from litellm_params).
Distinct from existing issues
Environment
- LiteLLM proxy v1.83.7 (latest stable)
- Deployment via systemd on Ubuntu 24.04
- Python 3.12
- Anthropic provider via
anthropic/ prefix routing
Bug
When
provider_budget_configis enabled (e.g.,anthropic: 5.0/24h), every call to/v1/messages(Anthropic format) and/v1/embeddingstriggers aValueErrorin the budget-limiter callback. Calls succeed (200 OK), but stderr floods with traceback frombudget_limiter.async_log_success_event.Reproducer
LiteLLM v1.83.7, config:
Then call:
Returns 200 + valid response, BUT stderr emits ValueError from
router_strategy/budget_limiter.pycomplaining about missingcustom_llm_providerin the kwargs/data dict.Same behavior for
/v1/embeddingscalls.Frequency
In our production deployment: 306 ValueError tracebacks / 2 hours during normal operation (call volume ~100 req/h split between
/v1/messagesand/v1/embeddings).Workaround
Disable
provider_budget_configentirely. This loses the hard-cap protection but stops the spam. We replaced it with a Prometheus alert onlitellm_spend_metric_totalas a soft-warning fallback.Root-cause hypothesis
The
provider_budget_configcallbackasync_log_success_eventreadscustom_llm_providerfromdata(or kwargs), but the request-routing layer for/v1/messagesand/v1/embeddingsdoes NOT injectcustom_llm_providerinto the kwargs the way/v1/chat/completionsdoes. We tried addingcustom_llm_provider:underlitellm_params:in YAML — wirkungslos (LiteLLM readsdata.getat deployment-top-level, not from litellm_params).Distinct from existing issues
anthropic/-prefix correctly, bug appears on every call regardless of UI involvement.Environment
anthropic/prefix routing