Skip to content

[Bug] custom_llm_provider not propagated to budget_limiter.async_log_success_event for /v1/messages + /v1/embeddings #26701

@nucocloud

Description

@nucocloud

Bug

When provider_budget_config is enabled (e.g., anthropic: 5.0/24h), every call to /v1/messages (Anthropic format) and /v1/embeddings triggers a ValueError in the budget-limiter callback. Calls succeed (200 OK), but stderr floods with traceback from budget_limiter.async_log_success_event.

Reproducer

LiteLLM v1.83.7, config:

litellm_settings:
  callbacks: ["prometheus"]

provider_budget_config:
  anthropic:
    budget_limit: 5.0
    time_period: "24h"

model_list:
  - model_name: claude-haiku-4-5-direct-anthropic
    litellm_params:
      model: anthropic/claude-haiku-4-5-20251001
      api_key: os.environ/ANTHROPIC_API_KEY

Then call:

curl -X POST http://litellm:4000/v1/messages \
  -H "x-api-key: $LITELLM_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-haiku-4-5-direct-anthropic","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'

Returns 200 + valid response, BUT stderr emits ValueError from router_strategy/budget_limiter.py complaining about missing custom_llm_provider in the kwargs/data dict.

Same behavior for /v1/embeddings calls.

Frequency

In our production deployment: 306 ValueError tracebacks / 2 hours during normal operation (call volume ~100 req/h split between /v1/messages and /v1/embeddings).

Workaround

Disable provider_budget_config entirely. This loses the hard-cap protection but stops the spam. We replaced it with a Prometheus alert on litellm_spend_metric_total as a soft-warning fallback.

Root-cause hypothesis

The provider_budget_config callback async_log_success_event reads custom_llm_provider from data (or kwargs), but the request-routing layer for /v1/messages and /v1/embeddings does NOT inject custom_llm_provider into the kwargs the way /v1/chat/completions does. We tried adding custom_llm_provider: under litellm_params: in YAML — wirkungslos (LiteLLM reads data.get at deployment-top-level, not from litellm_params).

Distinct from existing issues

Environment

  • LiteLLM proxy v1.83.7 (latest stable)
  • Deployment via systemd on Ubuntu 24.04
  • Python 3.12
  • Anthropic provider via anthropic/ prefix routing

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions