Skip to content

[Bug]: /rerank does not apply Virtual Key fallback chain (Fallbacks=None) #25298

@Sunsilkk

Description

@Sunsilkk

Check for existing issues

What happened?

When configuring a Virtual Key in the LiteLLM UI with a Fallback Chain for a rerank model, requests to /rerank do not see any fallback configuration.

In the Virtual Keys UI:

  • Primary model: Qwen3-Reranker-8B
  • Fallback chain includes: deepinfra/Qwen/Qwen3-Reranker-8B

Then I create/copy the virtual key and call /rerank with:

  • model: "Qwen3-Reranker-8B"
  • mock_testing_fallbacks: true

Expected behavior:

  • LiteLLM should pick up the fallback chain configured on the key
  • mock_testing_fallbacks: true should trigger fallback routing
  • Request should continue to the fallback model instead of failing with Fallbacks=None

Actual behavior:

  • LiteLLM throws an internal mock exception and reports:
    • Fallbacks=None
    • Available Model Group Fallbacks=None

This makes it look like the fallback chain saved on the Virtual Key is not being applied to /rerank requests at all.

Steps to Reproduce

  1. Open LiteLLM UI -> Virtual Keys
  2. Create a new virtual key
  3. Set primary model to Qwen3-Reranker-8B
  4. In Router Settings, configure a fallback chain with deepinfra/Qwen/Qwen3-Reranker-8B
  5. Save the key and copy the generated key
  6. Send a rerank request like this:
curl -X POST "https://<proxy>/rerank" \
  -H "Authorization: Bearer <VIRTUAL_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen3-Reranker-8B",
    "query": "ping",
    "documents": [
      "say ping",
      "say pong",
      "hello world"
    ],
    "top_n": 2,
    "mock_testing_fallbacks": true
  }'
  1. Observe that the request fails immediately instead of routing to the configured fallback

Relevant log output

{
  "error": {
    "message": "litellm.InternalServerError: This is a mock exception for model=Qwen3-Reranker-8B, to trigger a fallback. Fallbacks=None. Received Model Group=Qwen3-Reranker-8B\nAvailable Model Group Fallbacks=None",
    "type": null,
    "param": null,
    "code": "500"
  }
}

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.83.3

Twitter / LinkedIn details

No response

Image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions