Skip to content

feat: add FallbackLLM for built-in provider failover#21894

Open
Oxygen56 wants to merge 1 commit into
run-llama:mainfrom
Oxygen56:feat/llm-failover-19631
Open

feat: add FallbackLLM for built-in provider failover#21894
Oxygen56 wants to merge 1 commit into
run-llama:mainfrom
Oxygen56:feat/llm-failover-19631

Conversation

@Oxygen56
Copy link
Copy Markdown

@Oxygen56 Oxygen56 commented Jun 6, 2026

Description

Implements automatic failover across multiple LLM providers with configurable retries and error classification.

Features

  • FallbackLLM class wraps a list of LLM instances in priority order
  • Error classification: transient errors (timeouts, 429s, 5xx, connection errors) trigger automatic failover to the next provider; permanent errors (4xx auth, bad requests, content policy violations) fail immediately
  • Configurable retries: retry_attempts parameter controls how many times each provider is retried before falling back
  • Observability: all fallback events are logged via Python's standard logging module
  • Full endpoint coverage: supports chat, complete, stream_chat, stream_complete, and all async variants

Example Usage

from llama_index.llms.openai import OpenAI
from llama_index.llms.anthropic import Anthropic
from llama_index.core.llms.fallback import FallbackLLM

openai_llm = OpenAI(model="gpt-4o")
anthropic_llm = Anthropic(model="claude-sonnet-4-20250514")

llm = FallbackLLM(
    llms=[openai_llm, anthropic_llm],
    retry_attempts=2,
)
response = llm.complete("Hello, world!")

Files Changed

  • llama-index-core/llama_index/core/llms/fallback.py - New FallbackLLM implementation
  • llama-index-core/llama_index/core/llms/init.py - Add FallbackLLM to exports
  • llama-index-core/tests/llms/test_fallback_llm.py - Comprehensive test suite (32 tests)

Closes #19631

Implements automatic failover across multiple LLM providers with
configurable retries and error classification.

- New FallbackLLM class wraps a list of LLM instances in priority order
- Transient errors (timeouts, 429s, 5xx) trigger automatic failover
- Permanent errors (4xx auth, bad requests) fail immediately
- Configurable retry_attempts per provider before falling back
- Logs fallback events for observability
- Supports sync, async, and streaming endpoints

Closes run-llama#19631
@dosubot dosubot Bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Built-in LLM Failover for Reliability

1 participant