Model fallback chains

## Problem / Motivation

Load balancing works across deployments of the same model, but there is no way to define cross-model failover. If a primary model is down, the request fails instead of falling back to an alternative.

## Proposed Solution

Allow defining fallback chains at the model level. When the primary model (all its deployments) is unavailable, the proxy tries the next model in the chain. Example:

```yaml
models:
  - name: gpt-4o
    fallback: claude-sonnet
    # ...
  - name: claude-sonnet
    fallback: llama-70b
    # ...
```

## Acceptance Criteria

- [ ] Models can reference another model as fallback
- [ ] Fallback triggers when all deployments of the primary model are unavailable
- [ ] Fallback chain depth is configurable (default: 3)
- [ ] Usage tracking records which model actually handled the request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model fallback chains #45

Problem / Motivation

Proposed Solution

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model fallback chains #45

Description

Problem / Motivation

Proposed Solution

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions