Hybrid model routing: local + cloud task routing for cost optimization

## Problem

GAIA supports multiple LLM providers (Lemonade local, Claude, OpenAI) but has no intelligent routing between them. Users must manually choose a backend. There is no way to route simple tasks to local/cheap models and complex reasoning to cloud frontier models.

## Strategic Context

This is Track C's #1 deliverable and AMD's strongest cost differentiation story. From the strategy doc:

- A single heartbeat generates 2-3M tokens/day — costing \$1-3/day on Sonnet, \$30-45/day on Opus
- Routing to a local model saves 50-90%
- "This is the most concrete 'AMD saves you money' proof point"

## Proposed Architecture

- Task classifier (small local model or heuristic) determines task complexity
- Simple tasks (status checks, classification, formatting) → local Lemonade model
- Complex tasks (multi-step reasoning, code generation, analysis) → cloud model
- User-configurable routing rules and cost budgets
- Telemetry: track tokens saved, cost avoided

## Dependencies

- Multi-provider LLM support (already exists in \`src/gaia/llm/\`)
- Task complexity classification (new)

## Acceptance Criteria

- [ ] Automatic routing based on task complexity
- [ ] User can configure routing preferences
- [ ] Cost savings telemetry visible to user
- [ ] Fallback to cloud when local model quality is insufficient

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hybrid model routing: local + cloud task routing for cost optimization #632

Problem

Strategic Context

Proposed Architecture

Dependencies

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hybrid model routing: local + cloud task routing for cost optimization #632

Description

Problem

Strategic Context

Proposed Architecture

Dependencies

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions