Problem
GAIA supports multiple LLM providers (Lemonade local, Claude, OpenAI) but has no intelligent routing between them. Users must manually choose a backend. There is no way to route simple tasks to local/cheap models and complex reasoning to cloud frontier models.
Strategic Context
This is Track C's #1 deliverable and AMD's strongest cost differentiation story. From the strategy doc:
- A single heartbeat generates 2-3M tokens/day — costing $1-3/day on Sonnet, $30-45/day on Opus
- Routing to a local model saves 50-90%
- "This is the most concrete 'AMD saves you money' proof point"
Proposed Architecture
- Task classifier (small local model or heuristic) determines task complexity
- Simple tasks (status checks, classification, formatting) → local Lemonade model
- Complex tasks (multi-step reasoning, code generation, analysis) → cloud model
- User-configurable routing rules and cost budgets
- Telemetry: track tokens saved, cost avoided
Dependencies
- Multi-provider LLM support (already exists in `src/gaia/llm/`)
- Task complexity classification (new)
Acceptance Criteria
Problem
GAIA supports multiple LLM providers (Lemonade local, Claude, OpenAI) but has no intelligent routing between them. Users must manually choose a backend. There is no way to route simple tasks to local/cheap models and complex reasoning to cloud frontier models.
Strategic Context
This is Track C's #1 deliverable and AMD's strongest cost differentiation story. From the strategy doc:
Proposed Architecture
Dependencies
Acceptance Criteria