Skip to content

Hybrid model routing: local + cloud task routing for cost optimization #632

@kovtcharov

Description

@kovtcharov

Problem

GAIA supports multiple LLM providers (Lemonade local, Claude, OpenAI) but has no intelligent routing between them. Users must manually choose a backend. There is no way to route simple tasks to local/cheap models and complex reasoning to cloud frontier models.

Strategic Context

This is Track C's #1 deliverable and AMD's strongest cost differentiation story. From the strategy doc:

  • A single heartbeat generates 2-3M tokens/day — costing $1-3/day on Sonnet, $30-45/day on Opus
  • Routing to a local model saves 50-90%
  • "This is the most concrete 'AMD saves you money' proof point"

Proposed Architecture

  • Task classifier (small local model or heuristic) determines task complexity
  • Simple tasks (status checks, classification, formatting) → local Lemonade model
  • Complex tasks (multi-step reasoning, code generation, analysis) → cloud model
  • User-configurable routing rules and cost budgets
  • Telemetry: track tokens saved, cost avoided

Dependencies

  • Multi-provider LLM support (already exists in `src/gaia/llm/`)
  • Task complexity classification (new)

Acceptance Criteria

  • Automatic routing based on task complexity
  • User can configure routing preferences
  • Cost savings telemetry visible to user
  • Fallback to cloud when local model quality is insufficient

Metadata

Metadata

Assignees

No one assigned

    Labels

    domain:agent-coreFramework, tools, registry, memory, skills, orchestrationenhancementNew feature or requestp1medium prioritytrack:consumer-appHermes-competitor consumer product — mobile-first, voice + messaging + memory + skills

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions