AI Router

Data-driven LLM model routing. Fetches cost, intelligence, and performance data from the Artificial Analysis API, computes the Pareto frontier, and generates tier-based routing profiles.

See ROUTER_SPEC.md for the runtime implementation spec (classifier, fallback logic, optimizations).

Setup

bun install

Get an API key from artificialanalysis.ai (free tier: 1,000 requests/day).

export AA_API_KEY=your_key_here

Usage

Quick start

bun run tracker:save

That's it. The tracker fetches model data, auto-calibrates tier thresholds from the cost/intelligence distribution, builds the model registry, generates all 5 routing profiles, and writes everything to config/.

How thresholds work

Tier thresholds are auto-calibrated from the data by default. The system uses percentiles of the cost and intelligence distributions to derive boundaries:

Tier	Cost ceiling	Intelligence floor
simple	P35 cost	P5 intelligence
medium	P60 cost	P20 intelligence
complex	P85 cost	P40 intelligence
reasoning	P85 cost	P40 intelligence + P30 math
premium	uncapped	P70 intelligence

Bands deliberately overlap so cost-effective models (e.g. MiniMax) appear across multiple tiers, creating a mixture. Within each tier, the profile strategy (cost-efficiency, maximize intelligence, etc.) determines the ranking.

As the model landscape shifts (new models, price changes, score updates), thresholds automatically adapt on each run.

After each run, config/MODELS.md shows the full breakdown of which models land in which tiers.

To override with manual thresholds, set calibration.method to "manual":

{
  "calibration": { "method": "manual" },
  "tiers": {
    "simple":    { "maxBlendedCost": 1.0,  "minIntelligence": 20 },
    "medium":    { "maxBlendedCost": 3.0,  "minIntelligence": 35 },
    "complex":   { "maxBlendedCost": 10.0, "minIntelligence": 50 },
    "reasoning": { "maxBlendedCost": 10.0, "minIntelligence": 50, "minMathIndex": 40 },
    "premium":   { "maxBlendedCost": null,  "minIntelligence": 60 }
  }
}

Commands

bun run tracker:save       # Fetch, calibrate, write config + changelog
bun run tracker:dry-run    # Fetch, calibrate, print report (no writes)
bun run tracker:explore    # Dump scatter table + Pareto frontier + auto thresholds

Flags

Flag	Effect
`--explore`	Dump scatter + Pareto frontier + auto thresholds, no writes
`--dry-run`	Analyze + report, no writes (default)
`--save`	Write config files + changelog
`--no-cache`	Force fresh API fetch (skip 6h local cache)

API caching

Responses are cached locally at config/.aa-cache.json with a 6-hour TTL. Repeated runs within that window don't hit the API. Use --no-cache to bypass.

Development

bun test            # run tests
bun run typecheck   # tsc --noEmit
bun run lint        # biome check
bun run lint:fix    # biome check --write

Automation

.github/workflows/tracker.yml runs the tracker weekly (Monday 9am UTC). On config changes it commits to a branch and opens a PR for review.

Requires AA_API_KEY set as a repository secret.

Data attribution

Model benchmarks, pricing, and performance data provided by Artificial Analysis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Router

Setup

Usage

Quick start

How thresholds work

Commands

Flags

API caching

Development

Automation

Data attribution

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AI Router

Setup

Usage

Quick start

How thresholds work

Commands

Flags

API caching

Development

Automation

Data attribution