🚀 OmniRoute — The Free AI Gateway

🚀 OmniRoute — The Free AI Gateway

Never stop coding. Connect every AI tool to 231 providers — 50+ free — through one endpoint.

Plug Claude Code, Codex, Cursor, Cline, Copilot & Antigravity into FREE Claude / GPT / Gemini. Auto-fallback.

RTK + Caveman compression saves 15–95% tokens. Never hit limits.

~1.6B documented free tokens/month — up to ~2.1B in your first month with signup credits — aggregated across the free tiers, plus a long tail of permanently-free, no-cap providers, and the compression above stretches every one further. (how we count →)

💬 Join the community

Questions, provider tips, roadmap & support → Discord · Telegram · WhatsApp 🌍 Global / 🇧🇷 Brasil

🚀 Quick Start • 🎯 Combos • 🌐 Providers • 🔌 CLI & MCP • 🗜️ Compression • 🌍 Website

💥 The Promise • 🤔 Why • 🏆 What Sets Apart • 🤖 Compatible CLIs • 🖥️ Where It Runs • 🔒 Private • 🎬 In Action • 📚 Explore More • 📧 Support

🌐 Available in 40+ languages

🇺🇸	🇧🇷	🇪🇸	🇫🇷	🇮🇹	🇷🇺	🇨🇳	🇩🇪	🇯🇵	🇰🇷	🇮🇳
🇹🇭	🇻🇳	🇮🇩	🇲🇾	🇵🇭	🇸🇦	🇮🇱	🇦🇿	🇺🇦	🇵🇱	🇨🇿
🇳🇱	🇧🇬	🇩🇰	🇫🇮	🇳🇴	🇸🇪	🇭🇺	🇷🇴	🇸🇰	🇵🇹

💰 ~1.6B Free Tokens / Month

Stacking free tiers by hand is painful — dozens of SDKs, dozens of rate limits, and no idea how much you actually have. OmniRoute aggregates the documented free tiers of 40+ provider pools / 500+ models into one honest number and shows it live on the dashboard (/dashboard/free-tiers).

~1.6B free tokens / month (steady) — and up to ~2.1B in your first month with signup credits.
Pool-deduped, honest — we count each shared free pool once, so the headline isn't inflated by rate-limit ceilings the way multi-billion competitor claims are. (Counting every rate limit 24/7 would read ~10B; we don't publish that.)
Plus the un-countable — permanently-free, no-token-cap providers (SiliconFlow, Z.AI GLM-Flash, Kilo, OpenCode Zen…) and a $10 OpenRouter top-up that unlocks +24M/mo, both surfaced separately so they never inflate the headline.
Per-model breakdown, live used / remaining for the current month, and a transparent terms flag per provider.

Preview mockup — a real screenshot lands once the /dashboard/free-tiers page is validated. Full methodology (pool dedupe, credit tiers, provider terms): docs/reference/FREE_TIERS.md.

💥 The Promise

One endpoint. 231 providers. Never stop building — and let OmniRoute pick the cheapest one that works.

🚫 Never hit limits _{Auto-fallback across 231 providers in milliseconds. Quota out? Next provider takes over — zero downtime.}	💸 Save up to 95% tokens _{RTK + Caveman stacked compression cuts 15–95% of eligible tokens (~89% avg on tool-heavy sessions).}	🆓 $0 to start _{50+ providers with a free tier, 11 free forever (Kiro, Qoder, Pollinations, LongCat…). No card needed.}
🔌 Every tool works _{16+ coding agents — Claude Code, Codex, Cursor, Cline, Copilot, Antigravity — through one config.}	🧩 One endpoint _{OpenAI ↔ Claude ↔ Gemini ↔ Responses API translation. Point any tool at /v1 and it just works.}	🛡️ Production-grade _{Circuit breakers, TLS stealth, MCP (87 tools), A2A, memory, guardrails, evals. 14,965 tests.}

🤔 Why OmniRoute?

Stop juggling 10 dashboards, dead API keys, and surprise bills.

❌ The daily pain	✅ How OmniRoute fixes it
📉 Subscription quota expires unused every month	Maximize subscriptions — track quota, use every token before reset
🛑 Rate limits stop you mid-coding	4-tier auto-fallback — Subscription → API → Cheap → Free, in milliseconds
🔥 Tool outputs (`git diff`, `grep`, logs) burn tokens	RTK + Caveman compression — save 15–95% eligible tokens per request
💸 Expensive APIs ($20–50/mo per provider)	Cost-optimized routing — auto-route to the cheapest viable model
🧰 Each AI tool wants its own setup	One endpoint, every tool, one dashboard
🌍 AI blocked in your country	3-level proxy + TLS fingerprint stealth — use AI from anywhere

┌──────────────────────────────────────────────────────────┐
│        Your IDE / CLI  (Claude Code, Cursor, Cline…)       │
└─────────────────────────┬──────────────────────────────────┘
                          │ http://localhost:20128/v1
                          ▼
┌──────────────────────────────────────────────────────────┐
│                  OmniRoute — Smart Router                  │
│  RTK + Caveman compression · 15 routing strategies         │
│  Circuit breakers · TLS stealth · MCP · A2A · Guardrails   │
└─────────────────────────┬──────────────────────────────────┘
        ┌─────────────┬────┴────────┬─────────────┐
        ▼ Tier 1      ▼ Tier 2      ▼ Tier 3       ▼ Tier 4
   SUBSCRIPTION     API KEY        CHEAP          FREE
   Claude Code,     DeepSeek,      GLM $0.5,      Kiro, Qoder,
   Codex, Copilot   Groq, xAI      MiniMax $0.2   Pollinations
   quota out? ───▶  budget hit? ─▶ budget hit? ─▶ always on

🎯 Combos — The Flagship

A combo is a chain of models OmniRoute routes across automatically. Quota runs out, a provider fails, or costs spike — the combo silently slides to the next model. This is what makes OmniRoute unbreakable. 🛡️

⚡ Zero-config — just use `auto`

No combo to create. Set your model to auto (or a variant) and OmniRoute builds a virtual combo from your connected providers, scored live:

Model ID	What it optimizes for
`auto`	🎯 Balanced default (LKGP — sticks to your last good provider)
`auto/coding`	🧑‍💻 Quality-first weights for code generation
`auto/fast`	⚡ Lowest latency first
`auto/cheap`	💰 Cheapest per token first
`auto/offline`	🔋 Most quota / rate-limit headroom first
`auto/smart`	🔭 Quality-first + 10% exploration to discover better models

🔀 Or build your own — 15 routing strategies

Goal	Strategy / combo
🥇 Drain my subscription before paying	`priority` / `fill-first`
⚖️ Spread load across accounts	`round-robin` · `weighted` · `p2c` · `least-used`
💸 Always cheapest viable model	`cost-optimized` · `auto/cheap`
🧠 Hand off long context between models	`context-relay` · `context-optimized`
🎲 Randomized / privacy routing	`random` · `strict-random`
🤖 Just make it smart	`auto` (9-factor scoring) · `lkgp` · `reset-aware`

_{The Auto-Combo engine scores every candidate on 9 factors (health, quota, cost, latency, success rate, freshness…) — see docs/routing/AUTO-COMBO.md.}

🧱 Resilience is built in (3 independent layers)

Layer	Scope	What it does
🔌 Circuit breaker	whole provider	Stops hammering a provider that's failing upstream; auto-probes to recover
💤 Connection cooldown	one account / key	Skips a rate-limited key while other keys keep serving
🎯 Model lockout	provider + model	Quarantines just one quota-limited model, not the whole connection

Combo: "always-on"                         Strategy: priority
  1. cc/claude-opus-4-7   ← subscription (use it fully)
  2. cx/gpt-5.5           ← second subscription
  3. glm/glm-5.1          ← cheap backup ($0.5/1M)
  4. kr/claude-sonnet-4.5 ← FREE, unlimited (never fails)
Result: 4 layers of fallback = zero downtime

_{📖 Auto-Combo Engine · Resilience Guide}

🏆 What Sets OmniRoute Apart

Feature	OmniRoute	Other routers
🌐 Providers	231	20–100
🆓 Free providers	50+ (11 free forever)	1–5
🔀 Routing strategies	15 (priority, weighted, cost-optimized, context-relay…)	1–3
🗜️ Token compression	RTK + Caveman stacked (15–95%)	None / 20–40%
🧰 Built-in MCP server	87 tools, 3 transports, 30 scopes	Rare
🤝 A2A agent protocol	6 skills, JSON-RPC 2.0	None
🧠 Memory (FTS5 + vector)	Yes	Rare
🛡️ Guardrails (PII, injection, vision)	Yes	Rare
☁️ Cloud agents	Codex, Devin, Jules	None
🥷 TLS fingerprint stealth	JA3/JA4 via wreq-js	None
🖥️ Multi-platform	Web · Desktop · Termux · PWA	Web only
🌍 i18n	42 locales	0–4

_{📊 Detailed comparison vs LiteLLM, OpenRouter & Portkey → docs/comparison/OMNIROUTE_VS_ALTERNATIVES.md}

✨ What's New

Recent highlights from v3.8.20 → v3.8.33. Full history in CHANGELOG.md.

🤖 One-command CLI/agent setup — a dedicated setup-* command configures each coding tool to route through OmniRoute (Claude Code, Codex, Cline, Continue, Cursor, Roo Code, Kilo Code, Crush, Goose, Qwen Code, Aider, OpenCode, Gemini CLI); omniroute launch / omniroute launch-codex are zero-config launchers. → CLI Integrations
🛰️ Remote mode — drive a remote OmniRoute from any machine with scoped access tokens (omniroute connect / omniroute contexts / omniroute tokens). → Remote Mode
🧭 Smarter auto-routing — OpenRouter-style auto/<category>:<tier> combos (e.g. auto/coding:fast, auto/reasoning:pro), live Arena-ELO + models.dev model intelligence, per-step account allowlists, provider-wildcard combo steps, nested combo-ref execution, sticky weighted selection, and web_search-aware routing to a configured model. → Auto-Combo
🗜️ Pluggable compression — an async compression pipeline with Compression Studios, a stable LLMLingua-2 ONNX engine, RTK, delegated Anthropic Context Editing, and a unified settings panel with named profiles + an active-profile selector as the single source for per-engine on/off + level. → Compression
🕵️ Transparent MITM decrypt (TPROXY) — capture & translate traffic from CLIs that ignore proxy env vars, with a per-SNI certificate authority and a trust-store installer. → MITM/TPROXY
💸 Cost telemetry everywhere — X-OmniRoute-* cost/usage headers on every endpoint (including media), a non-token cost engine, a cache-HIT X-OmniRoute-Cost-Saved header, and per-key USD spend quotas. → API Reference
🧠 Memory you control — opt-in int8 vector quantization (Qdrant + sqlite-vec), memory off by default, and a per-request x-omniroute-no-memory header. → Memory
🛡️ Security — a prompt-injection guard across every LLM route (backed by a red-team suite), plus a free DuckDuckGo last-resort web search. → Guardrails
🤝 More providers & agents — Cursor Cloud Agent (a 4th cloud agent), a refreshed 231-provider catalog (OrcaRouter, Wafer AI, OpenAdapter, dit.ai, TokenRouter, …), Vertex AI media generation (speech / transcription / music / video), and one-click account import from CLIProxyAPI (~/.cli-proxy-api/). → Providers
⚡ Local performance & infra — a one-click local Redis launcher (omniroute redis up, plus a dashboard Redis panel) and an optional Bifrost Go sidecar that offloads the hottest relay path (BIFROST_BASE_URL, with automatic fallback to the TypeScript path on timeout). → Environment

🤖 Compatible CLIs & Coding Agents

One config — http://localhost:20128/v1 — and every AI IDE or CLI runs on free & low-cost models.

Claude Code	Codex CLI	Gemini CLI	Cursor	Copilot	Continue
OpenCode	Kilo Code	Droid	OpenClaw	Kiro	Command

＋ also works with · Cline · Antigravity · Windsurf · AMP · Hermes · Qwen CLI · Roo · Continue · any OpenAI-compatible tool

_{📖 Per-tool setup for all 16+ tools → docs/reference/CLI-TOOLS.md · 🧩 OpenCode plugin → @omniroute/opencode-provider}

🌐 231 AI Providers — 50+ Free

The most complete catalog of any open-source router: 231 providers, 50+ with a free tier, 11 free forever.

🆓 Free Forever — $0, no card

_{GPT-5, Claude, Gemini $100 free credits}	_{Kimi-K2, DeepSeek-R1 Unlimited FREE}	_{GPT-5, Claude, Llama 4 No key needed}	_{Flash-Lite 50M tokens/day 🔥}
_{50+ models 10K neurons/day}	_{gemini-3-flash 180K/mo free}	_{129 models ~40 RPM free}	_{Qwen3 235B 1M tokens/day}

📖 Full machine-readable catalog → docs/reference/PROVIDER_REFERENCE.md

🖥️ Where OmniRoute Runs — Anywhere

Same app, your machine, your rules. From a global npm install to your phone via Termux.

Platform	Install	Highlights
📦 npm (global)	`npm install -g omniroute`	One command, any OS
🐳 Docker	`docker run … diegosouzapw/omniroute`	Multi-arch AMD64 + ARM64
🖥️ Desktop (Electron)	`npm run electron:build`	Native window + system tray — Windows / macOS / Linux
💪 ARM	native `arm64`	Raspberry Pi, ARM servers, Apple Silicon
📱 Android (Termux)	`pkg install nodejs-lts && npx -y omniroute`	Runs on your phone, 24/7, no root
📲 PWA	"Add to Home Screen"	Fullscreen, offline, installable from browser
🧩 OpenCode plugin	`@omniroute/opencode-provider`	Native OpenCode integration
🛠️ From source	`npm install && npm run dev`	Hack on it, contribute

_{📖 Docker Guide · Desktop · Termux · PWA · OpenCode}

🔒 Private & Local-First

Your keys, your machine, your data. OmniRoute is a local proxy — it never phones home.

🏠 Runs 100% on your hardware — npm, Docker, desktop, or your phone. No OmniRoute cloud sits in the request path.
🔐 Credentials encrypted at rest — API keys & OAuth tokens sealed with AES-256-GCM.
🚫 Zero telemetry by default — your prompts go only to the providers you choose, nowhere else.
🛡️ Hardened gateway — API-key scoping, IP filtering, rate limits, prompt-injection guard, loopback-only process routes.
📜 MIT licensed & fully open-source — audit every line, self-host forever.

_{📖 Authorization · Guardrails · Compliance}

🔌 Full CLI + A2A & MCP

OmniRoute isn't just a server — it's a full command-line cockpit with 60+ commands, plus open agent protocols so an AI agent can drive OmniRoute by itself.

⌨️ A real CLI (not just `start`)

omniroute               # serve gateway + dashboard (port 20128)
omniroute chat          # interactive TUI chat client (slash: /model /combo /skill /memory)
omniroute setup         # guided first-run wizard
omniroute doctor        # diagnose providers, ports, native deps

🛰️ Remote mode — run the CLI here, OmniRoute on a VPS

OmniRoute on a server? Drive it from your laptop with the same CLI. Log in once with a scoped access token; every command then targets the remote.

omniroute connect 192.168.0.15            # password → scoped token, saved as a context
omniroute models list                     # ← runs against the REMOTE server
omniroute configure codex                 # ← picks a remote model, writes a local Codex profile
omniroute tokens create --name ci --scope read   # mint narrower tokens for other machines
omniroute contexts use default            # ← switch back to the local server

Tokens are scoped read / write / admin; process-spawning routes stay loopback-only. _{📖 Remote Mode}

providers · oauth · keys · combo · nodes · models · cache · compression · cost · usage · quota · health · resilience · telemetry · logs · audit · mcp · a2a · cloud · memory · skills · eval · tunnel · backup · sync · webhooks · policy · pricing · translator · simulate …

🤝 Connect an agent — and it controls OmniRoute itself

Expose OmniRoute over MCP or A2A and any capable agent gets the keys to the whole gateway — routing, providers, combos, cache, compression, memory — autonomously.

Protocol	Endpoint	Use it for
🧰 MCP (stdio)	`omniroute --mcp`	Plug into Claude Desktop, Cursor, any MCP client
🌊 MCP (HTTP)	`http://localhost:20128/api/mcp/stream`	Remote MCP — 87 tools, 30 scopes, full audit trail
📡 MCP (SSE)	`http://localhost:20128/api/mcp/sse`	Streaming MCP transport
🤝 A2A	`http://localhost:20128/.well-known/agent.json`	Agent-to-agent, JSON-RPC 2.0 + SSE, 6 skills

# Give Claude Code the full OmniRoute toolset over MCP:
claude mcp add-server omniroute --type http --url http://localhost:20128/api/mcp/stream

_{📖 MCP Server · A2A Server · Agent Protocols}

🗜️ Save 15–95% Tokens — Automatically

Why use many token when few token do trick? Every request passes through OmniRoute's compression pipeline transparently — no client changes. It's now a stack of 9 composable engines that run in order and mix & match per routing combo — building on ideas from RTK, Caveman (⭐ 51K+), LLMLingua-2, and Troglodita (PT-BR).

🧱 The 9-engine stack

Engines run in pipeline order; each is independently toggleable and configurable per combo:

#	Engine	What it does
1	Session-Dedup	Drops content repeated across turns (content-addressed, cross-turn)
2	CCR	Archives large blocks behind retrieve markers, fetched on demand
3	RTK	Smart tool-result filtering, dedup & truncation (command-aware)
4	Headroom	Lossless tabular compaction of homogeneous JSON arrays (~30%+)
5	Caveman	Rule-based prose compression (~65–75% on output)
6	LLMLingua-2	ML semantic pruning via MobileBERT ONNX — code-safe, async
7	Lite	Whitespace + image-URL trimming (latency-light baseline)
8	Aggressive	Summarization + progressive aging of old turns
9	Ultra	Heuristic token pruning with an optional small-model (SLM) tier

Code blocks, URLs and structured data are always preserved byte-perfect. One-click presets combine the engines:

Mode	Savings	Best for
🪶 Lite	~15%	Always-on safe default
🪨 Standard (Caveman)	~30%	Daily coding
⚡ Aggressive	~50%	Long tool-heavy sessions
🔥 Ultra	~75%	Maximum savings
🧰 RTK	60–90%	Shell/test/build/git output
🔗 Stacked (RTK → Caveman)	78–95%	Mixed prompts + tool logs

Real example — Standard mode:

Before (69 tokens): "The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I would recommend using useMemo to memoize the object."

After (19 tokens): "New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo."

Same answer. 72% fewer tokens. Zero accuracy loss. ✅

PT-BR example — Troglodita mode:

Antes (42 tokens): "O problema é que o componente está re-renderizando porque uma nova referência de objeto está sendo criada em cada ciclo de renderização. Eu recomendaria usar useMemo."

Depois (12 tokens): "Re-render: ref nova cada ciclo (objeto inline recriado). Usar useMemo."

Mesma resposta. ~70% menos tokens. Precisão técnica intacta. ✅

📖 How it works — pipeline, architecture & savings math

Client (10,000 tok) ──▶ OmniRoute Compression (9 engines) ──▶ Provider (~1,080 tok, up to 95% saved)

Default stacked combo runs RTK → Caveman. When both act on the same tool/context payload, savings compound:

combined = 1 − (1 − RTK) × (1 − Caveman_input)
average  = 1 − (1 − 0.80) × (1 − 0.46) = 89.2%
range    = 78.4 – 94.6%

Code blocks, URLs, JSON and structured data are always protected by the preservation engine. Auto-trigger compression by token threshold, or assign a compression pipeline per routing combo.

📖 COMPRESSION_GUIDE.md · RTK_COMPRESSION.md · COMPRESSION_ENGINES.md

⚡ Quick Start

1) Install & run

npm install -g omniroute
omniroute

Dashboard at http://localhost:20128 · API at http://localhost:20128/v1.

2) Connect a FREE provider (no signup)

Dashboard → Providers → connect Kiro AI (free Claude unlimited) or OpenCode Free (no auth) → done.

3) Point your coding tool

Base URL: http://localhost:20128/v1
API Key:  [copy from Dashboard → Endpoints]
Model:    auto            (zero-config smart routing — or any provider/model)

4) Verify it's working

curl http://localhost:20128/v1/models -H "Authorization: Bearer YOUR_KEY"

You should see your connected models listed. 🎉 That's it — start coding, and OmniRoute auto-routes & falls back for you.

If your client cannot send custom headers, OmniRoute also exposes tokenized compatibility aliases:

OpenAI catalog:   http://localhost:20128/vscode/YOUR_KEY/
OpenAI models:    http://localhost:20128/vscode/YOUR_KEY/models
OpenAI chat:      http://localhost:20128/vscode/YOUR_KEY/chat/completions
OpenAI responses: http://localhost:20128/vscode/YOUR_KEY/responses
Ollama chat:      http://localhost:20128/vscode/YOUR_KEY/api/chat
Ollama tags:      http://localhost:20128/vscode/YOUR_KEY/api/tags

Use these only for clients that cannot attach Authorization: Bearer .... Header auth remains the preferred mode.

📦 More install methods — Docker, source, pnpm, Arch

🐳 Docker

docker run -d --name omniroute --restart unless-stopped --stop-timeout 40 \
  -p 20128:20128 -v omniroute-data:/app/data diegosouzapw/omniroute:latest

🛠️ From source

cp .env.example .env && npm install
PORT=20128 npm run dev

📦 pnpm

pnpm install -g omniroute && pnpm approve-builds -g && omniroute

🐧 Arch Linux (AUR)

yay -S omniroute-bin && systemctl --user enable --now omniroute.service

🔧 Nix (Flake)

# Using Nix flakes
nix develop
npm run dev

# Or using devbox
devbox run npm run dev

📖 Docker Guide — Compose profiles, Caddy HTTPS, Cloudflare tunnels.

🦭 Podman

# 1. Build the image
podman build --target runner-base -t omniroute:base .

# 2. Fix data directory permissions for rootless Podman
mkdir -p data && podman unshare chown 1000:1000 ./data

# 3. Set runtime in .env, then run (see contrib/podman/ for Quadlet)
echo "CONTAINER_HOST=podman" >> .env
podman compose --profile base up -d

📖 Podman Guide — Quadlet setup, podman-compose, Quadlet.

🎬 OmniRoute in Action

🇧🇷 Português
_{Guia completo}

🇺🇸 English
_{Complete walkthrough}

🇷🇺 Русский
_{Полное руководство}

🎬 Made a video about OmniRoute? Open an issue or discussion with the link — we'll feature it here.

📚 Explore More

💰 Pricing at a glance & the $0 Free Stack (11 providers)

Tier	Example	Cost
💳 Subscription	Claude Code Pro / Codex / Copilot	$10–200/mo
🔑 API Key (free tiers)	NVIDIA NIM, Cerebras, Groq	FREE
💰 Cheap	GLM-5 $0.5/1M · MiniMax M2.5 $0.3/1M	pennies
🆓 Free Forever	Kiro, Qoder, Qwen, Pollinations, LongCat	$0

The $0 Free Stack — combine into one unbreakable combo:

Provider	Prefix	Free models	Quota
Kiro	`kr/`	Claude Sonnet 4.5, Haiku 4.5, Opus 4.6	50 credits/mo
Qoder	`if/`	kimi-k2-thinking, qwen3-coder-plus, deepseek-r1	♾️ Unlimited
Qwen	`qw/`	qwen3-coder-plus/flash/next	♾️ Unlimited
Pollinations	`pol/`	GPT-5, Claude, Gemini, DeepSeek, Llama 4	No key needed
LongCat	`lc/`	LongCat-Flash-Lite	50M tokens/day 🔥
Cloudflare AI	`cf/`	50+ models	10K neurons/day
NVIDIA NIM	`nvidia/`	129 models	~40 RPM
Cerebras	`cerebras/`	Qwen3 235B, GPT-OSS 120B	1M tok/day

💡 The dashboard "cost" is a savings tracker, not a bill — OmniRoute never charges you. A "$290 total cost" using free models means $290 saved.

📖 Complete free directory → docs/reference/FREE_TIERS.md — 25+ providers, quotas, base URLs.

🎯 Use Cases — ready-made combo playbooks

$0 forever:

1. kr/claude-sonnet-4.5   (Kiro — unlimited)
2. if/kimi-k2-thinking    (Qoder — unlimited)
3. pol/gpt-5              (Pollinations — no key)
4. lc/longcat-flash-lite  (50M tok/day backup)
Compression: aggressive (~50%) → double your free quota · Cost: $0/mo

24/7 no interruptions: chain 2 subscriptions → cheap → free for 5 layers of fallback. Blocked region: free providers + global/per-provider proxy → access AI from any country. Max savings: subscription + cheap backup + ultra compression (~75%) → ~$150–300/mo saved for heavy users.

🌍 Bypass geo-blocks — 3-level proxy + stealth

🇷🇺 🇨🇳 🇮🇷 🇨🇺 🇹🇷 In a blocked region? OmniRoute's 3-level proxy (Global / Per-Provider / Per-Connection) proxies API requests, OAuth flows, connection tests, token refresh & model sync.

Protocols: HTTP/HTTPS, SOCKS5, authenticated proxies
🆓 1proxy marketplace — hundreds of free validated proxies, quality scores, auto-rotation
Anti-detection — TLS fingerprint spoofing (wreq-js), CLI fingerprint matching, proxy IP preservation

📖 docs/ops/PROXY_GUIDE.md

✨ Full feature list — 30+ capabilities (memory, evals, observability)

Routing: 15 strategies · task-aware smart routing · thinking budget controls · wildcard routing · system prompt injection. Compatibility: OpenAI ↔ Claude ↔ Gemini ↔ Responses API · auto OAuth refresh (PKCE, 8 providers) · multi-account round-robin · Batch + Files API · live OpenAPI 3.0. Protocols: MCP (87 tools, 3 transports, 30 scopes) · A2A (JSON-RPC 2.0, SSE, 6 skills) · ACP · cloud agents (Codex, Devin, Jules). Plugins: custom plugin marketplace (system-configured registry URL with SSRF-guarded fetch) · install / enable / disable · Notion + Obsidian knowledge-base integrations (WebDAV file server, vault search, note CRUD). Embedded services: one-click install & lifecycle management of local sidecar services (CLIProxy, NineRouter). Quality & Ops: built-in Evals (golden-set: exact/contains/regex/custom) · guardrails (PII, injection, vision) · health dashboard · p50/p95/p99 telemetry · webhooks · compliance audit. AI Agent Skills: drop-in markdown manifests — point any agent at a skills/*/SKILL.md manifest. 43 skills available.

📖 MCP Server · A2A Server · Resilience Guide · Features Gallery

📖 Setup, env vars & FAQ

Env var	Default	Purpose
`PORT`	`20128`	API + dashboard port
`REQUIRE_API_KEY`	`false`	Require API key for all requests
`DATA_DIR`	`~/.omniroute`	Database & config storage

Will I be charged by OmniRoute? No — it's free, open-source software on your machine. You only pay paid providers directly. OmniRoute has no billing system. Are FREE providers really unlimited? Yes — Kiro, Qoder, Pollinations, LongCat, Cloudflare. No catch. Will compression hurt quality? No — it only compresses the input; code, URLs, JSON are always protected. Does it work where AI is blocked? Yes — 3-level proxy + 1proxy marketplace reach all 231 providers.

📖 User Guide · API Reference · Environment Config

🐛 Troubleshooting

Problem	Quick fix
"Language model did not provide messages"	Provider quota exhausted → use a combo fallback
Rate limiting (429)	Add fallback: `cc/claude → glm/glm-4.7 → if/kimi-k2-thinking`
OAuth token expired	Auto-refreshed; if stuck, delete + re-auth in Providers
`unsupported_country_region_territory`	Configure proxy in Settings → Proxy
Docker SQLite locks	Use `--stop-timeout 40` for clean WAL checkpoint
Node runtime errors	Use Node `>=22.0.0 <23` or `>=24.0.0 <27`

🐛 Reporting a bug? Run npm run system-info and attach system-info.txt. 📖 docs/guides/TROUBLESHOOTING.md

📸 Dashboard screenshots

Page	Screenshot	Page	Screenshot
Providers		Combos
Analytics		Health
Translator		Settings
CLI Tools		Usage Logs

📧 Support & Community

💬 Chat with the community — Discord, Telegram & WhatsApp (🌍 / 🇧🇷) links are at the top of this README.

🌍 Website: omniroute.online
🐙 GitHub: github.com/diegosouzapw/OmniRoute
🐛 Issues: report a bug (attach npm run system-info output)
🤝 Contributing: see CONTRIBUTING.md or pick a good first issue

🛠️ Tech Stack

Runtime: Node.js 22.x or 24.x LTS (24 LTS recommended) — >=22.0.0 <23 || >=24.0.0 <27
Language: TypeScript 6.0 — 100% TypeScript across src/ and open-sse/ (zero any in core modules since v2.0)
Framework: Next.js 16 + React 19 + Tailwind CSS 4
Database: better-sqlite3 (SQLite) + LowDB (JSON legacy) — domain state, proxy logs, MCP audit, routing decisions, memory, skills
Schemas: Zod (MCP tool I/O validation, API contracts)
Protocols: MCP (stdio/HTTP) + A2A v0.3 (JSON-RPC 2.0 + SSE)
Streaming: Server-Sent Events (SSE) + WebSocket bridge (/v1/ws)
Auth: OAuth 2.0 (PKCE) + JWT + API Keys + MCP Scoped Authorization
Testing: Node.js test runner + Vitest (14,965 test cases across 517 files — unit, integration, E2E, security, ecosystem)
Platforms: Desktop (Electron), Android (Termux), PWA (any browser)
CI/CD: GitHub Actions (auto npm publish + Docker Hub on release)
Website: omniroute.online
Package: npmjs.com/package/omniroute
Docker: hub.docker.com/r/diegosouzapw/omniroute
Resilience: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing, auto-combo self-healing

📖 Documentation

📘 Getting Started

Document	Description
User Guide	Providers, combos, CLI integration, deployment
Setup Guide	Full install methods, CLI tool configs, protocol setup, timeout tuning
CLI Tools Guide	Per-tool setup for Claude Code, Codex, Cursor, Cline, OpenClaw, Kilo, Copilot
Remote Mode	Drive a remote OmniRoute (VPS) from your laptop CLI via scoped access tokens
Claude Code Config	Point Claude Code at OmniRoute (local/remote) with `launch` + per-model profiles
Quick Start	3-step install → connect → configure

🔧 Operations & Deployment

Document	Description
Docker Guide	Docker run, Compose profiles, Caddy HTTPS, tunnels, image tags
Podman Guide	Quadlet systemd integration, podman-compose, SELinux
VM Deployment	Complete guide: VM + nginx + Cloudflare setup
Fly.io Deployment	Deploy to Fly.io with persistent storage
Termux Guide	Run OmniRoute on Android via Termux
PWA Guide	Progressive Web App install, caching, architecture
Uninstall Guide	Clean removal for all install methods
Environment Config	Complete `.env` variables and references

🧠 Features & Architecture

Document	Description
Architecture	System architecture, data flow, and internals
Compression Guide	7-option pipeline: off / lite / standard / aggressive / ultra / RTK / stacked
RTK Compression	Command-output compression, filters, trust, verify, raw-output recovery
Compression Engines	Caveman, RTK, stacked pipelines, dashboard/API/MCP surfaces
Compression Rules Format	JSON rule-pack schemas for Caveman and RTK filters
Compression Language Packs	Language detection and Caveman rule-pack authoring
Resilience Guide	Circuit breakers, cooldowns, queue, anti-thundering herd, TLS spoofing
Auto-Combo Engine	9-factor scoring, mode packs, self-healing
Proxy Guide	3-level proxy system, 1proxy marketplace, registry CRUD
Free Tiers	25+ free API providers consolidated directory
Features Gallery	Visual dashboard tour with screenshots
Codebase Documentation	Beginner-friendly codebase walkthrough

🤖 Protocols & APIs

Document	Description
API Reference	All endpoints with examples
OpenAPI Spec	OpenAPI 3.0 specification
MCP Server	87 MCP tools, IDE configs, Python/TS/Go clients
MCP Server Guide	MCP installation, transports, and tool reference
A2A Server	JSON-RPC 2.0 protocol, skills, streaming, task mgmt
A2A Server Guide	A2A agent card, tasks, skills, and streaming

📋 Project & Quality

Document	Description
Contributing	Development setup and guidelines
Changelog	Full per-version release history
Security Policy	Vulnerability reporting and security practices
i18n Guide	40+ language support, translation workflow, RTL
Release Checklist	Pre-release validation steps
Coverage Plan	Test coverage strategy and 14,965 test suite

⭐ Top Contributors

OmniRoute is shaped by a passionate open-source community. These individuals have made exceptional contributions that directly impact the quality, stability, and reach of the project. Thank you.

oyi77
_{🥇 190 commits • +72K lines}
_{Analytics engine, SQL aggregations,
proxy marketplace, test coverage}

Chris Staley
_{🥈 72 commits • +5.7K lines}
_{SSE stream hardening, Responses API,
Gemini pagination, test regression fixes}

zenobit
_{🥉 62 commits • +24K lines}
_{CI/CD pipeline, i18n for 33 languages,
Void Linux package, platform fixes}

R.D. & Randi
_{🏅 107 commits • +28K lines}
_{Endpoints page, tunnel integrations,
Docker workflows, A2A status, compression UI}

benzntech
_{🏅 20 commits • +7.5K lines}
_{Electron desktop app, auto-updater,
release build workflows, cross-platform CI}

🙏 These contributors' features, bug fixes, and infrastructure improvements are a core part of what makes OmniRoute reliable and feature-rich. Every pull request, every test case, and every i18n translation file matters. Open source is built by people like them.

👥 Contributors

How to Contribute

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

Releasing a New Version

# Create a release — npm publish happens automatically
gh release create v3.8.2 --title "v3.8.2" --generate-notes

📊 Stars

🌍 StarMapper

🙏 Acknowledgments

OmniRoute stands on the shoulders of giants. It started as a fork of 9router and a TypeScript port of the Go project CLIProxyAPI — and from there, every subsystem below was inspired by an open-source project that got there first. Each one shaped a concrete piece of OmniRoute. This is our thank-you to all of them. 🙏

⭐ star counts as of June 2026 — go give these projects a star.

🧬 Lineage & gateway

Project	⭐	How it inspired OmniRoute
9router · decolua	17.9k	The original project this fork is built on — extended here with multi-modal APIs and a full TypeScript rewrite.
CLIProxyAPI · router-for-me	37.8k	The Go implementation that inspired this JavaScript / TypeScript port.
LiteLLM · BerriAI	50.8k	The AI gateway whose public pricing dataset feeds our cost-tracking sync and whose provider-normalization model informed our routing.

🗜️ Context & token compression — engines

Project	⭐	How it inspired OmniRoute
Caveman · JuliusBrussee	74.5k	The viral "why use many token when few token do trick" project — its caveman-speak philosophy powers our standard compression mode and 30+ filler/condensation rules.
RTK – Rust Token Killer · rtk-ai	63.6k	High-performance command-output compression — inspired our RTK engine, JSON filter DSL, raw-output recovery and the stacked RTK → Caveman pipeline.
headroom · chopratejas	33.6k	Reversible context-compression (SmartCrusher) — inspired our `headroom` engine and the `ccr` retrieve-marker pattern.
LLMLingua · Microsoft	6.3k	Prompt-compression research (LLMLingua / LLMLingua-2) — inspired our async, code-safe, fail-open `llmlingua` engine.
llmlingua-2-js · atjsh	27	The JS/ONNX port (MobileBERT / XLM-RoBERTa) used as the worker-thread backend for our LLMLingua engine.
Troglodita · Lenine Júnior	15	PT-BR token compression — powers our pt-BR language pack: pleonasm reduction and filler removal tuned for Brazilian-Portuguese grammar.

🧩 Compact formats, token research & code-aware tooling

Project	⭐	How it inspired OmniRoute
TOON · toon-format	24.6k	Token-Oriented Object Notation — its columnar, header-plus-rows model shaped our tabular compaction stage.
GCF – Graph Compact Format · Blackwell Systems	11	Schema-aware "JSON for LLMs" notation — co-inspired our lossless homogeneous-array compaction with `[N rows]` markers.
token-optimizer-mcp · ooples	409	Brotli/SQLite cache + per-session context-delta — inspired our `session-dedup` engine.
token-savior · Mibayy	993	Bash-output compaction + MCP profiles — inspired our compression bail-out discipline and MCP tool-manifest reduction.
token-saver · ppgranger	103	Content-aware, per-file-type output compression with failure-aware bail-out — validated our per-type dispatch and minimum-gain skip.
token-optimizer · alexgreensh	1.4k	"Find the ghost tokens" — its offload + recoverable-handle pattern informed our CCR offload thinking.
TokenMizer · Shweta-Mishra-ai	1	A session-graph + cross-turn line-dedup blueprint that informed our session-dedup design.
mcp-compressor · Atlassian Labs	80	MCP tool-schema/description compression — informed our MCP tool-manifest cardinality reduction.
RepoMapper · pdavis68	182	Aider-style repo-map ranking — informed our repo-map / retrieval-ranking exploration.
quiet-shell-mcp · mrsimpson	4	Declarative shell-output reduction over MCP — validated our declarative bash-output compaction.
ts-morph · David Sherret	6.1k	TypeScript Compiler API toolkit — inspired our parser-based comment removal that preserves string, template and regex literals.

🧠 Memory & RAG

Project	⭐	How it inspired OmniRoute
Mem0 · mem0ai	58.9k	Universal memory layer — its proxy-as-write/read-boundary model shaped our memory architecture.
Letta (MemGPT) · letta-ai	23.4k	Stateful agents with tiered memory — inspired our Context Control & Recovery (CCR) tiered model.
WFGY · onestardao	1.8k	The ProblemMap taxonomy of 16 recurring RAG/LLM failure modes — the shared vocabulary in our troubleshooting guide.

🛰️ Traffic inspection, MITM & transparent proxy

Project	⭐	How it inspired OmniRoute
llm-interceptor · chouzz	46	MITM interception/analysis of coding-assistant ↔ LLM traffic — our Traffic Inspector ports its SSE merge, conversation normalization, host passthrough and secret masking (MIT).
ProxyBridge · InterceptSuite	5.1k	Transparent per-process proxy routing — inspired our crash-safe MITM teardown, socket idle-timeouts, `/proc` process attribution and TPROXY capture.

📚 Model data, observability & UI

Project	⭐	How it inspired OmniRoute
models.dev · SST / OpenCode	5.1k	Open database of AI model specs, pricing and capabilities — synced natively into our model catalog.
React Flow / xyflow · xyflow	37.1k	The node-based graph library powering our real-time Compression Studio and Combo/Routing Studio.
LangGraph · LangChain	35.1k	LangGraph Studio's live workflow-graph visualization inspired our Studios' real-time cascade view.
Langfuse · Langfuse	29.3k	Its trace → span → generation observability model shaped our Compression Studio waterfall.
Kiali · Kiali	3.6k	Istio service-mesh observability — inspired our circuit-breaker badges and error-edge visuals in the Routing/Combo Studio.
lobe-icons · LobeHub	2.1k	AI/LLM brand logos that render the provider icons across our dashboard.

🛡️ Security

Project	⭐	How it inspired OmniRoute
awesome-secure-defaults · tldrsec	708	A curated list of secure-by-default libraries that guides our security choices (Helmet.js, DOMPurify, ssrf-req-filter, safe-regex, Google Tink).

❤️ Support

OmniRoute is free and open source, built and maintained in the open. If it saves you time or money, consider supporting development:

Name		Name	Last commit message	Last commit date
Latest commit History 4,517 Commits
.github		.github
.husky		.husky
.source		.source
.vale/styles/Vocab/OmniRoute		.vale/styles/Vocab/OmniRoute
.vscode		.vscode
@omniroute		@omniroute
bin		bin
config		config
contrib/podman		contrib/podman
docs		docs
electron		electron
examples		examples
images		images
open-sse		open-sse
public		public
scripts		scripts
skills		skills
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.markdownlint.json		.markdownlint.json
.mcp.json.example		.mcp.json.example
.node-version		.node-version
.npmignore		.npmignore
.npmrc		.npmrc
.nvmrc		.nvmrc
.size-limit.json		.size-limit.json
.vale.ini		.vale.ini
.zizmor.yml		.zizmor.yml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DESING.md		DESING.md
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
design.md		design.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
eslint.complexity.config.mjs		eslint.complexity.config.mjs
eslint.config.mjs		eslint.config.mjs
eslint.sonarjs.config.mjs		eslint.sonarjs.config.mjs
flake.lock		flake.lock
flake.nix		flake.nix
fly.toml		fly.toml
knip.json		knip.json
llm.txt		llm.txt
news.json		news.json
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.mjs		postcss.config.mjs
prettier.config.mjs		prettier.config.mjs
promptfooconfig.yaml		promptfooconfig.yaml
semcheck.yaml		semcheck.yaml
socket.yml		socket.yml
sonar-project.properties		sonar-project.properties
source.config.ts		source.config.ts
stryker.conf.json		stryker.conf.json
stryker.disablebail.json		stryker.disablebail.json
tsconfig.json		tsconfig.json
tsconfig.typecheck-core.json		tsconfig.typecheck-core.json
tsconfig.typecheck-noimplicit-core.json		tsconfig.typecheck-noimplicit-core.json
vitest.config.ts		vitest.config.ts
vitest.mcp.config.ts		vitest.mcp.config.ts

Folders and files

Latest commit

History

Repository files navigation

🚀 OmniRoute — The Free AI Gateway

Never stop coding. Connect every AI tool to 231 providers — 50+ free — through one endpoint.

💬 Join the community

💰 ~1.6B Free Tokens / Month

💥 The Promise

🤔 Why OmniRoute?

🎯 Combos — The Flagship

⚡ Zero-config — just use auto

🔀 Or build your own — 15 routing strategies

🧱 Resilience is built in (3 independent layers)

🏆 What Sets OmniRoute Apart

✨ What's New

🤖 Compatible CLIs & Coding Agents

🌐 231 AI Providers — 50+ Free

🆓 Free Forever — $0, no card

🖥️ Where OmniRoute Runs — Anywhere

🔒 Private & Local-First

🔌 Full CLI + A2A & MCP

⌨️ A real CLI (not just start)

🛰️ Remote mode — run the CLI here, OmniRoute on a VPS

🤝 Connect an agent — and it controls OmniRoute itself

🗜️ Save 15–95% Tokens — Automatically

🧱 The 9-engine stack

📖 How it works — pipeline, architecture & savings math

⚡ Quick Start

📦 More install methods — Docker, source, pnpm, Arch

🎬 OmniRoute in Action

📚 Explore More

📧 Support & Community

🛠️ Tech Stack

📖 Documentation

📘 Getting Started

🔧 Operations & Deployment

🧠 Features & Architecture

🤖 Protocols & APIs

📋 Project & Quality

⭐ Top Contributors

👥 Contributors

How to Contribute

Releasing a New Version

📊 Stars

🌍 StarMapper

🙏 Acknowledgments

🧬 Lineage & gateway

🗜️ Context & token compression — engines

🧩 Compact formats, token research & code-aware tooling

🧠 Memory & RAG

🛰️ Traffic inspection, MITM & transparent proxy

📚 Model data, observability & UI

🛡️ Security

❤️ Support

📄 License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

⚡ Zero-config — just use `auto`

⌨️ A real CLI (not just `start`)

Packages