One endpoint. 20+ providers. Zero vendor lock-in. Full security.
🚀 Quick Start · 📖 Docs · 🛡️ Security · 🧪 Testing · 🌐 Live Demo
Chimera Gateway is a production-grade, self-hosted AI API gateway that routes your LLM requests across 15 free providers with intelligent fallback, latency-aware routing, local Ollama support, and battle-tested security defences directly informed by the "Your Agent Is Mine" research paper (arXiv:2604.08407).
Drop in as a replacement for any OpenAI-compatible client — zero code changes required.
# Before (locked to one provider)
curl https://api.openai.com/v1/chat/completions ...
# After (21 providers, smart routing, full security)
curl http://localhost:8000/v1/chat/completions ...
┌─────────────────────────────────────────────┐
│ Your Application │
│ (OpenAI SDK · Claude SDK · curl · │
│ any HTTP client) │
└──────────────────────────┬──────────────────┘
│
POST /v1/chat/completions or /v1/messages
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Chimera Gateway v8.2.0 │
│ │
│ ┌───────────┐ ┌─────────┐ ┌───────────┐ ┌──────────┐ │
│ │ Auth Gate │→ │ WAF │→ │ Prompt │→ │ Content │ │
│ │ (Bearer) │ │ │ │ Shield │ │ Policy │ │
│ └───────────┘ └─────────┘ └───────────┘ └──────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Intelligent Routing Engine │ │
│ │ auto · fast · quality · balanced · reasoning │ │
│ │ + 21 provider circuit breakers & rate limiters │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌────────────────────┐ │
│ │ Canaries │ │ E2EE │ │ Transparency Log │ │
│ │ (key leak)│ │ (AES-GCM) │ │ (SHA-256 audit) │ │
│ └───────────┘ └───────────┘ └────────────────────┘ │
└──────────────────────────┬──────────────────────────────────────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌────────────┐ ┌──────────────┐ ┌──────────────┐
│ Cloud │ │ Local │ │ Custom │
│ Providers │ │ Ollama │ │ BYOK │
│ (20 APIs) │ │ (offline) │ │ (vLLM etc) │
└────────────┘ └──────────────┘ └──────────────┘
git clone https://github.com/Mr-DS-ML-85/chimera-ai-gateway.git
cd chimera-ai-gateway
pip install -r requirements.txt
cp .env.example .envEdit .env — add at least one API key. Pollinations.AI works with zero keys.
# Minimum: one key (or nothing for Pollinations free)
GROQ_API_KEY=your_key_here# Development
DEV=1 python main.py
# Production
python main.py
# Health check
curl http://localhost:8000/v1/health# OpenAI-compatible
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "auto", "messages": [{"role": "user", "content": "Hello!"}]}'
# Anthropic / Claude Code — /v1/messages
curl -X POST http://localhost:8000/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: your-key" \
-d '{"model": "auto", "messages": [{"role": "user", "content": "Hello!"}]}'# Build & run
docker build -t chimera-gateway .
docker run -d --name chimera -p 8000:8000 --env-file .env chimera-gateway
# Docker Compose (Chimera + Ollama)
docker compose up -dUse these special model names for automatic intelligent routing:
auto— Best free non-reasoning model (fast by default)auto:free— Free-tier only, non-reasoningauto:reasoning— Best free reasoning/math/code modelauto:free:reasoning— Free-tier only, reasoningfast/fast:free— Prioritise latencyquality/balanced— Prioritise output qualityreasoning/reasoning:free— Reasoning modelsnon-reasoning/non-reasoning:free— General-purpose models
Provider-prefixed variants also work: groq/auto, openrouter/reasoning, ollama/fast, etc.
Chimera natively supports the Anthropic API format via /v1/messages and /v1/responses — works directly with Claude Code, Claude SDK (Node/Python), and any Anthropic-compatible client.
# Set environment
export ANTHROPIC_BASE_URL="http://localhost:8000"
export ANTHROPIC_AUTH_TOKEN="your_chimera_api_key"// Node.js — Claude SDK
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ baseURL: process.env.ANTHROPIC_BASE_URL });
const msg = await client.messages.create({
model: "auto:reasoning",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain quantum entanglement" }],
});# Python — Anthropic SDK
from anthropic import Anthropic
client = Anthropic(base_url="http://localhost:8000")
msg = client.messages.create(
model="auto:reasoning",
=======
# Chimera AI Gateway
[](./LICENSE)
Chimera is an **open-source AI model gateway** that routes API requests to multiple providers (OpenCode, OpenRouter, Claude Desktop, etc.) with model aliasing, WAF bypass, and fallback logic.
---
## 🔧 Quick Start
```bash
git clone https://github.com/Mr-DS-ML-85/chimera-ai-gateway.git
cd chimera-ai-gateway/chimera
cp .env.example .env # Edit with your API keys
python3 start.pyfrom anthropic import Anthropic
client = Anthropic(
base_url="http://127.0.0.1:8000", # Use 127.0.0.1 for Claude Desktop
api_key="your_chimera_api_key"
)
msg = client.messages.create(
model="sonnet", # or "haiku", "opus", "claude-4.6-sonnet"
>>>>>>> d85142b (Add README)
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)<<<<<<< HEAD
| Endpoint | Description |
|---|---|
POST /v1/messages |
Messages API — non-streaming & streaming (SSE) |
POST /v1/messages/count_tokens |
Token counting for pre-flight checks |
POST /v1/responses |
Responses API format wrapper |
All virtual model aliases work with /v1/messages:
auto/auto:free— Best non-reasoning model (fast by default)auto:reasoning/auto:free:reasoning— Best reasoning modelfast/quality/balanced— Latency vs quality trade-offs
Model alias shortcuts: sonnet → claude-sonnet-4-7, opus → claude-opus-4-5, haiku → claude-haiku-4-7
Claude Desktop (and Cowork) validates model names in Gateway mode — only names containing claude, sonnet, opus, haiku, or anthropic are accepted (see GitHub #56990). Chimera automatically rewrites free/third-party model names to the opencode-zen/ prefix so they pass validation while routing to the correct backend:
| Client model name | Rewritten to |
|---|---|
claude-opus-4.7 |
opencode-zen/minimax-m2.5-free |
claude-4.5-Haiku |
google/gemini-3-flash |
claude-opus-4.6 |
auto-reasoning |
You can add your own aliases via .env
This applies to both /v1/chat/completions and /v1/messages endpoints. No client config change needed — just use the model name as-is.
Set ANTHROPIC_API_KEY in .env to route directly to Anthropic for model names starting with anthropic/:
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://api.anthropic.com
ANTHROPIC_TIMEOUT=120These headers are forwarded through to the upstream provider:
anthropic-beta— enables beta featuresanthropic-version— API version (e.g.2023-06-01)anthropic-dangerous-direct-browser-access— for direct browser clients
| # | Provider | Free Tier | RPM | Capabilities | Key Variable |
|---|---|---|---|---|---|
| 1 | Groq | ✅ No CC | 30 | Tools · System · Stream | GROQ_API_KEY |
| 2 | Google AI Studio | ✅ No CC | 15 | Vision · Tools · System · Stream | GOOGLE_API_KEY |
| 3 | OpenRouter | ✅ No CC | 20 | Vision · Tools · System · Stream | OPENROUTER_API_KEY |
| 4 | Cloudflare Workers AI | 10K neurons/day | 300 | System · Stream | CF_ACCOUNT_ID + CF_API_TOKEN |
| 5 | GitHub Models | ✅ 50 req/day | 5–15 | Tools · System · Stream | GITHUB_TOKEN |
| 6 | NVIDIA NIM | 40 | Tools · System · Stream | NVIDIA_NIM_API_KEY |
|
| 7 | a4f.co | ✅ No CC | 20 | System · Stream | A4F_API_KEY |
| 8 | Cerebras | ✅ No CC | 30 | Tools · System · Stream | CEREBRAS_API_KEY |
| 9 | Pollinations.AI | ✅ Zero-key | 10 | Vision · System | None needed |
| 10 | Ollama (Local) | ✅ Unlimited | ∞ | Vision · Tools · System · Stream | OLLAMA_BASE_URL |
| 11 | HuggingFace | ✅ No CC | 60 | System · Stream | HUGGINGFACE_API_KEY |
| 12 | SambaNova | ✅ No CC | 30 | System · Stream | SAMBANOVA_API_KEY |
| 13 | Together AI | $5 credit | 60 | Tools · System · Stream | TOGETHER_API_KEY |
| 14 | LLM7.io | ✅ Anonymous | 30 | Vision · System | LLM7_API_KEY |
| 15 | Mistral AI | ✅ Experiment | 2 | Vision · Tools · System · Stream | MISTRAL_API_KEY |
| 16 | xAI / Grok | 60 | Vision · Tools · System · Stream | XAI_API_KEY |
|
| 17 | DeepSeek | ✅ Trial credits | 60 | Tools · System · Stream | DEEPSEEK_API_KEY |
| 18 | Perplexity | ❌ Paid API | 20 | System · Stream · Search | PERPLEXITY_API_KEY |
| 19 | Fireworks AI | $1 credit | 6,000 | Vision · Tools · System · Stream | FIREWORKS_API_KEY |
| 20 | DeepInfra | ✅ Trial credits | 12,000 | Tools · System · Stream | DEEPINFRA_API_KEY |
| 21 | Anthropic (direct) | — | Vision · Tools · Thinking · Stream | ANTHROPIC_API_KEY |
|
| 22 | Custom (BYOK) | ✅ User-defined | ∞ | Vision · Tools · System · Stream | CUSTOM_OPENAI_* |
See docs/PROVIDERS.md for full provider details and model lists.
Built on the "Your Agent Is Mine" threat model (arXiv:2604.08407):
- AC-1 Payload Injection — Pattern + Base64 nested scan for injected code
- AC-2 Secret Exfiltration — Canary token detection for API key leaks
- Prompt Shield — Many-shot and encoding-bypass injection detection
- Content Policy — CSAM, WMD, self-harm block lists
- WAF — SQLi, XSS, CMDi, path traversal protection
- PII Redaction — Automatic sensitive data masking in logs
- SSRF Guard — No redirect following to internal resources
- AES-256-GCM E2EE — Optional per-request response encryption
- HMAC Request Signing — Request integrity verification
- Transparency Log — Append-only SHA-256 audit trail
See docs/SECURITY.md for full details.
Key .env variables:
# Gateway
CHIMERA_API_KEY=your_master_key
ROUTE_BY=quality # quality | latency | random
# Provider keys (add as needed)
GROQ_API_KEY=
GOOGLE_API_KEY=
OPENROUTER_API_KEY=
POLLINATIONS_API_KEY= # optional — works without it
OLLAMA_BASE_URL=http://localhost:11434
CUSTOM_OPENAI_BASE_URL= # your vLLM / ollama server
# Anthropic direct routing (optional — for claude-sonnet-4, claude-opus-4, etc.)
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://api.anthropic.com
ANTHROPIC_TIMEOUT=120
# OpenCode Zen — free model gateway (minimax-m2.5-free, gemini-3-flash, glm-5, etc.)
# Also used for Claude Desktop model name rewriting (see README)
OPENCODE_ZEN_API_KEY=
# Security
ENABLE_WAF=1
ENABLE_PII_REDACTION=1
# Runtime
PORT=8000
DEV=0 # 1 for developmentchimera-ai-gateway/
├── main.py # Entry point (python main.py)
├── api/
│ ├── app.py # FastAPI app factory
│ ├── middleware.py # CORS, request ID, client IP
│ └── routes/
│ ├── chat.py # /v1/chat/completions + /v1/messages
│ ├── health.py # /v1/health
│ ├── models.py # /v1/models
│ ├── metrics.py # /v1/metrics
│ ├── admin.py # /v1/admin/* (keys, providers)
│ └── transparency.py # /v1/transparency-log
├── providers/
│ ├── catalogue.py # 21 provider definitions + live discovery
│ ├── router.py # Intelligent routing engine
│ ├── virtual_routes.py # Virtual model alias resolution
│ ├── circuit_breaker.py # Per-provider health/failover
│ ├── rate_limiter.py # Sliding-window RPM/RPD limits
│ └── auto_models.py # Hourly live model refresh
├── security/
│ ├── waf.py # Web Application Firewall
│ ├── prompt_shield.py # Injection detection
│ ├── canary.py # API key exfil detection
│ ├── pii.py # PII redaction
│ └── ssrf.py # SSRF protection
├── crypto/
│ └── e2ee.py # AES-256-GCM E2EE
├── keys/
│ └── virtual_keys.py # Scoped API key management
├── transparency/
│ └── log.py # SHA-256 audit log
└── docs/
├── API.md # Full API reference
├── PROVIDERS.md # Detailed provider docs
└── SECURITY.md # Security architecture
Note: This is a flat-package project. Run with
PYTHONPATH=. python main.pyor from the project root. Do NOTpip installthe directory — it is not a published package.
# Full test suite (mocked — no API calls)
pytest tests/ -v
# With coverage
pytest tests/ -v --cov=. --cov-report=html
# Specific test categories
pytest tests/ -v -m security # security tests
pytest tests/ -v -m integration # integration tests
# Lint
ruff check .MIT — see LICENSE.
- cheahjs/free-llm-api-resources — free provider catalogue
- "Your Agent Is Mine" (arXiv:2604.08407) — security threat model
- FastAPI — web framework
- All 21 integrated AI providers =======
Whitelist LDAP/XML payloads in .env:
WAF_BYPASS_PATTERNS=ldap|xml|/v1/modelsEdit .env:
MODEL_ALIASES_JSON='{"sonnet": "gemini-3.1-flash-lite", "opus": "opencode/minimax-m2.5-free"}'MIT License — © 2026 Mr-DS-ML-85