Skip to content

Releases: Inebrio/Routerly

Routerly v0.1.5

27 Mar 18:02
d23230c

Choose a tag to compare

Routerly is a self-hosted LLM gateway that sits between your application and any AI provider. Swap one URL, keep every existing SDK, and get full control over routing, cost, and access — without changing a line of application code.


Core Features

OpenAI & Anthropic Wire-Compatible Proxy

Drop-in replacement for the OpenAI and Anthropic APIs. Any SDK, tool, or framework that speaks either protocol works without modification — Python, Node.js, .NET, Go, Cursor, Continue.dev, LibreChat, LangChain, LlamaIndex, Open WebUI, and more.

Intelligent Multi-Policy Routing

Each request is scored against up to 9 pluggable routing policies applied simultaneously. Routerly picks the best candidate and falls back automatically if a provider fails.

Policy What it does
llm Asks a language model to pick the best candidate given request context
cheapest Minimises cost per token
health Deprioritises models with recent errors
performance Favours models with lower average latency
capability Matches models to task requirements (vision, tools, JSON mode…)
context Filters models by context window size relative to the prompt
budget-remaining Excludes models that would push a project over its budget
rate-limit Steers traffic away from rate-limited providers
fairness Balances load across candidates

Real-Time Cost Tracking & Budgets

Every request is priced at the token level. Costs accumulate per project, and hard limits (hourly / daily / weekly / monthly / per-request) block overspending before it happens.

Multi-Tenant Project Isolation

Separate Bearer tokens per project. Each project has its own model pool, routing policies, and budget envelope — ideal for multi-tenant SaaS, team isolation, or dev/staging/prod separation.

Streaming & Tool Calling

Full streaming SSE support and OpenAI-compatible tool/function calling pass-through, including streaming tool call deltas.

Supported Providers

OpenAI · Anthropic · Google Gemini · Ollama (local) · Mistral · Cohere · xAI (Grok) · Any custom HTTP endpoint. Mix cloud and local models in the same project.


Dashboard

Built-in React web dashboard with:

  • Overview — live spend, call volume, success rate, and daily cost trend
  • Models — manage registered models with provider badges and pricing info
  • Projects — create and configure projects with routing policies and model pools
  • Usage analytics — per-model breakdown of calls, tokens in/out, errors, and cost; filterable by period, project, and status
  • Users & Roles — RBAC with configurable roles and permissions
  • Light / dark theme

CLI (routerly)

Full-featured admin CLI:

  • routerly status — server health, reachability, uptime, model/project counts; --json for scripting
  • routerly model — add, list, remove models
  • routerly project — add, list, remove projects; manage model assignments
  • routerly auth — manage multiple server accounts (login, logout, switch, rename, whoami)
  • routerly user — add, list, remove users
  • routerly role — add, list, edit, remove roles
  • routerly report — pull usage and call reports
  • routerly service — start/stop/configure the local service

Installer

One-line install for macOS, Linux, and Windows:

curl -fsSL https://github.com/Inebrio/Routerly/releases/latest/download/install.sh | bash

The installer detects the platform, optionally installs Node.js 20+, builds packages, generates an encryption key, sets up an auto-start daemon, and walks through first-run setup (model, project, admin user). On existing installs it shows an Update / Reinstall / Uninstall menu.

Docker Compose image also available for zero-dependency deployment.


Design Philosophy

  • No external database — config and usage data live in plain JSON files
  • Fully decoupled — service, dashboard, and CLI can run on different machines
  • No telemetry — your prompts and keys never leave your infrastructure
  • Zero migration cost — one env-var change, nothing else