title	Getting Started
description	Install and run FrugalRoute, send your first request, and connect any OpenAI-compatible client.

Getting Started

FrugalRoute is a local-first LLM routing layer that sits between your application and your models. Instead of hardcoding a specific model, you describe what you need -- and FrugalRoute picks the cheapest capable model, starting with local inference via Ollama and escalating to cloud providers only when necessary.

It exposes an OpenAI-compatible API on localhost:3100. Any client that speaks the OpenAI chat completions protocol works -- the OpenAI SDK, the Anthropic SDK (via its OpenAI-compat mode), fetch, curl, or any HTTP client. FrugalRoute is not tied to OpenAI; it uses the OpenAI wire format as a universal interface.

Prerequisites

Ollama (for local model inference)
Either Node.js (v16+) or Bun (v1.1+) to run FrugalRoute
Optionally, API keys for cloud providers: OpenAI, Anthropic, Google (Gemini), Groq, Mistral, Kimi (Moonshot), or DeepSeek

Installation

From npm (recommended)

npm install -g frugalroute

Or run without installing:

npx frugalroute
# or with bun
bunx frugalroute

That's it — frugalroute is now available globally and the server will start on http://localhost:3100.

From source

If you want to hack on FrugalRoute or use unreleased features:

git clone https://github.com/SimplyLiz/FrugalRoute && cd FrugalRoute
bun install
bun run dev

Configuration

Copy the example environment file and edit it:

cp .env.example .env

The defaults work for local-only usage. Add API keys for any cloud providers you want:

# .env
PORT=3100

OLLAMA_BASE_URL=http://localhost:11434

# Cloud providers — all optional, add the ones you have
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
GOOGLE_API_KEY=your-key-here
GROQ_API_KEY=gsk_your-key-here
MISTRAL_API_KEY=your-key-here
KIMI_API_KEY=your-key-here
DEEPSEEK_API_KEY=your-key-here

EMBEDDING_MODEL=nomic-embed-text
DEFAULT_MAX_COST_PER_REQUEST=0.01

Each key you add registers that provider's models. No key = no registration, no errors. You can start with just Ollama and add cloud providers later.

Pull local models

FrugalRoute needs at least one local model and the embedding model for semantic routing:

ollama pull gemma3:4b
ollama pull nomic-embed-text

Start the server

If installed globally via npm:

frugalroute

If running from source:

bun run dev

You should see the server listening on http://localhost:3100.

First request

Send a chat completion request with curl:

curl http://localhost:3100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "Explain what a merkle tree is in two sentences." }
    ]
  }'

FrugalRoute will select the most cost-effective model that can handle the request, run inference, and return a standard OpenAI-shaped response.

Connecting your application

FrugalRoute speaks the OpenAI wire format, so any client that supports a custom base_url works. You do not need an OpenAI account or API key to use FrugalRoute -- Ollama runs locally by default with zero cost.

curl

curl http://localhost:3100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "What is a B-tree?" }
    ]
  }'

Omit the model field entirely to let the router decide. Or use a model alias:

curl http://localhost:3100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fast",
    "messages": [
      { "role": "user", "content": "Format this as JSON: name=Alice age=30" }
    ]
  }'

Default aliases: fast -> local model, smart -> GPT-4o, best -> Claude Sonnet.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3100/v1",
    api_key="unused",  # FrugalRoute does not require an API key by default
)

response = client.chat.completions.create(
    model="auto",  # let the router decide
    messages=[
        {"role": "user", "content": "What is a B-tree?"}
    ],
)

print(response.choices[0].message.content)

TypeScript (OpenAI SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:3100/v1",
  apiKey: "unused",
});

const response = await client.chat.completions.create({
  model: "auto",
  messages: [
    { role: "user", content: "What is a B-tree?" },
  ],
});

console.log(response.choices[0].message.content);

TypeScript (fetch)

No SDK needed -- FrugalRoute is just an HTTP endpoint:

const response = await fetch("http://localhost:3100/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    messages: [{ role: "user", content: "What is a B-tree?" }],
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);

Python (Anthropic SDK, OpenAI-compat mode)

The Anthropic SDK can also talk to OpenAI-compatible endpoints:

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:3100",
    api_key="unused",
)

# Use the messages API -- FrugalRoute normalizes the format internally

What `model` should I set?

Value	Behaviour
(omitted or `"auto"`)	Router picks the cheapest capable model based on your prompt
`"fast"` / `"smart"` / `"best"`	Model alias -- resolved to a real model ID before routing
`"gemma3-4b"` / `"gpt-4o"` / etc.	Direct model -- bypasses the router entirely

Running without cloud keys

FrugalRoute works with Ollama only. If you don't set OPENAI_API_KEY or ANTHROPIC_API_KEY, the router uses local models exclusively. Cost: $0 for every request.

Cloud providers are optional escalation targets. Add keys only if you want the router to escalate complex tasks (reasoning, coding) to more capable models when local confidence is low.

Next steps

Configuration -- tune routing thresholds, cost limits, and provider priorities
Routing -- understand capability matching, aliases, sticky sessions, and the escalation cascade
Observability -- circuit breaker, latency tracking, and health probing
API Reference -- full endpoint documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Started

Prerequisites

Installation

From npm (recommended)

From source

Configuration

Pull local models

Start the server

First request

Connecting your application

curl

Python (OpenAI SDK)

TypeScript (OpenAI SDK)

TypeScript (fetch)

Python (Anthropic SDK, OpenAI-compat mode)

What `model` should I set?

Running without cloud keys

Next steps

FilesExpand file tree

getting-started.mdx

Latest commit

History

getting-started.mdx

File metadata and controls

Getting Started

Prerequisites

Installation

From npm (recommended)

From source

Configuration

Pull local models

Start the server

First request

Connecting your application

curl

Python (OpenAI SDK)

TypeScript (OpenAI SDK)

TypeScript (fetch)

Python (Anthropic SDK, OpenAI-compat mode)

What model should I set?

Running without cloud keys

Next steps

What `model` should I set?