Coven Code supports a wide range of LLM providers through a unified provider abstraction. Every provider implements the same LlmProvider trait, so switching between them requires only a configuration change.
Use the --provider flag on any invocation to override the active provider:
coven-code --provider openai "refactor this module"
coven-code --provider ollama "explain this function"
coven-code --provider groq --model llama-3.3-70b-versatile "write tests"
The provider can also be set persistently in ~/.coven-code/settings.json:
{
"provider": "openai"
}When no provider is specified, Coven Code defaults to Anthropic.
The default provider. Uses the /v1/messages streaming endpoint.
Authentication: ANTHROPIC_API_KEY environment variable, or set api_key in settings.json.
Default model: claude-sonnet-4-6
Available models (bundled snapshot):
| Model ID | Context Window | Max Output | Input ($/1M) | Output ($/1M) |
|---|---|---|---|---|
claude-opus-4-6 |
200,000 | 32,000 | $15.00 | $75.00 |
claude-sonnet-4-6 |
200,000 | 16,000 | $3.00 | $15.00 |
claude-haiku-4-5-20251001 |
200,000 | 8,096 | $0.80 | $4.00 |
All Anthropic models support tool calling, vision, and extended reasoning.
Configuration:
{
"provider": "anthropic",
"providers": {
"anthropic": {
"api_key": "sk-ant-...",
"models_whitelist": ["claude-sonnet-4-6", "claude-haiku-4-5-20251001"]
}
}
}Base URL override: Set ANTHROPIC_BASE_URL to point at a proxy or local mirror.
Uses the OpenAI Chat Completions API (/v1/chat/completions).
Authentication: OPENAI_API_KEY environment variable.
Default model: gpt-4o
Available models (bundled snapshot):
| Model ID | Context Window | Max Output | Reasoning |
|---|---|---|---|
gpt-4o |
128,000 | 16,384 | No |
gpt-4o-mini |
128,000 | 16,384 | No |
o3 |
200,000 | 100,000 | Yes |
o4-mini |
200,000 | 100,000 | Yes |
Configuration:
{
"provider": "openai",
"providers": {
"openai": {
"api_key": "sk-...",
"api_base": "https://api.openai.com/v1"
}
}
}Uses the Google Generative Language / Vertex AI API.
Authentication: GOOGLE_API_KEY environment variable (for AI Studio) or GOOGLE_APPLICATION_CREDENTIALS (for Vertex AI).
Default model: gemini-2.5-flash
Available models (bundled snapshot):
| Model ID | Context Window | Max Output |
|---|---|---|
gemini-2.5-pro |
1,048,576 | 65,536 |
gemini-2.5-flash |
1,048,576 | 65,536 |
gemini-2.0-flash |
1,048,576 | 8,192 |
Configuration:
{
"provider": "google",
"providers": {
"google": {
"api_key": "AIza..."
}
}
}Uses the Azure OpenAI Chat Completions endpoint. The deployment name acts as the model identifier.
Authentication: Three environment variables are required:
AZURE_API_KEY— your Azure OpenAI API keyAZURE_RESOURCE_NAME— your Azure resource name (the subdomain of.openai.azure.com)AZURE_API_VERSION— API version (defaults to2024-08-01-preview)
Default model: gpt-4o
Request URL format:
https://{AZURE_RESOURCE_NAME}.openai.azure.com/openai/deployments/{deployment}/chat/completions?api-version={version}
Configuration:
{
"provider": "azure",
"providers": {
"azure": {
"api_key": "...",
"options": {
"resource_name": "my-azure-resource",
"api_version": "2024-08-01-preview"
}
}
}
}Uses the Bedrock Converse Streaming API. Supports all Claude models deployed on Bedrock.
Authentication (two modes):
- Bearer token: Set
AWS_BEARER_TOKEN_BEDROCK(takes priority over SigV4). - SigV4 credentials: Set
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, and optionallyAWS_SESSION_TOKEN.
Region: Reads AWS_REGION or AWS_DEFAULT_REGION (defaults to us-east-1).
Default model: anthropic.claude-sonnet-4-6-v1
The adapter automatically prepends regional cross-inference prefixes (e.g. us.anthropic.claude-...) for US-region deployments.
Configuration:
{
"provider": "amazon-bedrock",
"providers": {
"amazon-bedrock": {
"options": {
"region": "us-east-1"
}
}
}
}Uses the GitHub Copilot Chat Completions API (https://api.githubcopilot.com/chat/completions).
Authentication: GITHUB_TOKEN environment variable.
Default model: gpt-4o
Configuration:
{
"provider": "github-copilot",
"providers": {
"github-copilot": {
"api_key": "ghu_..."
}
}
}Native Cohere API adapter.
Authentication: COHERE_API_KEY environment variable.
Default model: command-r-plus
Configuration:
{
"provider": "cohere",
"providers": {
"cohere": {
"api_key": "..."
}
}
}Connects to a locally running Ollama instance. No API key required.
Base URL: Reads OLLAMA_HOST (defaults to http://localhost:11434). Coven Code appends /v1 to construct the OpenAI-compatible endpoint.
Default model: llama3.2
Model list: When using /connect or /model, the picker queries your local Ollama server via /api/tags and shows only the models you have installed (ollama list). Cloud models (e.g., kimi-k2.6:cloud) appear after you run ollama pull <model>:cloud.
Configuration:
{
"provider": "ollama",
"providers": {
"ollama": {
"api_base": "http://localhost:11434"
}
}
}Run a model locally first with ollama pull llama3.2, then:
coven-code --provider ollama --model llama3.2 "explain this code"
Connects to a locally running LM Studio server. No API key required.
Base URL: Reads LM_STUDIO_HOST (defaults to http://localhost:1234). Coven Code appends /v1.
Default model: default (whichever model is loaded in LM Studio)
Configuration:
{
"provider": "lmstudio",
"providers": {
"lmstudio": {
"api_base": "http://localhost:1234/v1"
}
}
}Connects to a locally running llama.cpp HTTP server. No API key required.
Base URL: Reads LLAMA_CPP_HOST (defaults to http://localhost:8080). Coven Code appends /v1.
Default model: default
Configuration:
{
"provider": "llamacpp",
"providers": {
"llamacpp": {
"api_base": "http://localhost:8080/v1"
}
}
}Start llama.cpp with the --server flag before use.
Fast inference cloud with OpenAI-compatible API.
Authentication: GROQ_API_KEY environment variable.
Base URL: https://api.groq.com/openai/v1
Default model: llama-3.3-70b-versatile
Configuration:
{
"provider": "groq",
"providers": {
"groq": {
"api_key": "gsk_..."
}
}
}OpenAI-compatible API with extended reasoning output via a reasoning_content field.
Authentication: DEEPSEEK_API_KEY environment variable.
Base URL: https://api.deepseek.com/v1
Default model: deepseek-chat
Configuration:
{
"provider": "deepseek",
"providers": {
"deepseek": {
"api_key": "sk-..."
}
}
}OpenAI-compatible API with Mistral-specific protocol quirks (tool call ID formatting, tool-user sequence injection).
Authentication: MISTRAL_API_KEY environment variable.
Base URL: https://api.mistral.ai/v1
Default model: mistral-large-latest
Configuration:
{
"provider": "mistral",
"providers": {
"mistral": {
"api_key": "..."
}
}
}Authentication: XAI_API_KEY environment variable.
Base URL: https://api.x.ai/v1
Default model: grok-2
Configuration:
{
"provider": "xai",
"providers": {
"xai": {
"api_key": "xai-..."
}
}
}Unified API gateway to many models. Sends HTTP-Referer: https://coven-code.ai/ and `` headers automatically.
Authentication: OPENROUTER_API_KEY environment variable.
Base URL: https://openrouter.ai/api/v1
Default model: anthropic/claude-sonnet-4
Model identifiers use OpenRouter's routing format: provider/model-name.
Configuration:
{
"provider": "openrouter",
"providers": {
"openrouter": {
"api_key": "sk-or-..."
}
}
}Hosted open-source models.
Authentication: TOGETHER_API_KEY environment variable.
Base URL: https://api.together.xyz/v1
Default model: meta-llama/Llama-3.3-70B-Instruct-Turbo
Configuration:
{
"provider": "togetherai",
"providers": {
"togetherai": {
"api_key": "..."
}
}
}Search-augmented LLM API.
Authentication: PERPLEXITY_API_KEY environment variable.
Base URL: https://api.perplexity.ai
Default model: sonar-pro
Configuration:
{
"provider": "perplexity",
"providers": {
"perplexity": {
"api_key": "pplx-..."
}
}
}Hosted open-weight models on OpenAI-compatible API.
Authentication: DEEPINFRA_API_KEY environment variable.
Base URL: https://api.deepinfra.com/v1/openai
Default model: meta-llama/Llama-3.3-70B-Instruct
Configuration:
{
"provider": "deepinfra",
"providers": {
"deepinfra": {
"api_key": "..."
}
}
}Privacy-focused inference.
Authentication: VENICE_API_KEY environment variable.
Base URL: https://api.venice.ai/api/v1
Default model: llama-3.3-70b (resolved from the model registry at runtime)
Configuration:
{
"provider": "venice",
"providers": {
"venice": {
"api_key": "..."
}
}
}Wafer-scale inference hardware.
Authentication: CEREBRAS_API_KEY environment variable.
Base URL: https://api.cerebras.ai/v1
Default model: llama-3.3-70b
Configuration:
{
"provider": "cerebras",
"providers": {
"cerebras": {
"api_key": "..."
}
}
}The providers map in ~/.coven-code/settings.json accepts per-provider ProviderConfig objects:
{
"provider": "anthropic",
"providers": {
"anthropic": {
"api_key": "sk-ant-...",
"api_base": "https://api.anthropic.com",
"enabled": true,
"models_whitelist": [],
"models_blacklist": [],
"options": {}
},
"openai": {
"api_key": "sk-...",
"enabled": true
},
"ollama": {
"enabled": true,
"api_base": "http://192.168.1.50:11434/v1"
}
}
}Fields:
| Field | Type | Description |
|---|---|---|
api_key |
string | Override the environment variable API key |
api_base |
string | Override the default base URL |
enabled |
bool | Enable or disable the provider (default: true) |
models_whitelist |
array of strings | If non-empty, only listed model IDs are allowed |
models_blacklist |
array of strings | Listed model IDs are refused |
options |
object | Provider-specific pass-through options |
When models_whitelist is non-empty for a provider, only the listed model IDs can be selected for that provider. Any model ID in models_blacklist is rejected regardless of the whitelist:
{
"providers": {
"openai": {
"models_whitelist": ["gpt-4o", "gpt-4o-mini"],
"models_blacklist": ["gpt-4o-mini"]
}
}
}The above example allows only gpt-4o (whitelist minus blacklist).
Coven Code ships a bundled snapshot of models for Anthropic, OpenAI, and Google. At runtime it optionally refreshes from the public https://models.dev/api.json API (cached to ~/.coven-code/models_cache.json, refreshed at most every 5 minutes). Network failures are swallowed silently; the bundled snapshot is always sufficient for normal operation.
When no model is explicitly set, Coven Code scores available models by priority patterns to pick the best default. Well-known model prefixes (claude-*, gpt-*, gemini-*, etc.) are always routed to their canonical provider regardless of gateway entries in the remote cache.