ZeroClaw Providers Reference

This document maps provider IDs, aliases, and credential environment variables.

Last verified: March 1, 2026.

How to List Providers

zeroclaw providers

Credential Resolution Order

Runtime resolution order is:

Explicit credential from config/CLI
Provider-specific env var(s)
Generic fallback env vars: ZEROCLAW_API_KEY then API_KEY

For resilient fallback chains (reliability.fallback_providers), each fallback provider resolves credentials independently. The primary provider's explicit credential is not reused for fallback providers.

Provider Catalog

Canonical ID	Aliases	Local	Provider-specific env var(s)
`openrouter`	—	No	`OPENROUTER_API_KEY`
`anthropic`	—	No	`ANTHROPIC_OAUTH_TOKEN`, `ANTHROPIC_API_KEY`
`openai`	—	No	`OPENAI_API_KEY`
`ollama`	—	Yes	`OLLAMA_API_KEY` (optional)
`gemini`	`google`, `google-gemini`	No	`GEMINI_API_KEY`, `GOOGLE_API_KEY`
`venice`	—	No	`VENICE_API_KEY`
`vercel`	`vercel-ai`	No	`VERCEL_API_KEY`
`cloudflare`	`cloudflare-ai`	No	`CLOUDFLARE_API_KEY`
`moonshot`	`kimi`	No	`MOONSHOT_API_KEY`
`stepfun`	`step`, `step-ai`, `step_ai`	No	`STEP_API_KEY`, `STEPFUN_API_KEY`
`kimi-code`	`kimi_coding`, `kimi_for_coding`	No	`KIMI_CODE_API_KEY`, `MOONSHOT_API_KEY`
`synthetic`	—	No	`SYNTHETIC_API_KEY`
`opencode`	`opencode-zen`	No	`OPENCODE_API_KEY`
`zai`	`z.ai`	No	`ZAI_API_KEY`
`glm`	`zhipu`	No	`GLM_API_KEY`
`minimax`	`minimax-intl`, `minimax-io`, `minimax-global`, `minimax-cn`, `minimaxi`, `minimax-oauth`, `minimax-oauth-cn`, `minimax-portal`, `minimax-portal-cn`	No	`MINIMAX_OAUTH_TOKEN`, `MINIMAX_API_KEY`
`bedrock`	`aws-bedrock`	No	`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` (optional: `AWS_REGION`)
`qianfan`	`baidu`	No	`QIANFAN_API_KEY`
`doubao`	`volcengine`, `ark`, `doubao-cn`	No	`ARK_API_KEY`, `DOUBAO_API_KEY`
`siliconflow`	`silicon-cloud`, `siliconcloud`	No	`SILICONFLOW_API_KEY`
`hunyuan`	`tencent`	No	`HUNYUAN_API_KEY`
`qwen`	`dashscope`, `qwen-intl`, `dashscope-intl`, `qwen-us`, `dashscope-us`, `qwen-code`, `qwen-oauth`, `qwen_oauth`	No	`QWEN_OAUTH_TOKEN`, `DASHSCOPE_API_KEY`
`groq`	—	No	`GROQ_API_KEY`
`mistral`	—	No	`MISTRAL_API_KEY`
`xai`	`grok`	No	`XAI_API_KEY`
`deepseek`	—	No	`DEEPSEEK_API_KEY`
`together`	`together-ai`	No	`TOGETHER_API_KEY`
`fireworks`	`fireworks-ai`	No	`FIREWORKS_API_KEY`
`novita`	—	No	`NOVITA_API_KEY`
`perplexity`	—	No	`PERPLEXITY_API_KEY`
`cohere`	—	No	`COHERE_API_KEY`
`copilot`	`github-copilot`	No	(use config/`API_KEY` fallback with GitHub token)
`cursor`	—	Yes	(none; Cursor manages its own credentials)
`lmstudio`	`lm-studio`	Yes	(optional; local by default)
`llamacpp`	`llama.cpp`	Yes	`LLAMACPP_API_KEY` (optional; only if server auth is enabled)
`sglang`	—	Yes	`SGLANG_API_KEY` (optional)
`vllm`	—	Yes	`VLLM_API_KEY` (optional)
`osaurus`	—	Yes	`OSAURUS_API_KEY` (optional; defaults to `"osaurus"`)
`nvidia`	`nvidia-nim`, `build.nvidia.com`	No	`NVIDIA_API_KEY`

Cursor (Headless CLI) Notes

Provider ID: cursor
Invocation: cursor --headless [--model <model>] - (prompt is sent via stdin)
The cursor binary must be in PATH, or override its location with CURSOR_PATH env var.
Authentication is managed by Cursor itself (its own credential store); no API key is required.
The model argument is forwarded to cursor as-is; use "default" (or leave model empty) to let Cursor select the model.
This provider spawns a subprocess per request and is best suited for batch/script usage rather than high-throughput inference.
Limitations: Only the system prompt (if any) and the last user message are forwarded per request. Full multi-turn conversation history is not preserved because the headless CLI accepts a single prompt per invocation. Temperature control is not supported; non-default values return an explicit error.

Vercel AI Gateway Notes

Provider ID: vercel (alias: vercel-ai)
Base API URL: https://ai-gateway.vercel.sh/v1
Authentication: VERCEL_API_KEY
Vercel AI Gateway usage does not require a project deployment.
If you see DEPLOYMENT_NOT_FOUND, verify the provider is targeting the gateway endpoint above instead of https://api.vercel.ai.

Gemini Notes

Provider ID: gemini (aliases: google, google-gemini)
Auth can come from GEMINI_API_KEY, GOOGLE_API_KEY, or Gemini CLI OAuth cache (~/.gemini/oauth_creds.json)
API key requests use generativelanguage.googleapis.com/v1beta
Gemini CLI OAuth requests use cloudcode-pa.googleapis.com/v1internal with Code Assist request envelope semantics
Thinking models (e.g. gemini-3-pro-preview) are supported — internal reasoning parts are automatically filtered from the response

Qwen (Alibaba Cloud) Notes

Provider IDs: qwen, qwen-code (OAuth), qwen-oauth, dashscope, qwen-intl, qwen-us
OAuth Free Tier: Use qwen-code or set api_key = "qwen-oauth" in config
- Endpoint: portal.qwen.ai/v1
- Credentials: ~/.qwen/oauth_creds.json (use qwen login to authenticate)
- Daily quota: 1000 requests
- Available model: qwen3-coder-plus (verified 2026-02-24)
- Context window: ~32K tokens
API Key Access: Use qwen or dashscope provider with DASHSCOPE_API_KEY
- Endpoint: dashscope.aliyuncs.com/compatible-mode/v1
- Higher quotas and more models available with paid API key
Authentication: QWEN_OAUTH_TOKEN (for OAuth) or DASHSCOPE_API_KEY (for API key)
Recommended Model: qwen3-coder-plus - Optimized for coding tasks
Quota Tracking: zeroclaw providers-quota --provider qwen-code shows static quota info (?/1000 - unknown remaining, 1000/day total)
- Qwen OAuth API does not return rate limit headers
- Actual request counting requires local counter (not implemented)
- Rate limit errors are detected and parsed for retry backoff
Limitations:
- OAuth free tier limited to 1 model and 1000 requests/day
- See test report: docs/qwen-provider-test-report.md

Volcengine ARK (Doubao) Notes

Runtime provider ID: doubao (aliases: volcengine, ark, doubao-cn)
Onboarding display/canonical name: volcengine
Base API URL: https://ark.cn-beijing.volces.com/api/v3
Chat endpoint: /chat/completions
Model discovery endpoint: /models
Authentication: ARK_API_KEY (fallback: DOUBAO_API_KEY)
Default model preset: doubao-1-5-pro-32k-250115

Minimal setup example:

export ARK_API_KEY="your-ark-api-key"
zeroclaw onboard --provider volcengine --api-key "$ARK_API_KEY" --model doubao-1-5-pro-32k-250115 --force

Quick validation:

zeroclaw models refresh --provider volcengine
zeroclaw agent --provider volcengine --model doubao-1-5-pro-32k-250115 -m "ping"

StepFun Notes

Provider ID: stepfun (aliases: step, step-ai, step_ai)
Base API URL: https://api.stepfun.com/v1
Chat endpoint: /chat/completions
Model discovery endpoint: /models
Authentication: STEP_API_KEY (fallback: STEPFUN_API_KEY)
Default model preset: step-3.5-flash
Official docs:
- Chat Completions: https://platform.stepfun.com/docs/zh/api-reference/chat/chat-completion-create
- Models List: https://platform.stepfun.com/docs/api-reference/models/list
- OpenAI migration guide: https://platform.stepfun.com/docs/guide/openai

Minimal setup example:

export STEP_API_KEY="your-stepfun-api-key"
zeroclaw onboard --provider stepfun --api-key "$STEP_API_KEY" --model step-3.5-flash --force

Quick validation:

zeroclaw models refresh --provider stepfun
zeroclaw agent --provider stepfun --model step-3.5-flash -m "ping"

SiliconFlow Notes

Provider ID: siliconflow (aliases: silicon-cloud, siliconcloud)
Base API URL: https://api.siliconflow.cn/v1
Chat endpoint: /chat/completions
Model discovery endpoint: /models
Authentication: SILICONFLOW_API_KEY
Default model preset: Pro/zai-org/GLM-4.7

Minimal setup example:

export SILICONFLOW_API_KEY="your-siliconflow-api-key"
zeroclaw onboard --provider siliconflow --api-key "$SILICONFLOW_API_KEY" --model Pro/zai-org/GLM-4.7 --force

Quick validation:

zeroclaw models refresh --provider siliconflow
zeroclaw agent --provider siliconflow --model Pro/zai-org/GLM-4.7 -m "ping"

Ollama Vision Notes

Provider ID: ollama
Vision input is supported through user message image markers: [IMAGE:<source>].
After multimodal normalization, ZeroClaw sends image payloads through Ollama's native messages[].images field.
If a non-vision provider is selected, ZeroClaw returns a structured capability error instead of silently ignoring images.

Ollama Cloud Routing Notes

Use :cloud model suffix only with a remote Ollama endpoint.
Remote endpoint should be set in api_url (example: https://ollama.com).
ZeroClaw normalizes a trailing /api in api_url automatically.
If default_model ends with :cloud while api_url is local or unset, config validation fails early with an actionable error.
Local Ollama model discovery intentionally excludes :cloud entries to avoid selecting cloud-only models in local mode.

Hunyuan Notes

Provider ID: hunyuan (alias: tencent)
Base API URL: https://api.hunyuan.cloud.tencent.com/v1
Authentication: HUNYUAN_API_KEY (obtain from Tencent Cloud console)
Recommended models: hunyuan-t1-latest (deep reasoning), hunyuan-turbo-latest (fast), hunyuan-pro (high quality)

llama.cpp Server Notes

Provider ID: llamacpp (alias: llama.cpp)
Default endpoint: http://localhost:8080/v1
API key is optional by default; set LLAMACPP_API_KEY only when llama-server is started with --api-key.
Model discovery: zeroclaw models refresh --provider llamacpp

SGLang Server Notes

Provider ID: sglang
Default endpoint: http://localhost:30000/v1
API key is optional by default; set SGLANG_API_KEY only when the server requires authentication.
Tool calling requires launching SGLang with --tool-call-parser (e.g. hermes, llama3, qwen25).
Model discovery: zeroclaw models refresh --provider sglang

vLLM Server Notes

Provider ID: vllm
Default endpoint: http://localhost:8000/v1
API key is optional by default; set VLLM_API_KEY only when the server requires authentication.
Model discovery: zeroclaw models refresh --provider vllm

Osaurus Server Notes

Provider ID: osaurus
Default endpoint: http://localhost:1337/v1
API key defaults to "osaurus" but is optional; set OSAURUS_API_KEY to override or leave unset for keyless access.
Model discovery: zeroclaw models refresh --provider osaurus
Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that combines local MLX inference with cloud provider proxying through a single endpoint.
Supports multiple API formats simultaneously: OpenAI-compatible (/v1/chat/completions), Anthropic (/messages), Ollama (/chat), and Open Responses (/v1/responses).
Built-in MCP (Model Context Protocol) support for tool and context server connectivity.
Local models run via MLX (Llama, Qwen, Gemma, GLM, Phi, Nemotron, and others); cloud models are proxied transparently.

Bedrock Notes

Provider ID: bedrock (alias: aws-bedrock)
API: Converse API
Authentication: AWS AKSK (not a single API key). Set AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY environment variables.
Optional: AWS_SESSION_TOKEN for temporary/STS credentials, AWS_REGION or AWS_DEFAULT_REGION (default: us-east-1).
Default onboarding model: anthropic.claude-sonnet-4-5-20250929-v1:0
Supports native tool calling and prompt caching (cachePoint).
Cross-region inference profiles supported (e.g., us.anthropic.claude-*).
Model IDs use Bedrock format: anthropic.claude-sonnet-4-6, anthropic.claude-opus-4-6-v1, etc.

Ollama Reasoning Toggle

You can control Ollama reasoning/thinking behavior from config.toml:

[runtime]
reasoning_enabled = false

Behavior:

false: sends think: false to Ollama /api/chat requests.
true: sends think: true.
Unset: omits think and keeps Ollama/model defaults.

Ollama Vision Override

Some Ollama models support vision (e.g. llava, llama3.2-vision) while others do not. Since ZeroClaw cannot auto-detect this, you can override it in config.toml:

default_provider = "ollama"
default_model = "llava"
model_support_vision = true

Behavior:

true: enables image attachment processing in the agent loop.
false: disables vision even if the provider reports support.
Unset: uses the provider's built-in default.

Environment override: ZEROCLAW_MODEL_SUPPORT_VISION=true

OpenAI Codex Reasoning Level

You can control OpenAI Codex reasoning effort from config.toml:

[provider]
reasoning_level = "high"

Behavior:

Supported values: minimal, low, medium, high, xhigh (case-insensitive).
When set, overrides ZEROCLAW_CODEX_REASONING_EFFORT.
Unset falls back to ZEROCLAW_CODEX_REASONING_EFFORT if present, otherwise defaults to xhigh.
Legacy compatibility: runtime.reasoning_level is accepted but deprecated; prefer provider.reasoning_level.
If both provider.reasoning_level and runtime.reasoning_level are set, provider-level value wins.

Kimi Code Notes

Provider ID: kimi-code
Endpoint: https://api.kimi.com/coding/v1
Default onboarding model: kimi-for-coding (alternative: kimi-k2.5)
Runtime auto-adds User-Agent: KimiCLI/0.77 for compatibility.

NVIDIA NIM Notes

Canonical provider ID: nvidia
Aliases: nvidia-nim, build.nvidia.com
Base API URL: https://integrate.api.nvidia.com/v1
Model discovery: zeroclaw models refresh --provider nvidia

Recommended starter model IDs (verified against NVIDIA API catalog on February 18, 2026):

meta/llama-3.3-70b-instruct
deepseek-ai/deepseek-v3.2
nvidia/llama-3.3-nemotron-super-49b-v1.5
nvidia/llama-3.1-nemotron-ultra-253b-v1

Custom Endpoints

OpenAI-compatible endpoint:

default_provider = "custom:https://your-api.example.com"

Anthropic-compatible endpoint:

default_provider = "anthropic-custom:https://your-api.example.com"

MiniMax OAuth Setup (config.toml)

Set the MiniMax provider and OAuth placeholder in config:

default_provider = "minimax-oauth"
api_key = "minimax-oauth"

Then provide one of the following credentials via environment variables:

MINIMAX_OAUTH_TOKEN (preferred, direct access token)
MINIMAX_API_KEY (legacy/static token)
MINIMAX_OAUTH_REFRESH_TOKEN (auto-refreshes access token at startup)

Optional:

MINIMAX_OAUTH_REGION=global or cn (defaults by provider alias)
MINIMAX_OAUTH_CLIENT_ID to override the default OAuth client id

Channel compatibility note:

For MiniMax-backed channel conversations, runtime history is normalized to keep valid user/assistant turn order.
Channel-specific delivery guidance (for example Telegram attachment markers) is merged into the leading system prompt instead of being appended as a trailing system turn.

Qwen Code OAuth Setup (config.toml)

Set Qwen Code OAuth mode in config:

default_provider = "qwen-code"
api_key = "qwen-oauth"

Credential resolution for qwen-code:

Explicit api_key value (if not the placeholder qwen-oauth)
QWEN_OAUTH_TOKEN
~/.qwen/oauth_creds.json (reuses Qwen Code cached OAuth credentials)
Optional refresh via QWEN_OAUTH_REFRESH_TOKEN (or cached refresh token)
If no OAuth placeholder is used, DASHSCOPE_API_KEY can still be used as fallback

Optional endpoint override:

QWEN_OAUTH_RESOURCE_URL (normalized to https://.../v1 if needed)
If unset, resource_url from cached OAuth credentials is used when available

Model Routing (`hint:<name>`)

You can route model calls by hint using [[model_routes]]:

[[model_routes]]
hint = "reasoning"
provider = "openrouter"
model = "anthropic/claude-opus-4-20250514"
max_tokens = 8192

[[model_routes]]
hint = "fast"
provider = "groq"
model = "llama-3.3-70b-versatile"

Then call with a hint model name (for example from tool or integration paths):

hint:reasoning

Embedding Routing (`hint:<name>`)

You can route embedding calls with the same hint pattern using [[embedding_routes]]. Set [memory].embedding_model to a hint:<name> value to activate routing.

[memory]
embedding_model = "hint:semantic"

[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536

[[embedding_routes]]
hint = "archive"
provider = "custom:https://embed.example.com/v1"
model = "your-embedding-model-id"
dimensions = 1024

Supported embedding providers:

none
openai
custom:<url> (OpenAI-compatible embeddings endpoint)

Optional per-route key override:

[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
api_key = "sk-route-specific"

Upgrading Models Safely

Use stable hints and update only route targets when providers deprecate model IDs.

Recommended workflow:

Keep call sites stable (hint:reasoning, hint:semantic).
Change only the target model under [[model_routes]] or [[embedding_routes]].
Run:
- zeroclaw doctor
- zeroclaw status
Smoke test one representative flow (chat + memory retrieval) before rollout.

This minimizes breakage because integrations and prompts do not need to change when model IDs are upgraded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroClaw Providers Reference

How to List Providers

Credential Resolution Order

Provider Catalog

Cursor (Headless CLI) Notes

Vercel AI Gateway Notes

Gemini Notes

Qwen (Alibaba Cloud) Notes

Volcengine ARK (Doubao) Notes

StepFun Notes

SiliconFlow Notes

Ollama Vision Notes

Ollama Cloud Routing Notes

Hunyuan Notes

llama.cpp Server Notes

SGLang Server Notes

vLLM Server Notes

Osaurus Server Notes

Bedrock Notes

Ollama Reasoning Toggle

Ollama Vision Override

OpenAI Codex Reasoning Level

Kimi Code Notes

NVIDIA NIM Notes

Custom Endpoints

MiniMax OAuth Setup (config.toml)

Qwen Code OAuth Setup (config.toml)

Model Routing (`hint:<name>`)

Embedding Routing (`hint:<name>`)

Upgrading Models Safely

FilesExpand file tree

providers-reference.md

Latest commit

History

providers-reference.md

File metadata and controls

ZeroClaw Providers Reference

How to List Providers

Credential Resolution Order

Provider Catalog

Cursor (Headless CLI) Notes

Vercel AI Gateway Notes

Gemini Notes

Qwen (Alibaba Cloud) Notes

Volcengine ARK (Doubao) Notes

StepFun Notes

SiliconFlow Notes

Ollama Vision Notes

Ollama Cloud Routing Notes

Hunyuan Notes

llama.cpp Server Notes

SGLang Server Notes

vLLM Server Notes

Osaurus Server Notes

Bedrock Notes

Ollama Reasoning Toggle

Ollama Vision Override

OpenAI Codex Reasoning Level

Kimi Code Notes

NVIDIA NIM Notes

Custom Endpoints

MiniMax OAuth Setup (config.toml)

Qwen Code OAuth Setup (config.toml)

Model Routing (hint:<name>)

Embedding Routing (hint:<name>)

Upgrading Models Safely

Model Routing (`hint:<name>`)

Embedding Routing (`hint:<name>`)