Supported Providers

any-llm-go supports multiple LLM providers through a unified interface. Each provider is implemented as a separate package.

Provider Status

Provider	ID	Completion	Streaming	Tools	Reasoning	Embeddings	List Models
Anthropic	`anthropic`	✅	✅	✅	✅	❌	❌
DeepSeek	`deepseek`	✅	✅	✅	✅	❌	✅
Gemini	`gemini`	✅	✅	✅	✅	✅	✅
Groq	`groq`	✅	✅	✅	❌	❌	✅
llama.cpp	`llamacpp`	✅	✅	✅	❌	✅	✅
Llamafile	`llamafile`	✅	✅	✅	❌	✅	✅
Mistral	`mistral`	✅	✅	✅	✅	✅	✅
Ollama	`ollama`	✅	✅	✅	✅	✅	✅
OpenAI	`openai`	✅	✅	✅	✅	✅	✅
z.ai	`zai`	✅	✅	✅	✅	❌	✅

Legend

Completion - Basic chat completion support
Streaming - Real-time streaming responses
Tools - Function calling / tool use
Reasoning - Extended thinking (e.g., Claude's thinking, OpenAI o1 reasoning)
Embeddings - Text embedding generation
List Models - API to list available models

Provider Details

Anthropic

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/anthropic"
)

// Using environment variable (ANTHROPIC_API_KEY).
provider, err := anthropic.New()

// Or with explicit API key.
provider, err := anthropic.New(anyllm.WithAPIKey("sk-ant-..."))

Environment Variable: ANTHROPIC_API_KEY

Popular Models:

claude-sonnet-4-20250514 - Latest Sonnet model
claude-3-5-sonnet-latest - Previous Sonnet
claude-3-5-haiku-latest - Fast and cost-effective
claude-3-opus-latest - Most capable (legacy)

Extended Thinking:

Anthropic's Claude models support extended thinking for complex reasoning tasks:

response, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "claude-sonnet-4-20250514",
    Messages: messages,
    ReasoningEffort: anyllm.ReasoningEffortMedium, // low, medium, or high
})

// Access the thinking content.
if response.Choices[0].Message.Reasoning != nil {
    fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}

DeepSeek

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/deepseek"
)

// Using environment variable (DEEPSEEK_API_KEY).
provider, err := deepseek.New()

// Or with explicit API key.
provider, err := deepseek.New(anyllm.WithAPIKey("sk-..."))

Environment Variable: DEEPSEEK_API_KEY

Popular Models:

deepseek-chat - General-purpose chat model
deepseek-reasoner - Reasoning model (DeepSeek R1)

Reasoning/Thinking:

DeepSeek R1 supports extended thinking for complex reasoning tasks:

response, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "deepseek-reasoner",
    Messages: messages,
    ReasoningEffort: anyllm.ReasoningEffortMedium,
})

if response.Choices[0].Message.Reasoning != nil {
    fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}

JSON Schema:

DeepSeek doesn't support json_schema response format directly. The provider automatically handles this by injecting the schema into the user message and using json_object mode instead.

Gemini

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/gemini"
)

// Using environment variable (GEMINI_API_KEY or GOOGLE_API_KEY).
provider, err := gemini.New()

// Or with explicit API key.
provider, err := gemini.New(anyllm.WithAPIKey("your-key"))

Environment Variables: GEMINI_API_KEY or GOOGLE_API_KEY

Popular Models:

gemini-2.5-flash - Fast and cost-effective
gemini-2.5-pro - Most capable model
gemini-3-flash-preview - Reasoning-capable model

Embedding Models:

gemini-embedding-001 - Text embeddings

Reasoning/Thinking:

Gemini models support extended thinking for complex reasoning tasks:

response, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "gemini-3-flash-preview",
    Messages: messages,
    ReasoningEffort: anyllm.ReasoningEffortMedium, // low, medium, or high
})

// Access the thinking content.
if response.Choices[0].Message.Reasoning != nil {
    fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}

Groq

Groq provides fast inference through their cloud API. It exposes an OpenAI-compatible API.

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/groq"
)

// Using environment variable (GROQ_API_KEY).
provider, err := groq.New()

// Or with explicit API key.
provider, err := groq.New(anyllm.WithAPIKey("gsk_..."))

Environment Variable: GROQ_API_KEY

Popular Models:

llama-3.1-8b-instant - Fast and cost-effective
llama-3.3-70b-versatile - More capable model
mixtral-8x7b-32768 - Mixtral with 32k context

Completion:

provider, _ := groq.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "llama-3.1-8b-instant",
    Messages: []anyllm.Message{
        {Role: anyllm.RoleUser, Content: "Hello!"},
    },
})

Mistral

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/mistral"
)

// Using environment variable (MISTRAL_API_KEY).
provider, err := mistral.New()

// Or with explicit API key.
provider, err := mistral.New(anyllm.WithAPIKey("your-key"))

Environment Variable: MISTRAL_API_KEY

Popular Models:

mistral-small-latest - Fast and cost-effective
mistral-large-latest - Most capable model
mistral-medium-latest - Balanced performance

Reasoning Models:

magistral-small-latest - Fast reasoning model
magistral-medium-latest - More capable reasoning model

Embedding Models:

mistral-embed - Text embeddings

Completion:

provider, _ := mistral.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "mistral-small-latest",
    Messages: []anyllm.Message{
        {Role: anyllm.RoleUser, Content: "Hello!"},
    },
})

Embeddings:

provider, _ := mistral.New()
resp, err := provider.Embedding(ctx, anyllm.EmbeddingParams{
    Model: "mistral-embed",
    Input: "Hello, world!",
})

Llamafile

Llamafile is a single-file executable that bundles a model with llama.cpp for easy local deployment. It exposes an OpenAI-compatible API. No API key is required.

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/llamafile"
)

// Using default settings (localhost:8080/v1).
provider, err := llamafile.New()

// Or with custom base URL.
provider, err := llamafile.New(anyllm.WithBaseURL("http://localhost:8081/v1"))

Environment Variable: LLAMAFILE_BASE_URL (optional, defaults to http://localhost:8080/v1)

Running Llamafile:

Download a llamafile from Mozilla-Ocho/llamafile and run it:

# Download a llamafile (example: LLaVA)
curl -LO https://huggingface.co/Mozilla/llava-v1.5-7b-llamafile/resolve/main/llava-v1.5-7b-q4.llamafile
chmod +x llava-v1.5-7b-q4.llamafile
./llava-v1.5-7b-q4.llamafile --server

Completion:

provider, _ := llamafile.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "LLaMA_CPP", // Llamafile uses "LLaMA_CPP" as the model name.
    Messages: []anyllm.Message{
        {Role: anyllm.RoleUser, Content: "Hello!"},
    },
})

Streaming:

provider, _ := llamafile.New()
chunks, errs := provider.CompletionStream(ctx, anyllm.CompletionParams{
    Model: "LLaMA_CPP",
    Messages: messages,
})

for chunk := range chunks {
    fmt.Print(chunk.Choices[0].Delta.Content)
}
if err := <-errs; err != nil {
    log.Fatal(err)
}

List Models:

provider, _ := llamafile.New()
models, err := provider.ListModels(ctx)
for _, model := range models.Data {
    fmt.Println(model.ID) // Typically "LLaMA_CPP"
}

Ollama

Ollama is a local LLM server that allows you to run models on your own hardware. No API key is required.

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/ollama"
)

// Using default settings (localhost:11434).
provider, err := ollama.New()

// Or with custom base URL.
provider, err := ollama.New(anyllm.WithBaseURL("http://localhost:11435"))

Environment Variable: OLLAMA_HOST (optional, defaults to http://localhost:11434)

Popular Models:

llama3.2 - Meta's Llama 3.2
mistral - Mistral 7B
codellama - Code-focused Llama
deepseek-r1 - DeepSeek reasoning model

Reasoning/Thinking:

Ollama supports extended thinking for models that support it:

response, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "deepseek-r1",
    Messages: messages,
    ReasoningEffort: anyllm.ReasoningEffortMedium,
})

if response.Choices[0].Message.Reasoning != nil {
    fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}

Embeddings:

provider, _ := ollama.New()
resp, err := provider.Embedding(ctx, anyllm.EmbeddingParams{
    Model: "nomic-embed-text",
    Input: "Hello, world!",
})

List Models:

provider, _ := ollama.New()
models, err := provider.ListModels(ctx)
for _, model := range models.Data {
    fmt.Println(model.ID)
}

llama.cpp

llama.cpp offers a local server compatible with the OpenAI API. No API key is required by default.

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/llamacpp"
)

// Using default settings (localhost:8080).
provider, err := llamacpp.New()

// Or with custom base URL.
provider, err := llamacpp.New(anyllm.WithBaseURL("http://localhost:9090/v1"))

Popular Models:

LLaMA_CPP - Default identifier used by the server.
Any GGUF model loaded into the server (the Model parameter is often ignored by llama.cpp if only one model is loaded).

Reasoning/Thinking:

llama.cpp supports reasoning for models that provide it (like DeepSeek-R1 GGUF):

response, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "LLaMA_CPP",
    Messages: messages,
    ReasoningEffort: anyllm.ReasoningEffortMedium,
})

if response.Choices[0].Message.Reasoning != nil {
    fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}

Embeddings:

provider, _ := llamacpp.New()
resp, err := provider.Embedding(ctx, anyllm.EmbeddingParams{
    Model: "LLaMA_CPP",
    Input: "Hello, world!",
})

List Models:

provider, _ := llamacpp.New()
models, err := provider.ListModels(ctx)
for _, model := range models.Data {
    fmt.Println(model.ID)
}

OpenAI

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/openai"
)

// Using environment variable (OPENAI_API_KEY).
provider, err := openai.New()

// Or with explicit API key.
provider, err := openai.New(anyllm.WithAPIKey("sk-..."))

// Or with custom base URL (for Azure, proxies, etc.).
provider, err := openai.New(
    anyllm.WithAPIKey("your-key"),
    anyllm.WithBaseURL("https://your-endpoint.openai.azure.com"),
)

Environment Variable: OPENAI_API_KEY

Popular Models:

gpt-4o - Most capable model
gpt-4o-mini - Fast and cost-effective
gpt-4-turbo - Previous generation flagship
o1-preview - Reasoning model
o1-mini - Smaller reasoning model

Embedding Models:

text-embedding-3-small - Cost-effective embeddings
text-embedding-3-large - Higher quality embeddings

z.ai

z.ai provides access to the GLM model family through an OpenAI-compatible API.

import (
    anyllm "github.com/mozilla-ai/any-llm-go"
    "github.com/mozilla-ai/any-llm-go/providers/zai"
)

// Using environment variable (ZAI_API_KEY).
provider, err := zai.New()

// Or with explicit API key.
provider, err := zai.New(anyllm.WithAPIKey("your-key"))

Environment Variable: ZAI_API_KEY

Popular Models:

glm-4.5-air - Fast and cost-effective
glm-4.5 - Capable general model
glm-4.6 - Vision-capable model
glm-4.7 - Advanced model
glm-5 - Most capable model

Completion:

provider, _ := zai.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
    Model: "glm-4.6",
    Messages: []anyllm.Message{
        {Role: anyllm.RoleUser, Content: "Hello!"},
    },
})

Coming Soon

The following providers are planned for future releases:

Provider	Status
Cohere	Planned
Together AI	Planned
AWS Bedrock	Planned
Azure OpenAI	Planned (use OpenAI with custom base URL for now)

Adding a New Provider

Want to add support for a new provider? See our Contributing Guide for instructions on implementing a new provider.

The basic requirements are:

Implement the Provider interface
Use the official provider SDK when available
Normalize responses to OpenAI format
Add comprehensive tests
Document the provider in this file

Provider-Specific Notes

Response Format

All providers normalize their responses to OpenAI's format:

type ChatCompletion struct {
    ID      string   `json:"id"`
    Object  string   `json:"object"`
    Created int64    `json:"created"`
    Model   string   `json:"model"`
    Choices []Choice `json:"choices"`
    Usage   *Usage   `json:"usage,omitempty"`
}

This means you can write provider-agnostic code that works with any supported provider.

Error Handling

Provider-specific errors are normalized to common error types:

Error Type	Description
`ErrRateLimit`	Rate limit exceeded
`ErrAuthentication`	Invalid or missing API key
`ErrInvalidRequest`	Malformed request
`ErrContextLength`	Input exceeds model's context window
`ErrContentFilter`	Content blocked by safety filters
`ErrModelNotFound`	Requested model doesn't exist

See Error Handling for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supported Providers

Provider Status

Legend

Provider Details

Anthropic

DeepSeek

Gemini

Groq

Mistral

Llamafile

Ollama

llama.cpp

OpenAI

z.ai

Coming Soon

Adding a New Provider

Provider-Specific Notes

Response Format

Error Handling

FilesExpand file tree

providers.md

Latest commit

History

providers.md

File metadata and controls

Supported Providers

Provider Status

Legend

Provider Details

Anthropic

DeepSeek

Gemini

Groq

Mistral

Llamafile

Ollama

llama.cpp

OpenAI

z.ai

Coming Soon

Adding a New Provider

Provider-Specific Notes

Response Format

Error Handling