any-llm-go supports multiple LLM providers through a unified interface. Each provider is implemented as a separate package.
| Provider | ID | Completion | Streaming | Tools | Reasoning | Embeddings | List Models |
|---|---|---|---|---|---|---|---|
| Anthropic | anthropic |
✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
| DeepSeek | deepseek |
✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
| Gemini | gemini |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Groq | groq |
✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
| llama.cpp | llamacpp |
✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
| Llamafile | llamafile |
✅ | ✅ | ✅ | ❌ | ✅ | ✅ |
| Mistral | mistral |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Ollama | ollama |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| OpenAI | openai |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| z.ai | zai |
✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
- Completion - Basic chat completion support
- Streaming - Real-time streaming responses
- Tools - Function calling / tool use
- Reasoning - Extended thinking (e.g., Claude's thinking, OpenAI o1 reasoning)
- Embeddings - Text embedding generation
- List Models - API to list available models
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/anthropic"
)
// Using environment variable (ANTHROPIC_API_KEY).
provider, err := anthropic.New()
// Or with explicit API key.
provider, err := anthropic.New(anyllm.WithAPIKey("sk-ant-..."))Environment Variable: ANTHROPIC_API_KEY
Popular Models:
claude-sonnet-4-20250514- Latest Sonnet modelclaude-3-5-sonnet-latest- Previous Sonnetclaude-3-5-haiku-latest- Fast and cost-effectiveclaude-3-opus-latest- Most capable (legacy)
Extended Thinking:
Anthropic's Claude models support extended thinking for complex reasoning tasks:
response, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "claude-sonnet-4-20250514",
Messages: messages,
ReasoningEffort: anyllm.ReasoningEffortMedium, // low, medium, or high
})
// Access the thinking content.
if response.Choices[0].Message.Reasoning != nil {
fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/deepseek"
)
// Using environment variable (DEEPSEEK_API_KEY).
provider, err := deepseek.New()
// Or with explicit API key.
provider, err := deepseek.New(anyllm.WithAPIKey("sk-..."))Environment Variable: DEEPSEEK_API_KEY
Popular Models:
deepseek-chat- General-purpose chat modeldeepseek-reasoner- Reasoning model (DeepSeek R1)
Reasoning/Thinking:
DeepSeek R1 supports extended thinking for complex reasoning tasks:
response, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "deepseek-reasoner",
Messages: messages,
ReasoningEffort: anyllm.ReasoningEffortMedium,
})
if response.Choices[0].Message.Reasoning != nil {
fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}JSON Schema:
DeepSeek doesn't support json_schema response format directly. The provider automatically handles this by injecting the schema into the user message and using json_object mode instead.
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/gemini"
)
// Using environment variable (GEMINI_API_KEY or GOOGLE_API_KEY).
provider, err := gemini.New()
// Or with explicit API key.
provider, err := gemini.New(anyllm.WithAPIKey("your-key"))Environment Variables: GEMINI_API_KEY or GOOGLE_API_KEY
Popular Models:
gemini-2.5-flash- Fast and cost-effectivegemini-2.5-pro- Most capable modelgemini-3-flash-preview- Reasoning-capable model
Embedding Models:
gemini-embedding-001- Text embeddings
Reasoning/Thinking:
Gemini models support extended thinking for complex reasoning tasks:
response, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "gemini-3-flash-preview",
Messages: messages,
ReasoningEffort: anyllm.ReasoningEffortMedium, // low, medium, or high
})
// Access the thinking content.
if response.Choices[0].Message.Reasoning != nil {
fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}Groq provides fast inference through their cloud API. It exposes an OpenAI-compatible API.
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/groq"
)
// Using environment variable (GROQ_API_KEY).
provider, err := groq.New()
// Or with explicit API key.
provider, err := groq.New(anyllm.WithAPIKey("gsk_..."))Environment Variable: GROQ_API_KEY
Popular Models:
llama-3.1-8b-instant- Fast and cost-effectivellama-3.3-70b-versatile- More capable modelmixtral-8x7b-32768- Mixtral with 32k context
Completion:
provider, _ := groq.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "llama-3.1-8b-instant",
Messages: []anyllm.Message{
{Role: anyllm.RoleUser, Content: "Hello!"},
},
})import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/mistral"
)
// Using environment variable (MISTRAL_API_KEY).
provider, err := mistral.New()
// Or with explicit API key.
provider, err := mistral.New(anyllm.WithAPIKey("your-key"))Environment Variable: MISTRAL_API_KEY
Popular Models:
mistral-small-latest- Fast and cost-effectivemistral-large-latest- Most capable modelmistral-medium-latest- Balanced performance
Reasoning Models:
magistral-small-latest- Fast reasoning modelmagistral-medium-latest- More capable reasoning model
Embedding Models:
mistral-embed- Text embeddings
Completion:
provider, _ := mistral.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "mistral-small-latest",
Messages: []anyllm.Message{
{Role: anyllm.RoleUser, Content: "Hello!"},
},
})Embeddings:
provider, _ := mistral.New()
resp, err := provider.Embedding(ctx, anyllm.EmbeddingParams{
Model: "mistral-embed",
Input: "Hello, world!",
})Llamafile is a single-file executable that bundles a model with llama.cpp for easy local deployment. It exposes an OpenAI-compatible API. No API key is required.
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/llamafile"
)
// Using default settings (localhost:8080/v1).
provider, err := llamafile.New()
// Or with custom base URL.
provider, err := llamafile.New(anyllm.WithBaseURL("http://localhost:8081/v1"))Environment Variable: LLAMAFILE_BASE_URL (optional, defaults to http://localhost:8080/v1)
Running Llamafile:
Download a llamafile from Mozilla-Ocho/llamafile and run it:
# Download a llamafile (example: LLaVA)
curl -LO https://huggingface.co/Mozilla/llava-v1.5-7b-llamafile/resolve/main/llava-v1.5-7b-q4.llamafile
chmod +x llava-v1.5-7b-q4.llamafile
./llava-v1.5-7b-q4.llamafile --serverCompletion:
provider, _ := llamafile.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "LLaMA_CPP", // Llamafile uses "LLaMA_CPP" as the model name.
Messages: []anyllm.Message{
{Role: anyllm.RoleUser, Content: "Hello!"},
},
})Streaming:
provider, _ := llamafile.New()
chunks, errs := provider.CompletionStream(ctx, anyllm.CompletionParams{
Model: "LLaMA_CPP",
Messages: messages,
})
for chunk := range chunks {
fmt.Print(chunk.Choices[0].Delta.Content)
}
if err := <-errs; err != nil {
log.Fatal(err)
}List Models:
provider, _ := llamafile.New()
models, err := provider.ListModels(ctx)
for _, model := range models.Data {
fmt.Println(model.ID) // Typically "LLaMA_CPP"
}Ollama is a local LLM server that allows you to run models on your own hardware. No API key is required.
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/ollama"
)
// Using default settings (localhost:11434).
provider, err := ollama.New()
// Or with custom base URL.
provider, err := ollama.New(anyllm.WithBaseURL("http://localhost:11435"))Environment Variable: OLLAMA_HOST (optional, defaults to http://localhost:11434)
Popular Models:
llama3.2- Meta's Llama 3.2mistral- Mistral 7Bcodellama- Code-focused Llamadeepseek-r1- DeepSeek reasoning model
Reasoning/Thinking:
Ollama supports extended thinking for models that support it:
response, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "deepseek-r1",
Messages: messages,
ReasoningEffort: anyllm.ReasoningEffortMedium,
})
if response.Choices[0].Message.Reasoning != nil {
fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}Embeddings:
provider, _ := ollama.New()
resp, err := provider.Embedding(ctx, anyllm.EmbeddingParams{
Model: "nomic-embed-text",
Input: "Hello, world!",
})List Models:
provider, _ := ollama.New()
models, err := provider.ListModels(ctx)
for _, model := range models.Data {
fmt.Println(model.ID)
}llama.cpp offers a local server compatible with the OpenAI API. No API key is required by default.
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/llamacpp"
)
// Using default settings (localhost:8080).
provider, err := llamacpp.New()
// Or with custom base URL.
provider, err := llamacpp.New(anyllm.WithBaseURL("http://localhost:9090/v1"))Popular Models:
LLaMA_CPP- Default identifier used by the server.- Any GGUF model loaded into the server (the
Modelparameter is often ignored by llama.cpp if only one model is loaded).
Reasoning/Thinking:
llama.cpp supports reasoning for models that provide it (like DeepSeek-R1 GGUF):
response, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "LLaMA_CPP",
Messages: messages,
ReasoningEffort: anyllm.ReasoningEffortMedium,
})
if response.Choices[0].Message.Reasoning != nil {
fmt.Println("Thinking:", response.Choices[0].Message.Reasoning.Content)
}Embeddings:
provider, _ := llamacpp.New()
resp, err := provider.Embedding(ctx, anyllm.EmbeddingParams{
Model: "LLaMA_CPP",
Input: "Hello, world!",
})List Models:
provider, _ := llamacpp.New()
models, err := provider.ListModels(ctx)
for _, model := range models.Data {
fmt.Println(model.ID)
}import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/openai"
)
// Using environment variable (OPENAI_API_KEY).
provider, err := openai.New()
// Or with explicit API key.
provider, err := openai.New(anyllm.WithAPIKey("sk-..."))
// Or with custom base URL (for Azure, proxies, etc.).
provider, err := openai.New(
anyllm.WithAPIKey("your-key"),
anyllm.WithBaseURL("https://your-endpoint.openai.azure.com"),
)Environment Variable: OPENAI_API_KEY
Popular Models:
gpt-4o- Most capable modelgpt-4o-mini- Fast and cost-effectivegpt-4-turbo- Previous generation flagshipo1-preview- Reasoning modelo1-mini- Smaller reasoning model
Embedding Models:
text-embedding-3-small- Cost-effective embeddingstext-embedding-3-large- Higher quality embeddings
z.ai provides access to the GLM model family through an OpenAI-compatible API.
import (
anyllm "github.com/mozilla-ai/any-llm-go"
"github.com/mozilla-ai/any-llm-go/providers/zai"
)
// Using environment variable (ZAI_API_KEY).
provider, err := zai.New()
// Or with explicit API key.
provider, err := zai.New(anyllm.WithAPIKey("your-key"))Environment Variable: ZAI_API_KEY
Popular Models:
glm-4.5-air- Fast and cost-effectiveglm-4.5- Capable general modelglm-4.6- Vision-capable modelglm-4.7- Advanced modelglm-5- Most capable model
Completion:
provider, _ := zai.New()
resp, err := provider.Completion(ctx, anyllm.CompletionParams{
Model: "glm-4.6",
Messages: []anyllm.Message{
{Role: anyllm.RoleUser, Content: "Hello!"},
},
})The following providers are planned for future releases:
| Provider | Status |
|---|---|
| Cohere | Planned |
| Together AI | Planned |
| AWS Bedrock | Planned |
| Azure OpenAI | Planned (use OpenAI with custom base URL for now) |
Want to add support for a new provider? See our Contributing Guide for instructions on implementing a new provider.
The basic requirements are:
- Implement the
Providerinterface - Use the official provider SDK when available
- Normalize responses to OpenAI format
- Add comprehensive tests
- Document the provider in this file
All providers normalize their responses to OpenAI's format:
type ChatCompletion struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []Choice `json:"choices"`
Usage *Usage `json:"usage,omitempty"`
}This means you can write provider-agnostic code that works with any supported provider.
Provider-specific errors are normalized to common error types:
| Error Type | Description |
|---|---|
ErrRateLimit |
Rate limit exceeded |
ErrAuthentication |
Invalid or missing API key |
ErrInvalidRequest |
Malformed request |
ErrContextLength |
Input exceeds model's context window |
ErrContentFilter |
Content blocked by safety filters |
ErrModelNotFound |
Requested model doesn't exist |
See Error Handling for more details.