|
| 1 | +--- |
| 2 | +title: "Bifrost AI Gateway" |
| 3 | +description: "The fastest way to build AI applications that never go down. A high-performance AI gateway unifying 20+ providers through a single OpenAI-compatible API." |
| 4 | +icon: "bridge" |
| 5 | +--- |
| 6 | + |
| 7 | +Bifrost is a high-performance AI gateway that unifies access to 20+ providers OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, and more, through a unified API. Deploy in seconds with zero configuration and get automatic failover, load balancing, semantic caching, and enterprise-grade governance. In sustained benchmarks at 5,000 requests per second, Bifrost adds only **11 µs** of overhead per request. |
| 8 | + |
| 9 | +<Frame> |
| 10 | +<img src="/media/architecture.png" alt="Bifrost architecture diagram" width="100%" /> |
| 11 | +</Frame> |
| 12 | + |
| 13 | +## Get started |
| 14 | + |
| 15 | +<CardGroup cols={2}> |
| 16 | + <Card title="Gateway setup" icon="server" href="/quickstart/gateway/setting-up"> |
| 17 | + Deploy the HTTP API gateway with a built-in web UI for visual configuration and real-time monitoring |
| 18 | + </Card> |
| 19 | + <Card title="Go SDK" icon="code" href="/quickstart/go-sdk/setting-up"> |
| 20 | + Integrate directly into your Go application for maximum performance and control |
| 21 | + </Card> |
| 22 | +</CardGroup> |
| 23 | + |
| 24 | +--- |
| 25 | + |
| 26 | +## Open source features |
| 27 | + |
| 28 | +<CardGroup cols={2}> |
| 29 | + <Card title="Drop-in Replacement" icon="shuffle" href="/features/drop-in-replacement"> |
| 30 | + Replace existing AI SDK connections by changing just the base URL. Keep your code, gain fallbacks and governance. |
| 31 | + </Card> |
| 32 | + <Card title="Automatic Fallbacks" icon="list-check" href="/features/fallbacks"> |
| 33 | + Seamless failover between providers and models. When your primary provider fails, Bifrost switches to backups automatically. |
| 34 | + </Card> |
| 35 | + <Card title="Load Balancing" icon="scale-balanced" href="/features/keys-management"> |
| 36 | + Intelligent API key distribution with weighted load balancing, model-specific filtering, and automatic failover. |
| 37 | + </Card> |
| 38 | + <Card title="Virtual Keys" icon="key" href="/features/governance/virtual-keys"> |
| 39 | + The primary governance entity. Control access permissions, budgets, rate limits, and routing per consumer. |
| 40 | + </Card> |
| 41 | + <Card title="Routing" icon="arrow-progress" href="/features/governance/routing"> |
| 42 | + Direct requests to specific models, providers, and keys. Implement weighted strategies and automatic fallbacks. |
| 43 | + </Card> |
| 44 | + <Card title="Budget & Rate Limits" icon="money-bills" href="/features/governance/budget-and-limits"> |
| 45 | + Hierarchical cost control with budgets and rate limits at virtual key, team, and customer levels. |
| 46 | + </Card> |
| 47 | + <Card title="MCP Tool Filtering" icon="grid-2" href="/features/governance/mcp-tools"> |
| 48 | + Control which MCP tools are available per virtual key with strict allow-lists. |
| 49 | + </Card> |
| 50 | + <Card title="Semantic Caching" icon="database" href="/features/semantic-caching"> |
| 51 | + Intelligent response caching based on semantic similarity. Reduce costs and latency for similar queries. |
| 52 | + </Card> |
| 53 | + <Card title="Built-in Observability" icon="cube" href="/features/observability/default"> |
| 54 | + Monitor every AI request in real-time. Track performance, debug issues, and analyze usage patterns. |
| 55 | + </Card> |
| 56 | + <Card title="Prometheus Metrics" icon="chart-line" href="/features/observability/prometheus"> |
| 57 | + Native Prometheus metrics via scraping or Push Gateway for monitoring and alerting. |
| 58 | + </Card> |
| 59 | + <Card title="OpenTelemetry" icon="bolt" href="/features/observability/otel"> |
| 60 | + OTLP integration for distributed tracing with Grafana, New Relic, Honeycomb, and more. |
| 61 | + </Card> |
| 62 | + <Card title="Telemetry" icon="gauge" href="/features/telemetry"> |
| 63 | + Built-in Prometheus-based monitoring tracking HTTP-level and upstream provider metrics. |
| 64 | + </Card> |
| 65 | + <Card title="Custom Plugins" icon="puzzle-piece" href="/plugins/getting-started"> |
| 66 | + Extensible middleware architecture. Build Go or WASM plugins for custom logic. |
| 67 | + </Card> |
| 68 | + <Card title="Mocker Plugin" icon="mask" href="/features/plugins/mocker"> |
| 69 | + Mock AI provider responses for testing, development, and simulation. |
| 70 | + </Card> |
| 71 | +</CardGroup> |
| 72 | + |
| 73 | +--- |
| 74 | + |
| 75 | +## MCP Gateway |
| 76 | + |
| 77 | +Enable AI models to discover and execute external tools dynamically via the **Model Context Protocol**. Bifrost acts as both an MCP client and server, connecting to external tool servers and exposing tools to clients like Claude Desktop. |
| 78 | + |
| 79 | +<CardGroup cols={2}> |
| 80 | + <Card title="Overview" icon="circle-info" href="/mcp/overview"> |
| 81 | + Learn how Bifrost integrates MCP to transform static chat models into action-capable agents. |
| 82 | + </Card> |
| 83 | + <Card title="Tool Execution" icon="play" href="/mcp/tool-execution"> |
| 84 | + Execute MCP tools with full control over approval, security validation, and conversation flow. |
| 85 | + </Card> |
| 86 | + <Card title="Agent Mode" icon="robot" href="/mcp/agent-mode"> |
| 87 | + Autonomous tool execution with configurable auto-approval for trusted operations. |
| 88 | + </Card> |
| 89 | + <Card title="Code Mode" icon="code" href="/mcp/code-mode"> |
| 90 | + Let AI write Python to orchestrate multiple tools — 50% less tokens, 40% lower latency. |
| 91 | + </Card> |
| 92 | + <Card title="OAuth Authentication" icon="shield" href="/mcp/oauth"> |
| 93 | + OAuth 2.0 authentication with automatic token refresh, PKCE, and dynamic client registration. |
| 94 | + </Card> |
| 95 | + <Card title="Tool Hosting" icon="toolbox" href="/mcp/tool-hosting"> |
| 96 | + Register custom tools directly in your application and expose them via MCP. |
| 97 | + </Card> |
| 98 | +</CardGroup> |
| 99 | + |
| 100 | +--- |
| 101 | + |
| 102 | +## Enterprise features |
| 103 | + |
| 104 | +Advanced capabilities for teams running production AI systems at scale. Enterprise deployments include private networking, custom security controls, and governance features designed for enterprise-grade reliability. |
| 105 | + |
| 106 | +<CardGroup cols={2}> |
| 107 | + <Card title="Guardrails" icon="road-barrier" href="/enterprise/guardrails"> |
| 108 | + Content safety with AWS Bedrock Guardrails, Azure Content Safety, and Patronus AI for real-time protection. |
| 109 | + </Card> |
| 110 | + <Card title="Adaptive Load Balancing" icon="brain" href="/enterprise/adaptive-load-balancing"> |
| 111 | + Predictive scaling with real-time health monitoring, automatically optimizing traffic across providers. |
| 112 | + </Card> |
| 113 | + <Card title="Clustering" icon="circle-nodes" href="/enterprise/clustering"> |
| 114 | + High-availability with automatic service discovery, gossip-based sync, and zero-downtime deployments. |
| 115 | + </Card> |
| 116 | + <Card title="Identity Providers (Okta, Entra)" icon="shield-check" href="/enterprise/advanced-governance"> |
| 117 | + OpenID Connect integration, user-level governance, team sync, and compliance frameworks. |
| 118 | + </Card> |
| 119 | + <Card title="Role-Based Access Control" icon="user-shield" href="/enterprise/rbac"> |
| 120 | + Fine-grained permissions with custom roles controlling access across all Bifrost resources. |
| 121 | + </Card> |
| 122 | + <Card title="MCP with Federated Auth" icon="screwdriver-wrench" href="/enterprise/mcp-with-fa"> |
| 123 | + Transform existing enterprise APIs into MCP tools using federated authentication — no code required. |
| 124 | + </Card> |
| 125 | + <Card title="Vault Support" icon="vault" href="/enterprise/vault-support"> |
| 126 | + Secure key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault. |
| 127 | + </Card> |
| 128 | + <Card title="In-VPC Deployments" icon="cloud" href="/enterprise/invpc-deployments"> |
| 129 | + Deploy within your private cloud infrastructure with VPC isolation and enhanced security controls. |
| 130 | + </Card> |
| 131 | + <Card title="Audit Logs" icon="scroll" href="/enterprise/audit-logs"> |
| 132 | + Immutable audit trails for SOC 2, GDPR, HIPAA, and ISO 27001 compliance. |
| 133 | + </Card> |
| 134 | + <Card title="Datadog Connector" icon="dog" href="/enterprise/datadog-connector"> |
| 135 | + Native Datadog integration for APM traces, LLM Observability, and metrics. |
| 136 | + </Card> |
| 137 | + <Card title="Log Exports" icon="download" href="/enterprise/log-exports"> |
| 138 | + Automated export of request logs and telemetry to storage systems and data lakes. |
| 139 | + </Card> |
| 140 | + <Card title="Custom Plugin Development" icon="plug" href="/enterprise/custom-plugins"> |
| 141 | + Tailored plugin development for organization-specific AI workflows and business logic. |
| 142 | + </Card> |
| 143 | +</CardGroup> |
| 144 | + |
| 145 | +--- |
| 146 | + |
| 147 | +## SDK integrations |
| 148 | + |
| 149 | +Use Bifrost as a drop-in replacement for popular AI SDKs with zero code changes — just update the base URL. |
| 150 | + |
| 151 | +<CardGroup cols={2}> |
| 152 | + <Card title="OpenAI SDK" icon="openai" href="/integrations/openai-sdk/overview"> |
| 153 | + Drop-in replacement for the OpenAI Python and Node.js SDKs. |
| 154 | + </Card> |
| 155 | + <Card title="Anthropic SDK" icon="asterisk" href="/integrations/anthropic-sdk/overview"> |
| 156 | + Drop-in replacement for the Anthropic Python and TypeScript SDKs. |
| 157 | + </Card> |
| 158 | + <Card title="Bedrock SDK" icon="aws" href="/integrations/bedrock-sdk/overview"> |
| 159 | + Native AWS Bedrock SDK integration with full model support. |
| 160 | + </Card> |
| 161 | + <Card title="GenAI SDK" icon="diamond" href="/integrations/genai-sdk/overview"> |
| 162 | + Drop-in replacement for the Google GenAI SDK. |
| 163 | + </Card> |
| 164 | + <Card title="LiteLLM" icon="train" href="/integrations/litellm-sdk"> |
| 165 | + Compatibility with LiteLLM proxy and SDK for unified model access. |
| 166 | + </Card> |
| 167 | + <Card title="LangChain" icon="link" href="/integrations/langchain-sdk"> |
| 168 | + Integration with the LangChain framework for building AI applications. |
| 169 | + </Card> |
| 170 | + <Card title="PydanticAI" icon="robot" href="/integrations/pydanticai-sdk"> |
| 171 | + Integration with PydanticAI for type-safe AI agent development. |
| 172 | + </Card> |
| 173 | +</CardGroup> |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +## Supported providers |
| 178 | + |
| 179 | +Bifrost supports 20+ AI providers through a single unified API. Configure multiple providers and Bifrost handles routing, failover, and load balancing automatically. See the [full provider support matrix](/providers/supported-providers/overview) for detailed capability comparisons. |
| 180 | + |
| 181 | +<CardGroup cols={3}> |
| 182 | + <Card title="OpenAI" icon="openai" href="/providers/supported-providers/openai"> |
| 183 | + GPT-4o, o1, GPT-4, and more with full feature support. |
| 184 | + </Card> |
| 185 | + <Card title="Anthropic" icon="asterisk" href="/providers/supported-providers/anthropic"> |
| 186 | + Claude 4, Claude 3.5, and Claude 3 model family. |
| 187 | + </Card> |
| 188 | + <Card title="AWS Bedrock" icon="aws" href="/providers/supported-providers/bedrock"> |
| 189 | + Multi-model access with native AWS authentication. |
| 190 | + </Card> |
| 191 | + <Card title="Google Vertex AI" icon="v" href="/providers/supported-providers/vertex"> |
| 192 | + Gemini and PaLM models with OAuth2 authentication. |
| 193 | + </Card> |
| 194 | + <Card title="Azure OpenAI" icon="microsoft" href="/providers/supported-providers/azure"> |
| 195 | + OpenAI models via Azure with deployment management. |
| 196 | + </Card> |
| 197 | + <Card title="Google Gemini" icon="diamond" href="/providers/supported-providers/gemini"> |
| 198 | + Gemini models with vision, audio, and embeddings. |
| 199 | + </Card> |
| 200 | + <Card title="Groq" icon="bolt" href="/providers/supported-providers/groq"> |
| 201 | + Ultra-fast inference with LPU hardware acceleration. |
| 202 | + </Card> |
| 203 | + <Card title="Mistral" icon="m" href="/providers/supported-providers/mistral"> |
| 204 | + Mistral and Mixtral models with tool support. |
| 205 | + </Card> |
| 206 | + <Card title="Cohere" icon="c" href="/providers/supported-providers/cohere"> |
| 207 | + Command models with chat, embeddings, and reasoning. |
| 208 | + </Card> |
| 209 | + <Card title="Cerebras" icon="c" href="/providers/supported-providers/cerebras"> |
| 210 | + High-speed inference with full streaming support. |
| 211 | + </Card> |
| 212 | + <Card title="Ollama" icon="o" href="/providers/supported-providers/ollama"> |
| 213 | + Local inference with OpenAI-compatible format. |
| 214 | + </Card> |
| 215 | + <Card title="Hugging Face" icon="face-smiling-hands" href="/providers/supported-providers/huggingface"> |
| 216 | + Inference API with chat, vision, TTS, and STT. |
| 217 | + </Card> |
| 218 | + <Card title="OpenRouter" icon="split" href="/providers/supported-providers/openrouter"> |
| 219 | + Route to multiple providers with reasoning support. |
| 220 | + </Card> |
| 221 | + <Card title="Perplexity" icon="hexagon-nodes" href="/providers/supported-providers/perplexity"> |
| 222 | + Web search integration with reasoning support. |
| 223 | + </Card> |
| 224 | + <Card title="ElevenLabs" icon="pause" href="/providers/supported-providers/elevenlabs"> |
| 225 | + Text-to-speech and speech-to-text models. |
| 226 | + </Card> |
| 227 | + <Card title="Nebius" icon="n" href="/providers/supported-providers/nebius"> |
| 228 | + OpenAI-compatible with streaming and embeddings. |
| 229 | + </Card> |
| 230 | + <Card title="xAI" icon="x" href="/providers/supported-providers/xai"> |
| 231 | + Grok models with vision and reasoning support. |
| 232 | + </Card> |
| 233 | + <Card title="Parasail" icon="p" href="/providers/supported-providers/parasail"> |
| 234 | + Chat and streaming with tool calling support. |
| 235 | + </Card> |
| 236 | + <Card title="Replicate" icon="R" href="/providers/supported-providers/replicate"> |
| 237 | + Prediction-based architecture with async modes. |
| 238 | + </Card> |
| 239 | + <Card title="SGL" icon="s" href="/providers/supported-providers/sgl"> |
| 240 | + SGLang runtime with streaming and embeddings. |
| 241 | + </Card> |
| 242 | +</CardGroup> |
0 commit comments