fix: separte mcp inference handler for auth consistency with other inference routes#1226
Conversation
📝 WalkthroughWalkthroughThe pull request separates MCP tool execution into a dedicated inference handler, refactors virtual key extraction to support multiple HTTP header sources (x-bf-vk, Bearer tokens with VK prefix, x-api-key, x-goog-api-key), and updates documentation to clarify authentication semantics with disable_auth_on_inference configurations. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant MCPInferenceHandler
participant ParseVK as Virtual Key Parser
participant BifrostClient
participant Tool as External Tool
Client->>MCPInferenceHandler: POST /v1/mcp/tool/execute<br/>(format=chat|responses)
MCPInferenceHandler->>MCPInferenceHandler: Route by format param
MCPInferenceHandler->>MCPInferenceHandler: Parse JSON request<br/>(ChatAssistantMessageToolCall<br/>or ResponsesToolMessage)
MCPInferenceHandler->>ParseVK: parseVirtualKey(ctx)
ParseVK->>ParseVK: Check x-bf-vk header
alt VK Found
ParseVK-->>MCPInferenceHandler: *string (VK pointer)
else Check Bearer/API Keys
ParseVK->>ParseVK: Parse Authorization Bearer token<br/>or x-api-key/x-goog-api-key
ParseVK-->>MCPInferenceHandler: *string or nil
end
MCPInferenceHandler->>MCPInferenceHandler: Convert HTTP context<br/>to Bifrost context
MCPInferenceHandler->>BifrostClient: ExecuteChatMCPTool<br/>or ExecuteResponsesMCPTool
BifrostClient->>Tool: Invoke tool function
Tool-->>BifrostClient: Tool response
BifrostClient-->>MCPInferenceHandler: Result
MCPInferenceHandler->>Client: JSON response (200)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
docs/media/ui-disable-auth-on-inference.pngis excluded by!**/*.pngdocs/media/ui-enforce-virtual-keys.pngis excluded by!**/*.png
📒 Files selected for processing (9)
docs/features/governance/virtual-keys.mdxdocs/features/mcp/tool-execution.mdxplugins/governance/main.goplugins/governance/utils.gotransports/bifrost-http/handlers/mcp.gotransports/bifrost-http/handlers/mcpinference.gotransports/bifrost-http/server/server.gotransports/changelog.mdui/app/workspace/config/views/securityView.tsx
💤 Files with no reviewable changes (1)
- transports/bifrost-http/handlers/mcp.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)
Files:
transports/changelog.mddocs/features/mcp/tool-execution.mdxui/app/workspace/config/views/securityView.tsxplugins/governance/main.gotransports/bifrost-http/server/server.goplugins/governance/utils.gotransports/bifrost-http/handlers/mcpinference.godocs/features/governance/virtual-keys.mdx
🧠 Learnings (6)
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.
Applied to files:
docs/features/mcp/tool-execution.mdxdocs/features/governance/virtual-keys.mdx
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.
Applied to files:
plugins/governance/main.gotransports/bifrost-http/server/server.goplugins/governance/utils.gotransports/bifrost-http/handlers/mcpinference.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.
Applied to files:
plugins/governance/main.gotransports/bifrost-http/server/server.goplugins/governance/utils.gotransports/bifrost-http/handlers/mcpinference.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.
Applied to files:
plugins/governance/main.goplugins/governance/utils.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.
Applied to files:
transports/bifrost-http/server/server.gotransports/bifrost-http/handlers/mcpinference.go
📚 Learning: 2025-12-29T09:14:16.633Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 888
File: transports/bifrost-http/handlers/middlewares.go:246-256
Timestamp: 2025-12-29T09:14:16.633Z
Learning: In the bifrost HTTP transport, fasthttp.RequestCtx is the primary context carrier and should be passed directly to functions that expect a context.Context. Do not convert to context.Context unless explicitly required. Ensure tracer implementations and related components are designed to accept fasthttp.RequestCtx directly, and document this architectural decision for maintainers.
Applied to files:
transports/bifrost-http/server/server.gotransports/bifrost-http/handlers/mcpinference.go
🧬 Code graph analysis (3)
plugins/governance/main.go (2)
core/utils.go (1)
GetStringFromContext(293-300)core/schemas/bifrost.go (1)
BifrostContextKeyVirtualKey(123-123)
transports/bifrost-http/server/server.go (1)
transports/bifrost-http/handlers/mcpinference.go (1)
NewMCPInferenceHandler(21-26)
transports/bifrost-http/handlers/mcpinference.go (4)
transports/bifrost-http/lib/middleware.go (1)
ChainMiddlewares(11-23)transports/bifrost-http/handlers/utils.go (2)
SendBifrostError(47-62)SendJSON(16-22)core/schemas/chatcompletions.go (1)
ChatAssistantMessageToolCall(720-725)transports/bifrost-http/lib/ctx.go (1)
ConvertToBifrostContext(80-409)
🔇 Additional comments (10)
transports/bifrost-http/server/server.go (1)
1006-1010: LGTM! Auth middleware now applies consistently to MCP inference routes.The MCP inference handler is correctly registered with the same middleware chain as other inference routes, ensuring authentication is applied consistently when configured. This achieves the PR objective of separating the MCP tool execution endpoint for auth consistency.
plugins/governance/utils.go (1)
23-34: Critical: Case sensitivity bug will break virtual key authentication.Lines 27 and 32-33 convert the entire Bearer token value to lowercase before checking the prefix and storing it. Virtual keys are case-sensitive, so this will cause authentication failures.
Only the prefix portion should be case-insensitive for the comparison, but the actual token value must preserve its original case.
🔎 Proposed fix
authHeader := string(ctx.Request.Header.Peek("Authorization")) if authHeader != "" { if strings.HasPrefix(strings.ToLower(authHeader), "bearer ") { authHeaderValue := strings.TrimSpace(authHeader[7:]) // Remove "Bearer " prefix - if authHeaderValue != "" && strings.HasPrefix(strings.ToLower(authHeaderValue), VirtualKeyPrefix) { + if authHeaderValue != "" && strings.HasPrefix(strings.ToLower(authHeaderValue), strings.ToLower(VirtualKeyPrefix)) { virtualKeyValue = authHeaderValue } } } if virtualKeyValue != "" { return bifrost.Ptr(virtualKeyValue) }Likely an incorrect or invalid review comment.
docs/features/mcp/tool-execution.mdx (1)
21-32: LGTM! Clear authentication documentation.The new Authentication section accurately describes the authentication behavior for the
/v1/mcp/tool/executeendpoint and properly aligns with the PR objective of establishing auth consistency across inference routes. The cross-reference to the Virtual Keys documentation helps users understand the relationship between authentication and virtual keys.docs/features/governance/virtual-keys.mdx (2)
491-544: LGTM! Clear documentation on enforcing virtual keys.The updated section properly documents how to make virtual keys mandatory, with accurate UI paths and clear instructions across all configuration methods. The terminology updates and UI path changes (Config → Security) align well with the broader documentation improvements in this PR.
545-620: Excellent clarification of authentication and virtual keys relationship.This new section effectively clarifies the independent-yet-complementary relationship between authentication and virtual keys. The examples for both
disable_auth_on_inferencescenarios are clear and practical, helping users understand when to use which headers. The comprehensive configuration guidance across Web UI, API, and config.json is thorough.transports/bifrost-http/handlers/mcpinference.go (5)
15-26: LGTM! Clean handler structure.The struct design is straightforward and the constructor properly initializes the handler with required dependencies. The fields are effectively read-only after construction, making this safe for concurrent request handling.
28-31: LGTM! Proper route registration with middleware support.The route registration correctly uses
ChainMiddlewaresto apply the provided middlewares, which aligns with the PR objective of ensuring consistent authentication middleware application across inference routes.
48-79: LGTM! Solid error handling and context management.The method demonstrates proper:
- JSON unmarshaling with error handling
- Field validation (nil and empty string checks)
- Context conversion with deferred cleanup
- Error propagation using
SendBifrostError- Response formatting using
SendJSONThe
falseparameter inConvertToBifrostContextappears appropriate for non-streaming tool execution.
81-112: LGTM! Consistent implementation for responses format.The method follows the same solid patterns as
executeChatMCPTool:
- Proper JSON unmarshaling and validation
- Context management with deferred cleanup
- Appropriate error and response handling
The pointer usage difference between
ExecuteResponsesMCPTool(&req)here andExecuteChatMCPTool(req)in the chat variant appears intentional, likely reflecting the underlying client API signatures.
33-46: No issues found. TheSendError()function is properly defined inhandlers/utils.goand is being called correctly with matching parameters.Likely an incorrect or invalid review comment.
| xAPIKey := string(ctx.Request.Header.Peek("x-api-key")) | ||
| if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), VirtualKeyPrefix) { | ||
| return bifrost.Ptr(xAPIKey) | ||
| } | ||
| // Checking x-goog-api-key header | ||
| xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key")) | ||
| if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), VirtualKeyPrefix) { | ||
| return bifrost.Ptr(xGoogleAPIKey) | ||
| } |
There was a problem hiding this comment.
Critical: Case sensitivity bug in x-api-key and x-goog-api-key headers.
Lines 36 and 41 convert the entire key value to lowercase before checking the prefix and returning it. This will break authentication for case-sensitive virtual keys.
The same fix applies here as with the Authorization header - only the prefix comparison should be case-insensitive.
🔎 Proposed fix
xAPIKey := string(ctx.Request.Header.Peek("x-api-key"))
- if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), VirtualKeyPrefix) {
+ if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), strings.ToLower(VirtualKeyPrefix)) {
return bifrost.Ptr(xAPIKey)
}
// Checking x-goog-api-key header
xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key"))
- if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), VirtualKeyPrefix) {
+ if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), strings.ToLower(VirtualKeyPrefix)) {
return bifrost.Ptr(xGoogleAPIKey)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| xAPIKey := string(ctx.Request.Header.Peek("x-api-key")) | |
| if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), VirtualKeyPrefix) { | |
| return bifrost.Ptr(xAPIKey) | |
| } | |
| // Checking x-goog-api-key header | |
| xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key")) | |
| if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), VirtualKeyPrefix) { | |
| return bifrost.Ptr(xGoogleAPIKey) | |
| } | |
| xAPIKey := string(ctx.Request.Header.Peek("x-api-key")) | |
| if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), strings.ToLower(VirtualKeyPrefix)) { | |
| return bifrost.Ptr(xAPIKey) | |
| } | |
| // Checking x-goog-api-key header | |
| xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key")) | |
| if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), strings.ToLower(VirtualKeyPrefix)) { | |
| return bifrost.Ptr(xGoogleAPIKey) | |
| } |
🤖 Prompt for AI Agents
In plugins/governance/utils.go around lines 35 to 43, the code lowercases the
entire x-api-key and x-goog-api-key values before checking the VirtualKeyPrefix,
which corrupts case-sensitive keys; change the checks to only perform a
case-insensitive comparison of the prefix while returning the original header
value untouched: ensure you first verify the header length is at least the
prefix length, compute a lowercase comparison of the header's leading substring
(or lowercase the prefix) to compare to the lowercase VirtualKeyPrefix, and if
it matches return the original xAPIKey/xGoogleAPIKey string as before; apply the
same fix to both header checks.

Summary
Improves the documentation and implementation of virtual keys, clarifying their relationship with authentication and standardizing how they're handled across the application.
Changes
Type of change
Affected areas
How to test
disable_auth_on_inferencesetting:Screenshots/Recordings
Screenshots added for "Enforce Virtual Keys" and "Disable Auth on Inference" UI settings.
Breaking changes
Related issues
Improves virtual key handling and documentation for better user experience.
Security considerations
This PR clarifies the relationship between authentication and virtual keys, which is important for proper security configuration.
Checklist
docs/contributing/README.mdand followed the guidelines