Skip to content

fix: separte mcp inference handler for auth consistency with other inference routes#1226

Closed
Pratham-Mishra04 wants to merge 1 commit intomainfrom
01-03-fix_separate_mcp_inference_handler_for_auth_consistency
Closed

fix: separte mcp inference handler for auth consistency with other inference routes#1226
Pratham-Mishra04 wants to merge 1 commit intomainfrom
01-03-fix_separate_mcp_inference_handler_for_auth_consistency

Conversation

@Pratham-Mishra04
Copy link
Collaborator

Summary

Improves the documentation and implementation of virtual keys, clarifying their relationship with authentication and standardizing how they're handled across the application.

Changes

  • Enhanced documentation on virtual keys, explaining how they work with authentication
  • Added clear instructions on making virtual keys mandatory
  • Moved MCP tool execution endpoint to a separate handler for consistent auth middleware application
  • Refactored utility functions for better code organization
  • Added screenshots to improve documentation clarity
  • Updated UI text to be more consistent with terminology

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

  1. Test virtual key authentication with and without the disable_auth_on_inference setting:
# With auth disabled
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "x-bf-vk: sk-bf-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}'

# With auth enabled
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Basic <base64-credentials>" \
  -H "x-bf-vk: sk-bf-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}'
  1. Test MCP tool execution with virtual keys:
curl -X POST http://localhost:8080/v1/mcp/tool/execute \
  -H "x-bf-vk: sk-bf-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "function": {
      "name": "get_weather",
      "arguments": "{\"location\":\"New York\"}"
    }
  }'
  1. Test enforcing virtual keys in the UI by going to Config → Security and enabling "Enforce Virtual Keys"

Screenshots/Recordings

Screenshots added for "Enforce Virtual Keys" and "Disable Auth on Inference" UI settings.

Breaking changes

  • Yes
  • No

Related issues

Improves virtual key handling and documentation for better user experience.

Security considerations

This PR clarifies the relationship between authentication and virtual keys, which is important for proper security configuration.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 3, 2026

📝 Walkthrough

Walkthrough

The pull request separates MCP tool execution into a dedicated inference handler, refactors virtual key extraction to support multiple HTTP header sources (x-bf-vk, Bearer tokens with VK prefix, x-api-key, x-goog-api-key), and updates documentation to clarify authentication semantics with disable_auth_on_inference configurations.

Changes

Cohort / File(s) Summary
Documentation Updates
docs/features/governance/virtual-keys.mdx, docs/features/mcp/tool-execution.mdx, ui/app/workspace/config/views/securityView.tsx, transports/changelog.md
Terminology standardization: "x-bf-vk header" → "virtual key header"; reorganized sections with "Authentication and Virtual Keys" and "Making Virtual Keys Mandatory"; added disable_auth_on_inference behavior guidance with CLI/API examples; extended config.json documentation; UI description updated.
Governance Plugin Refactoring
plugins/governance/main.go, plugins/governance/utils.go
Replaced internal getStringFromContext with bifrost.GetStringFromContext in main.go; rewrote parseVirtualKey utility to extract virtual keys from multiple header sources (x-bf-vk, Authorization Bearer, x-api-key, x-goog-api-key) with FastHTTP context integration and pointer return semantics.
MCP Handler Restructuring
transports/bifrost-http/handlers/mcp.go, transports/bifrost-http/handlers/mcpinference.go, transports/bifrost-http/server/server.go
Removed POST /v1/mcp/tool/execute route and three tool execution methods (executeTool, executeChatMCPTool, executeResponsesMCPTool) from MCP handler; introduced new MCPInferenceHandler with identical tool execution methods as dedicated inference endpoint; registered new handler in server inference routes.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant MCPInferenceHandler
    participant ParseVK as Virtual Key Parser
    participant BifrostClient
    participant Tool as External Tool

    Client->>MCPInferenceHandler: POST /v1/mcp/tool/execute<br/>(format=chat|responses)
    MCPInferenceHandler->>MCPInferenceHandler: Route by format param
    MCPInferenceHandler->>MCPInferenceHandler: Parse JSON request<br/>(ChatAssistantMessageToolCall<br/>or ResponsesToolMessage)
    MCPInferenceHandler->>ParseVK: parseVirtualKey(ctx)
    ParseVK->>ParseVK: Check x-bf-vk header
    alt VK Found
        ParseVK-->>MCPInferenceHandler: *string (VK pointer)
    else Check Bearer/API Keys
        ParseVK->>ParseVK: Parse Authorization Bearer token<br/>or x-api-key/x-goog-api-key
        ParseVK-->>MCPInferenceHandler: *string or nil
    end
    MCPInferenceHandler->>MCPInferenceHandler: Convert HTTP context<br/>to Bifrost context
    MCPInferenceHandler->>BifrostClient: ExecuteChatMCPTool<br/>or ExecuteResponsesMCPTool
    BifrostClient->>Tool: Invoke tool function
    Tool-->>BifrostClient: Tool response
    BifrostClient-->>MCPInferenceHandler: Result
    MCPInferenceHandler->>Client: JSON response (200)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 The tools once wandered in MCP's embrace,
Now sprinted to Inference's swifter space!
With headers diverse—from Google to Bearer's refrain,
Virtual keys dance through multiple lanes.
One handler splits two, auth flows run clear,
Consistency reigns—the separation is here! 🎯

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is well-structured, follows the provided template, and includes all major sections (Summary, Changes, Type of change, Affected areas, How to test, Screenshots, Breaking changes, Related issues, Security considerations, and Checklist).
Title check ✅ Passed The title addresses the main change (separating MCP inference handler for auth consistency) but contains a typo ('separte' instead of 'separate').

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 01-03-fix_separate_mcp_inference_handler_for_auth_consistency

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator Author

Pratham-Mishra04 commented Jan 3, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9556576 and 49ff4a9.

⛔ Files ignored due to path filters (2)
  • docs/media/ui-disable-auth-on-inference.png is excluded by !**/*.png
  • docs/media/ui-enforce-virtual-keys.png is excluded by !**/*.png
📒 Files selected for processing (9)
  • docs/features/governance/virtual-keys.mdx
  • docs/features/mcp/tool-execution.mdx
  • plugins/governance/main.go
  • plugins/governance/utils.go
  • transports/bifrost-http/handlers/mcp.go
  • transports/bifrost-http/handlers/mcpinference.go
  • transports/bifrost-http/server/server.go
  • transports/changelog.md
  • ui/app/workspace/config/views/securityView.tsx
💤 Files with no reviewable changes (1)
  • transports/bifrost-http/handlers/mcp.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • transports/changelog.md
  • docs/features/mcp/tool-execution.mdx
  • ui/app/workspace/config/views/securityView.tsx
  • plugins/governance/main.go
  • transports/bifrost-http/server/server.go
  • plugins/governance/utils.go
  • transports/bifrost-http/handlers/mcpinference.go
  • docs/features/governance/virtual-keys.mdx
🧠 Learnings (6)
📚 Learning: 2025-12-30T05:37:48.365Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1180
File: docs/features/mcp/connecting-to-servers.mdx:452-458
Timestamp: 2025-12-30T05:37:48.365Z
Learning: When reviewing documentation PRs in a Graphite-managed stack, first check related or previous PRs in the stack for feature implementations before flagging documentation as incorrect or unsupported. Documentation MDX files often reference features implemented in earlier stack PRs; verify that the documented behavior exists in earlier changes and that the docs accurately reflect the implemented state before requesting edits.

Applied to files:

  • docs/features/mcp/tool-execution.mdx
  • docs/features/governance/virtual-keys.mdx
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/main.go
  • transports/bifrost-http/server/server.go
  • plugins/governance/utils.go
  • transports/bifrost-http/handlers/mcpinference.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • plugins/governance/main.go
  • transports/bifrost-http/server/server.go
  • plugins/governance/utils.go
  • transports/bifrost-http/handlers/mcpinference.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/main.go
  • plugins/governance/utils.go
📚 Learning: 2025-12-12T08:25:02.629Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: transports/bifrost-http/integrations/router.go:709-712
Timestamp: 2025-12-12T08:25:02.629Z
Learning: In transports/bifrost-http/**/*.go, update streaming response handling to align with OpenAI Responses API: use typed SSE events such as response.created, response.output_text.delta, response.done, etc., and do not rely on the legacy data: [DONE] termination marker. Note that data: [DONE] is only used by the older Chat Completions and Text Completions streaming APIs. Ensure parsers, writers, and tests distinguish SSE events from the [DONE] sentinel and handle each event type accordingly for correct stream termination and progress updates.

Applied to files:

  • transports/bifrost-http/server/server.go
  • transports/bifrost-http/handlers/mcpinference.go
📚 Learning: 2025-12-29T09:14:16.633Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 888
File: transports/bifrost-http/handlers/middlewares.go:246-256
Timestamp: 2025-12-29T09:14:16.633Z
Learning: In the bifrost HTTP transport, fasthttp.RequestCtx is the primary context carrier and should be passed directly to functions that expect a context.Context. Do not convert to context.Context unless explicitly required. Ensure tracer implementations and related components are designed to accept fasthttp.RequestCtx directly, and document this architectural decision for maintainers.

Applied to files:

  • transports/bifrost-http/server/server.go
  • transports/bifrost-http/handlers/mcpinference.go
🧬 Code graph analysis (3)
plugins/governance/main.go (2)
core/utils.go (1)
  • GetStringFromContext (293-300)
core/schemas/bifrost.go (1)
  • BifrostContextKeyVirtualKey (123-123)
transports/bifrost-http/server/server.go (1)
transports/bifrost-http/handlers/mcpinference.go (1)
  • NewMCPInferenceHandler (21-26)
transports/bifrost-http/handlers/mcpinference.go (4)
transports/bifrost-http/lib/middleware.go (1)
  • ChainMiddlewares (11-23)
transports/bifrost-http/handlers/utils.go (2)
  • SendBifrostError (47-62)
  • SendJSON (16-22)
core/schemas/chatcompletions.go (1)
  • ChatAssistantMessageToolCall (720-725)
transports/bifrost-http/lib/ctx.go (1)
  • ConvertToBifrostContext (80-409)
🔇 Additional comments (10)
transports/bifrost-http/server/server.go (1)

1006-1010: LGTM! Auth middleware now applies consistently to MCP inference routes.

The MCP inference handler is correctly registered with the same middleware chain as other inference routes, ensuring authentication is applied consistently when configured. This achieves the PR objective of separating the MCP tool execution endpoint for auth consistency.

plugins/governance/utils.go (1)

23-34: Critical: Case sensitivity bug will break virtual key authentication.

Lines 27 and 32-33 convert the entire Bearer token value to lowercase before checking the prefix and storing it. Virtual keys are case-sensitive, so this will cause authentication failures.

Only the prefix portion should be case-insensitive for the comparison, but the actual token value must preserve its original case.

🔎 Proposed fix
 	authHeader := string(ctx.Request.Header.Peek("Authorization"))
 	if authHeader != "" {
 		if strings.HasPrefix(strings.ToLower(authHeader), "bearer ") {
 			authHeaderValue := strings.TrimSpace(authHeader[7:]) // Remove "Bearer " prefix
-			if authHeaderValue != "" && strings.HasPrefix(strings.ToLower(authHeaderValue), VirtualKeyPrefix) {
+			if authHeaderValue != "" && strings.HasPrefix(strings.ToLower(authHeaderValue), strings.ToLower(VirtualKeyPrefix)) {
 				virtualKeyValue = authHeaderValue
 			}
 		}
 	}
 	if virtualKeyValue != "" {
 		return bifrost.Ptr(virtualKeyValue)
 	}

Likely an incorrect or invalid review comment.

docs/features/mcp/tool-execution.mdx (1)

21-32: LGTM! Clear authentication documentation.

The new Authentication section accurately describes the authentication behavior for the /v1/mcp/tool/execute endpoint and properly aligns with the PR objective of establishing auth consistency across inference routes. The cross-reference to the Virtual Keys documentation helps users understand the relationship between authentication and virtual keys.

docs/features/governance/virtual-keys.mdx (2)

491-544: LGTM! Clear documentation on enforcing virtual keys.

The updated section properly documents how to make virtual keys mandatory, with accurate UI paths and clear instructions across all configuration methods. The terminology updates and UI path changes (Config → Security) align well with the broader documentation improvements in this PR.


545-620: Excellent clarification of authentication and virtual keys relationship.

This new section effectively clarifies the independent-yet-complementary relationship between authentication and virtual keys. The examples for both disable_auth_on_inference scenarios are clear and practical, helping users understand when to use which headers. The comprehensive configuration guidance across Web UI, API, and config.json is thorough.

transports/bifrost-http/handlers/mcpinference.go (5)

15-26: LGTM! Clean handler structure.

The struct design is straightforward and the constructor properly initializes the handler with required dependencies. The fields are effectively read-only after construction, making this safe for concurrent request handling.


28-31: LGTM! Proper route registration with middleware support.

The route registration correctly uses ChainMiddlewares to apply the provided middlewares, which aligns with the PR objective of ensuring consistent authentication middleware application across inference routes.


48-79: LGTM! Solid error handling and context management.

The method demonstrates proper:

  • JSON unmarshaling with error handling
  • Field validation (nil and empty string checks)
  • Context conversion with deferred cleanup
  • Error propagation using SendBifrostError
  • Response formatting using SendJSON

The false parameter in ConvertToBifrostContext appears appropriate for non-streaming tool execution.


81-112: LGTM! Consistent implementation for responses format.

The method follows the same solid patterns as executeChatMCPTool:

  • Proper JSON unmarshaling and validation
  • Context management with deferred cleanup
  • Appropriate error and response handling

The pointer usage difference between ExecuteResponsesMCPTool(&req) here and ExecuteChatMCPTool(req) in the chat variant appears intentional, likely reflecting the underlying client API signatures.


33-46: No issues found. The SendError() function is properly defined in handlers/utils.go and is being called correctly with matching parameters.

Likely an incorrect or invalid review comment.

Comment on lines +35 to +43
xAPIKey := string(ctx.Request.Header.Peek("x-api-key"))
if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), VirtualKeyPrefix) {
return bifrost.Ptr(xAPIKey)
}
// Checking x-goog-api-key header
xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key"))
if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), VirtualKeyPrefix) {
return bifrost.Ptr(xGoogleAPIKey)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Case sensitivity bug in x-api-key and x-goog-api-key headers.

Lines 36 and 41 convert the entire key value to lowercase before checking the prefix and returning it. This will break authentication for case-sensitive virtual keys.

The same fix applies here as with the Authorization header - only the prefix comparison should be case-insensitive.

🔎 Proposed fix
 	xAPIKey := string(ctx.Request.Header.Peek("x-api-key"))
-	if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), VirtualKeyPrefix) {
+	if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), strings.ToLower(VirtualKeyPrefix)) {
 		return bifrost.Ptr(xAPIKey)
 	}
 	// Checking x-goog-api-key header
 	xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key"))
-	if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), VirtualKeyPrefix) {
+	if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), strings.ToLower(VirtualKeyPrefix)) {
 		return bifrost.Ptr(xGoogleAPIKey)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
xAPIKey := string(ctx.Request.Header.Peek("x-api-key"))
if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), VirtualKeyPrefix) {
return bifrost.Ptr(xAPIKey)
}
// Checking x-goog-api-key header
xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key"))
if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), VirtualKeyPrefix) {
return bifrost.Ptr(xGoogleAPIKey)
}
xAPIKey := string(ctx.Request.Header.Peek("x-api-key"))
if xAPIKey != "" && strings.HasPrefix(strings.ToLower(xAPIKey), strings.ToLower(VirtualKeyPrefix)) {
return bifrost.Ptr(xAPIKey)
}
// Checking x-goog-api-key header
xGoogleAPIKey := string(ctx.Request.Header.Peek("x-goog-api-key"))
if xGoogleAPIKey != "" && strings.HasPrefix(strings.ToLower(xGoogleAPIKey), strings.ToLower(VirtualKeyPrefix)) {
return bifrost.Ptr(xGoogleAPIKey)
}
🤖 Prompt for AI Agents
In plugins/governance/utils.go around lines 35 to 43, the code lowercases the
entire x-api-key and x-goog-api-key values before checking the VirtualKeyPrefix,
which corrupts case-sensitive keys; change the checks to only perform a
case-insensitive comparison of the prefix while returning the original header
value untouched: ensure you first verify the header length is at least the
prefix length, compute a lowercase comparison of the header's leading substring
(or lowercase the prefix) to compare to the lowercase VirtualKeyPrefix, and if
it matches return the original xAPIKey/xGoogleAPIKey string as before; apply the
same fix to both header checks.

@Pratham-Mishra04 Pratham-Mishra04 changed the title fix: separte mcp inference handler for auth consistency with other inference routes [DO NOT MERGE] fix: separte mcp inference handler for auth consistency with other inference routes Jan 3, 2026
@Pratham-Mishra04 Pratham-Mishra04 changed the title [DO NOT MERGE] fix: separte mcp inference handler for auth consistency with other inference routes fix: separte mcp inference handler for auth consistency with other inference routes Jan 9, 2026
Base automatically changed from v1.4.0 to main January 12, 2026 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] inconsistency in the authentication design for MCP endpoints

1 participant