Skip to content

v1.4.0#1153

Merged
akshaydeo merged 217 commits intomainfrom
v1.4.0
Jan 12, 2026
Merged

v1.4.0#1153
akshaydeo merged 217 commits intomainfrom
v1.4.0

Conversation

@akshaydeo
Copy link
Contributor

  • MCP gateway with code mode
  • New plugins architecture

Pratham-Mishra04 and others added 5 commits November 28, 2025 16:14
## Summary

Briefly explain the purpose of this PR and the problem it solves.

## Changes

- What was changed and why
- Any notable design decisions or trade-offs

## Type of change

- [ ] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Describe the steps to validate this change. Include commands and expected outcomes.

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

If adding new configs or environment variables, document them here.

## Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

## Breaking changes

- [ ] Yes
- [ ] No

If yes, describe impact and migration instructions.

## Related issues

Link related issues and discussions. Example: Closes #123

## Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable
## Summary

Implement MCP (Model Context Protocol) server functionality over HTTP, enabling Bifrost to act as an MCP server that can receive requests from MCP clients and execute tools.

## Changes

- Added a new `MCPServerHandler` that implements the MCP protocol over HTTP streaming
- Created endpoints for MCP server communication (`/mcp`) supporting both JSON-RPC (POST) and Server-Sent Events (GET)
- Implemented tool synchronization from connected MCP clients to the MCP server
- Added virtual key validation for MCP server requests when governance enforcement is enabled
- Extended the HTTP server to register MCP server routes and implement tool execution

## Type of change

- [x] Feature

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)

## How to test

Test the MCP server functionality by sending MCP protocol requests to the `/mcp` endpoint:

```sh
# Test JSON-RPC endpoint
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-bf-your-virtual-key" \
  -d '{"jsonrpc":"2.0","method":"list_tools","id":1}'

# Test SSE endpoint
curl -N http://localhost:8080/mcp \
  -H "Accept: text/event-stream" \
  -H "Authorization: Bearer sk-bf-your-virtual-key"
```

## Breaking changes

- [x] No

## Security considerations

The implementation includes security checks for virtual keys when governance enforcement is enabled, ensuring that MCP server requests are properly authenticated.

## Checklist

- [x] I added/updated tests where appropriate
- [x] I verified builds succeed (Go and UI)
@github-actions
Copy link
Contributor

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1153

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 22, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • MCP: tool discovery, agent-mode automation, code-mode (execute TypeScript-like code) and client health monitoring.
    • Distributed tracing: spans, stream timing (TTFT) and per-request trace context.
    • WASM plugin examples (Go/Rust) and HTTP plugin middleware.
  • Improvements

    • Unified request context across providers; stronger streaming cancellation & timeout handling.
    • Vertex model listing shows custom + foundation models with friendlier names.
    • xAI/Grok compatibility tweaks.
  • Documentation

    • Updated plugin and provider docs; new prerelease changelogs.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds a full MCP subsystem (manager, client lifecycle, tool discovery, code-mode and agent-mode execution), migrates callers to a richer *schemas.BifrostContext and tracing infrastructure (Trace/Span/Tracer + TraceStore), refactors streaming accumulators/TTFT and plugin transport (HTTP middleware/WASM), and expands governance, migrations, and tests.

Changes

Cohort / File(s) Summary
MCP core & tooling
core/mcp/*, core/bifrost.go, core/mcp.go, core/schemas/mcp.go, tests/core-mcp/*
New MCPManager/ToolsManager, client lifecycle (Add/Edit/Remove/Reconnect), transports (HTTP/STDIO/SSE/InProcess), health monitors, tool registration, code-mode tools (list/read/execute), agent-mode loop (chat/responses), agent adapter layer, and extensive unit/integration tests. Bifrost lazily initializes and delegates MCP-related flows.
Tracing & context primitives
core/schemas/trace.go, core/schemas/tracer.go, framework/tracing/*, core/schemas/context.go, core/utils.go, core/bifrost.go
New Trace/Span/Tracer types and NoOp/default tracer, TraceStore and Tracer implementation, W3C propagation helpers, BifrostContext extended with trace/span keys and SetValue/GetValue behavior, Bifrost stores tracer (atomic.Value) and new tracer helpers.
Provider APIs (context migration)
core/providers/*/* (many files)
Mass migration of provider public APIs from context.Context to *schemas.BifrostContext; streaming paths updated to use ctx.SetValue for stream-end/cancellation; new error converters and provider-specific compatibility adjustments (xAI/grok, Responses/Chat conversions).
Streaming & accumulators
framework/streaming/*, core/providers/utils/utils.go
StreamAccumulator refactor: FirstChunkTimestamp, dedupe maps, refCount, getLast* accessors, idempotent finalization, Time-To-First-Token (TTFT) propagation, chunk deduplication, and deferred-span/aggregated post-hook finalization for streaming.
Code-mode execution & VFS
core/mcp/codemode*.go
TypeScript transpile + Goja VM runner with MCP tool bindings, console capture, timeouts and Promise handling; tools: executeToolCode, listToolFiles, readToolFile; rich error hints and environment metadata.
Tools & client manager
core/mcp/clientmanager.go, core/mcp/toolmanager.go, core/mcp/utils.go
Client lifecycle APIs (Add/Edit/Remove/Reconnect), RegisterTool, GetClients/GetToolPerClient, connect helpers for transports, UpdateToolManagerConfig, tool discovery/routing and code-mode binding logic.
Agent adaptors & loop
core/mcp/agent*.go, core/mcp/agentadaptors.go
API-agnostic agent adapters for Chat/Responses, shared executeAgent loop, classification and parallel execution of auto-exec tools, iteration up to max depth, tool-call parsing and result aggregation.
Governance & in-memory store
plugins/governance/*, framework/configstore/*
New GovernanceStore interface and LocalGovernanceStore, in-memory rate-limit/budget APIs, UsageTracker updates, configstore migrations and new MCP fields (agent depth, tool timeouts, code-mode binding), many governance tests and e2e scenarios.
Schemas & JSON helpers
core/schemas/*.go, core/schemas/json_*.go, core/schemas/*_wasm.go
Tracing schema types, Tracer/Span/Trace APIs, JSON Marshal/Unmarshal wrappers (native vs wasm), DeepCopyChatTool, plugin HTTP request/response types, context wasm/native helpers, minor field/tag adjustments (e.g., BifrostError.ExtraFields).
Plugin transport & WASM plugins
core/schemas/plugin.go, framework/plugins/dynamicplugin.go, examples/plugins/hello-world*, docs/plugins/*
Replace TransportInterceptor with HTTPTransportMiddleware/HTTPRequest+HTTPResponse; updated plugin short-circuit/middleware API; dynamic plugin loading updated; new Go/Rust WASM plugin examples; docs updated (header lowercase normalization).
Config store & migrations
framework/configstore/*, framework/configstore/tables/*
Persist new MCP client fields (IsCodeModeClient, ToolsToAutoExecute), ToolManagerConfig in MCPConfig, migrations for added columns and client name normalization, RDB store changes to assemble ToolManagerConfig and cascade delete semantics.
Tests & test utilities
core/internal/testutil/*, tests/core-mcp/*, plugins/governance/*_test.go
Extensive MCP unit/integration tests, test harness updates to use schemas.NewBifrostContext, new Responses retry conditions and test helpers, governance store/tracker tests and many updated internal test call sites.
CI, scripts, manifests & examples
.github/workflows/*, Makefile, core/go.mod, framework/go.mod, examples/*
Branch-aware release scripts, new integration test targets and debug flows, dependency bumps and new module files for tests/examples, core/framework version bumps, docs (migration guide, changelogs) updates.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Bifrost
  participant MCPManager
  participant ToolsManager
  participant MCPClient as MCP
  participant Provider as LLM

  Client->>Bifrost: Chat/Responses request (with tools requested)
  Bifrost->>MCPManager: AddToolsToRequest(ctx, req)
  Bifrost->>Provider: Initial LLM call
  Provider-->>Bifrost: Response (may include tool calls)
  Bifrost->>MCPManager: CheckAndExecuteAgentFor*(ctx, req, resp, makeReq)
  MCPManager->>ToolsManager: ExecuteAgentFor*Request(...)
  loop agent iterations (<= maxAgentDepth)
    ToolsManager->>MCPClient: Execute auto-exec tool(s) (parallel)
    MCPClient-->>ToolsManager: Tool results
    ToolsManager->>Provider: makeReq(next turn)
    Provider-->>ToolsManager: Next response
  end
  ToolsManager-->>MCPManager: Aggregated response
  MCPManager-->>Bifrost: Agent-processed response
  Bifrost-->>Client: Final response
Loading
sequenceDiagram
  autonumber
  actor Client
  participant Bifrost
  participant Provider
  participant Tracer
  participant PostHooks

  Client->>Bifrost: Streaming request
  Bifrost->>Tracer: CreateTrace + Start LLM span
  Bifrost->>Provider: Start stream
  loop for each chunk
    Provider-->>Bifrost: Stream chunk
    Bifrost->>Tracer: AddStreamingChunk(traceID, chunk)
    Bifrost->>PostHooks: (optional per-chunk post-hook)
  end
  alt cancelled/timeout
    Bifrost->>PostHooks: HandleStreamCancellation/Timeout
  else normal end
    Bifrost->>PostHooks: Finalize streaming post-hooks (aggregate timings)
  end
  Bifrost->>Tracer: PopulateLLMResponseAttributes + End span
  Bifrost-->>Client: Finalized aggregated response
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~180 minutes

Possibly related PRs

  • v1.4.0 #1153 — large MCP integration and tracing work that appears to overlap or be a predecessor to these MCP/agent/tracing additions.
  • chore: governance tests added #1041 — governance package and governance test additions that intersect with the governance/in-memory changes here.
  • review fixes #1309 — streaming-accumulator, retry and related helpers overlap with accumulator and retry-condition changes in this PR.

Suggested reviewers

  • Pratham-Mishra04
  • roroghost17

Poem

"A rabbit types as midnight glows,
Wiring tools where MCP wind blows.
Spans hop, traces bloom, streams find pace,
Code-mode dances with a gentle grace.
Hops of tests — v1.4 takes its place!" 🐇✨

🚥 Pre-merge checks | ❌ 3
❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning PR description is minimal and does not follow the repository's required template. Expand the description to follow the template: include Summary, detailed Changes, Type of change, Affected areas, How to test (commands), Breaking changes and migration notes, Related issues, Security considerations, and a completed checklist.
Docstring Coverage ⚠️ Warning Docstring coverage is 73.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive Title is a release tag but too generic to describe the primary code changes (MCP gateway, code-mode, plugins). Use a concise descriptive title summarizing the main change, e.g., "MCP gateway and code-mode; new plugins architecture (v1.4.0)".

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch v1.4.0

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 613dfe3 and 22c5159.

📒 Files selected for processing (1)
  • core/providers/gemini/gemini.go

Comment @coderabbitai help to get the list of available commands and usage tips.

@akshaydeo akshaydeo changed the title V1.4.0 v1.4.0 Dec 22, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🧹 Nitpick comments (56)
ui/lib/utils/validation.ts (1)

374-376: LGTM! Sensible optimization with early exit.

The early return for empty or undefined origins is correct and improves performance by avoiding unnecessary filtering operations. Returning isValid: true for an empty input follows the principle of vacuous truth.

Optional refinement: Consider making the type signature explicit if origins can be undefined:

🔎 Type signature refinement
-export function validateOrigins(origins: string[]): { isValid: boolean; invalidOrigins: string[] } {
+export function validateOrigins(origins: string[] | undefined): { isValid: boolean; invalidOrigins: string[] } {

This would make the function signature align with the defensive !origins check and provide better type safety at call sites.

core/providers/openai/openai.go (1)

1933-2044: Consider refactoring transcription handling to use HandleProviderResponse

The TODO correctly points out that this path is the odd one out: it decodes with sonic.Unmarshal and then (optionally) decodes again into rawResponse. When you pick this up, it would be cleaner to mirror the other handlers and centralize JSON/RawResponse handling via providerUtils.HandleProviderResponse, keeping the existing HTML-response guard intact to avoid regressions.

ui/lib/types/schemas.ts (1)

601-647: MCP client name and tool constraints—check for backward compatibility and consider deduping validators

  • The stricter name rules (no hyphens/spaces and not starting with a digit) make sense for identifier-like usage, but they may prevent editing existing MCP clients whose names don’t meet these constraints. If such clients already exist, consider either:

    • migrating their names, or
    • handling legacy names more leniently on update in the UI/schema.
  • tools_to_auto_execute reuses the same wildcard/duplicate semantics as tools_to_execute, which is good. The two refine blocks are identical; if this evolves further, extracting a shared helper validator would reduce duplication.

transports/bifrost-http/handlers/mcp.go (1)

348-357: Simplify unreachable conditional branch.

The nested check if !hasWildcardInExecute at line 351 is always true at this point in the code. When hasWildcardInExecute is true, the function returns early at line 341. This makes the inner condition redundant.

🔎 Proposed simplification
 		// Validate each tool in ToolsToAutoExecute
 		for _, tool := range toolsToAutoExecute {
 			if tool == "*" {
-				// Wildcard is allowed if "*" is in ToolsToExecute
-				if !hasWildcardInExecute {
-					return fmt.Errorf("tool '%s' in tools_to_auto_execute is not in tools_to_execute", tool)
-				}
+				// At this point, hasWildcardInExecute is always false
+				// (if it were true, we would have returned at line 341)
+				return fmt.Errorf("tool '%s' in tools_to_auto_execute is not in tools_to_execute", tool)
 			} else if !allowedTools[tool] {
 				return fmt.Errorf("tool '%s' in tools_to_auto_execute is not in tools_to_execute", tool)
 			}
 		}
tests/core-mcp/client_config_test.go (2)

24-31: Consider extracting the repeated client lookup into a helper.

The pattern of finding bifrostInternal client by iterating through clients and matching on ID is repeated in nearly every test function. Extracting this to a helper would reduce duplication.

🔎 Example helper function
// In utils.go or setup.go
func findBifrostInternalClient(clients []schemas.MCPClient) *schemas.MCPClient {
	for i := range clients {
		if clients[i].Config.ID == "bifrostInternal" {
			return &clients[i]
		}
	}
	return nil
}

Usage in tests:

bifrostClient := findBifrostInternalClient(clients)
require.NotNil(t, bifrostClient, "bifrostInternal client should exist")

129-158: Test name suggests mixed mode validation but only checks code mode count.

TestMixedCodeModeAndNonCodeModeClients implies it should verify both code mode and non-code mode clients exist, but only asserts on codeModeCount. Consider also asserting that non-code mode clients exist, or clarify the test's intent.

🔎 Proposed enhancement
 	codeModeCount := 0
+	nonCodeModeCount := 0

 	for _, client := range clients {
 		if client.Config.IsCodeModeClient {
 			codeModeCount++
+		} else {
+			nonCodeModeCount++
 		}
 	}

 	// At minimum, we should have bifrostInternal as code mode
 	assert.GreaterOrEqual(t, codeModeCount, 1, "Should have at least one code mode client")
+	// For a true "mixed" test, verify both types exist
+	// Note: This may require adding another client with non-code mode
ui/app/workspace/logs/views/columns.tsx (1)

40-50: Consider defensive handling for array block mapping.

On line 46, if lastMessage.output is an array but some blocks don't have a text property, the map will produce undefined values that get joined. Consider filtering or providing a fallback.

🔎 Proposed fix
 			} else if (Array.isArray(lastMessage.output)) {
-				return lastMessage.output.map((block) => block.text).join("\n");
+				return lastMessage.output.map((block) => block.text ?? "").filter(Boolean).join("\n");
 			} else if (lastMessage.output.type && lastMessage.output.type === "computer_screenshot") {
tests/core-mcp/edge_cases_test.go (1)

81-104: Consider testing actual timeout behavior.

The test name mentions "Timeout" but the test uses a 100ms delay which should complete successfully. Consider adding a separate test that actually triggers a timeout (e.g., delay longer than the configured tool execution timeout) to verify timeout error handling.

core/schemas/mux.go (1)

325-338: Function-call output routing looks consistent; consider future-proofing for block outputs.

The changes cleanly:

  • Omit Role for function_call_output messages and gate rm.Role on role != "".
  • Ensure function_call_output content is kept solely in ResponsesToolMessage.Output instead of rm.Content.
  • Skip Content / ContentBlocks population for function_call_output while preserving the existing block-type mapping for all other messages.

This matches the Responses API model and avoids duplicated content paths. One limitation to be aware of: if ChatMessageRoleTool ever starts using ContentBlocks for structured outputs, those blocks are currently ignored and not copied into ResponsesToolMessage.Output. If you plan to support structured tool outputs later, it may be worth extending the outputContent handling to cover block outputs as well (or documenting that tool outputs are string-only for now).

Also applies to: 351-366, 368-425, 435-448

tests/core-mcp/utils.go (1)

15-30: Tighten test helper robustness around JSON errors and generic error-string matching.

  • createToolCall currently ignores json.Marshal errors. Even in tests, this can silently convert bad argument shapes into empty {} payloads. Consider returning an error from createToolCall (and using require.NoError at the call site) or panicking on marshal failure so tests fail loudly rather than masking invalid inputs.
  • In assertExecutionResult, the generic assert.NotContains(t, responseText, "Error:") is stricter than the comment (“but allow console.error output”) and may produce false negatives if the successful tool output legitimately includes Error: as part of normal text. The two explicit checks for "Execution runtime error" and "Execution typescript error" already cover the structured failure modes; you might drop or relax the generic "Error:" assertion to avoid accidental test brittleness.

Also applies to: 32-60

core/mcp/clientmanager.go (1)

21-36: MCP client lifecycle, locking, and connection setup look solid; fix small doc inconsistencies.

  • GetClients, AddClient, ReconnectClient, RemoveClient/removeClientUnsafe, and connectToMCPClient follow a good pattern: short lock-held sections around m.clientMap, heavy network/IO setup done without holding the lock, SSE CancelFunc is stored and later invoked for cleanup, and transports are closed on error or removal.
  • RegisterTool correctly ensures the local in-process server/client exist, validates tool names, avoids duplicate registrations, wraps the typed handler via CallToolRequest.GetArguments(), and keeps the internal ToolMap in sync.

Two minor polish items:

  • The EditClient comment block still says it “refreshes its tool list” and retrieves tools from the server, but the implementation intentionally only updates execution filters and flags (Name/IsCodeModeClient/Headers/ToolsToExecute/ToolsToAutoExecute) and relies on ReconnectClient for tool refresh. Updating the comment to match behavior will avoid confusion.
  • In connectToMCPClient, the initial existingClient.ConnectionInfo.Type = config.ConnectionType is immediately superseded by the subsequent m.clientMap[config.ID] = &schemas.MCPClientState{...} assignment. You can safely drop that assignment (or reuse the existing struct instead of always allocating a new one) to reduce redundancy.

Also applies to: 47-110, 134-162, 176-200, 291-431, 433-534, 540-583, 591-679

transports/bifrost-http/lib/config.go (2)

217-231: New MCP client defaults are reasonable; confirm migrations initialize them for existing installs.

Adding MCPAgentDepth: 10 and MCPToolExecutionTimeout: 30 to DefaultClientConfig provides sensible defaults for agent depth and per-tool timeout. For fresh setups they flow through loadDefaultClientConfig as expected.

For existing deployments where client config already lives in the DB, these fields will remain 0 unless:

  • The DB migration sets sane non-zero defaults, or
  • A config file provides explicit values (handled in loadClientConfigFromFile).

It’s worth double‑checking the migration layer to ensure upgraded instances don’t silently run with unlimited/zero depth or timeout when no file-based overrides exist.


445-507: Inline client-config merge logic is fine; consider deleting or reusing mergeClientConfig to avoid drift.

loadClientConfigFromFile now inlines a clear, field-by-field merge:

  • DB config wins overall, with file values only filling zero/empty ints/slices.
  • Boolean flags are only flipped false → true based on the file.
  • New fields MCPAgentDepth and MCPToolExecutionTimeout are included in the same pattern.

This is correct, but the old mergeClientConfig helper immediately below implements almost the same logic (without the new MCP fields) and is no longer invoked here. To keep maintenance overhead down and avoid future divergence, consider either:

  • Deleting mergeClientConfig if it’s truly unused, or
  • Refactoring loadClientConfigFromFile to delegate to it and extending the helper to cover the new fields.

Also applies to: 534-573

ui/lib/types/config.ts (1)

326-327: Add documentation for new MCP config fields.

The new mcp_agent_depth and mcp_tool_execution_timeout fields lack documentation. Add JSDoc comments to clarify their purpose and valid ranges.

🔎 Suggested documentation
 	enable_litellm_fallbacks: boolean;
+	/** Maximum depth for MCP agent mode tool execution (minimum: 1) */
 	mcp_agent_depth: number;
+	/** Timeout for individual MCP tool execution in seconds (minimum: 1) */
 	mcp_tool_execution_timeout: number;
 }
transports/config.schema.json (1)

1766-1782: Consider adding maximum constraints for MCP config fields.

The tool_execution_timeout and max_agent_depth fields only have minimum constraints but no maximum values. Consider adding reasonable upper bounds to prevent misconfiguration that could cause resource exhaustion.

For example:

  • tool_execution_timeout: Maximum of 300 seconds (5 minutes) seems reasonable for a tool execution
  • max_agent_depth: Maximum of 50 to prevent infinite recursion scenarios
🔎 Proposed schema enhancement
       "tool_execution_timeout": {
         "type": "integer",
         "description": "Tool execution timeout in seconds",
         "minimum": 1,
+        "maximum": 300,
         "default": 30
       },
       "max_agent_depth": {
         "type": "integer",
         "description": "Max agent depth",
         "minimum": 1,
+        "maximum": 50,
         "default": 10
       }
tests/core-mcp/agent_mode_test.go (1)

73-77: Assertion doesn't verify the configured MaxAgentDepth value.

The test configures MaxAgentDepth = 2 but only asserts that clients is not nil. This doesn't verify that the max depth configuration was actually applied. Consider adding an assertion that validates the configured value is accessible and correct.

🔎 Suggested improvement
 	// Verify max depth is configured
 	clients, err := b.GetMCPClients()
 	require.NoError(t, err)
 	assert.NotNil(t, clients, "Should have clients")
+
+	// Verify the actual max depth configuration
+	toolManagerConfig := b.GetMCPToolManagerConfig()
+	assert.Equal(t, 2, toolManagerConfig.MaxAgentDepth, "MaxAgentDepth should be 2")
+	assert.Equal(t, 30*time.Second, toolManagerConfig.ToolExecutionTimeout, "ToolExecutionTimeout should be 30s")

If GetMCPToolManagerConfig() or equivalent is not exposed, consider whether the test should verify behavior (e.g., agent stops at depth 2) rather than just configuration presence.

framework/configstore/migrations_test.go (2)

34-41: Log capture may interfere with parallel tests.

captureLogOutput modifies the global log output, which can cause race conditions or log interleaving if tests run in parallel with t.Parallel(). While not currently using parallel execution, this could become an issue.

🔎 Consider using a custom logger
-// captureLogOutput captures log output during a function execution
-func captureLogOutput(fn func()) string {
-	var buf bytes.Buffer
-	log.SetOutput(&buf)
-	defer log.SetOutput(os.Stderr)
-
-	fn()
-	return buf.String()
-}
+// captureLogOutput captures log output during a function execution
+// Note: This modifies global log state and is not safe for parallel tests
+func captureLogOutput(fn func()) string {
+	var buf bytes.Buffer
+	log.SetOutput(&buf)
+	defer log.SetOutput(os.Stderr)
+
+	fn()
+	return buf.String()
+}

Alternatively, consider passing a *log.Logger instance to the functions under test to enable isolated testing.


248-259: Consider calling the actual normalizeMCPClientName function.

This local helper duplicates the normalization logic from migrations.go. If the production implementation changes, this test helper could become stale and tests might pass while actual behavior differs.

🔎 Suggested improvement
-	// Helper function to normalize (same logic as in migrations.go)
-	normalizeName := func(name string) string {
-		normalized := strings.ReplaceAll(name, "-", "_")
-		normalized = strings.ReplaceAll(normalized, " ", "_")
-		normalized = strings.TrimLeftFunc(normalized, func(r rune) bool {
-			return r >= '0' && r <= '9'
-		})
-		if normalized == "" {
-			normalized = "mcp_client"
-		}
-		return normalized
-	}
+	// Use the actual normalizeMCPClientName function from migrations.go
+	normalizeName := normalizeMCPClientName

This ensures tests exercise the actual production code path.

ui/app/workspace/config/views/mcpView.tsx (2)

11-26: Duplicate default config definition across views.

This defaultConfig object is duplicated across multiple view files (clientSettingsView.tsx, governanceView.tsx, observabilityView.tsx, performanceTuningView.tsx, securityView.tsx). Consider extracting this to a shared constant to ensure consistency and reduce maintenance burden when adding new fields.


76-100: Validation duplicates logic already enforced by input handlers.

The handleSave validation re-parses and validates localValues, but localConfig already contains validated positive integers from the change handlers. Consider simplifying by directly using localConfig.mcp_agent_depth and localConfig.mcp_tool_execution_timeout in the guard, or removing the duplicate parse:

🔎 Suggested simplification
 const handleSave = useCallback(async () => {
   try {
-    const agentDepth = Number.parseInt(localValues.mcp_agent_depth);
-    const toolTimeout = Number.parseInt(localValues.mcp_tool_execution_timeout);
-
-    if (isNaN(agentDepth) || agentDepth <= 0) {
+    if (!localConfig.mcp_agent_depth || localConfig.mcp_agent_depth <= 0) {
       toast.error("Max agent depth must be a positive number.");
       return;
     }
 
-    if (isNaN(toolTimeout) || toolTimeout <= 0) {
+    if (!localConfig.mcp_tool_execution_timeout || localConfig.mcp_tool_execution_timeout <= 0) {
       toast.error("Tool execution timeout must be a positive number.");
       return;
     }
framework/configstore/clientconfig.go (1)

103-113: Inconsistent hashing approach for integer fields.

The new MCP integer fields use strconv.Itoa with string concatenation, while existing integer fields (InitialPoolSize, LogRetentionDays, MaxRequestBodySizeMB at lines 116-132) use sonic.Marshal. Both are deterministic, but the inconsistency could cause confusion for future maintainers. Consider aligning the approach.

Additionally, the > 0 conditional means negative values hash identically to zero. If negative values should be rejected, consider validation elsewhere; if they're valid, they should produce distinct hashes.

🔎 Option to align with existing pattern
-	if c.MCPAgentDepth > 0 {
-		hash.Write([]byte("mcpAgentDepth:" + strconv.Itoa(c.MCPAgentDepth)))
-	} else {
-		hash.Write([]byte("mcpAgentDepth:0"))
-	}
-
-	if c.MCPToolExecutionTimeout > 0 {
-		hash.Write([]byte("mcpToolExecutionTimeout:" + strconv.Itoa(c.MCPToolExecutionTimeout)))
-	} else {
-		hash.Write([]byte("mcpToolExecutionTimeout:0"))
-	}
+	data, err = sonic.Marshal(c.MCPAgentDepth)
+	if err != nil {
+		return "", err
+	}
+	hash.Write(data)
+
+	data, err = sonic.Marshal(c.MCPToolExecutionTimeout)
+	if err != nil {
+		return "", err
+	}
+	hash.Write(data)
framework/configstore/utils.go (1)

186-202: Redacted value restoration logic is correct but pattern matching could have edge cases.

The restoration of redacted header values from existingHeaders correctly handles the UI flow where sensitive values are masked. However, the check strings.Contains(value, "****") could match legitimate header values containing four asterisks. Consider a more specific redaction pattern if one is consistently used (e.g., exact match or a distinctive prefix).

tests/core-mcp/mcp_connection_test.go (1)

182-217: Consider improving placeholder test documentation.

The skipped test is well-documented with a placeholder comment block showing intended usage. Consider adding a TODO or issue reference to track when this should be enabled.

ui/components/ui/tristateCheckbox.tsx (1)

90-104: Unreachable style condition detected.

Line 97 condition !isChecked && !isIndeterminate && state !== "none" can never be true. Given the three states:

  • state === "all"isChecked = true
  • state === "some"isIndeterminate = true
  • state === "none" → excluded by state !== "none"

This condition is dead code and can be removed.

🔎 Proposed fix
 				state === "none" && "bg-destructive/5 border-destructive/30 text-destructive dark:bg-destructive/20 dark:border-destructive/40",
-				!isChecked && !isIndeterminate && state !== "none" && "bg-background",
 				"focus-visible:border-ring focus-visible:ring-ring/50 focus-visible:ring-[3px]",
ui/app/workspace/mcp-gateway/views/mcpClientsTable.tsx (3)

54-56: Missing dependency in useEffect.

The useEffect calls loadClients on mount but doesn't include it in the dependency array. This could cause stale closures if loadClients changes.

🔎 Proposed fix
 	useEffect(() => {
 		loadClients();
+		// eslint-disable-next-line react-hooks/exhaustive-deps
 	}, []);

Or better, call refetch directly:

 	useEffect(() => {
-		loadClients();
+		refetch();
 	}, []);

163-175: Use strict equality for state comparison.

Lines 165, 171, 195, 204, 190 use loose equality (==) for comparing c.state. Use strict equality (===) for consistency and to avoid type coercion issues.

🔎 Proposed fix
-							c.state == "connected"
+							c.state === "connected"
-								? c.config.tools_to_execute?.includes("*")
-									? c.tools?.length
-									: (c.config.tools_to_execute?.length ?? 0)
-								: 0;
+								? c.config.tools_to_execute?.includes("*")
+									? c.tools?.length
+									: (c.config.tools_to_execute?.length ?? 0)
+								: 0;
 							const autoExecuteToolsCount =
-								c.state == "connected"
+								c.state === "connected"

184-192: Use strict equality for code mode badge state check.

Line 187 and 190 use loose equality. Apply consistent strict equality.

🔎 Proposed fix
-										c.state == "connected" ? MCP_STATUS_COLORS[c.config.is_code_mode_client ? "connected" : "disconnected"] : ""
+										c.state === "connected" ? MCP_STATUS_COLORS[c.config.is_code_mode_client ? "connected" : "disconnected"] : ""
 									}
 								>
-									{c.state == "connected" ? <>{c.config.is_code_mode_client ? "Enabled" : "Disabled"}</> : "-"}
+									{c.state === "connected" ? <>{c.config.is_code_mode_client ? "Enabled" : "Disabled"}</> : "-"}
core/mcp/codemode_listfiles.go (1)

62-73: Consider sorting client names for deterministic output.

Map iteration order in Go is not deterministic. If consistent ordering is desired for the file listing (e.g., for testing or user experience), consider sorting the client names.

🔎 Proposed fix
+	// Collect and sort client names for deterministic output
+	clientNames := make([]string, 0, len(availableToolsPerClient))
+	for clientName := range availableToolsPerClient {
+		clientNames = append(clientNames, clientName)
+	}
+	sort.Strings(clientNames)
+
 	// Build tree structure
 	treeLines := []string{"servers/"}
 	codeModeServerCount := 0
-	for clientName := range availableToolsPerClient {
+	for _, clientName := range clientNames {

Add "sort" to imports.

framework/configstore/migrations.go (3)

1280-1315: Consider backfilling default values for existing rows.

Unlike other migrations in this file, this one doesn't backfill default values for existing rows. Based on TableClientConfig struct (lines 23-24 in tables/clientconfig.go), defaults are 10 for mcp_agent_depth and 30 for mcp_tool_execution_timeout. Existing rows may have NULL values.

🔎 Proposed fix
 		if !migrator.HasColumn(&tables.TableClientConfig{}, "mcp_tool_execution_timeout") {
 			if err := migrator.AddColumn(&tables.TableClientConfig{}, "mcp_tool_execution_timeout"); err != nil {
 				return err
 			}
 		}
+		// Backfill default values for existing rows
+		if err := tx.Exec("UPDATE config_client SET mcp_agent_depth = 10 WHERE mcp_agent_depth IS NULL OR mcp_agent_depth = 0").Error; err != nil {
+			return fmt.Errorf("failed to backfill mcp_agent_depth: %w", err)
+		}
+		if err := tx.Exec("UPDATE config_client SET mcp_tool_execution_timeout = 30 WHERE mcp_tool_execution_timeout IS NULL OR mcp_tool_execution_timeout = 0").Error; err != nil {
+			return fmt.Errorf("failed to backfill mcp_tool_execution_timeout: %w", err)
+		}
 		return nil

1360-1404: Consider using structured logging instead of log.Printf.

The migration uses log.Printf for logging name transformations while other parts of the codebase may use a structured logger. Consider using consistent logging patterns.


1437-1441: Document the one-way nature of the migration in the ID or description.

The rollback explicitly states it cannot restore original names. Consider documenting this limitation more prominently.

🔎 Suggested documentation
 	m := migrator.New(db, migrator.DefaultOptions, []*migrator.Migration{{
-		ID: "normalize_mcp_client_names",
+		ID: "normalize_mcp_client_names_one_way",
 		Migrate: func(tx *gorm.DB) error {

Or add a more prominent comment at the migration function level.

ui/app/workspace/mcp-gateway/views/mcpClientSheet.tsx (1)

225-236: UI label says "Code Mode Client" but component manages "MCP server configuration".

There's a terminology inconsistency. The SheetDescription at line 193 says "MCP server configuration" and the section header at line 209 says "Basic Information", but the label here says "Code Mode Client". Consider using "Code Mode Server" for consistency with the rest of the UI changes in this PR (as done in mcpClientForm.tsx at line 209).

🔎 Suggested fix for consistency
-							<FormLabel>Code Mode Client</FormLabel>
+							<FormLabel>Code Mode Server</FormLabel>
ui/app/workspace/mcp-gateway/views/mcpClientForm.tsx (1)

101-109: Validation rules have redundancy - pattern already covers some checks.

The regex pattern /^[a-zA-Z0-9_]+$/ at line 104 already ensures no hyphens or spaces are allowed. The explicit checks at lines 105-106 are redundant but harmless - they provide more specific error messages. However, the pattern check will fail first, so users will see the generic pattern error rather than the specific hyphen/space messages.

Consider reordering validations so specific checks come before the pattern, or remove the redundant checks:

🔎 Option 1: Reorder for better error messages
 const validator = new Validator([
   // Name validation
   Validator.required(form.name?.trim(), "Server name is required"),
+  Validator.custom(!(form.name || "").includes("-"), "Server name cannot contain hyphens"),
+  Validator.custom(!(form.name || "").includes(" "), "Server name cannot contain spaces"),
+  Validator.custom((form.name || "").length === 0 || !/^[0-9]/.test(form.name || ""), "Server name cannot start with a number"),
   Validator.pattern(form.name || "", /^[a-zA-Z0-9_]+$/, "Server name can only contain letters, numbers, and underscores"),
-  Validator.custom(!(form.name || "").includes("-"), "Server name cannot contain hyphens"),
-  Validator.custom(!(form.name || "").includes(" "), "Server name cannot contain spaces"),
-  Validator.custom((form.name || "").length === 0 || !/^[0-9]/.test(form.name || ""), "Server name cannot start with a number"),
   Validator.minLength(form.name || "", 3, "Server name must be at least 3 characters"),
   Validator.maxLength(form.name || "", 50, "Server name cannot exceed 50 characters"),
core/mcp/utils.go (1)

438-466: Regex-based tool call extraction may have false positives.

The regex (?:await\s+)?([a-zA-Z_$][a-zA-Z0-9_$]*)\s*\.\s*([a-zA-Z_$][a-zA-Z0-9_$]*)\s*\( will match any method call pattern, including standard JavaScript calls like console.log(, JSON.stringify(, Math.random(. While line 471-474 in isToolCallAllowedForCodeMode handles this by checking against known client names, consider documenting this behavior.

🔎 Suggested documentation improvement
 // extractToolCallsFromCode extracts tool calls from TypeScript code
 // Tool calls are in the format: serverName.toolName(...) or await serverName.toolName(...)
+// Note: This will match all method call patterns including built-in objects (console.log, JSON.stringify, etc.)
+// The caller (isToolCallAllowedForCodeMode) is responsible for filtering out non-MCP tool calls
+// by checking if the server name exists in the list of known MCP clients.
 func extractToolCallsFromCode(code string) ([]toolCallInfo, error) {
core/mcp/codemode_readfile.go (1)

258-264: Consider using slices.Sort instead of manual bubble sort.

Go 1.21+ provides slices.Sort which is more efficient and idiomatic. The current bubble sort is O(n²) while slices.Sort is O(n log n).

🔎 Suggested improvement
+import "slices"

-			// Simple alphabetical sort
-			for i := 0; i < len(propNames)-1; i++ {
-				for j := i + 1; j < len(propNames); j++ {
-					if propNames[i] > propNames[j] {
-						propNames[i], propNames[j] = propNames[j], propNames[i]
-					}
-				}
-			}
+			// Sort properties for consistent output
+			slices.Sort(propNames)
tests/core-mcp/fixtures.go (1)

121-277: Consider using a raw string literal or external file for readability.

The LongCodeExecution fixture is functional but hard to read with concatenated strings. For test maintainability, consider using backtick raw strings or loading from an external .ts file.

That said, this is a minor readability concern and doesn't affect functionality.

tests/core-mcp/tool_execution_test.go (2)

107-126: Clarify the expected behavior in the assertion.

The test comment says "should have no code mode clients" but the assertion checks for "No servers" message. The conditional if bifrostErr == nil && result != nil on line 122 is redundant since requireNoBifrostError already ensures bifrostErr is nil.

🔎 Suggested simplification
 	// listToolFiles should still work but return empty/no servers message
-	if bifrostErr == nil && result != nil {
-		responseText := *result.Content.ContentStr
-		assert.Contains(t, responseText, "No servers", "Should indicate no servers")
-	}
+	responseText := *result.Content.ContentStr
+	assert.Contains(t, responseText, "No servers", "Should indicate no code-mode servers available")

152-159: Same redundant conditional pattern.

Similar to the previous test, the conditional check is unnecessary after requireNoBifrostError. Also, the assertion logic with len(responseText) == 0 || ... may be overly permissive.

tests/core-mcp/codemode_auto_execute_test.go (1)

19-39: Consider extracting config preservation to a helper function.

The pattern of fetching current config and preserving fields (lines 19-38) is repeated across multiple tests. A helper like getAndPreserveConfig(b, clientID string) could reduce duplication.

This is a minor maintainability suggestion and not blocking.

transports/bifrost-http/handlers/mcp_server.go (1)

206-211: Potential issue: Blank identifier in range.

Line 209 uses for toolName, _ := range toolMap which can be simplified to for toolName := range toolMap.

🔎 Suggested fix
 	// Clear existing tools
 	toolMap := server.ListTools()
-	for toolName, _ := range toolMap {
+	for toolName := range toolMap {
 		server.DeleteTools(toolName)
 	}
tests/core-mcp/responses_test.go (2)

14-14: Consider reducing the test timeout for faster CI feedback.

A 10-minute timeout (TestTimeout = 10 * time.Minute in setup.go) is quite long for unit tests. If tests are expected to complete quickly, a shorter timeout (e.g., 30 seconds to 1 minute) would provide faster feedback on failures.


237-265: Timeout test may not reliably trigger the timeout scenario.

The test configures a 100ms tool execution timeout and sends a request asking the LLM to "Call slow_tool with delay 500ms". However, this relies on the LLM interpreting the message correctly and actually calling slow_tool. The test's assertion only checks if a timeout error occurred but doesn't guarantee the slow_tool was actually invoked.

Consider directly calling ExecuteMCPTool with the slow_tool to ensure deterministic timeout behavior:

🔎 Suggested approach
// Test direct tool execution timeout
slowCall := createToolCall("slow_tool", map[string]interface{}{
    "delay_ms": float64(500), // 500ms delay, but timeout is 100ms
})
result, bifrostErr := b.ExecuteMCPTool(ctx, slowCall)
require.NotNil(t, bifrostErr, "Should timeout when tool execution exceeds timeout")
assert.Contains(t, bifrost.GetErrorMessage(bifrostErr), "timeout")
core/mcp/codemode_executecode.go (3)

69-124: Consider extracting the lengthy tool description to a constant.

The tool description is quite lengthy (40+ lines). While comprehensive documentation is valuable, consider extracting this to a package-level constant for better readability and maintainability of the createExecuteToolCodeTool function.


871-900: Regex compiled on every call - consider hoisting to package level.

The importExportRegex is compiled on every call to stripImportsAndExports. Consider hoisting it to a package-level variable for better performance:

🔎 Suggested fix
+// Package-level compiled regex for import/export detection
+var importExportRegex = regexp.MustCompile(`^\s*(import|export)\b`)
+
 func stripImportsAndExports(code string) (string, []int) {
     lines := strings.Split(code, "\n")
     keptLines := []string{}
     strippedLineNumbers := []int{}
 
-    importExportRegex := regexp.MustCompile(`^\s*(import|export)\b`)
-
     for i, line := range lines {

959-961: Regex compiled on every call in generateErrorHints.

Same issue as above - the regex (\w+)\s+is not defined is compiled on every error. Consider hoisting to package level.

🔎 Suggested fix
+var undefinedVarRegex = regexp.MustCompile(`(\w+)\s+is not defined`)
+var notFunctionRegex = regexp.MustCompile(`(\w+(?:\.\w+)?)\s+is not a function`)
+
 func generateErrorHints(errorMessage string, serverKeys []string) []string {
     hints := []string{}
 
     if strings.Contains(errorMessage, "is not defined") {
-        re := regexp.MustCompile(`(\w+)\s+is not defined`)
-        if match := re.FindStringSubmatch(errorMessage); len(match) > 1 {
+        if match := undefinedVarRegex.FindStringSubmatch(errorMessage); len(match) > 1 {
tests/core-mcp/setup.go (2)

23-31: Missing validation for OPENAI_API_KEY environment variable.

If OPENAI_API_KEY is not set, tests will silently use an empty string, leading to confusing failures. Consider adding validation or skipping tests when the key is unavailable:

🔎 Suggested improvement
func (a *TestAccount) GetKeysForProvider(ctx *context.Context, providerKey schemas.ModelProvider) ([]schemas.Key, error) {
    apiKey := os.Getenv("OPENAI_API_KEY")
    if apiKey == "" {
        return nil, fmt.Errorf("OPENAI_API_KEY environment variable is not set")
    }
    return []schemas.Key{
        {
            Value:  apiKey,
            Models: []string{},
            Weight: 1.0,
        },
    }, nil
}

142-360: Consider using table-driven tool registration to reduce repetition.

The tool registration code is highly repetitive. A table-driven approach would reduce duplication and make adding new test tools easier:

🔎 Example refactor
type testToolDef struct {
    name        string
    description string
    handler     func(args any) (string, error)
    params      *schemas.ToolFunctionParameters
}

func registerTestTools(b *bifrost.Bifrost) error {
    tools := []testToolDef{
        {
            name:        "echo",
            description: "Echoes back the input message",
            handler: func(args any) (string, error) {
                argsMap := args.(map[string]interface{})
                return argsMap["message"].(string), nil
            },
            params: &schemas.ToolFunctionParameters{
                Type: "object",
                Properties: &map[string]interface{}{
                    "message": map[string]interface{}{"type": "string", "description": "The message to echo"},
                },
                Required: []string{"message"},
            },
        },
        // ... other tools
    }
    
    for _, tool := range tools {
        schema := schemas.ChatTool{
            Type: schemas.ChatToolTypeFunction,
            Function: &schemas.ChatToolFunction{
                Name:        tool.name,
                Description: schemas.Ptr(tool.description),
                Parameters:  tool.params,
            },
        }
        if err := b.RegisterMCPTool(tool.name, tool.description, tool.handler, schema); err != nil {
            return fmt.Errorf("failed to register %s tool: %w", tool.name, err)
        }
    }
    return nil
}
core/mcp/agent.go (1)

196-201: Consider logging the actual unmarshal error for better debugging.

When sonic.Unmarshal fails, the error is logged but the original toolCall.Function.Arguments value is not included, which would help diagnose malformed JSON payloads in production.

🔎 Suggested improvement
 				if err := sonic.Unmarshal([]byte(toolCall.Function.Arguments), &arguments); err != nil {
-					logger.Debug(fmt.Sprintf("%s Failed to parse tool arguments: %v", CodeModeLogPrefix, err))
+					logger.Debug(fmt.Sprintf("%s Failed to parse tool arguments: %v (raw: %s)", CodeModeLogPrefix, err, toolCall.Function.Arguments))
 					nonAutoExecutableTools = append(nonAutoExecutableTools, toolCall)
 					continue
 				}
core/bifrost.go (2)

1887-1899: New public API UpdateToolManagerConfig exposes runtime configuration.

This allows hot-reloading of maxAgentDepth and toolExecutionTimeout. The implementation correctly delegates to the MCP manager. Consider adding input validation for edge cases (e.g., negative values).

🔎 Suggested validation
 func (bifrost *Bifrost) UpdateToolManagerConfig(maxAgentDepth int, toolExecutionTimeoutInSeconds int) error {
 	if bifrost.mcpManager == nil {
 		return fmt.Errorf("MCP is not configured in this Bifrost instance")
 	}
+	if maxAgentDepth < 0 {
+		return fmt.Errorf("maxAgentDepth must be non-negative")
+	}
+	if toolExecutionTimeoutInSeconds < 0 {
+		return fmt.Errorf("toolExecutionTimeoutInSeconds must be non-negative")
+	}
 
 	bifrost.mcpManager.UpdateToolManagerConfig(&schemas.MCPToolManagerConfig{
 		MaxAgentDepth:        maxAgentDepth,
 		ToolExecutionTimeout: time.Duration(toolExecutionTimeoutInSeconds) * time.Second,
 	})
 	return nil
 }

593-598: Refactor context passing to avoid anti-pattern: pass by value instead of by pointer.

The code passes &ctx to CheckAndExecuteAgentForChatRequest, allowing the agent function to mutate the caller's context variable via *ctx = context.WithValue(...) (confirmed at agent.go lines 159, 339). Passing context.Context by pointer is an anti-pattern in Go that offers no benefits and introduces the risk of unintended side effects by allowing modification of the caller's variable. The Go standard library packages all accept context.Context by value.

Refactor CheckAndExecuteAgentForChatRequest to accept and return context.Context by value instead of by pointer.

core/schemas/mcp.go (1)

4-10: Consider aliasing or avoiding direct dependency on mcp-go in schemas package.

The schemas package now directly imports github.com/mark3labs/mcp-go/client and github.com/mark3labs/mcp-go/server. This couples the schema definitions to a specific external library version. If the mcp-go library changes its API, the schemas package must also change. This may be acceptable given the tight MCP integration, but consider whether these types could be defined as interfaces in schemas to reduce coupling.

core/mcp/agent_adaptors.go (2)

82-90: Unchecked type assertions could cause panics.

Multiple methods perform unchecked type assertions (e.g., response.(*schemas.BifrostChatResponse) at lines 83, 88, 93, 126, 136). While the adapter pattern guarantees these types at compile-time design level, a programming error could cause a panic at runtime. Consider using the two-value form for defensive programming, especially in the makeLLMCall methods which involve external calls.

🔎 Example defensive assertion
func (c *chatAPIAdapter) makeLLMCall(ctx context.Context, request interface{}) (interface{}, *schemas.BifrostError) {
	chatRequest, ok := request.(*schemas.BifrostChatRequest)
	if !ok {
		return nil, &schemas.BifrostError{
			IsBifrostError: false,
			Error: &schemas.ErrorField{
				Message: fmt.Sprintf("expected *BifrostChatRequest, got %T", request),
			},
		}
	}
	return c.makeReq(ctx, chatRequest)
}

Also applies to: 125-128, 136-137


159-270: Significant code duplication between Chat and Responses response builders.

createChatResponseWithExecutedToolsAndNonAutoExecutableCalls and createResponsesResponseWithExecutedToolsAndNonAutoExecutableCalls share nearly identical logic for building toolCallIDToName map, extracting tool results, and formatting contentText. Consider extracting the common logic into a shared helper function.

Also applies to: 373-529

core/mcp/toolmanager.go (3)

52-68: Consider validating clientManager parameter.

The constructor validates config for nil but doesn't validate clientManager. If clientManager is nil, the manager will be created successfully but will panic on first use when methods like GetAvailableTools call m.clientManager.GetToolPerClient(ctx).

🔎 Proposed validation
 func NewToolsManager(config *schemas.MCPToolManagerConfig, clientManager ClientManager, fetchNewRequestIDFunc func(ctx context.Context) string) *ToolsManager {
+	if clientManager == nil {
+		logger.Error(fmt.Sprintf("%s clientManager cannot be nil", MCPLogPrefix))
+		return nil
+	}
 	if config == nil {
 		config = &schemas.MCPToolManagerConfig{

203-216: Redundant length check.

The check if len(availableTools) > 0 on line 216 is redundant since the function already returns early on lines 203-205 when len(availableTools) == 0.

🔎 Proposed simplification
 	if len(availableTools) == 0 {
 		return req
 	}

 	// Get integration user agent for duplicate checking
 	var integrationUserAgentStr string
 	integrationUserAgent := ctx.Value(schemas.BifrostContextKey("integration-user-agent"))
 	if integrationUserAgent != nil {
 		if str, ok := integrationUserAgent.(string); ok {
 			integrationUserAgentStr = str
 		}
 	}

-	if len(availableTools) > 0 {
-		switch req.RequestType {
+	switch req.RequestType {

And remove the corresponding closing brace before return req.


460-474: Log message may be misleading when partial updates occur.

The log message on line 473 logs config.ToolExecutionTimeout and config.MaxAgentDepth, but if either value is <= 0, it won't be updated. This could log misleading values (e.g., logging 0 for timeout when the actual stored value remains unchanged).

🔎 Proposed fix to log actual stored values
 func (m *ToolsManager) UpdateConfig(config *schemas.MCPToolManagerConfig) {
 	if config == nil {
 		return
 	}
 	if config.ToolExecutionTimeout > 0 {
 		m.toolExecutionTimeout.Store(config.ToolExecutionTimeout)
 	}
 	if config.MaxAgentDepth > 0 {
 		m.maxAgentDepth.Store(int32(config.MaxAgentDepth))
 	}

-	logger.Info(fmt.Sprintf("%s tool manager configuration updated with tool execution timeout: %v and max agent depth: %d", MCPLogPrefix, config.ToolExecutionTimeout, config.MaxAgentDepth))
+	logger.Info(fmt.Sprintf("%s tool manager configuration updated with tool execution timeout: %v and max agent depth: %d", MCPLogPrefix, m.toolExecutionTimeout.Load(), m.maxAgentDepth.Load()))
 }

Pratham-Mishra04 and others added 5 commits December 22, 2025 15:42
## Summary

Briefly explain the purpose of this PR and the problem it solves.

## Changes

- What was changed and why
- Any notable design decisions or trade-offs

## Type of change

- [ ] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Describe the steps to validate this change. Include commands and expected outcomes.

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

If adding new configs or environment variables, document them here.

## Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

## Breaking changes

- [ ] Yes
- [ ] No

If yes, describe impact and migration instructions.

## Related issues

Link related issues and discussions. Example: Closes #123

## Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable
## Summary

Briefly explain the purpose of this PR and the problem it solves.

## Changes

- What was changed and why
- Any notable design decisions or trade-offs

## Type of change

- [ ] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Describe the steps to validate this change. Include commands and expected outcomes.

```sh
# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build
```

If adding new configs or environment variables, document them here.

## Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

## Breaking changes

- [ ] Yes
- [ ] No

If yes, describe impact and migration instructions.

## Related issues

Link related issues and discussions. Example: Closes #123

## Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

## Checklist

- [ ] I read `docs/contributing/README.md` and followed the guidelines
- [ ] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [ ] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (12)
.editorconfig (1)

8-10: Appropriate addition of Go-specific EditorConfig settings.

The new Go section aligns with the PR's introduction of Go code for MCP integration. Go standard practice is to use tabs for indentation (enforced by gofmt), so indent_style = tab is correct.

Consider removing indent_size = 4 from this section, as it is non-standard with tab indentation and has no practical effect in Go tooling (which ignores EditorConfig). Alternatively, if you intend this to indicate visual tab display width, you may want to document this intent.

🔎 Optional: Remove or clarify `indent_size`
 [*.go]
 indent_style = tab
-indent_size = 4

This keeps EditorConfig focused on what it can effectively control for Go files. Go tooling is indifferent to EditorConfig and will enforce its own standards.

plugins/governance/config_update_sync_test.go (1)

14-235: Consider using WaitForCondition consistently for async sync verification.

This test uses time.Sleep(500 * time.Millisecond) at lines 95, 148, and 201 for async sync, while TestTeamBudgetUpdateSyncToMemory demonstrates a more robust pattern using WaitForCondition. The fixed sleeps may cause flaky tests under load.

That said, the test logic is correct and covers the important scenario of usage reset when the new limit is lower than current usage.

plugins/governance/test_utils.go (1)

143-152: rand.Seed is deprecated in Go 1.20+.

Consider using the newer approach with a local random source for better concurrency safety:

🔎 Suggested fix
+var rng = rand.New(rand.NewSource(time.Now().UnixNano()))
+
 // generateRandomID generates a random ID for test resources
 func generateRandomID() string {
-	rand.Seed(time.Now().UnixNano())
 	const letters = "abcdefghijklmnopqrstuvwxyz0123456789"
 	b := make([]byte, 8)
 	for i := range b {
-		b[i] = letters[rand.Intn(len(letters))]
+		b[i] = letters[rng.Intn(len(letters))]
 	}
 	return string(b)
 }
plugins/governance/team_budget_test.go (1)

78-95: Consider extracting the long prompt string to a shared constant.

This longPrompt string is duplicated across multiple test files (team_budget_test.go, edge_cases_test.go, provider_budget_test.go, advanced_scenarios_test.go). Extracting it to a constant in test_utils.go would improve maintainability.

// In test_utils.go
const LongTestPrompt = "Please provide a comprehensive and detailed response to the following question. " +
    "I need extensive information covering all aspects of the topic. " +
    // ... rest of the string
plugins/governance/store_test.go (1)

200-207: Direct sync.Map manipulation couples test to implementation details.

Directly accessing store.budgets (the internal sync.Map) to modify budget state ties the test to the store's internal implementation. If the internal data structure changes, this test will break.

Consider exposing a test helper method or using the public UpdateBudgetUsageInMemory method if available.

plugins/governance/tracker_test.go (1)

80-81: Remove no-op assertion.

assert.True(t, true) always passes and provides no value. The comment "Just verify it doesn't crash" is valid intent, but if execution reaches this point without panicking, the test already succeeded.

🔎 Suggested fix
 	time.Sleep(100 * time.Millisecond)
-	// Just verify it doesn't crash
-	assert.True(t, true)
+	// Test passes if execution reaches here without panicking
 }
plugins/governance/customer_budget_test.go (1)

66-160: Consider extracting shared test logic to reduce duplication.

Both TestCustomerBudgetExceededWithMultipleVKs and TestCustomerBudgetExceededWithMultipleTeams share nearly identical request/response handling logic (lines 66-160 and 241-334). Consider extracting a helper function like runBudgetExhaustionLoop(t, vkValues []string, customerBudget float64) to reduce duplication.

This is optional since the current structure provides good readability and each test is self-contained.

Also applies to: 241-334

plugins/governance/rate_limit_enforcement_test.go (1)

489-491: Minor: string(rune('0'+i)) only works correctly for single digits.

For i >= 10, string(rune('0'+i)) produces unexpected characters (e.g., ':' for 10, ';' for 11). Since the loop only goes to 5 iterations, this works correctly here, but using strconv.Itoa(i) would be more robust.

Suggested fix
-				Content: "Request " + string(rune('0'+i)) + " to test both limits.",
+				Content: "Request " + strconv.Itoa(i) + " to test both limits.",

Add "strconv" to imports.

framework/configstore/rdb.go (3)

749-763: Clarify the contract for missing client configuration.

The code returns default values when TableClientConfig is not found, with a comment stating "This will never happen, but just in case." This creates ambiguity:

  • If this truly never happens, consider removing the defensive code and letting the error propagate
  • If this can happen in valid scenarios (e.g., fresh installation), update the comment to explain when and why
  • If this is a critical invariant, consider using a panic or fatal error instead of silently returning defaults

The current implementation masks a potentially unexpected state, making debugging harder.


1393-1468: Consider consolidating cascade deletion logic.

The cascade deletion logic is manually implemented in multiple places (DeleteVirtualKey, DeleteVirtualKeyProviderConfig, DeleteTeam, DeleteCustomer). While the pattern is consistent, the duplication increases maintenance burden and the risk of inconsistencies.

Based on learnings, budgets and rate limits maintain 1:1 relationships with their parent entities, so the manual deletion ensures no cross-entity sharing occurs.

Consider:

  1. Extracting common deletion logic into helper functions (e.g., deleteBudgetIfExists, deleteRateLimitIfExists)
  2. Documenting why manual cascade is used instead of database CASCADE constraints
  3. Adding tests to verify the cascade behavior remains correct

The current implementation is functionally correct based on the 1:1 relationship invariant from learnings, but could be more maintainable.

Also applies to: 1549-1577, 1693-1729, 1781-1821


2177-2202: Well-implemented retry helper with minor suggestions.

The RetryOnNotFound function correctly handles retries for not-found scenarios with proper context cancellation support.

Consider these optional enhancements:

  1. Add debug logging for retry attempts to aid troubleshooting
  2. Consider making this a package-level function since it doesn't use any receiver state
Optional: Add retry attempt logging
 func (s *RDBConfigStore) RetryOnNotFound(ctx context.Context, fn func(ctx context.Context) (any, error), maxRetries int, retryDelay time.Duration) (any, error) {
 	var lastErr error
 	for attempt := range maxRetries {
 		result, err := fn(ctx)
 		if err == nil {
 			return result, nil
 		}
 		if !errors.Is(err, ErrNotFound) && !errors.Is(err, gorm.ErrRecordNotFound) {
 			return nil, err
 		}
 
 		lastErr = err
+		s.logger.Debug("RetryOnNotFound: attempt %d/%d failed, retrying after %v", attempt+1, maxRetries, retryDelay)
 
 		// Don't wait after the last attempt
 		if attempt < maxRetries-1 {
 			select {
 			case <-ctx.Done():
 				return nil, ctx.Err()
 			case <-time.After(retryDelay):
 				// Continue to next retry
 			}
 		}
 	}
 	return nil, lastErr
 }
plugins/governance/store.go (1)

166-217: Improved governance checks with baseline support.

The refactored CheckBudget and CheckRateLimit methods add:

  • Multi-node baseline support for distributed usage tracking
  • Independent expiration handling for token vs request limits
  • Defensive nil checks preventing panics
  • Enhanced logging for troubleshooting

Consider adding debug logging when baselines are nil to catch potential bugs in caller code:

Optional: Log warning for nil baselines
 func (gs *LocalGovernanceStore) CheckBudget(ctx context.Context, vk *configstoreTables.TableVirtualKey, request *EvaluationRequest, baselines map[string]float64) error {
 	if vk == nil {
 		return fmt.Errorf("virtual key cannot be nil")
 	}
 
 	// This is to prevent nil pointer dereference
 	if baselines == nil {
+		gs.logger.Debug("CheckBudget: baselines is nil, using empty map")
 		baselines = map[string]float64{}
 	}

Also applies to: 220-308

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd2355b and e42e8da.

⛔ Files ignored due to path filters (11)
  • core/go.sum is excluded by !**/*.sum
  • framework/go.sum is excluded by !**/*.sum
  • plugins/governance/go.sum is excluded by !**/*.sum
  • plugins/jsonparser/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/maxim/go.sum is excluded by !**/*.sum
  • plugins/mocker/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/semanticcache/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (54)
  • .editorconfig
  • core/go.mod
  • core/mcp.go
  • core/providers/nebius/nebius.go
  • core/schemas/context.go
  • docs/changelogs/v1.3.47.mdx
  • docs/features/governance/virtual-keys.mdx
  • framework/configstore/rdb.go
  • framework/configstore/store.go
  • framework/configstore/tables/budget.go
  • framework/configstore/tables/mcp.go
  • framework/configstore/tables/ratelimit.go
  • framework/go.mod
  • framework/modelcatalog/main.go
  • framework/modelcatalog/sync.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/go.mod
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/main.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/resolver.go
  • plugins/governance/resolver_test.go
  • plugins/governance/store.go
  • plugins/governance/store_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/test_utils.go
  • plugins/governance/tracker.go
  • plugins/governance/tracker_test.go
  • plugins/governance/usage_tracking_test.go
  • plugins/governance/utils.go
  • plugins/governance/vk_budget_test.go
  • plugins/jsonparser/go.mod
  • plugins/logging/go.mod
  • plugins/maxim/go.mod
  • plugins/mocker/go.mod
  • plugins/otel/go.mod
  • plugins/semanticcache/go.mod
  • plugins/telemetry/go.mod
  • transports/bifrost-http/handlers/governance.go
  • transports/bifrost-http/handlers/middlewares.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/lib/config_test.go
  • transports/bifrost-http/lib/lib.go
  • transports/bifrost-http/server/server.go
  • transports/go.mod
  • ui/lib/constants/logs.ts
✅ Files skipped from review due to trivial changes (2)
  • docs/changelogs/v1.3.47.mdx
  • framework/go.mod
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/features/governance/virtual-keys.mdx
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/changelog.md
  • plugins/governance/team_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • framework/modelcatalog/main.go
  • core/schemas/context.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/customer_budget_test.go
  • framework/modelcatalog/sync.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/e2e_test.go
  • core/mcp.go
  • plugins/governance/tracker_test.go
  • plugins/governance/main.go
  • framework/configstore/tables/budget.go
  • framework/configstore/store.go
  • plugins/governance/go.mod
  • framework/configstore/tables/ratelimit.go
  • core/providers/nebius/nebius.go
  • framework/configstore/rdb.go
  • plugins/governance/test_utils.go
  • framework/configstore/tables/mcp.go
  • plugins/governance/resolver.go
  • plugins/governance/tracker.go
  • plugins/governance/rate_limit_test.go
  • core/go.mod
  • plugins/governance/store.go
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • framework/modelcatalog/main.go
  • core/schemas/context.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/customer_budget_test.go
  • framework/modelcatalog/sync.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/e2e_test.go
  • core/mcp.go
  • plugins/governance/tracker_test.go
  • plugins/governance/main.go
  • framework/configstore/tables/budget.go
  • framework/configstore/store.go
  • framework/configstore/tables/ratelimit.go
  • core/providers/nebius/nebius.go
  • framework/configstore/rdb.go
  • plugins/governance/test_utils.go
  • framework/configstore/tables/mcp.go
  • plugins/governance/resolver.go
  • plugins/governance/tracker.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/store.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/store_test.go
  • plugins/governance/provider_budget_test.go
  • plugins/governance/advanced_scenarios_test.go
  • plugins/governance/team_budget_test.go
  • plugins/governance/config_update_sync_test.go
  • plugins/governance/rate_limit_enforcement_test.go
  • plugins/governance/customer_budget_test.go
  • plugins/governance/fixtures_test.go
  • plugins/governance/resolver_test.go
  • plugins/governance/in_memory_sync_test.go
  • plugins/governance/edge_cases_test.go
  • plugins/governance/e2e_test.go
  • plugins/governance/tracker_test.go
  • plugins/governance/main.go
  • plugins/governance/test_utils.go
  • plugins/governance/resolver.go
  • plugins/governance/tracker.go
  • plugins/governance/rate_limit_test.go
  • plugins/governance/store.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/nebius/nebius.go
🧬 Code graph analysis (17)
plugins/governance/provider_budget_test.go (5)
plugins/governance/test_utils.go (11)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • BudgetRequest (177-180)
  • ProviderConfigRequest (167-174)
  • ExtractIDFromResponse (248-268)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
core/schemas/models.go (1)
  • Model (109-129)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/team_budget_test.go (1)
plugins/governance/test_utils.go (10)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateTeamRequest (183-187)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateVirtualKeyRequest (155-164)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/config_update_sync_test.go (4)
plugins/governance/test_utils.go (14)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • UpdateVirtualKeyRequest (210-218)
  • BudgetRequest (177-180)
  • UpdateBudgetRequest (196-199)
  • ProviderConfigRequest (167-174)
  • CreateTeamRequest (183-187)
  • WaitForCondition (371-394)
  • UpdateTeamRequest (221-224)
  • CreateCustomerRequest (190-193)
  • UpdateCustomerRequest (227-230)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (2)
  • RateLimit (13-25)
  • Budget (5-11)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
framework/modelcatalog/main.go (2)
transports/bifrost-http/lib/config.go (1)
  • Config (189-220)
framework/modelcatalog/config.go (1)
  • Config (13-16)
plugins/governance/rate_limit_enforcement_test.go (3)
plugins/governance/test_utils.go (9)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateVirtualKeyRequest (155-164)
  • CreateRateLimitRequest (202-207)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • ProviderConfigRequest (167-174)
ui/lib/types/governance.ts (1)
  • RateLimit (13-25)
core/providers/gemini/types.go (2)
  • Role (17-17)
  • Content (977-985)
plugins/governance/resolver_test.go (5)
plugins/governance/store.go (1)
  • NewLocalGovernanceStore (75-94)
plugins/governance/resolver.go (11)
  • NewBudgetResolver (70-75)
  • DecisionAllow (18-18)
  • DecisionVirtualKeyNotFound (19-19)
  • DecisionVirtualKeyBlocked (20-20)
  • DecisionProviderBlocked (26-26)
  • DecisionModelBlocked (25-25)
  • DecisionTokenLimited (23-23)
  • DecisionRequestLimited (24-24)
  • DecisionBudgetExceeded (22-22)
  • DecisionRateLimited (21-21)
  • Decision (15-15)
ui/lib/types/governance.ts (5)
  • VirtualKey (61-80)
  • RateLimit (13-25)
  • Budget (5-11)
  • Customer (37-44)
  • Team (27-35)
core/schemas/bifrost.go (2)
  • OpenAI (35-35)
  • BifrostContextKey (117-117)
core/schemas/models.go (1)
  • Model (109-129)
plugins/governance/in_memory_sync_test.go (3)
plugins/governance/test_utils.go (3)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
examples/plugins/hello-world/main.go (1)
  • Cleanup (41-44)
ui/lib/types/governance.ts (1)
  • Budget (5-11)
plugins/governance/edge_cases_test.go (1)
plugins/governance/test_utils.go (13)
  • NewGlobalTestData (307-313)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • CreateCustomerRequest (190-193)
  • BudgetRequest (177-180)
  • ExtractIDFromResponse (248-268)
  • CreateTeamRequest (183-187)
  • CreateVirtualKeyRequest (155-164)
  • ProviderConfigRequest (167-174)
  • ChatCompletionRequest (233-239)
  • ChatMessage (242-245)
  • CheckErrorMessage (272-292)
  • CalculateCost (64-73)
plugins/governance/e2e_test.go (3)
plugins/governance/test_utils.go (5)
  • MakeRequest (91-141)
  • APIRequest (76-81)
  • ExtractIDFromResponse (248-268)
  • WaitForAPICondition (398-424)
  • APIResponse (84-88)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
framework/configstore/tables/utils.go (1)
  • ParseDuration (9-43)
plugins/governance/tracker_test.go (6)
plugins/governance/fixtures_test.go (1)
  • NewMockLogger (24-32)
framework/configstore/clientconfig.go (1)
  • GovernanceConfig (720-727)
plugins/governance/resolver.go (1)
  • NewBudgetResolver (70-75)
plugins/governance/tracker.go (2)
  • NewUsageTracker (52-66)
  • UsageUpdate (17-30)
core/schemas/bifrost.go (1)
  • OpenAI (35-35)
core/utils.go (1)
  • IsFinalChunk (212-227)
plugins/governance/main.go (3)
plugins/governance/store.go (2)
  • GovernanceStore (50-72)
  • NewLocalGovernanceStore (75-94)
plugins/governance/resolver.go (2)
  • BudgetResolver (64-67)
  • NewBudgetResolver (70-75)
plugins/governance/tracker.go (2)
  • UsageTracker (33-45)
  • NewUsageTracker (52-66)
framework/configstore/tables/budget.go (1)
framework/configstore/tables/utils.go (1)
  • ParseDuration (9-43)
framework/configstore/store.go (1)
framework/configstore/tables/ratelimit.go (2)
  • TableRateLimit (11-36)
  • TableRateLimit (39-39)
plugins/governance/test_utils.go (2)
ui/lib/types/governance.ts (2)
  • Budget (5-11)
  • RateLimit (13-25)
core/schemas/models.go (1)
  • Model (109-129)
framework/configstore/tables/mcp.go (2)
core/schemas/mcp.go (1)
  • MCPStdioConfig (71-75)
ui/lib/types/mcp.ts (1)
  • MCPStdioConfig (7-11)
plugins/governance/resolver.go (4)
plugins/governance/store.go (1)
  • GovernanceStore (50-72)
framework/configstore/tables/virtualkey.go (2)
  • TableVirtualKey (95-122)
  • TableVirtualKey (125-125)
core/schemas/bifrost.go (1)
  • ModelProvider (32-32)
framework/configstore/tables/ratelimit.go (2)
  • TableRateLimit (11-36)
  • TableRateLimit (39-39)
plugins/governance/store.go (7)
framework/configstore/tables/virtualkey.go (2)
  • TableVirtualKey (95-122)
  • TableVirtualKey (125-125)
framework/configstore/tables/team.go (2)
  • TableTeam (12-38)
  • TableTeam (41-41)
framework/configstore/tables/budget.go (2)
  • TableBudget (11-27)
  • TableBudget (30-30)
framework/configstore/tables/ratelimit.go (2)
  • TableRateLimit (11-36)
  • TableRateLimit (39-39)
framework/configstore/tables/utils.go (1)
  • ParseDuration (9-43)
core/schemas/models.go (1)
  • Model (109-129)
ui/lib/types/governance.ts (4)
  • Budget (5-11)
  • RateLimit (13-25)
  • Team (27-35)
  • Customer (37-44)
🪛 GitHub Actions: PR #1153
framework/configstore/rdb.go

[error] 740-740: unknown field IsCodeModeClient in struct literal of type schemas.MCPClientConfig

🔇 Additional comments (68)
core/providers/nebius/nebius.go (1)

48-54: LGTM! Configuration plumbing is correct and consistent.

The initialization of sendBackRawRequest from config.SendBackRawRequest aligns with the existing pattern for sendBackRawResponse and properly enables configuration-driven control over whether raw requests are included in responses.

core/schemas/context.go (1)

174-176: LGTM: Good defensive programming for plugin safety.

The lazy initialization of userValues makes SetValue resilient to BifrostContext instances that might be created without using the constructor. This is particularly valuable in the new plugins architecture where external code could potentially misuse the API.

While NewBifrostContext always initializes the map (line 55), this defensive check ensures robustness against edge cases and future changes.

framework/modelcatalog/sync.go (1)

61-67: LGTM!

The optional pre-check pattern is correctly implemented with a nil guard before invoking the callback. Returning nil on cancellation is appropriate since this is an optional gating mechanism, not an error condition.

framework/modelcatalog/main.go (3)

81-86: LGTM!

Clean type definition with clear documentation explaining its optional nature and purpose.


32-33: LGTM!

Field placement is appropriate. Since the callback is set once during initialization and remains read-only, no additional synchronization is required.


88-108: All callers of modelcatalog.Init have been properly updated with the new shouldSyncPricingFunc parameter. The breaking API change is correctly handled across the codebase in all three call sites.

plugins/governance/go.mod (2)

10-10: Testify dependency is justified. The governance plugin uses github.com/stretchr/testify/assert and github.com/stretchr/testify/require across multiple test files (fixtures_test.go, tracker_test.go, store_test.go, resolver_test.go). The dependency addition is appropriate.


80-80: Governance plugin uses mcp-go indirectly; API verification applies to core layer, not governance.

The mcp-go v0.43.2 upgrade was applied to core/go.mod and transports/go.mod (direct dependencies). The governance plugin has mcp-go as an indirect dependency and does not import mark3labs packages—it only manages MCP configuration through configstore tables. No API changes to mcp-go will impact the governance plugin directly.

However, verify that core/mcp.go and transports/bifrost-http/handlers/mcp_server.go use the v0.43.2 API correctly, and note that test directories (tests/core-mcp/go.mod, tests/scripts/1millogs/go.mod) still reference v0.41.1, creating a version mismatch in the stack.

Likely an incorrect or invalid review comment.

plugins/governance/changelog.md (1)

1-3: LGTM!

The changelog entries clearly document the key changes: governance store interface extraction for extensibility, rate-limit handling improvements, and e2e test additions.

core/mcp.go (1)

1134-1151: LGTM!

The removal of the strict type assertion to *server.MCPServer correctly broadens compatibility. The validation at lines 998-1003 ensures InProcessServer is non-nil before reaching this point, and client.NewInProcessClient accepts the interface type directly.

framework/configstore/tables/budget.go (3)

24-27: LGTM!

The virtual field LastDBUsage with gorm:"-" json:"-" correctly excludes it from persistence and serialization, enabling runtime delta tracking without affecting the database schema.


32-46: LGTM!

The BeforeSave hook properly validates:

  • ResetDuration format using ParseDuration and ensures it's positive
  • MaxLimit is non-negative (zero is valid for "no budget allocated" scenarios)

This provides robust input validation before persistence.


47-52: LGTM!

The AfterFind hook correctly captures the database value into LastDBUsage after loading, enabling runtime tracking of usage deltas. This pattern is consistent with the parallel implementation in ratelimit.go.

framework/configstore/tables/ratelimit.go (2)

32-36: LGTM!

The virtual fields LastDBTokenUsage and LastDBRequestUsage with gorm:"-" json:"-" tags correctly enable runtime delta tracking without affecting persistence or serialization, consistent with the pattern in budget.go.


82-88: LGTM!

The AfterFind hook correctly captures both token and request usage values from the database into virtual fields for runtime tracking. This enables detecting in-memory usage changes since the last DB read.

plugins/governance/fixtures_test.go (3)

14-66: LGTM!

The MockLogger implementation is well-designed:

  • Thread-safe with sync.Mutex for concurrent test scenarios
  • Stores logs by category (errors, warnings, infos, debugs) for test assertions
  • Fatal appending to errors (rather than panicking) is appropriate for test contexts

68-193: LGTM!

The test data builders provide clean, composable factory functions for creating test fixtures. The consistent pattern (e.g., buildVirtualKeybuildVirtualKeyWithBudget) enables flexible test setup while maintaining DRY principles.


195-221: LGTM!

The assertion helpers appropriately use t.Helper() for accurate test failure line reporting and wrap testify assertions with domain-specific semantics.

plugins/governance/config_update_sync_test.go (3)

1-6: LGTM!

Clean package declaration and minimal imports for the test file.


686-764: Good use of WaitForCondition for reliable async verification.

This test properly uses WaitForCondition with progressive backoff for both usage updates and limit sync verification. This pattern should be preferred over fixed time.Sleep calls in other tests for consistency and reliability.


960-1123: LGTM!

The provider budget update sync test covers the complete flow: create VK with provider budget, consume budget via request, update to lower limit, and verify usage reset. The test appropriately skips when consumption fails.

plugins/governance/resolver.go (3)

64-75: LGTM!

The change from *GovernanceStore to GovernanceStore (interface type) is correct Go idiom. Interfaces should be passed by value since they're already reference types internally, and this enables dependency injection with any compatible implementation.


194-217: LGTM!

The new checkRateLimitHierarchy method consolidates rate-limit checking into a single hierarchy-aware function. It correctly:

  1. Calls CheckRateLimit on the store
  2. On error, determines the appropriate RateLimitInfo (provider-level first, then VK-level)
  3. Returns a structured EvaluationResult with the decision from the store

219-233: LGTM!

The updated checkBudgetHierarchy correctly takes the full *EvaluationRequest parameter, enabling the store to perform atomic budget checking across the VK → Team → Customer hierarchy with access to request context.

plugins/governance/test_utils.go (4)

15-73: LGTM!

The cost modeling utilities provide realistic test data for budget consumption testing. The CalculateCost function correctly calculates based on per-token pricing.


90-141: LGTM!

The MakeRequest function is well-implemented:

  • Properly handles optional body marshaling
  • Sets appropriate headers including virtual key when provided
  • Gracefully handles non-JSON responses by storing raw body

299-367: LGTM!

The GlobalTestData struct with cleanup logic is well-designed:

  • Properly tracks resources by type for ordered cleanup
  • Handles 404 responses gracefully (resource already deleted)
  • Logs cleanup summary for debugging

369-424: LGTM!

The polling utilities are well-implemented with:

  • Progressive backoff (50ms→500ms for conditions, 100ms→500ms for API)
  • Descriptive logging on timeout and multi-attempt success
  • Clean separation between generic condition checking and API-specific polling
plugins/governance/rate_limit_test.go (1)

1-991: Comprehensive rate limit test coverage looks good.

The test suite covers a wide range of rate limit scenarios including VK-level and provider-level limits for both tokens and requests. The use of t.Parallel() for concurrent test execution and proper cleanup via testData.Cleanup(t) is well implemented.

A few observations:

  • The time.Sleep(500 * time.Millisecond) calls for async sync verification are reasonable for integration tests, though they could make tests flaky under load.
  • Tests properly handle cases where rate limits might not be hit by using t.Skip rather than failing.
framework/configstore/store.go (2)

104-104: LGTM - GetRateLimits method follows existing patterns.

This method is consistent with the existing GetBudgets method at line 111, providing a clean way to retrieve all rate limits.


146-148: RetryOnNotFound properly handles context cancellation during retries.

The implementation in rdb.go (lines 2178-2202) includes a select statement that monitors ctx.Done() alongside the retry delay, immediately returning ctx.Err() if the context is cancelled. This prevents retries from continuing after context cancellation.

plugins/governance/team_budget_test.go (1)

8-159: Team budget test implementation is solid.

The test properly validates that a shared team budget is enforced across multiple VKs by:

  1. Creating a team with a restrictive budget
  2. Creating two VKs under the team with higher individual budgets
  3. Alternating requests between VKs to exhaust the shared team budget
  4. Verifying budget enforcement kicks in

The use of CalculateCost for tracking consumed budget and proper error message checking is well implemented.

plugins/governance/store_test.go (2)

68-101: Concurrent reads test is well designed.

The test launches 100 goroutines each performing 100 reads, validating that the lock-free read path handles high concurrency correctly. The use of atomic.Int64 for counters is appropriate.


1-351: Store test suite provides good coverage of governance store operations.

The tests cover key scenarios including VK retrieval, concurrent access, budget validation, rate limit updates, and reset operations. The use of builder functions for test data keeps tests readable.

plugins/governance/provider_budget_test.go (2)

57-145: Verify subtest state isolation for provider budget tests.

Both subtests (OpenAIProviderBudgetExceeded and AnthropicProviderBudgetExceeded) use the same vkValue and overall VK budget ($1.0). While provider budgets are tracked separately, requests in the OpenAI subtest will also consume from the overall VK budget.

If the first subtest consumes significant overall budget, it could affect the second subtest's behavior. Consider whether you want to:

  1. Create separate VKs for each subtest for complete isolation
  2. Document that subtests are intentionally testing shared VK behavior

1-236: Provider budget test structure is sound.

The test creates a VK with distinct provider-specific budgets and verifies that each provider's budget is independently enforced. The pattern of tracking consumed budget via CalculateCost and checking for budget-related error messages is consistent with other tests.

plugins/governance/advanced_scenarios_test.go (2)

1-1675: Comprehensive advanced scenarios test suite.

The test file provides excellent coverage of complex governance scenarios including:

  • VK switching between teams/customers after budget exhaustion
  • Hierarchical budget chain switching
  • Budget updates after exhaustion (VK, team, customer, provider config)
  • Deletion cascades for VKs, teams, and customers
  • Verification that VK entity IDs are set to nil when parent entities are deleted

The tests align well with the 1:1 relationship model between budgets/rate limits and their parent entities mentioned in the learnings.


1254-1259: The review comment is based on an incorrect premise. Rate limits are cascade-deleted when a VirtualKey is deleted, not left orphaned. The DeleteVirtualKeyInMemory function explicitly deletes both VK-level rate limits (line 1246) and provider-config-level rate limits (line 1256), using the same cascade pattern as budgets. The test comment at lines 1254–1259 suggesting that rate limits "may still exist in memory (orphaned)" is misleading and inconsistent with the actual cleanup logic.

Likely an incorrect or invalid review comment.

plugins/governance/tracker_test.go (1)

84-152: Streaming optimization test correctly validates chunk handling.

The test properly verifies that:

  1. Non-final streaming chunks update token usage but not request count
  2. Final chunks increment the request counter

This aligns with the UsageUpdate struct fields (IsStreaming, IsFinalChunk, HasUsageData) from the tracker.

plugins/governance/resolver_test.go (1)

1-551: Comprehensive test coverage for BudgetResolver - well structured.

The test suite covers essential governance scenarios including VK lifecycle, provider/model filtering, rate limits, budgets, and multi-level hierarchy. The use of table-driven tests for IsProviderAllowed and IsModelAllowed (lines 403-511) is a good pattern.

One observation on line 338: direct access to store.budgets.Store() tightly couples the test to the internal sync.Map implementation. If the storage mechanism changes, this test will break. Consider adding a method like UpdateBudgetForTest() to the store interface for controlled test manipulation, or accept this coupling as acceptable for package-internal tests.

plugins/governance/customer_budget_test.go (1)

10-160: Good integration tests for customer budget enforcement.

The tests correctly verify POST-HOC budget enforcement (the request that exceeds the limit is allowed, but subsequent requests are blocked). The shouldStop logic on lines 149-156 correctly implements this pattern.

The long lorem ipsum prompts help consume budget faster, making the tests more efficient.

plugins/governance/tracker.go (3)

47-49: Good choice for worker interval.

The 10-second interval provides a reasonable balance between responsiveness (detecting expired limits quickly) and performance (not hammering the store with resets).


128-148: Well-structured reset and dump workflow.

The three-part approach (reset rate limits → reset budgets → dump to DB) correctly separates in-memory operations from persistence. This ensures fast in-memory updates while batching DB writes for efficiency.


228-254: Selective field updates prevent config overwrites.

The selective update approach (lines 233-237) correctly updates only usage-related fields (token_current_usage, token_last_reset, request_current_usage, request_last_reset) while preserving max_limit and reset_duration. This prevents race conditions where config changes during startup could be lost.

plugins/governance/rate_limit_enforcement_test.go (1)

1-129: Well-documented POST-HOC enforcement behavior.

The test comments clearly explain that rate limit enforcement is POST-HOC: the request that exceeds the limit is ALLOWED, but subsequent requests are BLOCKED. This is important documentation for understanding the system behavior.

plugins/governance/in_memory_sync_test.go (1)

1-554: Comprehensive in-memory synchronization tests.

The tests properly verify that in-memory state is updated after CRUD operations. The pattern of verifying both the resource and its associated budget separately (e.g., lines 116-127) ensures complete coverage of the synchronization logic.

The pragmatic handling of SQLite locking issues (lines 547-551) with a warning instead of failure is a good approach for parallel test reliability.

plugins/governance/e2e_test.go (3)

766-857: Proper concurrent test implementation.

TestConcurrentRequestsToSameVK correctly uses sync.WaitGroup for goroutine synchronization and sync.Mutex for thread-safe success counting. This is the right pattern for concurrent testing.


1012-1036: Good use of polling instead of fixed sleep.

The WaitForAPICondition pattern (lines 1014-1032) is better than fixed sleeps because it adapts to actual system response time and has a clear timeout. This reduces flakiness while keeping tests responsive.


1-1543: Comprehensive e2e test coverage for governance features.

The test file covers critical governance scenarios including:

  • Multi-VK budget sharing (lines 20-183)
  • Full budget hierarchy enforcement (lines 189-354)
  • Failed requests not consuming budget (lines 360-500)
  • Rate limit reset boundaries (lines 652-760)
  • Concurrent request handling (lines 766-857)
  • Cascade deletion behavior (lines 1077-1307)
  • Provider load balancing and fallback (lines 1315-1543)

The section organization (CRITICAL, HIGH, FEATURE) makes it easy to understand test priorities.

plugins/governance/main.go (4)

40-47: Good abstraction with BaseGovernancePlugin interface.

The new interface provides a clean contract for the governance plugin, enabling easier mocking in tests and potential alternative implementations. The method signatures cover the full plugin lifecycle (hooks, interceptor, cleanup, store access).


166-224: InitFromStore enables dependency injection for testing.

The new constructor allows injecting a custom GovernanceStore implementation, which is valuable for:

  • Unit testing with mock stores
  • Integration with non-standard storage backends
  • Pre-configured store instances

The nil check on line 193-195 correctly validates the required dependency.


210-211: Context shadowing is intentional for plugin lifecycle.

Line 210 shadows the ctx parameter with a new cancellable context. This is intentional - the plugin manages its own lifecycle via cancelFunc, independent of the caller's context. This pattern ensures proper cleanup regardless of the caller's context state.


462-464: Error message change improves abstraction.

Changing from "x-bf-vk header is missing" to "virtual key is missing in headers" is a good improvement. It doesn't expose the specific header name implementation detail to API consumers, making the API more stable if header naming changes.

framework/configstore/tables/mcp.go (1)

16-35: Consistent pattern for new MCP client fields.

The new fields (IsCodeModeClient, ToolsToAutoExecuteJSON, ToolsToAutoExecute) follow the established pattern of the existing ToolsToExecute field pair. The serialization in BeforeSave (lines 63-71) and deserialization in AfterFind (lines 101-105) are symmetric and handle nil/empty cases correctly. Database migrations exist for both new columns with proper initialization of existing rows.

framework/configstore/rdb.go (4)

9-9: LGTM!

The time import is correctly added to support the new RetryOnNotFound function and MCP timeout duration handling.


45-46: Verify validation requirements for MCP configuration fields.

The new MCPAgentDepth and MCPToolExecutionTimeout fields are correctly persisted and retrieved. Consider whether these fields need validation (e.g., minimum/maximum bounds for agent depth, reasonable timeout ranges) either at the API layer or in the database layer.

Also applies to: 200-201


787-817: Blocked by schema mismatch - logic is correct.

The MCP client configuration logic properly handles:

  • Environment variable substitution with correct context (no existing headers for create, existing headers for update)
  • Explicit Select() to include zero-value boolean fields that GORM would otherwise skip

However, this code cannot function until the IsCodeModeClient field is added to schemas.MCPClientConfig (see previous comment about pipeline failure).

Note: The explicit Select() usage on lines 848-851 is the correct approach to ensure boolean fields with false values are updated in the database.

Also applies to: 820-856


1823-1830: LGTM!

The GetRateLimits method correctly retrieves all rate limits from the database, following the same pattern as other collection getters like GetBudgets.

plugins/governance/store.go (5)

17-72: Excellent interface design with clear error semantics.

The refactoring introduces a well-documented GovernanceStore interface with an explicit error contract:

  • Errors from CheckRateLimit and CheckBudget indicate policy violations, not infrastructure failures
  • This prevents inappropriate retry logic and ensures consistent behavior

The renamed LocalGovernanceStore implementation and new GovernanceData snapshot type provide clean abstractions for in-memory governance management.


96-149: LGTM!

The GetGovernanceData implementation correctly materializes an immutable snapshot from the in-memory sync.Map fields. The type-checking and nil-safety handling ensure robustness even with corrupted data.


1225-1266: Cascade deletion logic correctly maintains in-memory integrity.

The in-memory cascade deletion in DeleteVirtualKeyInMemory, DeleteTeamInMemory, and DeleteCustomerInMemory correctly:

  • Deletes associated budgets and rate limits (maintaining 1:1 relationship invariant from learnings)
  • Nullifies foreign key references in related entities
  • Maintains consistency with database cascade deletion logic in rdb.go

The implementation correctly enforces that budgets and rate limits are not shared across entities, as documented in learnings.

Also applies to: 1330-1362, 1426-1474


1576-1668: The helper functions correctly implement the budget and rate limit update semantics.

The checkAndUpdateBudget and checkAndUpdateRateLimit functions properly handle the reset logic:

  • Reset usage to 0 when current usage would exceed newly tightened limits
  • Preserve usage otherwise to avoid data loss
  • Maintain LastDB fields for multi-node baseline consistency

The equalPtr helper function referenced in checkAndUpdateRateLimit is defined in plugins/governance/utils.go and is properly accessible.


311-349: Shallow copy pattern is safe—no changes required.

The clone-first pattern correctly prevents race conditions. Verification confirms all modified fields in both TableBudget and TableRateLimit are value types (counters, timestamps) that are properly copied. Pointer fields (TokenResetDuration, RequestResetDuration) are only read via dereference, never reassigned, making shallow copying safe. Strings in Go are immutable, so shared pointer references do not create mutation races.

core/go.mod (5)

8-10: Approve patch and minor version updates to existing dependencies.

The updates to AWS SDK components (v1.32.6, v1.19.6, v1.94.0), sonic, fasthttp, oauth2, text, compress, and arch are all patch or minor version bumps that align with maintenance and security practices. No apparent compatibility concerns.

Also applies to: 12-12, 17-17, 20-22, 39-39, 45-45, 52-52, 64-66


49-49: The +incompatible tag is correct; v2.1.4+incompatible is available for potential upgrade.

The v2.1.3+incompatible version is the expected state. The +incompatible tag is necessary because go-sourcemap was originally released as v1.0.x and later had a breaking change release as v2.x; no compatible v2.1.3 variant exists. Additionally, there is no v3.x version. However, a newer patch version v2.1.4+incompatible is available in the module registry if you wish to upgrade this transitive dependency (pulled in by github.com/dop251/goja).


17-17: Version bump is compatible; agent/code-mode features are bifrost-specific implementations.

The mcp-go v0.43.2 upgrade is safe and brings improvements for JSON Schema handling and notification processing. Note that v0.43.2 does not introduce agent-mode or code-execution capabilities—these are implemented within bifrost's code. The MCP library provides the client/server foundation; bifrost's agent orchestration and code-execution features (core/mcp/agent.go, core/mcp/codemode_*.go) are independent implementations. No breaking changes exist in the v0.41.1 → v0.43.2 transition.


13-14: Dependencies are appropriately selected for production use.

Both go-typescript v0.7.0 and goja v0.0.0-20251103141225-af2ceb9156d7 are legitimate and well-maintained:

  • go-typescript v0.7.0 is a stable release (latest available version in the repo)
  • goja uses a pseudo-version because the project doesn't publish formal GitHub releases—this is standard Go practice and not indicative of instability
  • Both projects show active maintenance and no known security issues

27-27: All transitive dependencies are necessary and secure—no issues found.

The four indirect dependencies (Masterminds/semver/v3, regexp2, go-sourcemap, and pprof) are legitimate requirements pulled by the two direct dependencies go-typescript and goja, which power the new TypeScript/JavaScript execution layer.

Regarding potential optimizations:

  • regexp2 cannot be replaced with the standard library regexp package; goja (JavaScript runtime) requires ECMAScript-compatible regex semantics, not Go's syntax.
  • Masterminds/semver, go-sourcemap, and pprof have no standard library equivalents.
  • All versions are current, stable, and have no reported CVEs.

## Summary

Added a configurable code mode binding level for MCP tools, allowing tools to be exposed in the virtual file system (VFS) at either server-level or tool-level granularity.

## Changes

- Added a new `CodeModeBindingLevel` configuration option with two possible values:
  - `server`: All tools for a server are exposed in a single file (e.g., `servers/calculator.d.ts`)
  - `tool`: Each tool is exposed in its own file (e.g., `servers/calculator/add.d.ts`)
- Enhanced the `listToolFiles` and `readToolFile` tools to support both binding levels
- Implemented a tree-based VFS structure for better organization of tool files
- Added UI configuration in the MCP Gateway settings page with visual examples
- Updated database schema and migrations to store the new configuration

## Type of change

- [x] Feature
- [x] UI (Next.js)

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] UI (Next.js)

## How to test

1. Configure the code mode binding level in the UI:
   - Navigate to Workspace → Config → MCP Gateway
   - Select either "Server-Level" or "Tool-Level" binding
   - Save changes

2. Test with code mode in a chat:
   ```typescript
   // First list available tools
   await listToolFiles()
   
   // Read tool definitions based on binding level
   // For server-level:
   await readToolFile({ fileName: "servers/calculator.d.ts" })
   
   // For tool-level:
   await readToolFile({ fileName: "servers/calculator/add.d.ts" })
   
   // Execute tool code as usual
   await executeToolCode(`
     const result = await calculator.add({ a: 5, b: 3 });
     return result;
   `)
   ```

## Screenshots/Recordings

N/A

## Breaking changes

- [x] No

## Related issues

N/A

## Security considerations

No additional security implications. This change only affects how tools are organized and presented in the virtual file system.

## Checklist

- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
## Summary

Add support for tool execution in the Responses API, enabling agent mode and tool execution for both Chat Completions and Responses APIs with full feature parity.

## Changes

- Renamed `ExecuteMCPTool` to `ExecuteChatMCPTool` to clarify it works with Chat API format
- Added `ExecuteResponsesMCPTool` to support tool execution with Responses API format
- Implemented format converters between Chat and Responses API tool formats
- Added adapter pattern to make agent execution API-agnostic
- Updated HTTP endpoints to support both Chat and Responses API tool execution
- Added comprehensive tests for format conversion and tool execution

## Type of change

- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Test tool execution with both Chat and Responses APIs:

```sh
# Run all tests
go test ./...

# Run specific MCP tests
go test ./core/mcp -v

# Test HTTP endpoints
curl -X POST http://localhost:8080/v1/mcp/tool/chat/execute -d '{"id":"call123","type":"function","function":{"name":"calculator","arguments":"{\"x\":10,\"y\":5}"}}'

curl -X POST http://localhost:8080/v1/mcp/tool/responses/execute -d '{"call_id":"call123","name":"calculator","arguments":"{\"x\":10,\"y\":5}"}'
```

## Breaking changes

- [x] Yes
- [ ] No

The `ExecuteMCPTool` method has been renamed to `ExecuteChatMCPTool`. Any code directly calling this method will need to be updated. The HTTP endpoint has also changed from `/v1/mcp/tool/execute` to `/v1/mcp/tool/chat/execute`.

## Related issues

Enables Responses API to have full feature parity with Chat API for tool execution and agent mode.

## Security considerations

No new security implications. Tool execution permissions and sandboxing remain unchanged.

## Checklist

- [x] I added/updated tests where appropriate
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable
## Summary

Implement health monitoring for MCP clients to detect and handle connection state changes automatically.

## Changes

- Added a new `health_monitor.go` file with a health monitoring system for MCP clients
- Implemented `ClientHealthMonitor` to periodically check client health via ping
- Created `HealthMonitorManager` to manage multiple client health monitors
- Updated client state tracking to use the actual connection state instead of inferring it
- Added connection lost detection for SSE connections
- Improved client cleanup by stopping health monitoring when removing clients
- Enhanced logging with more descriptive messages for connection events

## Type of change

- [x] Feature
- [x] Bug fix

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)

## How to test

1. Configure multiple MCP clients with different connection types
2. Start Bifrost and verify clients connect properly
3. Disconnect a client (e.g., stop an external MCP server)
4. Verify the client state changes to "disconnected" after several failed health checks
5. Reconnect the client and verify state returns to "connected"
6. Test SSE connections specifically to verify connection lost detection

```sh
# Core/Transports
go version
go test ./...
```

## Breaking changes

- [x] No

## Related issues

Improves reliability of MCP client connections and provides better visibility into connection states.

## Security considerations

No significant security implications. The health monitoring system only interacts with already established connections.

## Checklist

- [x] I added/updated tests where appropriate
- [x] I verified builds succeed (Go and UI)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
framework/configstore/migrations.go (1)

1803-1844: Include MCP fields in ClientConfig hash backfill

In migrationAddAdditionalConfigHashColumns, the ClientConfig struct used to backfill hashes is missing the MCP fields that are now part of GenerateClientConfigHash:

clientConfig := ClientConfig{
  DropExcessRequests:      cc.DropExcessRequests,
  InitialPoolSize:         cc.InitialPoolSize,
  PrometheusLabels:        cc.PrometheusLabels,
  EnableLogging:           cc.EnableLogging,
  DisableContentLogging:   cc.DisableContentLogging,
  LogRetentionDays:        cc.LogRetentionDays,
  EnableGovernance:        cc.EnableGovernance,
  EnforceGovernanceHeader: cc.EnforceGovernanceHeader,
  AllowDirectKeys:         cc.AllowDirectKeys,
  AllowedOrigins:          cc.AllowedOrigins,
  MaxRequestBodySizeMB:    cc.MaxRequestBodySizeMB,
  EnableLiteLLMFallbacks:  cc.EnableLiteLLMFallbacks,
}

Add the three MCP fields that are now hashed:

MCPAgentDepth:           cc.MCPAgentDepth,
MCPToolExecutionTimeout: cc.MCPToolExecutionTimeout,
MCPCodeModeBindingLevel: cc.MCPCodeModeBindingLevel,

Without these, the backfilled hashes won't match the complete configuration state, causing spurious resync mismatches later.

♻️ Duplicate comments (3)
core/mcp/agent_adaptors.go (1)

139-153: Type assertion could panic on conversation type mismatch.

At line 143, msg.(schemas.ChatMessage) assumes all items in conversation are of type ChatMessage. If a non-ChatMessage type is accidentally added to the conversation slice, this will panic. The same issue exists in responsesAPIAdapter.createNewRequest at line 356.

This was previously flagged. Consider adding a defensive type check with a safe assertion pattern.

core/mcp/toolmanager.go (2)

29-33: Typo in documentation comment.

Line 32 contains a typo: "If not provider" should be "If not provided".

This was previously flagged.


134-145: Outdated parameter documentation.

Line 141 documents availableToolsPerClient as a parameter, but this parameter doesn't exist in the function signature. The function only takes existingTools and integrationUserAgent.

This was previously flagged.

🧹 Nitpick comments (11)
core/chatbot_test.go (1)

531-588: Propagate a request-scoped context into ExecuteChatMCPTool instead of context.Background()

ExecuteChatMCPTool is currently always invoked with context.Background(), so cancellation/timeouts from the outer chat request cannot affect tool execution. If you have (or later add) per-request contexts with deadlines or governance metadata, consider threading that through handleToolCalls and into ExecuteChatMCPTool to avoid orphaned tool runs.

core/schemas/mux.go (2)

402-543: Function-call outputs are now routed via Output instead of Content; verify all call sites expect this

The changes in ChatMessage.ToResponsesMessages for ChatMessageRoleTool:

  • set Type=function_call_output and intentionally omit Role,
  • avoid populating rm.Content for these messages, and
  • populate ResponsesToolMessage.Output.ResponsesToolCallOutputStr from cm.Content.ContentStr.

This matches the Responses API’s function_call_output shape and aligns with the reverse conversion in ToChatMessages, which reconstructs Content from Output.

Just confirm that any existing code reading Responses-style tool outputs is looking at ResponsesToolMessage.Output rather than Content, especially for newly introduced MCP / code-mode paths.

#!/bin/bash
# Look for places that read function_call_output messages and ensure they use .Output, not .Content.
rg -n "function_call_output" core

152-207: ToResponsesToolMessage does drop non-text fields from content blocks for tool outputs

When ToResponsesToolMessage converts ContentBlocks (lines 191–202), it only copies Type and Text, discarding ImageURLStruct, InputAudio, File, and CacheControl. While the underlying ResponsesToolMessageOutputStruct does support structured output via ResponsesFunctionToolCallOutputBlocks, the conversion logic is incomplete compared to the field-preserving logic in ToResponsesMessages for other message types.

In practice, tool outputs in the codebase are created exclusively with ContentStr, never ContentBlocks, so this limitation is not currently manifested. However, documenting that tool outputs are text-only, or aligning the block-conversion logic with the richer handling in ToResponsesMessages, would clarify intent and prevent potential data loss if tool output generation changes.

core/mcp/codemode_listfiles.go (1)

69-118: Handle the rare case where code-mode servers exist but no virtual files are emitted

handleListToolFiles returns helpful messages when no MCP clients are connected or when no clients are configured for code mode. For tool-level binding, it's possible (though unlikely) to have code-mode servers where none of the tools have a non-empty Function.Name, resulting in codeModeServerCount > 0 but an empty files slice. In that case, buildVFSTree returns an empty string and the tool response is blank.

Consider adding a check for len(files) == 0 after the codeModeServerCount check to emit a clearer message like "Code-mode servers are connected but none expose function tools, so there are no virtual .d.ts files."

framework/configstore/clientconfig.go (1)

35-54: Add test coverage for new MCP config fields in hash generation

The hash implementation for MCPAgentDepth, MCPToolExecutionTimeout, and MCPCodeModeBindingLevel is correct and deterministic. All three fields always hash: integer fields hash as their actual value (or 0 if unset), and the string field hashes as its value or defaults to "server" when empty. Config changes from unset to explicit default will correctly produce different hashes.

However, the existing test suite (TestGenerateClientConfigHash and TestGenerateClientConfigHash_RuntimeVsMigrationParity) contains no test cases for these new MCP fields. Add explicit tests to verify hash behavior when these fields change.

core/mcp/codemode_readfile.go (2)

294-431: Property names and enum values in TypeScript generation need careful handling

The generated TypeScript code has two limitations worth addressing:

  1. Property names containing illegal TS identifier characters: Properties like foo-bar or some prop are emitted verbatim at line 399 and will produce syntactically invalid TypeScript. These should be quoted (e.g., "foo-bar": string;).

  2. Enum values always converted to strings: The jsonSchemaToTypeScript function uses fmt.Sprintf("%q", e) for all enum values, converting numeric and boolean enums to quoted strings (e.g., 1 | 2 becomes "1" | "2" instead of preserving their types).

Consider:

  • Quoting property names that contain non-identifier-safe characters
  • Preserving the original types for numeric and boolean enum values

486-503: toPascalCase will corrupt non-ASCII characters due to byte-slicing

toPascalCase uses byte indexing (part[:1], part[1:]) which breaks multi-byte UTF-8 characters. For example, if a tool name like "école" reaches this function, the byte slice corrupts the leading byte, producing invalid UTF-8 output. This is unlikely in practice since MCP tool identifiers conventionally use ASCII, but parseToolName allows non-ASCII characters through via unicode.IsLetter().

For more robust handling of any non-ASCII tool names, switch to rune-based slicing (convert part to runes, slice by rune index, then convert back). Otherwise, this pragmatic choice is acceptable given MCP naming conventions.

core/mcp/clientmanager.go (1)

567-604: Consider potential TOCTOU race in setupLocalHost double-check pattern.

The double-checked locking pattern here could allow two goroutines to both create server/client objects before the second lock acquisition. While the second check prevents duplicate assignment, the first goroutine's created objects would be discarded without explicit cleanup.

This is a minor concern since the created objects are lightweight and Go's GC will reclaim them, but for clarity you might consider holding the lock through creation or using sync.Once.

core/bifrost.go (1)

1879-1895: Thread-safe initialization in AddMCPClient has a subtle race condition.

If Init was called without MCPConfig (so mcpManager is nil and mcpInitOnce is unused), then a concurrent call to AddMCPClient from multiple goroutines could result in only one goroutine initializing the manager while others see mcpManager as nil after sync.Once completes (since only the first caller's config is used).

The current check at lines 1889-1892 handles this, but there's a subtle issue: if mcpInitOnce.Do was already executed by another AddMCPClient call, subsequent callers will skip initialization and then call AddClient on the manager initialized by the first caller, which is correct behavior. However, consider adding a comment to clarify this intentional behavior.

🔎 Suggested clarification
 func (bifrost *Bifrost) AddMCPClient(config schemas.MCPClientConfig) error {
 	if bifrost.mcpManager == nil {
 		// Use sync.Once to ensure thread-safe initialization
+		// Note: Only the first caller's config is used to initialize the manager.
+		// Subsequent callers will add their config via AddClient below.
 		bifrost.mcpInitOnce.Do(func() {
 			bifrost.mcpManager = mcp.NewMCPManager(bifrost.ctx, schemas.MCPConfig{
 				ClientConfigs: []schemas.MCPClientConfig{config},
 			}, bifrost.logger)
 		})
 	}
core/mcp/agent_adaptors.go (2)

219-270: Consider extracting duplicated tool results formatting logic.

The logic for building toolResultsMap and formatting contentText (lines 219-270) is duplicated almost identically in createResponsesResponseWithExecutedToolsAndNonAutoExecutableCalls (lines 455-506). This could be extracted into a shared helper function to reduce code duplication and maintenance burden.

🔎 Suggested helper extraction
// formatToolResultsAsText builds a formatted text representation of executed tool results.
func formatToolResultsAsText(
    executedToolResults []*schemas.ChatMessage,
    toolCallIDToName map[string]string,
) string {
    if len(executedToolResults) == 0 {
        return "Now I shall call these tools next..."
    }
    
    toolResultsMap := make(map[string]interface{})
    for _, toolResult := range executedToolResults {
        // ... extraction logic ...
    }
    
    jsonBytes, err := sonic.Marshal(toolResultsMap)
    if err != nil {
        return fmt.Sprintf("The Output from allowed tools calls is - %v\n\nNow I shall call these tools next...", toolResultsMap)
    }
    return fmt.Sprintf("The Output from allowed tools calls is - %s\n\nNow I shall call these tools next...", string(jsonBytes))
}

112-120: Type assertions in adapter methods assume correct types.

Methods like hasToolCalls, extractToolCalls, and addAssistantMessage use direct type assertions (e.g., response.(*schemas.BifrostChatResponse)) that will panic if the wrong type is passed. Since this is internal code and the adapters are instantiated with matching types, this is acceptable, but consider adding defensive checks for robustness.

🔎 Example defensive pattern
func (c *chatAPIAdapter) hasToolCalls(response interface{}) bool {
    chatResponse, ok := response.(*schemas.BifrostChatResponse)
    if !ok {
        logger.Warn(fmt.Sprintf("hasToolCalls received unexpected type: %T", response))
        return false
    }
    return hasToolCallsForChatResponse(chatResponse)
}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e42e8da and bca600b.

⛔ Files ignored due to path filters (2)
  • core/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (43)
  • core/bifrost.go
  • core/changelog.md
  • core/chatbot_test.go
  • core/mcp/agent_adaptors.go
  • core/mcp/agent_test.go
  • core/mcp/clientmanager.go
  • core/mcp/codemode_listfiles.go
  • core/mcp/codemode_readfile.go
  • core/mcp/health_monitor.go
  • core/mcp/mcp.go
  • core/mcp/toolmanager.go
  • core/providers/nebius/nebius.go
  • core/providers/openai/openai.go
  • core/schemas/bifrost.go
  • core/schemas/mcp.go
  • core/schemas/mux.go
  • framework/configstore/clientconfig.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/clientconfig.go
  • framework/configstore/tables/mcp.go
  • plugins/governance/go.mod
  • transports/bifrost-http/handlers/config.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/mcp.go
  • transports/bifrost-http/handlers/mcp_server.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • transports/changelog.md
  • transports/go.mod
  • ui/app/workspace/config/logging/page.tsx
  • ui/app/workspace/config/mcp-gateway/page.tsx
  • ui/app/workspace/config/views/clientSettingsView.tsx
  • ui/app/workspace/config/views/governanceView.tsx
  • ui/app/workspace/config/views/loggingView.tsx
  • ui/app/workspace/config/views/mcpView.tsx
  • ui/app/workspace/config/views/observabilityView.tsx
  • ui/app/workspace/config/views/performanceTuningView.tsx
  • ui/app/workspace/config/views/securityView.tsx
  • ui/components/sidebar.tsx
  • ui/lib/constants/logs.ts
  • ui/lib/types/config.ts
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (1)
  • core/changelog.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • core/providers/openai/openai.go
  • core/providers/nebius/nebius.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/mcp/clientmanager.go
  • framework/configstore/migrations.go
  • core/mcp/mcp.go
  • core/mcp/agent_adaptors.go
  • core/mcp/codemode_readfile.go
  • core/chatbot_test.go
  • framework/configstore/tables/mcp.go
  • core/mcp/health_monitor.go
  • core/mcp/codemode_listfiles.go
  • core/mcp/agent_test.go
  • core/schemas/mcp.go
  • plugins/governance/go.mod
  • core/schemas/mux.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/clientconfig.go
  • core/mcp/toolmanager.go
  • core/bifrost.go
  • framework/configstore/clientconfig.go
  • core/schemas/bifrost.go
🧠 Learnings (1)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/mcp/clientmanager.go
  • framework/configstore/migrations.go
  • core/mcp/mcp.go
  • core/mcp/agent_adaptors.go
  • core/mcp/codemode_readfile.go
  • core/chatbot_test.go
  • framework/configstore/tables/mcp.go
  • core/mcp/health_monitor.go
  • core/mcp/codemode_listfiles.go
  • core/mcp/agent_test.go
  • core/schemas/mcp.go
  • core/schemas/mux.go
  • framework/configstore/rdb.go
  • framework/configstore/tables/clientconfig.go
  • core/mcp/toolmanager.go
  • core/bifrost.go
  • framework/configstore/clientconfig.go
  • core/schemas/bifrost.go
🧬 Code graph analysis (10)
core/mcp/clientmanager.go (4)
core/mcp/mcp.go (5)
  • MCPManager (41-49)
  • MCPLogPrefix (23-23)
  • BifrostMCPClientKey (22-22)
  • MCPClientConnectionEstablishTimeout (24-24)
  • BifrostMCPClientName (21-21)
core/mcp.go (6)
  • MCPManager (49-56)
  • MCPLogPrefix (31-31)
  • BifrostMCPClientKey (30-30)
  • MCPClientConnectionInfo (69-73)
  • MCPClientConnectionEstablishTimeout (32-32)
  • BifrostMCPClientName (29-29)
core/schemas/mcp.go (9)
  • MCPClientState (96-104)
  • MCPClientConfig (45-67)
  • MCPClientConnectionInfo (107-111)
  • MCPConnectionTypeHTTP (73-73)
  • MCPConnectionTypeSTDIO (74-74)
  • MCPConnectionTypeSSE (75-75)
  • MCPConnectionTypeInProcess (76-76)
  • MCPConnectionStateConnected (89-89)
  • MCPConnectionStateDisconnected (90-90)
core/mcp/health_monitor.go (2)
  • NewClientHealthMonitor (35-53)
  • DefaultHealthCheckInterval (14-14)
core/mcp/mcp.go (4)
core/mcp/toolmanager.go (2)
  • ToolsManager (22-34)
  • NewToolsManager (53-82)
core/schemas/mcp.go (5)
  • MCPClientState (96-104)
  • MCPConfig (14-23)
  • MCPToolManagerConfig (25-29)
  • DefaultToolExecutionTimeout (33-33)
  • DefaultMaxAgentDepth (32-32)
core/mcp/health_monitor.go (2)
  • HealthMonitorManager (185-188)
  • NewHealthMonitorManager (191-195)
core/mcp/init.go (1)
  • SetLogger (7-9)
core/mcp/agent_adaptors.go (2)
core/schemas/chatcompletions.go (5)
  • ChatAssistantMessageToolCall (719-725)
  • BifrostChatRequest (12-19)
  • BifrostChatResponse (27-42)
  • BifrostResponseChoice (745-753)
  • ChatAssistantMessage (658-665)
core/schemas/responses.go (5)
  • BifrostResponsesRequest (32-39)
  • BifrostResponsesResponse (45-85)
  • ResponsesMessageTypeMessage (297-297)
  • ResponsesInputMessageRoleAssistant (339-339)
  • ResponsesMessageTypeFunctionCall (302-302)
framework/configstore/tables/mcp.go (2)
core/schemas/mcp.go (1)
  • MCPStdioConfig (80-84)
ui/lib/types/mcp.ts (1)
  • MCPStdioConfig (7-11)
core/mcp/health_monitor.go (2)
core/mcp/mcp.go (2)
  • MCPManager (41-49)
  • MCPLogPrefix (23-23)
core/schemas/mcp.go (3)
  • MCPConnectionStateDisconnected (90-90)
  • MCPConnectionStateConnected (89-89)
  • MCPConnectionState (86-86)
core/mcp/agent_test.go (6)
core/schemas/chatcompletions.go (4)
  • BifrostChatResponse (27-42)
  • BifrostChatRequest (12-19)
  • ChatTool (261-266)
  • BifrostResponseChoice (745-753)
core/schemas/responses.go (7)
  • BifrostResponsesResponse (45-85)
  • BifrostResponsesRequest (32-39)
  • ResponsesMessageTypeFunctionCall (302-302)
  • ResponsesMessageTypeMessage (297-297)
  • ResponsesInputMessageRoleAssistant (339-339)
  • ResponsesInputMessageRoleUser (340-340)
  • ResponsesMessageTypeFunctionCallOutput (303-303)
core/schemas/bifrost.go (4)
  • BifrostError (482-491)
  • ErrorField (500-507)
  • ResponsesRequest (96-96)
  • OpenAI (35-35)
core/schemas/utils.go (1)
  • Ptr (16-18)
transports/bifrost-http/server/server.go (1)
  • SetLogger (116-118)
core/mcp/init.go (1)
  • SetLogger (7-9)
core/schemas/mcp.go (2)
ui/lib/types/mcp.ts (3)
  • MCPConnectionType (3-3)
  • MCPStdioConfig (7-11)
  • MCPConnectionState (5-5)
core/mcp.go (1)
  • MCPClientConnectionInfo (69-73)
core/schemas/mux.go (3)
core/schemas/responses.go (6)
  • ResponsesToolMessage (472-492)
  • ResponsesMessage (321-334)
  • ResponsesMessageTypeFunctionCallOutput (303-303)
  • ResponsesToolMessageOutputStruct (541-545)
  • ResponsesMessageContentBlock (406-421)
  • ResponsesMessageContent (346-351)
core/schemas/chatcompletions.go (4)
  • ChatAssistantMessageToolCall (719-725)
  • ChatAssistantMessageToolCallFunction (728-731)
  • ChatMessage (487-496)
  • ChatToolMessage (653-655)
core/schemas/utils.go (1)
  • Ptr (16-18)
core/mcp/toolmanager.go (7)
core/schemas/mcp.go (4)
  • MCPToolManagerConfig (25-29)
  • DefaultToolExecutionTimeout (33-33)
  • CodeModeBindingLevel (37-37)
  • CodeModeBindingLevelServer (40-40)
core/schemas/chatcompletions.go (6)
  • ChatTool (261-266)
  • ChatParameters (156-186)
  • ChatToolFunction (269-274)
  • ChatAssistantMessageToolCall (719-725)
  • BifrostChatRequest (12-19)
  • BifrostChatResponse (27-42)
core/mcp/mcp.go (1)
  • MCPLogPrefix (23-23)
core/mcp.go (1)
  • MCPLogPrefix (31-31)
core/schemas/bifrost.go (1)
  • BifrostError (482-491)
core/schemas/responses.go (1)
  • ResponsesParameters (87-114)
core/mcp/agent.go (1)
  • ExecuteAgentForChatRequest (30-64)
core/bifrost.go (3)
core/mcp/mcp.go (2)
  • MCPManager (41-49)
  • NewMCPManager (68-94)
core/mcp.go (2)
  • MCPManager (49-56)
  • MCPClient (59-66)
core/schemas/mcp.go (4)
  • MCPConfig (14-23)
  • MCPClient (116-120)
  • MCPClientConfig (45-67)
  • MCPToolManagerConfig (25-29)
🪛 GitHub Actions: PR #1153
framework/configstore/rdb.go

[error] 742-742: unknown field IsCodeModeClient in struct literal of type schemas.MCPClientConfig

🔇 Additional comments (48)
plugins/governance/go.mod (1)

10-10: ✓ Dependency additions look good — previous duplicate issue resolved.

The critical duplicate dependency issue from the previous review has been fixed. The new dependency additions (testify and its transitive dependencies go-spew and go-difflib) are standard testing utilities appropriate for the governance plugin, with versions properly pinned to specific commits.

Also applies to: 42-42, 85-85

core/schemas/bifrost.go (2)

120-143: Context key additions and “DO NOT SET” comments look consistent

The new BifrostMCPAgentOriginalRequestID context key and the strengthened comments for internal-only keys align with the existing context-key conventions and should help prevent misuse from plugins or callers.


476-491: No action needed. BifrostError.ExtraFields already uses json:"extra_fields" (without omitempty), consistent with all other response struct types throughout the codebase (BifrostChatResponse, BifrostTextCompletionResponse, BifrostResponsesResponse, etc.). The test in plugins/governance/e2e_test.go (lines 1406–1407) directly accesses extra_fields from the error JSON, confirming the field is expected to be always present. This is not a behavioral change—it's the established pattern.

framework/configstore/migrations.go (2)

1092-1158: New MCP client columns migrations look safe

The migrations that add tools_to_auto_execute_json and is_code_mode_client to the config_mcp_clients table properly:

  • Check for column existence before adding
  • Initialize existing rows with sensible defaults ('[]' for tools_to_auto_execute_json, false for is_code_mode_client)
  • Provide rollback support
  • Follow idempotent patterns suitable for both SQLite and PostgreSQL

1351-1482: MCP client name normalization is necessary and correctly implemented

The migration addresses a backward-compatibility issue: current validation (validateMCPClientName) rejects names with hyphens and leading digits, but existing configs may have them. The normalization logic:

  • Converts hyphens/spaces to underscores and strips leading digits, making old configs valid
  • Leaves already-safe names unchanged (no unnecessary modifications)
  • Handles collisions robustly using an in-memory assignedNames map plus DB checks
  • Applies numeric suffixes (_2, _3, …) only to normalized names that collide

The in-memory map combined with database lookups ensures correctness during the transaction, preventing race conditions and accurately detecting duplicates across the migration. The GetMCPClientByName lookups and tool execution paths that depend on the name field will work correctly after normalization.

core/schemas/mux.go (1)

116-151: Tool-call conversion from Responses → Chat is correct; arguments defaulting is sensible

ResponsesToolMessage.ToChatAssistantMessageToolCall correctly maps CallID, Name, and Arguments into a ChatAssistantMessageToolCall, and defaulting Arguments to "{}" ensures JSON unmarshaling won't blow up when the field is absent. The method is used exclusively for generic function-style tool execution in ExecuteResponsesTool(), with comprehensive test coverage including round-trip conversion validation.

framework/configstore/tables/mcp.go (3)

13-35: LGTM! New MCP client fields properly integrated.

The new IsCodeModeClient and ToolsToAutoExecute/ToolsToAutoExecuteJSON fields follow the established patterns for JSON serialization in this table. The struct correctly separates persisted JSON columns from runtime fields using the gorm:"-" tag.


63-71: Serialization logic for ToolsToAutoExecute is correct.

The implementation properly handles both nil and non-nil cases, defaulting to "[]" when nil to ensure valid JSON is always stored. This matches the existing ToolsToExecute pattern.


102-106: Deserialization logic correctly mirrors the serialization.

The AfterFind hook properly deserializes ToolsToAutoExecuteJSON back to the runtime slice, consistent with the existing ToolsToExecuteJSON handling.

core/mcp/clientmanager.go (6)

21-36: LGTM! GetClients creates proper defensive copies.

Using maps.Copy to create a shallow copy of the ToolMap prevents data races when callers iterate over returned clients while the manager modifies internal state.


47-64: ReconnectClient correctly handles lock/unlock pattern for external calls.

The method properly releases the lock before calling connectToMCPClient (which handles its own locking), avoiding potential deadlocks. The config is copied before releasing the lock.


75-110: AddClient implements proper placeholder pattern with cleanup on failure.

The two-phase approach (create placeholder → unlock → connect → cleanup on failure) prevents holding locks during potentially slow network operations while ensuring consistency.


134-165: removeClientUnsafe properly cleans up all resources.

The method correctly stops health monitoring, cancels SSE contexts, closes connections, and clears the tool map before removing from the client map. Logging errors without failing allows cleanup to continue.


231-288: RegisterTool validates tool naming and prevents overwrites.

The validation for hyphens, spaces, and numeric prefixes enforces naming conventions. Checking for existing tools prevents silent overwrites which could cause confusion.


296-452: connectToMCPClient uses proper two-phase locking pattern.

The method correctly:

  1. Acquires lock to initialize/validate client entry
  2. Releases lock for heavy I/O operations (transport creation, initialization)
  3. Re-acquires lock to update final state
  4. Handles cleanup if client was removed during connection setup
  5. Registers SSE connection-lost callback for state updates
framework/configstore/rdb.go (6)

45-47: LGTM! New MCP tool manager config fields properly persisted.

The MCPAgentDepth, MCPToolExecutionTimeout, and MCPCodeModeBindingLevel fields are correctly mapped in both UpdateClientConfig and GetClientConfig.


739-773: GetMCPConfig properly constructs ToolManagerConfig with defaults.

The method correctly handles the case where no client config exists by providing sensible defaults (30s timeout, depth 10). The duration conversion from seconds stored in DB to time.Duration is correct.


803-813: CreateMCPClientConfig correctly handles new fields.

The IsCodeModeClient and ToolsToAutoExecute fields are properly included in the database record creation.


846-854: UpdateMCPClientConfig uses Select to handle zero-value boolean.

Good use of Select() to explicitly include is_code_mode_client even when false, since GORM's Updates() skips zero values by default.


1397-1470: DeleteVirtualKey implements thorough cascading cleanup.

The method properly:

  1. Collects budget/rate limit IDs from provider configs
  2. Deletes join table entries
  3. Deletes provider configs before their associated budgets/rate limits
  4. Deletes MCP configs
  5. Deletes the virtual key itself
  6. Finally deletes the key's own budget/rate limit

This ordering respects foreign key dependencies.


2180-2205: RetryOnNotFound is a useful resilience pattern.

The implementation correctly:

  • Respects context cancellation
  • Only retries on ErrNotFound/gorm.ErrRecordNotFound
  • Avoids sleeping after the last attempt
  • Returns the last error on exhaustion
core/mcp/agent_test.go (6)

11-47: MockLLMCaller properly implements call tracking for tests.

The mock correctly maintains separate counters for chat and responses APIs, and returns an error when responses are exhausted. This enables tests to verify expected call counts.


75-149: TestHasToolCallsForChatResponse covers key edge cases.

The test correctly validates:

  • Nil response handling
  • Empty choices
  • tool_calls finish reason detection
  • Tool calls presence in message
  • stop finish reason should return false even with tool calls present

201-248: TestExecuteAgentForChatRequest validates no-op path.

The test confirms that responses without tool calls are returned unchanged without additional LLM calls, which is the expected behavior.


250-318: TestExecuteAgentForChatRequest_WithNonAutoExecutableTools validates early return.

Good test case verifying that non-auto-executable tools don't trigger additional LLM calls and are returned to the caller for handling.


487-538: Conversion tests verify round-trip integrity.

The tests for ToChatAssistantMessageToolCall and ToResponsesToolMessage properly validate that data survives format conversions, including edge cases like nil arguments defaulting to "{}".


613-681: TestExecuteAgentForResponsesRequest_ConversionRoundTrip is thorough.

This end-to-end conversion test validates that tool call IDs, names, arguments, and result outputs are preserved through the full Responses → Chat → Responses conversion cycle.

core/mcp/health_monitor.go (5)

12-17: Health check constants are reasonable defaults.

10-second interval, 5-second timeout, and 5 consecutive failures before marking unhealthy provide a good balance between responsiveness and tolerance for transient issues.


55-70: Start method properly initializes monitoring.

The method correctly checks if already monitoring, creates a cancellable context, initializes the ticker, and launches the background goroutine.


103-139: performHealthCheck has correct failure counting logic.

The method properly:

  1. Stops monitoring if client no longer exists
  2. Marks disconnected immediately if connection is nil
  3. Uses timeout context for ping
  4. Only marks disconnected after reaching max consecutive failures
  5. Resets failure count on successful ping

141-161: updateClientState avoids redundant state change logs.

Good optimization to only log when state actually changes, reducing log noise during normal operation.


197-220: HealthMonitorManager correctly handles monitor lifecycle.

StartMonitoring properly stops any existing monitor before starting a new one for the same client, preventing resource leaks. StopMonitoring cleans up and removes from the map.

core/schemas/mcp.go (4)

14-23: MCPConfig properly extended with tool manager config and request ID hook.

The ToolManagerConfig field enables runtime configuration of tool execution parameters. The FetchNewRequestIDFunc hook allows tracking individual tool call results in agent mode, which is useful for observability and plugin integration.


25-42: MCPToolManagerConfig and CodeModeBindingLevel provide clear configuration options.

The struct captures essential tool execution parameters. The CodeModeBindingLevel enum with "server" and "tool" values provides flexibility in how tools are exposed in the VFS for code execution.


44-67: MCPClientConfig enhancements are well-documented.

The added IsCodeModeClient, ToolsToAutoExecute, and updated InProcessServer fields extend the client configuration capabilities. The semantics comments for ToolsToAutoExecute clearly explain the wildcard, empty, and specific tool behaviors.


94-111: MCPClientState and MCPClientConnectionInfo provide complete client tracking.

These types properly capture:

  • Connection state and active client reference
  • Tool filtering configuration
  • Available tools mapping
  • Connection metadata for management UI
  • Cancel function for SSE cleanup (correctly excluded from JSON)
core/mcp/mcp.go (5)

18-32: Constants provide clear configuration and context key definitions.

The versioning, naming, and timeout constants establish consistent behavior. The context keys with documented semantics enable per-request filtering of clients and tools.


38-49: MCPManager struct properly encapsulates MCP integration state.

The struct includes all necessary components: context, tools manager, local server, client map with mutex protection, and health monitor manager. The separation of concerns is clear.


68-94: NewMCPManager properly initializes with defaults and error handling.

The constructor:

  1. Sets logger for the package
  2. Applies default tool manager config if not provided
  3. Initializes client map and health monitor manager
  4. Creates tools manager with config
  5. Adds configured clients (logging warnings on failure without blocking)

180-201: CheckAndExecuteAgentForChatRequest validates input and delegates appropriately.

The method correctly:

  1. Validates that makeReq is provided
  2. Short-circuits if no tool calls are present
  3. Delegates to the tools manager for actual agent execution

261-288: Cleanup method is thorough and handles all resource types.

The cleanup properly:

  1. Stops all health monitors first
  2. Acquires lock for client cleanup
  3. Removes all clients (which handles SSE cancel, connection close)
  4. Clears client map
  5. Clears local server reference

The order ensures health monitors don't try to access clients being cleaned up.

core/bifrost.go (1)

183-188: LGTM - Thread-safe MCP initialization.

The sync.Once pattern correctly ensures the MCP manager is initialized only once, even if called concurrently. The initialization is properly guarded and logs success.

core/mcp/agent_adaptors.go (1)

25-64: Well-designed adapter interface for API-agnostic agent execution.

The agentAPIAdapter interface cleanly abstracts the differences between Chat and Responses APIs, enabling the agent loop to remain format-neutral. The documentation clearly explains the conversion boundaries and design rationale.

core/mcp/toolmanager.go (5)

386-398: Good timeout handling for tool execution.

The code correctly creates a timeout context for tool execution and distinguishes between timeout errors (context.DeadlineExceeded) and other execution errors. The defer pattern ensures the context is cancelled properly.


84-132: LGTM - Tool aggregation with proper deduplication.

GetAvailableTools correctly aggregates tools from all clients, deduplicates by name, and conditionally adds code-mode tools when a code-mode client is present. The seenToolNames map ensures no duplicate tools are included.


338-356: Tool permission validation ensures security.

The code correctly validates that the requested tool is available and permitted before execution. The nested loop could be optimized to O(1) with a pre-built map, but this is acceptable for typical tool counts.


481-497: Agent execution wires dependencies correctly.

The ExecuteAgentForChatRequest method properly delegates to the shared agent execution logic, passing the manager's configuration (maxAgentDepth), request ID function, tool executor, and client manager. The pointer-to-context pattern was addressed in previous commits.


530-557: Thread-safe configuration management.

The UpdateConfig method correctly uses atomic operations for thread-safe updates, and GetCodeModeBindingLevel includes a nil check for robustness. The logging provides good observability for configuration changes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
framework/configstore/rdb.go (1)

705-777: Handle zero/empty MCP ToolManagerConfig values to avoid bad defaults

GetMCPConfig currently reads MCPToolExecutionTimeout, MCPAgentDepth, and MCPCodeModeBindingLevel directly from TableClientConfig and uses them as‑is in MCPToolManagerConfig. For existing DB rows created before these columns existed, these fields can be zero/empty even though the intended logical defaults are 10, 30s, and "server". That can lead to:

  • ToolExecutionTimeout of 0s (effectively immediate timeout).
  • MaxAgentDepth of 0.
  • An empty CodeModeBindingLevel, even in the “no client config” fallback path where you currently don’t set it at all.

A previous review already called out this risk when the GORM defaults were added to TableClientConfig without a backfill. The migrations in this PR still don’t backfill existing rows, so you need runtime guards here.

A minimal fix is to apply sane defaults whenever these fields are unset, in both the “no client config row” and “row exists but values are zero/empty” cases:

Proposed fix: apply runtime MCP defaults in GetMCPConfig
@@ func (s *RDBConfigStore) GetMCPConfig(ctx context.Context) (*schemas.MCPConfig, error) {
-	var clientConfig tables.TableClientConfig
-	if err := s.db.WithContext(ctx).First(&clientConfig).Error; err != nil {
-		if errors.Is(err, gorm.ErrRecordNotFound) {
-			// Return MCP config with default ToolManagerConfig if no client config exists
-			// This will never happen, but just in case.
-			return &schemas.MCPConfig{
-				ClientConfigs: clientConfigs,
-				ToolManagerConfig: &schemas.MCPToolManagerConfig{
-					ToolExecutionTimeout: 30 * time.Second, // default from TableClientConfig
-					MaxAgentDepth:        10,               // default from TableClientConfig
-				},
-			}, nil
-		}
-		return nil, err
-	}
-	toolManagerConfig := schemas.MCPToolManagerConfig{
-		ToolExecutionTimeout: time.Duration(clientConfig.MCPToolExecutionTimeout) * time.Second,
-		MaxAgentDepth:        clientConfig.MCPAgentDepth,
-		CodeModeBindingLevel: schemas.CodeModeBindingLevel(clientConfig.MCPCodeModeBindingLevel),
-	}
+	const (
+		defaultMCPAgentDepth           = 10
+		defaultMCPToolExecutionTimeout = 30 // seconds
+		defaultMCPCodeModeBindingLevel = "server"
+	)
+
+	var clientConfig tables.TableClientConfig
+	if err := s.db.WithContext(ctx).First(&clientConfig).Error; err != nil {
+		if errors.Is(err, gorm.ErrRecordNotFound) {
+			// No client config row yet: fall back to hard-coded MCP defaults.
+			return &schemas.MCPConfig{
+				ClientConfigs: clientConfigs,
+				ToolManagerConfig: &schemas.MCPToolManagerConfig{
+					ToolExecutionTimeout: time.Duration(defaultMCPToolExecutionTimeout) * time.Second,
+					MaxAgentDepth:        defaultMCPAgentDepth,
+					CodeModeBindingLevel: schemas.CodeModeBindingLevel(defaultMCPCodeModeBindingLevel),
+				},
+			}, nil
+		}
+		return nil, err
+	}
+
+	// Normalize zero/empty values from existing rows to the same defaults.
+	timeout := clientConfig.MCPToolExecutionTimeout
+	if timeout <= 0 {
+		timeout = defaultMCPToolExecutionTimeout
+	}
+	depth := clientConfig.MCPAgentDepth
+	if depth <= 0 {
+		depth = defaultMCPAgentDepth
+	}
+	binding := clientConfig.MCPCodeModeBindingLevel
+	if binding == "" {
+		binding = defaultMCPCodeModeBindingLevel
+	}
+
+	toolManagerConfig := schemas.MCPToolManagerConfig{
+		ToolExecutionTimeout: time.Duration(timeout) * time.Second,
+		MaxAgentDepth:        depth,
+		CodeModeBindingLevel: schemas.CodeModeBindingLevel(binding),
+	}
@@
-	return &schemas.MCPConfig{
-		ClientConfigs:     clientConfigs,
-		ToolManagerConfig: &toolManagerConfig,
-	}, nil
+	return &schemas.MCPConfig{
+		ClientConfigs:     clientConfigs,
+		ToolManagerConfig: &toolManagerConfig,
+	}, nil
 }

This keeps behavior consistent across fresh installs and upgraded databases without relying on GORM’s column defaults to backfill existing rows.

🧹 Nitpick comments (5)
plugins/governance/changelog.md (1)

1-3: Changelog entries look appropriate for governance plugin changes.

The three entries clearly document the key improvements: extracting the governance store into an interface for extensibility, extending rate limit handling, and adding e2e tests. The format and conventions align with standard changelog practices.

However, I notice the PR description mentions "MCP gateway with code mode" as a primary objective. If there are governance-related changes to support MCP tooling execution, rate limits, or budget enforcement for MCP operations, consider clarifying entry 2 slightly—e.g., "extended rate limit handling for MCP tool execution" or similar—to make the scope of that refactor more transparent.

framework/configstore/clientconfig.go (1)

8-8: ClientConfig MCP fields and hash integration look correct (consider centralizing defaults)

Including MCPAgentDepth, MCPToolExecutionTimeout, and MCPCodeModeBindingLevel in both the struct and GenerateClientConfigHash (with explicit fallback encodings) gives deterministic change detection and matches the DB schema. As an optional cleanup, you may want to centralize the MCP defaults (10, 30, "server") into shared constants so the struct tags, hash logic, migrations, and GetMCPConfig all refer to the same source of truth.

Also applies to: 38-55, 105-121

framework/configstore/migrations.go (1)

6-10: MCP migrations and client-name normalization are sound; keep defaults centralized

The new MCP-related migrations and normalization utilities look consistent:

  • migrationAddToolsToAutoExecuteJSONColumn and migrationAddIsCodeModeClientColumn add columns on config_mcp_clients and sensibly backfill existing rows ('[]' and false respectively).
  • migrationAddMCPAgentDepthAndMCPToolExecutionTimeoutColumns and migrationAddMCPCodeModeBindingLevelColumn follow the existing pattern of adding columns without backfill; with the runtime defaults wired into GetMCPConfig this is acceptable.
  • normalizeMCPClientName plus migrationNormalizeMCPClientNames correctly:
    • replace hyphens/spaces with underscores,
    • strip leading digits,
    • and ensure uniqueness via numeric suffixes while logging each transformation, using an in-memory assignedNames map plus DB lookups to avoid collisions.

The ordering in triggerMigrations (adding MCP columns before normalization and before config-hash migrations) is also coherent. As with the runtime code, consider centralizing the MCP defaults (10, 30, "server") to avoid divergence between tags, migrations, and GetMCPConfig, but the current behavior is functionally correct.

Also applies to: 83-86, 98-106, 1286-1352, 1354-1485

plugins/governance/main.go (1)

40-47: Governance plugin DI refactor and BaseGovernancePlugin interface look solid

The introduction of BaseGovernancePlugin, the switch to a GovernanceStore interface field, and the split between Init (builds a LocalGovernanceStore) and InitFromStore (injects a custom store) all cleanly improve testability and separation of concerns:

  • GovernancePlugin now clearly separates data access (store), decision engine (resolver), and business logic (tracker).
  • Both constructors share the same wiring pattern (resolver → tracker → optional startup resets) and propagate configStore, modelCatalog, and inMemoryStore consistently.
  • GetGovernanceStore now exposes the interface type directly, which matches the rest of the plugin ecosystem.

If you ever need tighter control over background work cancellation, you might consider passing the plugin’s derived ctx into NewUsageTracker rather than the outer ctx, but that’s an optional refinement rather than a blocker.

Also applies to: 49-69, 71-164, 166-224, 673-675

framework/configstore/rdb.go (1)

1399-1473: Add DB-level constraints to enforce 1:1 budget/rate-limit invariants

The cascade deletion logic in DeleteVirtualKey, DeleteVirtualKeyProviderConfig, DeleteTeam, and DeleteCustomer correctly assumes that each budget and rate limit is owned by exactly one parent entity. However, this invariant is not enforced at the database level—multiple parents can reference the same BudgetID or RateLimitID. If that occurs (whether by accident or future changes), these deletions will remove budgets/rate limits while other rows still reference them, causing dangling foreign keys and inconsistent governance state.

Consider:

  • Adding UNIQUE constraints at the database schema level to prevent the same budget/rate-limit ID from being assigned to more than one owner, or
  • Adding assertions/tests that verify no budget or rate limit is referenced by more than one parent before deletion

This would harden the cascades against accidental future changes that might introduce sharing.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bca600b and 0a6427c.

📒 Files selected for processing (22)
  • core/schemas/mux.go
  • framework/configstore/clientconfig.go
  • framework/configstore/migrations.go
  • framework/configstore/rdb.go
  • framework/configstore/store.go
  • framework/configstore/tables/clientconfig.go
  • plugins/governance/changelog.md
  • plugins/governance/main.go
  • transports/bifrost-http/handlers/config.go
  • transports/bifrost-http/handlers/inference.go
  • transports/bifrost-http/handlers/mcp.go
  • transports/bifrost-http/lib/config.go
  • transports/bifrost-http/server/server.go
  • transports/changelog.md
  • transports/config.schema.json
  • ui/app/workspace/config/views/clientSettingsView.tsx
  • ui/app/workspace/config/views/securityView.tsx
  • ui/app/workspace/logs/views/columns.tsx
  • ui/components/sidebar.tsx
  • ui/lib/constants/logs.ts
  • ui/lib/types/config.ts
  • ui/lib/types/schemas.ts
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • framework/configstore/clientconfig.go
  • framework/configstore/store.go
  • framework/configstore/tables/clientconfig.go
  • framework/configstore/rdb.go
  • framework/configstore/migrations.go
  • core/schemas/mux.go
  • plugins/governance/main.go
  • plugins/governance/changelog.md
🧠 Learnings (2)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • framework/configstore/clientconfig.go
  • framework/configstore/store.go
  • framework/configstore/tables/clientconfig.go
  • framework/configstore/rdb.go
  • framework/configstore/migrations.go
  • core/schemas/mux.go
  • plugins/governance/main.go
📚 Learning: 2025-12-22T10:50:40.990Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1154
File: plugins/governance/store.go:1165-1186
Timestamp: 2025-12-22T10:50:40.990Z
Learning: In the Bifrost governance plugin, budgets and rate limits have 1:1 relationships with their parent entities (virtual keys, teams, customers). Do not assume sharing; ensure cascade deletion logic only deletes budgets/rate limits when there are no shared references. Enforce invariants in code and add tests to verify no cross-entity sharing and that cascade deletes only remove the specific child of the parent. If a counterexample arises, adjust data model or add guards.

Applied to files:

  • plugins/governance/main.go
🧬 Code graph analysis (6)
framework/configstore/clientconfig.go (2)
framework/configstore/tables/config.go (1)
  • GlobalHeaderFilterConfig (46-49)
ui/lib/types/config.ts (1)
  • GlobalHeaderFilterConfig (304-307)
framework/configstore/store.go (1)
framework/configstore/tables/ratelimit.go (2)
  • TableRateLimit (11-36)
  • TableRateLimit (39-39)
framework/configstore/rdb.go (6)
core/schemas/mcp.go (3)
  • MCPConnectionType (70-70)
  • MCPConfig (14-23)
  • CodeModeBindingLevel (37-37)
framework/configstore/tables/clientconfig.go (2)
  • TableClientConfig (11-43)
  • TableClientConfig (46-46)
framework/configstore/tables/mcp.go (2)
  • TableMCPClient (12-36)
  • TableMCPClient (39-39)
framework/configstore/tables/virtualkey.go (6)
  • TableVirtualKey (95-122)
  • TableVirtualKey (125-125)
  • TableVirtualKeyProviderConfig (23-36)
  • TableVirtualKeyProviderConfig (39-41)
  • TableVirtualKeyMCPConfig (81-87)
  • TableVirtualKeyMCPConfig (90-92)
framework/configstore/tables/ratelimit.go (2)
  • TableRateLimit (11-36)
  • TableRateLimit (39-39)
framework/configstore/tables/team.go (2)
  • TableTeam (12-38)
  • TableTeam (41-41)
framework/configstore/migrations.go (2)
framework/configstore/tables/mcp.go (2)
  • TableMCPClient (12-36)
  • TableMCPClient (39-39)
framework/configstore/tables/clientconfig.go (2)
  • TableClientConfig (11-43)
  • TableClientConfig (46-46)
core/schemas/mux.go (3)
core/schemas/responses.go (6)
  • ResponsesToolMessage (473-493)
  • ResponsesMessage (321-334)
  • ResponsesMessageTypeFunctionCallOutput (303-303)
  • ResponsesToolMessageOutputStruct (542-546)
  • ResponsesMessageContentBlock (406-421)
  • ResponsesMessageContent (346-351)
core/schemas/chatcompletions.go (4)
  • ChatAssistantMessageToolCall (720-725)
  • ChatAssistantMessageToolCallFunction (728-731)
  • ChatMessage (487-496)
  • ChatToolMessage (654-656)
core/schemas/utils.go (1)
  • Ptr (16-18)
plugins/governance/main.go (5)
core/schemas/bifrost.go (1)
  • BifrostRequest (167-188)
core/schemas/plugin.go (1)
  • PluginShortCircuit (6-10)
plugins/governance/store.go (1)
  • GovernanceStore (50-72)
plugins/governance/resolver.go (2)
  • BudgetResolver (64-67)
  • NewBudgetResolver (70-75)
plugins/governance/tracker.go (2)
  • UsageTracker (33-45)
  • NewUsageTracker (52-66)
🪛 GitHub Actions: PR #1153
framework/configstore/rdb.go

[error] 744-744: unknown field IsCodeModeClient in struct literal of type schemas.MCPClientConfig

🔇 Additional comments (7)
core/schemas/mux.go (3)

130-150: LGTM - Clean conversion with sensible defaults.

The method handles nil receiver appropriately and provides a sensible default of "{}" for arguments to ensure valid JSON. The implementation correctly preserves CallID and Name as pointer fields, allowing nil values to propagate, which aligns with the target struct's optional field semantics.


418-431: LGTM - Correct role handling for function_call_output.

The change correctly omits the Role field for function_call_output messages, which aligns with the Responses API specification where tool call outputs don't include a role field.


446-518: LGTM - Proper content routing for function_call_output messages.

The logic correctly bypasses Content population for function_call_output messages, routing the content to the ResponsesToolMessage.Output field instead. The content block conversion for other message types is comprehensive, handling Text, Image, File, and Audio types with proper field mapping.

framework/configstore/store.go (1)

7-7: ConfigStore interface extensions look consistent

The added GetRateLimits and RetryOnNotFound methods, plus the time import, line up with the RDB implementation and overall configstore usage. No interface-level issues from this change; the main behavior concerns are in the RetryOnNotFound implementation in rdb.go (commented there).

Also applies to: 103-105, 151-152

framework/configstore/tables/clientconfig.go (1)

19-27: MCP fields on TableClientConfig are well-shaped

The three MCP-related columns (MCPAgentDepth, MCPToolExecutionTimeout, MCPCodeModeBindingLevel) have appropriate GORM defaults and JSON tags and match the runtime ClientConfig representation. No table‑schema issues from this change.

framework/configstore/rdb.go (2)

31-57: MCP fields are correctly persisted and re-hydrated

UpdateClientConfig and GetClientConfig now round‑trip the MCP fields (MCPAgentDepth, MCPToolExecutionTimeout, MCPCodeModeBindingLevel) between ClientConfig and TableClientConfig as expected, so the basic plumbing for these settings looks sound. The remaining behavioral nuance for defaults is addressed in the GetMCPConfig comment.

Also applies to: 181-207


1828-1835: GetRateLimits implementation is straightforward and symmetric with other getters

GetRateLimits mirrors the existing GetBudgets/GetModelPrices patterns (simple Find with ErrRecordNotFound treated as error) and correctly returns an empty slice vs. nil error behavior consistent with the rest of the store.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @core/internal/testutil/file_base64.go:
- Line 207: Replace the direct call to FileInputResponsesRetryConfig() with the
standard pattern: call GetTestRetryConfigForScenario("FileInput", testConfig) to
obtain the generic retry config, then construct a ResponsesRetryConfig value
with Conditions set to an empty slice and copy MaxAttempts, BaseDelay, MaxDelay,
OnRetry, and OnFinalFail from the generic config; essentially replace the
FileInputResponsesRetryConfig() usage with this two-step creation so it matches
the Chat test pattern.
🧹 Nitpick comments (2)
core/internal/testutil/file_base64.go (1)

81-233: Consider extracting shared test setup logic.

The Chat and Responses tests have significant code duplication:

  • Retry context setup (lines 100-116 vs 180-196) differs only in ScenarioName
  • Expectations setup (lines 119-125 vs 199-205) is identical
  • Validation pattern (lines 155-156 vs 228-229) is identical

While test code duplication is acceptable for clarity, extracting shared setup into helper functions (e.g., setupFileBase64RetryContext, setupFileBase64Expectations) would improve maintainability if the validation logic or expectations evolve.

core/internal/testutil/test_retry_conditions.go (1)

982-1150: Optional: Consider extracting common phrase lists to reduce duplication.

While the current implementation is correct and well-structured, the phrase lists are duplicated between Chat Completions and Responses API conditions (e.g., fileProcessingFailurePhrases at lines 1020-1040 vs lines 390-410, genericPhrases at lines 1071-1082 vs lines 446-457).

Potential refactoring to reduce duplication

Extract shared phrase lists to package-level variables:

+// Shared phrase lists for retry conditions
+var (
+	fileProcessingFailurePhrases = []string{
+		"i can't read",
+		"i cannot read",
+		// ... rest of phrases
+	}
+	
+	genericAIPhrases = []string{
+		"as an ai",
+		"as a language model",
+		// ... rest of phrases
+	}
+)

 func (c *FileNotProcessedCondition) ShouldRetry(...) (bool, string) {
-	fileProcessingFailurePhrases := []string{
-		"i can't read",
-		...
-	}
	
	for _, phrase := range fileProcessingFailurePhrases {

This would ensure consistency when updating phrase lists in the future.

Note: This is a minor optimization and can be deferred. The current implementation is maintainable as-is.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 217e276 and c94be44.

⛔ Files ignored due to path filters (10)
  • examples/plugins/hello-world/go.sum is excluded by !**/*.sum
  • framework/go.sum is excluded by !**/*.sum
  • plugins/governance/go.sum is excluded by !**/*.sum
  • plugins/jsonparser/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/maxim/go.sum is excluded by !**/*.sum
  • plugins/mocker/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/semanticcache/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/file_url.go
  • core/internal/testutil/test_retry_conditions.go
  • core/internal/testutil/test_retry_framework.go
  • examples/plugins/hello-world/go.mod
  • framework/go.mod
  • plugins/governance/go.mod
  • plugins/jsonparser/go.mod
  • plugins/logging/go.mod
  • plugins/maxim/go.mod
  • plugins/mocker/go.mod
  • plugins/otel/go.mod
  • plugins/semanticcache/go.mod
  • plugins/telemetry/go.mod
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/internal/testutil/file_url.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/test_retry_conditions.go
🧠 Learnings (3)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/test_retry_conditions.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/test_retry_conditions.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.

Applied to files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/test_retry_conditions.go
🧬 Code graph analysis (3)
core/internal/testutil/test_retry_framework.go (1)
core/internal/testutil/test_retry_conditions.go (4)
  • ResponsesEmptyCondition (989-989)
  • ResponsesFileNotProcessedCondition (1010-1010)
  • ResponsesGenericResponseCondition (1061-1061)
  • ResponsesContentValidationCondition (1108-1108)
core/internal/testutil/file_base64.go (3)
core/internal/testutil/test_retry_framework.go (6)
  • GetTestRetryConfigForScenario (1186-1222)
  • TestRetryContext (174-179)
  • ChatRetryConfig (192-199)
  • ChatRetryCondition (126-129)
  • WithChatTestRetry (290-440)
  • WithResponsesTestRetry (443-593)
core/internal/testutil/utils.go (3)
  • GetErrorMessage (646-679)
  • GetChatContent (368-395)
  • GetResponsesContent (414-459)
core/internal/testutil/validation_presets.go (2)
  • GetExpectationsForScenario (256-350)
  • ModifyExpectationsForProvider (357-404)
core/internal/testutil/test_retry_conditions.go (3)
core/schemas/responses.go (1)
  • BifrostResponsesResponse (43-83)
core/internal/testutil/test_retry_framework.go (1)
  • TestRetryContext (174-179)
core/internal/testutil/utils.go (1)
  • GetResponsesContent (414-459)
🔇 Additional comments (9)
core/internal/testutil/file_base64.go (4)

13-14: LGTM!

Clear and helpful documentation for the test constant.


67-79: LGTM!

Clean refactoring that delegates to per-API test functions while preserving the public API surface.


81-160: LGTM!

The Chat Completions test correctly follows the established testutil retry config pattern: obtaining a generic retry config via GetTestRetryConfigForScenario, then constructing a ChatRetryConfig with an empty Conditions slice and copying only the appropriate fields.

Based on learnings, this is the correct pattern for testutil tests.


235-262: LGTM!

The validation helper provides comprehensive checks for PDF document processing with clear error messages and appropriate fallback logic.

core/internal/testutil/test_retry_framework.go (1)

852-868: LGTM! Clean implementation following established patterns.

The new FileInputResponsesRetryConfig() correctly mirrors FileInputRetryConfig() (lines 834-850) for the Responses API, maintaining consistency in retry behavior across both APIs. The configuration is appropriate for file/document processing scenarios with comprehensive condition coverage.

core/internal/testutil/test_retry_conditions.go (4)

983-1007: LGTM! Clean adaptation of empty response detection for Responses API.

The section header and ResponsesEmptyCondition implementation correctly:

  • Documents the Responses API condition group
  • Implements ResponsesRetryCondition interface with proper signature
  • Uses GetResponsesContent() helper for consistent content extraction
  • Simplifies logic appropriately (no tool call checking needed for file input scenarios)

1009-1058: LGTM! Comprehensive file processing failure detection.

The condition correctly identifies file/document processing failures by:

  • Checking for 20 common failure phrases (matching FileNotProcessedCondition)
  • Validating content length (< 15 chars triggers retry)
  • Using consistent content extraction via GetResponsesContent()

The phrase list covers typical failure scenarios including access errors, format issues, and missing file indicators.


1060-1105: LGTM! Effective generic response detection.

The condition appropriately identifies template/generic AI responses through:

  • Priority checking for responses starting with generic phrases
  • Secondary checking for short responses (<30 chars) containing generic phrases
  • 10 common generic AI phrases covering typical non-engaging responses

Logic mirrors GenericResponseCondition for consistency across APIs.


1107-1150: LGTM! Focused content validation for file/document scenarios.

The condition is appropriately simplified for Responses API file input:

  • Scoped to file/document/pdf scenarios only (line 1124)
  • Validates presence of expected keywords from TestMetadata
  • Reasonable content length bounds (20-2000 chars) prevent false positives
  • Triggers retry when ALL expected keywords are missing

The simplification compared to ContentValidationCondition (no vision handling) is intentional and correct for the file input use case.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
core/internal/testutil/file_base64.go (1)

81-233: Consider extracting common setup logic.

Both Chat and Responses test functions share similar retry context and expectations setup code. While the per-API isolation is valuable for clarity, extracting the common setup into a helper function could reduce duplication and improve maintainability.

Example: Shared setup helper
// setupFileBase64Context creates common retry context and expectations for file base64 tests
func setupFileBase64Context(testConfig ComprehensiveTestConfig, scenarioName string) (TestRetryContext, Expectations) {
	retryContext := TestRetryContext{
		ScenarioName: scenarioName,
		ExpectedBehavior: map[string]interface{}{
			"should_process_pdf":     true,
			"should_read_document":   true,
			"should_extract_content": true,
			"document_understanding": true,
		},
		TestMetadata: map[string]interface{}{
			"provider":          testConfig.Provider,
			"model":             testConfig.ChatModel,
			"file_type":         "pdf",
			"encoding":          "base64",
			"test_content":      "Hello World!",
			"expected_keywords": []string{"hello", "world", "pdf", "document"},
		},
	}
	
	expectations := GetExpectationsForScenario("FileInput", testConfig, map[string]interface{}{})
	expectations = ModifyExpectationsForProvider(expectations, testConfig.Provider)
	expectations.ShouldContainKeywords = append(expectations.ShouldContainKeywords, "hello", "world")
	expectations.ShouldNotContainWords = append(expectations.ShouldNotContainWords, []string{
		"cannot process", "invalid format", "decode error",
		"unable to read", "no file", "corrupted", "unsupported",
	}...)
	
	return retryContext, expectations
}

Then use it in both tests to reduce duplication.

core/internal/testutil/test_retry_conditions.go (1)

1107-1146: Harden expected_keywords parsing against []interface{} payloads (prevents silent non-retries). Some callers may set TestMetadata["expected_keywords"] as a generic slice, which will currently bypass validation.

Proposed diff
 func (c *ResponsesContentValidationCondition) ShouldRetry(response *schemas.BifrostResponsesResponse, err *schemas.BifrostError, context TestRetryContext) (bool, string) {
 	if err != nil || response == nil {
 		return false, ""
 	}

 	content := strings.ToLower(GetResponsesContent(response))

 	// Skip if response is too short (other conditions will handle these)
 	if len(content) < 10 {
 		return false, ""
 	}

 	// Check for file/document processing scenarios
 	scenarioName := strings.ToLower(context.ScenarioName)
 	if strings.Contains(scenarioName, "file") || strings.Contains(scenarioName, "document") || strings.Contains(scenarioName, "pdf") {
-		// Check if this test has expected keywords from the TestRetryContext
-		if testMetadata, exists := context.TestMetadata["expected_keywords"]; exists {
-			if expectedKeywords, ok := testMetadata.([]string); ok && len(expectedKeywords) > 0 {
+		// Check if this test has expected keywords from the TestRetryContext
+		var expectedKeywords []string
+		if context.TestMetadata != nil {
+			if v, exists := context.TestMetadata["expected_keywords"]; exists {
+				switch vv := v.(type) {
+				case []string:
+					expectedKeywords = vv
+				case []interface{}:
+					for _, item := range vv {
+						if s, ok := item.(string); ok && strings.TrimSpace(s) != "" {
+							expectedKeywords = append(expectedKeywords, s)
+						}
+					}
+				}
+			}
+		}
+		if len(expectedKeywords) > 0 {
 				// Check if ANY of the expected keywords are present
 				foundExpectedKeyword := false
 				for _, keyword := range expectedKeywords {
 					if strings.Contains(content, strings.ToLower(keyword)) {
 						foundExpectedKeyword = true
 						break
 					}
 				}

 				// If valid response but missing ALL expected keywords, retry
 				if !foundExpectedKeyword && len(content) > 20 && len(content) < 2000 {
 					return true, fmt.Sprintf("response missing expected keywords %v, might include them on retry", expectedKeywords)
 				}
-			}
 		}
 	}

 	return false, ""
 }
core/internal/testutil/test_retry_framework.go (1)

852-868: Consider adding OnFinalFail to FileInputResponsesRetryConfig for parity/debuggability. Today a final failure here may log less context than other configs that set OnFinalFail.

Proposed diff
 func FileInputResponsesRetryConfig() ResponsesRetryConfig {
 	return ResponsesRetryConfig{
 		MaxAttempts: 10,
 		BaseDelay:   2000 * time.Millisecond,
 		MaxDelay:    10 * time.Second,
 		Conditions: []ResponsesRetryCondition{
 			&ResponsesEmptyCondition{},
 			&ResponsesFileNotProcessedCondition{},
 			&ResponsesGenericResponseCondition{},
 			&ResponsesContentValidationCondition{},
 		},
 		OnRetry: func(attempt int, reason string, t *testing.T) {
 			t.Logf("🔄 Retrying file input test (attempt %d): %s", attempt, reason)
 		},
+		OnFinalFail: func(attempts int, finalErr error, t *testing.T) {
+			t.Logf("❌ File input (Responses API) test failed after %d attempts: %v", attempts, finalErr)
+		},
 	}
 }
core/internal/testutil/file_url.go (2)

76-80: Skip logic is correct (Chat Completions only) + consider reusing bfCtx across retry attempts. Creating a new BifrostContext per attempt can be unnecessary overhead if it spawns cancellation watchers; reusing one per subtest is typically enough. Based on learnings, skipping OpenAI/OpenRouter only for Chat Completions is the right behavior.

Proposed diff
 	response, chatError := WithChatTestRetry(t, chatRetryConfig, retryContext, expectations, "FileURL", func() (*schemas.BifrostChatResponse, *schemas.BifrostError) {
-		bfCtx := schemas.NewBifrostContext(ctx, schemas.NoDeadline)
+		// Consider lifting this bfCtx outside the retry closure (once per subtest) if safe.
+		bfCtx := schemas.NewBifrostContext(ctx, schemas.NoDeadline)
 		chatReq := &schemas.BifrostChatRequest{
 			Provider: testConfig.Provider,
 			Model:    testConfig.ChatModel,
 			Input:    chatMessages,
 			Params: &schemas.ChatParameters{
 				MaxCompletionTokens: bifrost.Ptr(500),
 			},
 			Fallbacks: testConfig.Fallbacks,
 		}
 		return client.ChatCompletionRequest(bfCtx, chatReq)
 	})

Also applies to: 127-139


170-214: Responses retry config is now standardized; consider inheriting base timing/callbacks from scenario config (optional). If the suite expects MaxAttempts/BaseDelay/MaxDelay (or OnFinalFail) to be centrally controlled via GetTestRetryConfigForScenario("FileInput", testConfig), you may want to copy those fields into responsesRetryConfig while keeping the Responses-specific Conditions.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 217e276 and c94be44.

⛔ Files ignored due to path filters (10)
  • examples/plugins/hello-world/go.sum is excluded by !**/*.sum
  • framework/go.sum is excluded by !**/*.sum
  • plugins/governance/go.sum is excluded by !**/*.sum
  • plugins/jsonparser/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/maxim/go.sum is excluded by !**/*.sum
  • plugins/mocker/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/semanticcache/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
📒 Files selected for processing (14)
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/file_url.go
  • core/internal/testutil/test_retry_conditions.go
  • core/internal/testutil/test_retry_framework.go
  • examples/plugins/hello-world/go.mod
  • framework/go.mod
  • plugins/governance/go.mod
  • plugins/jsonparser/go.mod
  • plugins/logging/go.mod
  • plugins/maxim/go.mod
  • plugins/mocker/go.mod
  • plugins/otel/go.mod
  • plugins/semanticcache/go.mod
  • plugins/telemetry/go.mod
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/test_retry_conditions.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/file_url.go
🧠 Learnings (4)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/test_retry_conditions.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/file_url.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/test_retry_conditions.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/file_url.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.

Applied to files:

  • core/internal/testutil/test_retry_framework.go
  • core/internal/testutil/test_retry_conditions.go
  • core/internal/testutil/file_base64.go
  • core/internal/testutil/file_url.go
📚 Learning: 2026-01-03T13:25:22.368Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1191
File: core/internal/testutil/file_url.go:76-80
Timestamp: 2026-01-03T13:25:22.368Z
Learning: In tests for OpenAI and OpenRouter providers, document/implement that file URLs are not supported in the Chat Completions API but are supported in the Responses API. The test file core/internal/testutil/file_url.go should skip file URL tests only for Chat Completions when these providers are used, and not skip for the Responses API. This ensures test coverage remains correct for provider capabilities and avoids false positives/negatives in Chat Completions scenarios.

Applied to files:

  • core/internal/testutil/file_url.go
🧬 Code graph analysis (4)
core/internal/testutil/test_retry_framework.go (1)
core/internal/testutil/test_retry_conditions.go (4)
  • ResponsesEmptyCondition (989-989)
  • ResponsesFileNotProcessedCondition (1010-1010)
  • ResponsesGenericResponseCondition (1061-1061)
  • ResponsesContentValidationCondition (1108-1108)
core/internal/testutil/test_retry_conditions.go (4)
core/schemas/responses.go (1)
  • BifrostResponsesResponse (43-83)
core/schemas/bifrost.go (1)
  • BifrostError (491-500)
core/internal/testutil/test_retry_framework.go (1)
  • TestRetryContext (174-179)
core/internal/testutil/utils.go (1)
  • GetResponsesContent (414-459)
core/internal/testutil/file_base64.go (1)
core/internal/testutil/utils.go (3)
  • GetErrorMessage (646-679)
  • GetChatContent (368-395)
  • GetResponsesContent (414-459)
core/internal/testutil/file_url.go (2)
core/schemas/context.go (2)
  • NewBifrostContext (48-71)
  • NoDeadline (11-11)
core/internal/testutil/test_retry_framework.go (2)
  • FileInputResponsesRetryConfig (853-868)
  • WithResponsesTestRetry (443-593)
🔇 Additional comments (6)
core/internal/testutil/file_base64.go (5)

13-65: LGTM!

The PDF constant and helper functions are well-documented and correctly structured for creating test messages with base64-encoded PDF documents for both Chat and Responses APIs.


67-79: Good separation of concerns.

The orchestrator function cleanly delegates to per-API test functions, improving test isolation and maintainability.


81-160: Excellent adherence to retry config convention.

The Chat Completions test correctly follows the established pattern: obtains a generic retry config via GetTestRetryConfigForScenario() (line 99), then constructs a typed ChatRetryConfig with an empty Conditions slice while copying only the allowed fields (lines 127-134).


235-262: LGTM!

The validation helper appropriately checks for PDF document processing success with reasonable leniency—accepting either exact content extraction ("hello"+"world") or evidence of document-related processing. Clear error messages aid debugging.


207-208: Inconsistent retry config pattern - follow Chat test convention.

The Responses test calls FileInputResponsesRetryConfig() directly without first obtaining a generic retry config. This deviates from the established pattern used in the Chat test (lines 99, 127-134) and the convention documented in learnings.

Based on learnings, as per coding guidelines, the pattern should be:

  1. Get generic config: retryConfig := GetTestRetryConfigForScenario("FileInput", testConfig)
  2. Construct typed config with empty Conditions: Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, OnFinalFail
♻️ Proposed fix to align with convention
+		retryConfig := GetTestRetryConfigForScenario("FileInput", testConfig)
+		responsesRetryConfig := ResponsesRetryConfig{
+			MaxAttempts: retryConfig.MaxAttempts,
+			BaseDelay:   retryConfig.BaseDelay,
+			MaxDelay:    retryConfig.MaxDelay,
+			Conditions:  []ResponsesRetryCondition{},
+			OnRetry:     retryConfig.OnRetry,
+			OnFinalFail: retryConfig.OnFinalFail,
+		}
-		responsesRetryConfig := FileInputResponsesRetryConfig()

Likely an incorrect or invalid review comment.

core/internal/testutil/test_retry_conditions.go (1)

988-1007: ResponsesEmptyCondition may over-retry if “valid” Responses output is tool-call-only. If any tests/scenarios legitimately return tool calls with empty text content, this condition will force retries even though the response is usable. Consider confirming this condition is only used for “expected text” scenarios (like file summaries).

akshaydeo and others added 5 commits January 11, 2026 21:42
## Summary

Fix race condition in semantic cache plugin by retrieving context metadata before spawning a goroutine.

## Changes

- Moved the retrieval of `paramsHash` from context to before the goroutine is spawned
- Added a comment explaining the importance of getting metadata from context before the goroutine to avoid race conditions when the same context is reused across multiple requests

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [ ] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [x] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Run the semantic cache plugin with concurrent requests using the same context:

```sh
go test ./plugins/semanticcache/... -race
```

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

Fixes potential race conditions when the semantic cache plugin processes multiple requests with the same context.

## Security considerations

No security implications.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable
)

## Summary

Remove `grok-4-fast-non-reasoning` from the list of Grok reasoning models and update the comment in the `IsGrokReasoningModel` function.

## Changes

- Removed `grok-4-fast-non-reasoning` from the `grokReasoningModels` slice as it's no longer needed in the list of reasoning models
- Updated the comment in the `IsGrokReasoningModel` function to be more generic about safety checking for variants

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

```sh
# Core/Transports
go version
go test ./core/schemas/...
```

Verify that the `IsGrokReasoningModel` function correctly identifies reasoning models while excluding non-reasoning variants.

## Breaking changes

- [ ] Yes
- [x] No

## Security considerations

No security implications as this is a minor update to model identification logic.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
core/providers/openai/chat_test.go (1)

216-252: Non-reasoning model test aligns with commit fix.

This test case correctly validates that grok-2-latest (a non-reasoning model) preserves all parameters unchanged, aligning with the commit "remove non reasoning variant from grok reasoning models list."

Consider adding edge case tests for empty model string or unknown model patterns to ensure robust fallback behavior.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a5f2af and b42d387.

📒 Files selected for processing (2)
  • core/providers/openai/chat_test.go
  • core/schemas/utils.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/openai/chat_test.go
  • core/schemas/utils.go
🧠 Learnings (8)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/openai/chat_test.go
  • core/schemas/utils.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/openai/chat_test.go
  • core/schemas/utils.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/chat_test.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/chat_test.go
📚 Learning: 2025-12-15T10:16:21.909Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/huggingface/huggingface_test.go:12-63
Timestamp: 2025-12-15T10:16:21.909Z
Learning: In provider tests under core/providers/<provider>/*_test.go, do not require or flag the use of defer for Shutdown(); instead call client.Shutdown() at the end of each test function. This pattern appears consistent across all provider tests. Apply this rule only within this path; for other tests or resources, defer may still be appropriate.

Applied to files:

  • core/providers/openai/chat_test.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/openai/chat_test.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.

Applied to files:

  • core/providers/openai/chat_test.go
📚 Learning: 2026-01-12T06:41:13.643Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1153
File: core/schemas/utils.go:1159-1183
Timestamp: 2026-01-12T06:41:13.643Z
Learning: In core/schemas/utils.go, IsGrokReasoningModel (and the grokReasoningModels list) is the canonical utility for model matching. Treat it as the source of truth and not as a duplicate; if extending model support, update this file rather than introducing separate matching logic, and verify changes here to avoid diverging behavior.

Applied to files:

  • core/schemas/utils.go
🧬 Code graph analysis (1)
core/providers/openai/chat_test.go (3)
core/providers/openai/types.go (2)
  • OpenAIChatRequest (48-61)
  • OpenAIMessage (63-72)
core/schemas/chatcompletions.go (2)
  • ChatParameters (154-184)
  • ChatReasoning (238-241)
core/schemas/utils.go (1)
  • Ptr (14-16)
🔇 Additional comments (8)
core/providers/openai/chat_test.go (5)

9-15: Well-structured table-driven test.

The test uses idiomatic Go table-driven testing with a flexible validate function per case, allowing tailored assertions for each model's expected behavior.


253-281: Good nil-safety coverage.

The test correctly validates that applyXAICompatibility handles nil Reasoning gracefully without panicking while still processing other parameters according to model rules.


282-305: Good isolation test for unrelated parameters.

Testing that Temperature and TopP are preserved ensures the compatibility function doesn't inadvertently modify unrelated fields.


308-316: LGTM!

Clean test runner using t.Run for proper subtest isolation and reporting.


56-95: No test needed for grok-4-mini—it doesn't exist in the codebase.

The suggestion assumes a general "mini-variant pattern," but grok-3-mini is the only mini variant in the reasoning models list. The implementation specifically handles grok-3-mini with hardcoded string matching, not a pattern for all mini models. Test coverage is complete for all existing models: grok-3, grok-3-mini, grok-4, grok-4-fast-reasoning, and grok-code-fast-1.

Likely an incorrect or invalid review comment.

core/schemas/utils.go (3)

728-826: LGTM!

The DeepCopyChatTool implementation follows the established deep copy pattern used elsewhere in this file (e.g., DeepCopyChatMessage). The handling of nested structures (Function.Parameters, Custom.Format.Grammar, CacheControl) and proper nil checks are consistent with the codebase conventions.


1159-1182: Good defensive check for non-reasoning variants.

The implementation correctly excludes "non-reasoning" variants after substring matching, providing a safety net for future model name variants. This aligns with the commit message indicating removal of non-reasoning variants from the list.

The strings.Contains approach is consistent with other model detection functions in this file (IsNovaModel, IsAnthropicModel, etc.) and allows matching versioned variants like grok-3-20250101.

Based on learnings, this is the canonical utility for grok reasoning model matching.


268-268: The MarshalString function is properly defined in core/schemas/json_native.go and core/schemas/json_wasm.go with the expected signature func MarshalString(v interface{}) (string, error), making it available to the same package. The usage at line 268 is correct.

akshaydeo and others added 3 commits January 12, 2026 14:01
## Summary

This PR fixes several type safety issues in provider implementations, improves error handling, and enhances stream accumulator thread safety to prevent race conditions.

## Fixes

Closes #1306

## Changes

- Added proper type safety checks for Gemini provider's safety settings, labels, and request metadata
- Fixed empty string check for Anthropic instructions parameter
- Added mutex locks to stream accumulator methods to prevent race conditions
- Improved deferred span cleanup to ensure proper resource release even during panics
- Enhanced string map extraction with a new utility function `SafeExtractStringMap`
- Updated file base64 test to use scenario-specific retry configuration
- Fixed Gemini speech handler to safely check for nil params
- Added proper cleanup for deferred spans and stream accumulators

## Type of change

- [x] Bug fix
- [ ] Feature
- [x] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [x] Providers/Integrations
- [x] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

```sh
# Core/Transports
go version
go test ./...

# Test specific providers
go test ./core/providers/gemini/...
go test ./core/providers/anthropic/...
go test ./core/providers/bedrock/...

# Test streaming functionality
go test ./framework/streaming/...
```

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

Fixes type safety issues and race conditions in stream processing.

## Security considerations

Improves type safety which helps prevent potential security issues from unexpected data types.

## Checklist

- [x] I added/updated tests where appropriate
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable
…d query parameters (#1297)

## Summary

Added case-insensitive helper methods for header and query parameter lookups in HTTPRequest to improve plugin development experience and consistency.

## Changes

- Added `CaseInsensitiveHeaderLookup()` and `CaseInsensitiveQueryLookup()` helper methods to the HTTPRequest struct
- Implemented a shared `caseInsensitiveLookup()` function that handles various lookup scenarios
- Updated the governance plugin to use these new helper methods
- Updated documentation to recommend using these helper methods
- Added changelog entry

## Type of change

- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [x] Documentation
- [x] Chore/CI

## Affected areas

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [x] Plugins
- [ ] UI (Next.js)
- [x] Docs

## How to test

Test the new helper methods with various header and query parameter cases:

```sh
# Core/Transports
go test ./core/schemas -run TestHTTPRequest_CaseInsensitiveLookup

# Test with a plugin that uses the new methods
go run cmd/bifrost/main.go run --plugin plugins/governance
```

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

Improves developer experience when working with HTTP headers and query parameters in plugins.

## Security considerations

No security implications. This change only affects how headers and query parameters are accessed.

## Checklist

- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/bedrock/utils.go (1)

735-774: Nil pointer dereference on format.JSONSchema.

Line 751 accesses format.JSONSchema.Description before checking if format.JSONSchema is nil (line 759). If format.Type == "json_schema" but format.JSONSchema is nil, this will panic.

🐛 Proposed fix: Check JSONSchema nil before accessing Description
 func convertTextFormatToTool(ctx *schemas.BifrostContext, textConfig *schemas.ResponsesTextConfig) *BedrockTool {
 	if textConfig == nil || textConfig.Format == nil {
 		return nil
 	}
 
 	format := textConfig.Format
 	if format.Type != "json_schema" {
 		return nil
 	}
 
+	if format.JSONSchema == nil {
+		return nil // Schema is required for Bedrock tooling
+	}
+
 	toolName := "json_response"
 	if format.Name != nil && strings.TrimSpace(*format.Name) != "" {
 		toolName = strings.TrimSpace(*format.Name)
 	}
 
 	description := "Returns structured JSON output"
 	if format.JSONSchema.Description != nil {
 		description = *format.JSONSchema.Description
 	}
 
 	toolName = fmt.Sprintf("bf_so_%s", toolName)
 	ctx.SetValue(schemas.BifrostContextKeyStructuredOutputToolName, toolName)
 
-	var schemaObj any
-	if format.JSONSchema != nil {
-		schemaObj = *format.JSONSchema
-	} else {
-		return nil // Schema is required for Bedrock tooling
-	}
+	schemaObj := *format.JSONSchema
 
 	return &BedrockTool{
core/internal/testutil/file_base64.go (1)

260-267: Unreachable code in validation logic.

The else block at lines 265-267 is unreachable. After the check at line 255 (if !foundHelloWorld && !foundDocument), at least one of foundHelloWorld or foundDocument must be true. The subsequent if/else if at lines 260-263 covers both cases, making the final else dead code.

🧹 Proposed fix to remove dead code
 	if foundHelloWorld {
 		t.Logf("✅ %s model successfully extracted 'Hello World' content from PDF document", apiName)
 	} else if foundDocument {
 		t.Logf("✅ %s model processed PDF document but may not have clearly identified the exact text", apiName)
-	} else {
-		t.Errorf("❌ %s response doesn't reference document content or expected keywords: %s", apiName, content)
-		return
 	}
core/providers/utils/utils.go (1)

1094-1140: Add deferred span completion logic to ProcessAndSendError for consistency.

ProcessAndSendError is missing the deferred span completion logic present in ProcessAndSendResponse and ProcessAndSendBifrostError. Since all 20+ call sites across providers explicitly set StreamEndIndicator to true before calling this function, error conditions are always final chunks and require proper span completion. Add the same pattern used in the other functions: check for StreamEndIndicator in the early return path (after HandleStreamControlSkip check) and after the channel send, calling completeDeferredSpan(ctx, processedResponse, processedError) when the flag is true.

🤖 Fix all issues with AI agents
In @core/providers/gemini/gemini.go:
- Around line 723-727: HandleGeminiResponsesStream (and similarly SpeechStream
and TranscriptionStream) can panic if resp.BodyStream() is nil because you call
bufio.NewScanner(resp.BodyStream()) directly; add the same nil-check pattern
used in HandleGeminiChatCompletionStream: after creating stopCancellation via
providerUtils.SetupStreamCancellation(ctx, resp.BodyStream(), logger), verify if
resp.BodyStream() == nil and if so call stopCancellation(), log the error and
return early (or close the stream if needed) before creating bufio.NewScanner,
ensuring you reference resp.BodyStream(), providerUtils.SetupStreamCancellation,
bufio.NewScanner, and the function names mentioned when making the change.
- Around line 383-390: The scanner loop can exit without setting the
end-indicator (isLastChunk) or producing a scanner/context error, so add a
fallback end signal: track a local flag (e.g., streamEnded) set to true when you
handle isLastChunk, scanner.Err(), or context errors; in the existing defer
(which currently calls
providerUtils.HandleStreamCancellation/HandleStreamTimeout and closes
responseChan) add a branch that, when ctx.Err() is nil and streamEnded is false,
emits the same final end-of-stream signal used elsewhere (i.e., send the
terminal response on responseChan or invoke the same post-hook/handler used for
normal completion) using the same parameters (postHookRunner, responseChan,
providerName, model, schemas.ChatCompletionStreamRequest, logger) before closing
responseChan so consumers always receive an explicit end indicator.

In @core/providers/openai/models.go:
- Around line 40-42: The OpenAIListModelsResponse.Object field is not being set
when constructing openaiResponse; set openaiResponse.Object = "list" after
creating the OpenAIListModelsResponse (the struct named
OpenAIListModelsResponse) so the response follows OpenAI's convention for list
endpoints and returns "list" in the Object field.

In @core/providers/utils/utils.go:
- Around line 887-892: The code calls ctx.Value(...) without first verifying ctx
is non-nil, which can panic if a nil *schemas.BifrostContext is passed; add a
nil check before accessing context values (e.g., if ctx == nil { /* skip tracing
*/ } ) and only attempt the tracer lookup when ctx is non-nil, then proceed to
assert tracer as schemas.Tracer and call tracer.AddStreamingChunk(traceID,
response) as before; update the block around ctx,
schemas.BifrostContextKeyTracer, schemas.BifrostContextKeyTraceID,
schemas.Tracer and tracer.AddStreamingChunk to guard against a nil ctx.
🧹 Nitpick comments (4)
core/schemas/plugin_wasm.go (1)

1-11: LGTM! Clean WASM-specific short-circuit type definition.

The build constraint and struct definition are appropriate for the WASM/TinyGo plugin architecture. The explicit note about streams not being supported in WASM plugins is helpful for plugin developers.

Minor nit: Line 6 has a small grammatical issue and double space.

✏️ Optional: Fix minor typo in comment
 // PluginShortCircuit represents a plugin's decision to short-circuit the normal flow.
-// It can contain either a response (success short-circuit),  an error (error short-circuit).
+// It can contain either a response (success short-circuit) or an error (error short-circuit).
 // Streams are not supported in WASM plugins.
core/internal/testutil/file_base64.go (2)

81-86: Redundant FileBase64 support check.

Both RunFileBase64ChatCompletionsTest and RunFileBase64ResponsesTest duplicate the FileBase64 support check that's already performed in RunFileBase64Test (line 69). This is defensive but adds noise. Consider removing these checks from the individual functions if they're only called from RunFileBase64Test, or document that they can be called independently.

Also applies to: 162-167


98-125: Consider extracting duplicated expectations setup.

The retry context and expectations setup is nearly identical between RunFileBase64ChatCompletionsTest and RunFileBase64ResponsesTest. The ExpectedBehavior, TestMetadata structure, and expectations modification logic are duplicated. Consider extracting a helper function to reduce duplication.

♻️ Example helper extraction
func setupFileBase64TestContext(scenarioName string, testConfig ComprehensiveTestConfig) (TestRetryContext, TestExpectations) {
	retryContext := TestRetryContext{
		ScenarioName: scenarioName,
		ExpectedBehavior: map[string]interface{}{
			"should_process_pdf":     true,
			"should_read_document":   true,
			"should_extract_content": true,
			"document_understanding": true,
		},
		TestMetadata: map[string]interface{}{
			"provider":          testConfig.Provider,
			"model":             testConfig.ChatModel,
			"file_type":         "pdf",
			"encoding":          "base64",
			"test_content":      "Hello World!",
			"expected_keywords": []string{"hello", "world", "pdf", "document"},
		},
	}

	expectations := GetExpectationsForScenario("FileInput", testConfig, map[string]interface{}{})
	expectations = ModifyExpectationsForProvider(expectations, testConfig.Provider)
	expectations.ShouldContainKeywords = append(expectations.ShouldContainKeywords, "hello", "world")
	expectations.ShouldNotContainWords = append(expectations.ShouldNotContainWords,
		"cannot process", "invalid format", "decode error",
		"unable to read", "no file", "corrupted", "unsupported",
	)

	return retryContext, expectations
}

Also applies to: 179-205

core/providers/gemini/gemini.go (1)

874-910: Edge case: Empty finalResponses leaves end-indicator unset.

If FinalizeGeminiResponsesStream returns an empty slice, the for-loop (lines 876-910) never executes, and BifrostContextKeyStreamEndIndicator is never set. This could leave consumers waiting indefinitely for a stream-end signal.

Consider adding a fallback:

 		// Finalize the stream by closing any open items
 		finalResponses := FinalizeGeminiResponsesStream(streamState, lastUsageMetadata, sequenceNumber)
+		if len(finalResponses) == 0 {
+			ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
+		}
 		for i, finalResponse := range finalResponses {
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b42d387 and d1c337f.

📒 Files selected for processing (17)
  • core/internal/testutil/file_base64.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/transcription.go
  • core/providers/gemini/types.go
  • core/providers/openai/models.go
  • core/providers/openai/types.go
  • core/providers/utils/utils.go
  • core/schemas/plugin_wasm.go
  • core/schemas/utils.go
  • framework/streaming/accumulator.go
  • framework/streaming/types.go
  • plugins/governance/advancedscenarios_test.go
  • plugins/semanticcache/stream.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/providers/anthropic/responses.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/transcription.go
  • core/schemas/plugin_wasm.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/internal/testutil/file_base64.go
  • core/providers/bedrock/utils.go
  • core/providers/openai/models.go
  • core/providers/utils/utils.go
  • core/providers/gemini/gemini.go
🧠 Learnings (9)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/transcription.go
  • core/schemas/plugin_wasm.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/internal/testutil/file_base64.go
  • core/providers/bedrock/utils.go
  • core/providers/openai/models.go
  • core/providers/utils/utils.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/transcription.go
  • core/schemas/plugin_wasm.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/internal/testutil/file_base64.go
  • core/providers/bedrock/utils.go
  • core/providers/openai/models.go
  • core/providers/utils/utils.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/transcription.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/openai/models.go
  • core/providers/utils/utils.go
  • core/providers/gemini/gemini.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.

Applied to files:

  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/transcription.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/openai/models.go
  • core/providers/utils/utils.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
  • core/providers/openai/models.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
  • core/providers/openai/models.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.

Applied to files:

  • core/internal/testutil/file_base64.go
📚 Learning: 2025-12-26T05:46:14.625Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1172
File: core/providers/bedrock/responses.go:2929-2985
Timestamp: 2025-12-26T05:46:14.625Z
Learning: In Bedrock file handling (core/providers/bedrock/responses.go and utils.go), ensure that the FileData field is treated as either a proper data URL (prefix 'data:') or as plain text. Do not expect or accept naked base64-encoded strings without the data URL prefix. The code should first check for the data URL prefix; if missing, process the value as plain text. This pattern applies to all Go files in the bedrock provider directory.

Applied to files:

  • core/providers/bedrock/utils.go
📚 Learning: 2026-01-10T15:52:10.995Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1298
File: core/providers/utils/utils.go:983-1006
Timestamp: 2026-01-10T15:52:10.995Z
Learning: In Go with fasthttp v1.68.0, Response.BodyStream() returns an io.Reader, but the concrete type also implements io.Closer. In reviews, consider adding a safe type assertion pattern to close the stream when possible, e.g. if closer, ok := bodyStream.(io.Closer); ok { if err := closer.Close(); err != nil { // handle } }. This check should be applied in all code paths where BodyStream() is consumed, not only in this specific file, to ensure proper resource cleanup when the underlying reader supports closing.

Applied to files:

  • core/providers/utils/utils.go
🧬 Code graph analysis (5)
core/providers/gemini/chat.go (2)
core/providers/gemini/types.go (1)
  • SafeExtractSafetySettings (103-133)
core/schemas/utils.go (1)
  • SafeExtractStringMap (527-547)
core/providers/gemini/transcription.go (2)
core/providers/gemini/types.go (1)
  • SafeExtractSafetySettings (103-133)
core/schemas/utils.go (1)
  • SafeExtractStringMap (527-547)
core/schemas/plugin_wasm.go (2)
plugins/mocker/main.go (1)
  • Response (97-103)
core/schemas/bifrost.go (1)
  • BifrostResponse (348-369)
core/providers/gemini/responses.go (3)
core/schemas/responses.go (5)
  • ResponsesMessage (319-332)
  • ResponsesMessageContentBlock (404-419)
  • ResponsesInputMessageContentBlockTypeText (393-393)
  • ResponsesInputMessageRoleSystem (339-339)
  • ResponsesMessageContent (344-349)
core/providers/gemini/types.go (4)
  • Content (1013-1021)
  • Part (1027-1051)
  • Type (822-822)
  • Role (22-22)
core/schemas/utils.go (3)
  • SafeExtractFromMap (517-523)
  • SafeExtractString (278-295)
  • Ptr (14-16)
core/providers/gemini/gemini.go (4)
core/schemas/context.go (1)
  • BifrostContext (32-43)
core/schemas/provider.go (2)
  • PostHookRunner (314-314)
  • Provider (317-366)
core/schemas/responses.go (2)
  • BifrostResponsesRequest (30-37)
  • BifrostResponsesResponse (43-83)
core/providers/gemini/responses.go (1)
  • FinalizeGeminiResponsesStream (1376-1378)
🔇 Additional comments (32)
core/providers/bedrock/utils.go (3)

48-59: LGTM on context type migration.

The signature change to *schemas.BifrostContext aligns with the broader migration across providers, and the context is correctly propagated to convertResponseFormatToTool.


278-282: LGTM on safe extraction helper.

Using SafeExtractStringMap instead of direct type assertion is a good defensive coding practice.


678-732: LGTM on BifrostContext integration.

The signature change and ctx.SetValue usage correctly implements the new context mutation pattern for storing the structured output tool name.

core/providers/openai/types.go (1)

528-544: LGTM!

The new OpenAIModel and OpenAIListModelsResponse structs are well-defined with appropriate JSON tags. The GROQ-specific fields (Active, ContextWindow) are correctly documented and marked as optional with omitempty.

core/providers/openai/models.go (1)

9-33: LGTM!

The conversion logic is clean and well-structured:

  • Appropriate nil guard on the receiver
  • Efficient pre-allocation of the slice
  • Clean filtering using slices.Contains
  • Provider key prefix on model IDs enables cross-provider disambiguation
core/internal/testutil/file_base64.go (4)

127-134: Retry config construction follows conventions.

The ChatRetryConfig and ResponsesRetryConfig are correctly constructed using GetTestRetryConfigForScenario() with an empty Conditions slice, copying only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail. This aligns with the established pattern. Based on learnings, this is the expected convention for test retry configs.

Also applies to: 207-215


67-79: Clean orchestration pattern for per-API subtests.

The refactored RunFileBase64Test provides a clean delegation pattern that improves test organization and allows each API path to be tested independently. The early return on unsupported scenarios is appropriate.


243-270: Validation helper is well-structured.

The validateDocumentContent function properly uses t.Helper(), has clear validation logic with meaningful error messages, and handles edge cases like short responses. The emoji prefixes in logs aid visual scanning of test output.


188-195: No separate model field exists for Responses API.

The ComprehensiveTestConfig struct does not have a ResponsesModel field. It uses ChatModel for both ChatCompletions and Responses tests, which is the correct design as indicated by the Fallbacks comment mentioning "for chat, responses, image and reasoning tests." Using testConfig.ChatModel in both tests is appropriate and maintains proper test isolation.

Likely an incorrect or invalid review comment.

core/providers/utils/utils.go (9)

835-853: LGTM!

The signature migration to *schemas.BifrostContext is consistent with the broader context migration pattern and correctly passes the context to ProcessAndSendResponse.


856-874: LGTM!

Consistent signature migration matching SendCreatedEventResponsesChunk.


937-981: LGTM - same nil check recommendation applies.

The implementation correctly runs post-hooks before completing deferred spans. The same nil-check recommendation from ProcessAndSendResponse applies here for consistency.


983-1006: LGTM! Good use of the io.Closer type assertion pattern.

The implementation correctly handles stream cancellation with proper cleanup semantics. The type assertion pattern (if closer, ok := bodyStream.(io.Closer); ok) aligns with the learning that fasthttp's BodyStream() returns io.Reader but the concrete type implements io.Closer.

The documentation clearly states the cleanup function must be called - callers should use defer cleanup() immediately after calling this function to prevent goroutine leaks.


1016-1047: LGTM! Good atomic handling of stream end indicator.

The use of GetAndSetValue for atomic check-and-set prevents race conditions when multiple goroutines might try to finalize the stream simultaneously. Status code 499 (Client Closed Request) is the appropriate convention for client disconnection scenarios.


1057-1088: LGTM!

Consistent implementation with HandleStreamCancellation. Status code 504 (Gateway Timeout) is appropriate for deadline exceeded scenarios.


1254-1262: LGTM!

Minor comment update. The drain logic is appropriate for releasing streaming responses.


1391-1433: LGTM!

Signature migration is consistent with the broader BifrostContext adoption. The concurrent request handling and aggregation logic remains correct.


1531-1621: LGTM! Well-structured span completion logic.

The function properly handles:

  • Nil context checks upfront
  • Type assertions with proper fallback paths
  • TTFT (Time to First Token) and chunk count metrics for streaming observability
  • Parent-child span relationship for post-hook spans
  • Proper cleanup via ClearDeferredSpan

The fallback to final chunk data (lines 1587-1590) provides resilience if chunk accumulation fails.

core/providers/gemini/types.go (1)

101-133: LGTM - Well-designed safe extraction helper.

The function correctly handles the three expected input cases: nil, direct []SafetySetting, and JSON-deserialized []interface{}. The early return on non-conforming items prevents partial/corrupted data from being used. This pattern aligns well with SafeExtractStringMap in core/schemas/utils.go.

core/providers/gemini/chat.go (1)

39-55: LGTM - Safe extraction pattern correctly applied.

The migration from direct type assertions to SafeExtractSafetySettings and SafeExtractStringMap improves resilience when extraParams contains JSON-deserialized values (e.g., []interface{} instead of []SafetySetting). The nested extraction pattern is consistent and correctly propagates extraction failures silently.

core/providers/gemini/transcription.go (1)

117-133: LGTM - Consistent safe extraction pattern.

The changes mirror the pattern applied in chat.go, ensuring transcription request conversion also handles JSON-deserialized extraParams safely. This maintains consistency across the Gemini provider's request conversion functions.

core/providers/gemini/responses.go (3)

30-49: LGTM - System instruction conversion flow is well-structured.

The new flow correctly:

  1. Converts SystemInstruction to a system-role message first
  2. Appends converted Contents messages after
  3. Only sets bifrostReq.Input if there are actual messages

This ensures system instructions are properly represented as system messages in the Bifrost Responses format.


111-132: LGTM - Instructions fallback and safe extraction.

The logic correctly:

  1. Falls back to creating SystemInstruction from Params.Instructions only when one doesn't already exist
  2. Uses the same safe extraction pattern for safety_settings and cached_content as other files

This maintains consistency with the broader safe extraction migration.


1380-1420: LGTM - Well-implemented system instruction converter.

The helper function:

  1. Correctly handles nil/empty input
  2. Filters for text content only (non-text parts like inline data are ignored, which is appropriate for system instructions)
  3. Uses ContentStr for single text blocks (cleaner) and ContentBlocks for multiple (preserves structure)
  4. Returns nil when no text content exists, preventing empty messages
core/providers/gemini/gemini.go (8)

72-73: LGTM: Context type migration for non-streaming methods.

The migration from context.Context to *schemas.BifrostContext across completeRequest, listModelsByKey, ListModels, TextCompletion, TextCompletionStream, and ChatCompletion aligns correctly with the Provider interface definition in core/schemas/provider.go.


1136-1144: LGTM: SpeechStream has proper stream finalization.

The streaming handler correctly:

  1. Sets up cancellation handling in defer (lines 1137-1144)
  2. Returns early on scanner errors with end-indicator (lines 1267-1275)
  3. Always sends a final "done" response with end-indicator on normal completion (lines 1276-1292)

This ensures the end-indicator is always set regardless of how the stream terminates.

Also applies to: 1276-1292


1419-1427: LGTM: TranscriptionStream follows the correct finalization pattern.

The handler properly sets the end-indicator in all termination paths:

  • Cancellation/timeout via defer (lines 1420-1427)
  • Scanner errors (lines 1557-1565)
  • Normal completion with final response (lines 1567-1590)

Also applies to: 1567-1590


1035-1038: LGTM: Response format propagation via context.

The response format is correctly set on the context for use by ToBifrostSpeechResponse. The nil check on request.Params prevents panics, and setting even empty values allows downstream code to distinguish between "not specified" (key absent) vs "specified as empty" (key present with empty value).


1602-1602: LGTM: Batch operations context migration.

All batch operation methods (BatchCreate, batchListByKey, BatchList, batchRetrieveByKey, BatchRetrieve, batchCancelByKey, BatchCancel, batchResultsByKey, BatchResults) correctly use *schemas.BifrostContext and pass it to providerUtils.MakeRequestWithContext for proper cancellation/timeout handling.

Also applies to: 1773-1773, 1883-1883, 1956-1956, 2040-2040, 2071-2071, 2137-2137, 2190-2190, 2350-2350


2381-2381: LGTM: File operations context migration.

All file operation methods correctly updated to use *schemas.BifrostContext.

Also applies to: 2518-2518, 2623-2623, 2696-2696, 2780-2780, 2811-2811, 2862-2862, 2895-2895


2912-2912: LGTM: CountTokens context migration.

The CountTokens method correctly uses *schemas.BifrostContext and follows the standard pattern for non-streaming requests.


382-406: LGTM: Defer ordering in streaming handlers.

The defer statements are correctly ordered (LIFO execution):

  1. stopCancellation() - stops cancellation monitoring first
  2. ReleaseStreamingResponse(resp) - releases HTTP response
  3. Cancellation handling + close(responseChan) - final cleanup

This ensures proper resource cleanup regardless of how the goroutine exits.

Also applies to: 711-725

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/bedrock/utils.go (1)

750-763: Potential nil pointer dereference on format.JSONSchema.

Line 751 accesses format.JSONSchema.Description before checking if format.JSONSchema is nil (done later at line 759). If format.JSONSchema is nil, this will cause a panic.

🐛 Proposed fix: check JSONSchema before accessing its fields
 	description := "Returns structured JSON output"
-	if format.JSONSchema.Description != nil {
+	if format.JSONSchema != nil && format.JSONSchema.Description != nil {
 		description = *format.JSONSchema.Description
 	}

 	toolName = fmt.Sprintf("bf_so_%s", toolName)
 	ctx.SetValue(schemas.BifrostContextKeyStructuredOutputToolName, toolName)

 	var schemaObj any
 	if format.JSONSchema != nil {
 		schemaObj = *format.JSONSchema
 	} else {
 		return nil // Schema is required for Bedrock tooling
 	}
core/internal/testutil/file_base64.go (1)

260-267: Dead code: the else branch is unreachable.

After the early return at line 255-258, we know foundHelloWorld || foundDocument is true. The if foundHelloWorld and else if foundDocument branches exhaustively cover all remaining cases, making the else branch at lines 264-267 unreachable.

🧹 Suggested fix: remove the dead else branch
 	if foundHelloWorld {
 		t.Logf("✅ %s model successfully extracted 'Hello World' content from PDF document", apiName)
-	} else if foundDocument {
+	} else {
+		// foundDocument must be true here (already checked at line 255)
 		t.Logf("✅ %s model processed PDF document but may not have clearly identified the exact text", apiName)
-	} else {
-		t.Errorf("❌ %s response doesn't reference document content or expected keywords: %s", apiName, content)
-		return
 	}
core/providers/utils/utils.go (1)

1094-1140: Add StreamEndIndicator check and deferred span completion to ProcessAndSendError.

This function lacks the span lifecycle handling present in ProcessAndSendBifrostError. All callers set StreamEndIndicator before calling this function, but unlike ProcessAndSendBifrostError, it does not check this flag or call completeDeferredSpan. Add the same final chunk detection and span completion logic:

if isFinalChunk := ctx.Value(schemas.BifrostContextKeyStreamEndIndicator); isFinalChunk != nil {
	if final, ok := isFinalChunk.(bool); ok && final {
		completeDeferredSpan(ctx, processedResponse, processedError)
	}
}

This should be added in the early-return path (after HandleStreamControlSkip) and before sending the response, consistent with ProcessAndSendBifrostError.

🤖 Fix all issues with AI agents
In @core/providers/openai/models.go:
- Around line 40-42: The OpenAIListModelsResponse is being constructed without
setting its Object field so responses return an empty string; update the
OpenAIListModelsResponse initialization (where openaiResponse is created) to set
Object to "list" (e.g., OpenAIListModelsResponse{Object: "list", Data: ...}) so
the response conforms to the OpenAI API spec.
🧹 Nitpick comments (4)
core/providers/gemini/types.go (1)

101-133: Consider validating required fields in SafeExtractSafetySettings.

The function silently appends a SafetySetting even when required fields (category, threshold) are missing or have incorrect types. Per the struct comments, Category and Threshold are marked as "Required". This could lead to empty/incomplete safety settings being passed to the Gemini API.

💡 Optional: Add validation for required fields
 		for _, item := range v {
 			if m, ok := item.(map[string]interface{}); ok {
 				setting := SafetySetting{}
 				if method, ok := m["method"].(string); ok {
 					setting.Method = method
 				}
-				if category, ok := m["category"].(string); ok {
-					setting.Category = category
+				category, hasCategory := m["category"].(string)
+				if hasCategory {
+					setting.Category = category
+				}
-				if threshold, ok := m["threshold"].(string); ok {
-					setting.Threshold = threshold
+				threshold, hasThreshold := m["threshold"].(string)
+				if hasThreshold {
+					setting.Threshold = threshold
+				}
+				// Skip incomplete settings missing required fields
+				if !hasCategory || !hasThreshold {
+					continue
 				}
 				settings = append(settings, setting)
core/providers/utils/utils.go (1)

887-930: Tracing integration looks correct; consider extracting duplicated final-chunk logic.

The approach is sound: accumulating chunks before post-hooks preserves raw provider data while finalizing spans with post-processed results. However, the StreamEndIndicator check and completeDeferredSpan call is duplicated in both the skip path (lines 899-903) and the normal path (lines 926-930).

♻️ Optional: Extract final chunk handling to reduce duplication
 func ProcessAndSendResponse(
 	ctx *schemas.BifrostContext,
 	postHookRunner schemas.PostHookRunner,
 	response *schemas.BifrostResponse,
 	responseChan chan *schemas.BifrostStream,
 ) {
+	// Helper to complete deferred span if this is the final chunk
+	maybeCompleteDeferredSpan := func(resp *schemas.BifrostResponse, err *schemas.BifrostError) {
+		if isFinalChunk := ctx.Value(schemas.BifrostContextKeyStreamEndIndicator); isFinalChunk != nil {
+			if final, ok := isFinalChunk.(bool); ok && final {
+				completeDeferredSpan(ctx, resp, err)
+			}
+		}
+	}
+
 	// Accumulate chunk for tracing (common for all providers)
 	if tracer, ok := ctx.Value(schemas.BifrostContextKeyTracer).(schemas.Tracer); ok && tracer != nil {
 		if traceID, ok := ctx.Value(schemas.BifrostContextKeyTraceID).(string); ok && traceID != "" {
 			tracer.AddStreamingChunk(traceID, response)
 		}
 	}

 	// Run post hooks on the response (note: accumulated chunks above contain pre-hook data)
 	processedResponse, processedError := postHookRunner(ctx, response, nil)

 	if HandleStreamControlSkip(processedError) {
-		// Even if skipping, complete the deferred span if this is the final chunk
-		if isFinalChunk := ctx.Value(schemas.BifrostContextKeyStreamEndIndicator); isFinalChunk != nil {
-			if final, ok := isFinalChunk.(bool); ok && final {
-				completeDeferredSpan(ctx, processedResponse, processedError)
-			}
-		}
+		maybeCompleteDeferredSpan(processedResponse, processedError)
 		return
 	}
 	// ... rest of function ...
-	// Check if this is the final chunk and complete deferred span with post-processed data
-	if isFinalChunk := ctx.Value(schemas.BifrostContextKeyStreamEndIndicator); isFinalChunk != nil {
-		if final, ok := isFinalChunk.(bool); ok && final {
-			completeDeferredSpan(ctx, processedResponse, processedError)
-		}
-	}
+	maybeCompleteDeferredSpan(processedResponse, processedError)
 }
core/schemas/plugin.go (2)

49-71: The lowercase key check may be redundant.

The middle check (lines 60-63) only helps when data already contains lowercase keys. If the map has "Content-Type" and you search for "CONTENT-TYPE", lowerKey becomes "content-type" which won't match "Content-Type" in the map. The final EqualFold iteration handles all cases correctly.

Consider removing the intermediate check or documenting it as an optimization for maps that are expected to be pre-lowercased.

♻️ Optional simplification
 func caseInsensitiveLookup(data map[string]string, key string) string {
 	if data == nil || key == "" {
 		return ""
 	}
 	// exact match
 	if v, ok := data[key]; ok {
 		return v
 	}
-	// lower key checks
-	lowerKey := strings.ToLower(key)
-	if v, ok := data[lowerKey]; ok {
-		return v
-	}
 	// case-insensitive iteration
 	for k, v := range data {
 		if strings.EqualFold(k, key) {
 			return v
 		}
 	}
 	return ""
 }

150-161: Consider clarifying HTTPRequest lifecycle ownership.

The documentation explains return semantics well, but doesn't clarify who is responsible for releasing the pooled HTTPRequest. Since plugins receive a pointer and can modify it in-place, it would be helpful to note that the caller (transport layer) owns the lifecycle and will release it after all plugins complete.

 	// HTTPTransportIntercept is called at the HTTP transport layer before requests enter Bifrost core.
 	// It receives a serializable HTTPRequest and allows plugins to modify it in-place.
 	// Only invoked when using HTTP transport (bifrost-http), not when using Bifrost as a Go SDK directly.
 	// Works with both native .so plugins and WASM plugins due to serializable types.
+	// The caller owns the HTTPRequest lifecycle and is responsible for releasing it to the pool.
 	//
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b42d387 and 613dfe3.

📒 Files selected for processing (22)
  • core/changelog.md
  • core/internal/testutil/file_base64.go
  • core/providers/anthropic/responses.go
  • core/providers/bedrock/utils.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/gemini.go
  • core/providers/gemini/responses.go
  • core/providers/gemini/transcription.go
  • core/providers/gemini/types.go
  • core/providers/openai/models.go
  • core/providers/openai/types.go
  • core/providers/utils/utils.go
  • core/schemas/plugin.go
  • core/schemas/plugin_wasm.go
  • core/schemas/utils.go
  • docs/plugins/writing-go-plugin.mdx
  • docs/plugins/writing-wasm-plugin.mdx
  • framework/streaming/accumulator.go
  • framework/streaming/types.go
  • plugins/governance/advancedscenarios_test.go
  • plugins/governance/main.go
  • plugins/semanticcache/stream.go
✅ Files skipped from review due to trivial changes (1)
  • core/changelog.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • core/providers/anthropic/responses.go
  • core/schemas/plugin_wasm.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

  • core/providers/gemini/transcription.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/responses.go
  • core/internal/testutil/file_base64.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/utils/utils.go
  • core/schemas/plugin.go
  • core/providers/openai/models.go
  • core/providers/gemini/gemini.go
🧠 Learnings (9)
📚 Learning: 2025-12-09T17:07:42.007Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/schemas/account.go:9-18
Timestamp: 2025-12-09T17:07:42.007Z
Learning: In core/schemas/account.go, the HuggingFaceKeyConfig field within the Key struct is currently unused and reserved for future Hugging Face inference endpoint deployments. Do not flag this field as missing from OpenAPI documentation or require its presence in the API spec until the feature is actively implemented and used. When the feature is added, update the OpenAPI docs accordingly; otherwise, treat this field as non-breaking and not part of the current API surface.

Applied to files:

  • core/providers/gemini/transcription.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/responses.go
  • core/internal/testutil/file_base64.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/utils/utils.go
  • core/schemas/plugin.go
  • core/providers/openai/models.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-29T11:54:55.836Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1153
File: framework/configstore/rdb.go:2221-2246
Timestamp: 2025-12-29T11:54:55.836Z
Learning: In Go reviews, do not flag range-over-int patterns like for i := range n as compile-time errors, assuming Go 1.22+ semantics. Only flag actual range-capable values (slices, arrays, maps, channels, strings) and other compile-time issues. This applies to all Go files across the repository.

Applied to files:

  • core/providers/gemini/transcription.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/responses.go
  • core/internal/testutil/file_base64.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/utils/utils.go
  • core/schemas/plugin.go
  • core/providers/openai/models.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-19T09:26:54.961Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1006
File: core/providers/utils/utils.go:1050-1051
Timestamp: 2025-12-19T09:26:54.961Z
Learning: Update streaming end-marker handling so HuggingFace is treated as a non-[DONE] provider for backends that do not emit a DONE marker (e.g., meta llama on novita). In core/providers/utils/utils.go, adjust ProviderSendsDoneMarker() (or related logic) to detect providers that may not emit DONE and avoid relying on DONE as the sole end signal. Add tests to cover both DONE-emitting and non-DONE backends, with clear documentation in code comments explaining the rationale and any fallback behavior.

Applied to files:

  • core/providers/gemini/transcription.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/utils/utils.go
  • core/providers/openai/models.go
  • core/providers/gemini/gemini.go
📚 Learning: 2026-01-10T11:27:47.535Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 1256
File: core/providers/openai/openai.go:2276-2385
Timestamp: 2026-01-10T11:27:47.535Z
Learning: Validate image generation requests for nil and missing prompts before dispatch. Follow the same pattern used here: core/bifrost.go validates nil/empty prompts, providerUtils.CheckContextAndGetRequestBody returns a structured error when the request converter yields nil, and apply this across all providers (including OpenAI) to avoid sending null bodies.

Applied to files:

  • core/providers/gemini/transcription.go
  • core/providers/gemini/chat.go
  • core/providers/gemini/types.go
  • core/providers/gemini/responses.go
  • core/providers/openai/types.go
  • core/providers/bedrock/utils.go
  • core/providers/utils/utils.go
  • core/providers/openai/models.go
  • core/providers/gemini/gemini.go
📚 Learning: 2025-12-19T08:29:20.286Z
Learnt from: qwerty-dvorak
Repo: maximhq/bifrost PR: 1095
File: core/internal/testutil/count_tokens.go:30-67
Timestamp: 2025-12-19T08:29:20.286Z
Learning: In core/internal/testutil test files, enforce using GetTestRetryConfigForScenario() to obtain a generic retry config, then construct a typed retry config (e.g., CountTokensRetryConfig, EmbeddingRetryConfig, TranscriptionRetryConfig) with an empty Conditions slice. Copy only MaxAttempts, BaseDelay, MaxDelay, OnRetry, and OnFinalFail from the generic config. This convention should be consistently applied across all test files in this directory.

Applied to files:

  • core/internal/testutil/file_base64.go
📚 Learning: 2025-12-11T11:58:25.307Z
Learnt from: Pratham-Mishra04
Repo: maximhq/bifrost PR: 1000
File: core/providers/openai/responses.go:42-84
Timestamp: 2025-12-11T11:58:25.307Z
Learning: In core/providers/openai/responses.go (and related OpenAI response handling), document and enforce the API format constraint: if ResponsesReasoning != nil and the response contains content blocks, all content blocks should be treated as reasoning blocks by default. Implement type guards or parsing logic accordingly, and add unit tests to verify that when ResponsesReasoning is non-nil, content blocks are labeled as reasoning blocks. Include clear comments in the code explaining the rationale and ensure downstream consumers rely on this behavior.

Applied to files:

  • core/providers/openai/types.go
  • core/providers/openai/models.go
📚 Learning: 2025-12-14T14:43:30.902Z
Learnt from: Radheshg04
Repo: maximhq/bifrost PR: 980
File: core/providers/openai/images.go:10-22
Timestamp: 2025-12-14T14:43:30.902Z
Learning: Enforce the OpenAI image generation SSE event type values across the OpenAI image flow in the repository: use "image_generation.partial_image" for partial chunks, "image_generation.completed" for the final result, and "error" for errors. Apply this consistently in schemas, constants, tests, accumulator routing, and UI code within core/providers/openai (and related Go files) to ensure uniform event typing and avoid mismatches.

Applied to files:

  • core/providers/openai/types.go
  • core/providers/openai/models.go
📚 Learning: 2025-12-26T05:46:14.625Z
Learnt from: TejasGhatte
Repo: maximhq/bifrost PR: 1172
File: core/providers/bedrock/responses.go:2929-2985
Timestamp: 2025-12-26T05:46:14.625Z
Learning: In Bedrock file handling (core/providers/bedrock/responses.go and utils.go), ensure that the FileData field is treated as either a proper data URL (prefix 'data:') or as plain text. Do not expect or accept naked base64-encoded strings without the data URL prefix. The code should first check for the data URL prefix; if missing, process the value as plain text. This pattern applies to all Go files in the bedrock provider directory.

Applied to files:

  • core/providers/bedrock/utils.go
📚 Learning: 2026-01-10T15:52:10.995Z
Learnt from: akshaydeo
Repo: maximhq/bifrost PR: 1298
File: core/providers/utils/utils.go:983-1006
Timestamp: 2026-01-10T15:52:10.995Z
Learning: In Go with fasthttp v1.68.0, Response.BodyStream() returns an io.Reader, but the concrete type also implements io.Closer. In reviews, consider adding a safe type assertion pattern to close the stream when possible, e.g. if closer, ok := bodyStream.(io.Closer); ok { if err := closer.Close(); err != nil { // handle } }. This check should be applied in all code paths where BodyStream() is consumed, not only in this specific file, to ensure proper resource cleanup when the underlying reader supports closing.

Applied to files:

  • core/providers/utils/utils.go
🧬 Code graph analysis (7)
core/providers/gemini/transcription.go (2)
core/providers/gemini/types.go (1)
  • SafeExtractSafetySettings (103-133)
core/schemas/utils.go (1)
  • SafeExtractStringMap (527-547)
core/providers/gemini/chat.go (2)
core/providers/gemini/types.go (1)
  • SafeExtractSafetySettings (103-133)
core/schemas/utils.go (1)
  • SafeExtractStringMap (527-547)
core/providers/gemini/responses.go (3)
core/schemas/responses.go (5)
  • ResponsesMessage (319-332)
  • ResponsesMessageContentBlock (404-419)
  • ResponsesInputMessageContentBlockTypeText (393-393)
  • ResponsesInputMessageRoleSystem (339-339)
  • ResponsesMessageContent (344-349)
core/providers/gemini/types.go (5)
  • Content (1013-1021)
  • Part (1027-1051)
  • SafeExtractSafetySettings (103-133)
  • Type (822-822)
  • Role (22-22)
core/schemas/utils.go (3)
  • SafeExtractFromMap (517-523)
  • SafeExtractString (278-295)
  • Ptr (14-16)
core/internal/testutil/file_base64.go (3)
core/internal/testutil/test_retry_framework.go (7)
  • GetTestRetryConfigForScenario (1186-1222)
  • TestRetryContext (174-179)
  • ChatRetryConfig (192-199)
  • ChatRetryCondition (126-129)
  • WithChatTestRetry (290-440)
  • ResponsesRetryConfig (212-219)
  • WithResponsesTestRetry (443-593)
core/internal/testutil/utils.go (2)
  • GetChatContent (368-395)
  • GetResponsesContent (414-459)
core/internal/testutil/validation_presets.go (2)
  • GetExpectationsForScenario (256-350)
  • ModifyExpectationsForProvider (357-404)
core/providers/bedrock/utils.go (5)
core/schemas/context.go (1)
  • BifrostContext (32-43)
core/schemas/chatcompletions.go (2)
  • BifrostChatRequest (10-17)
  • ChatParameters (154-184)
core/providers/bedrock/types.go (2)
  • BedrockConverseRequest (55-75)
  • BedrockTool (285-288)
core/schemas/utils.go (1)
  • SafeExtractStringMap (527-547)
core/schemas/bifrost.go (1)
  • BifrostContextKeyStructuredOutputToolName (140-140)
core/providers/utils/utils.go (4)
core/schemas/context.go (1)
  • BifrostContext (32-43)
core/schemas/provider.go (1)
  • PostHookRunner (314-314)
core/schemas/bifrost.go (5)
  • BifrostResponse (348-369)
  • BifrostContextKeyTracer (146-146)
  • BifrostContextKeyTraceID (142-142)
  • BifrostContextKeyStreamEndIndicator (130-130)
  • BifrostError (491-500)
core/schemas/tracer.go (1)
  • Tracer (37-115)
core/providers/gemini/gemini.go (4)
core/schemas/context.go (1)
  • BifrostContext (32-43)
core/schemas/bifrost.go (3)
  • BifrostError (491-500)
  • BifrostStream (456-463)
  • BifrostContextKeyStreamEndIndicator (130-130)
core/schemas/provider.go (1)
  • Provider (317-366)
core/providers/gemini/responses.go (1)
  • FinalizeGeminiResponsesStream (1376-1378)
🔇 Additional comments (33)
core/providers/bedrock/utils.go (2)

48-48: LGTM on BifrostContext migration and safer type extraction.

The function signature update to *schemas.BifrostContext aligns with the broader context migration in this PR. The use of schemas.SafeExtractStringMap at line 279 is a safer approach than direct type assertion, as it handles both map[string]string and map[string]interface{} types gracefully.

Also applies to: 279-281


678-732: LGTM on convertResponseFormatToTool migration.

The migration to *schemas.BifrostContext and the use of ctx.SetValue at line 720 aligns with the new context handling pattern. The function correctly extracts the JSON schema and sets the structured output tool name in the context.

core/providers/gemini/responses.go (2)

30-49: LGTM!

The system instruction handling correctly prepends system messages before content messages, maintaining proper message ordering. The nil checks ensure graceful handling when system instructions are empty or contain no text content.


1380-1420: LGTM!

The helper function correctly:

  1. Handles nil/empty input gracefully
  2. Extracts only text content from system instruction parts
  3. Uses ContentStr for single-block optimization vs ContentBlocks for multiple blocks
  4. Sets the appropriate system role

The pointer usage on line 1393 (&part.Text) is safe since part is a pointer to the struct element, not a copy.

core/internal/testutil/file_base64.go (6)

1-11: LGTM!

Imports are clean and all appear to be used in the file.


13-26: LGTM!

The updated documentation clearly describes the constant's purpose and content.


28-65: LGTM!

Both message creation helpers are well-structured and correctly use the schema types for their respective APIs.


67-79: LGTM!

Clean orchestration function that properly delegates to per-API subtests. The early return on unsupported scenarios is appropriate.


81-160: LGTM!

The Chat Completions test follows the established retry framework pattern correctly. The retry config is constructed with GetTestRetryConfigForScenario() and an empty Conditions slice as per the test conventions. Based on learnings, this is the expected approach for test files in this directory.


162-241: LGTM!

The Responses API test mirrors the Chat Completions test structure appropriately, using the correct API-specific retry config and content extraction helpers. The symmetry between the two test functions improves maintainability.

core/providers/openai/types.go (1)

528-544: LGTM!

The new OpenAIModel fields and OpenAIListModelsResponse type are well-structured. Using pointer types with omitempty for optional fields (Created, Active, ContextWindow) is appropriate, and the clear comment distinguishing GROQ-specific fields helps maintainability.

core/providers/openai/models.go (1)

9-33: LGTM!

The conversion logic is correct: nil-safety, proper capacity allocation, filtering by allowedModels, and consistent ID prefixing with providerKey.

core/providers/gemini/chat.go (1)

39-55: LGTM! Safe extraction helpers improve robustness.

The switch from direct type assertions to SafeExtractSafetySettings and schemas.SafeExtractStringMap correctly handles both pre-typed values (e.g., []SafetySetting) and JSON-deserialized generic types (e.g., []interface{}). This prevents panics when ExtraParams contains data from different serialization paths.

core/providers/gemini/transcription.go (1)

117-133: LGTM! Consistent with chat.go extraction pattern.

The safe extraction helpers are applied consistently here, matching the pattern in chat.go. This ensures uniform handling of ExtraParams across both chat and transcription request converters.

core/providers/gemini/gemini.go (7)

383-406: LGTM! Robust streaming lifecycle with proper cancellation handling.

The streaming goroutine correctly:

  1. Uses deferred cleanup to handle cancellation/timeout via ctx.Err() checks
  2. Sets up stream cancellation handler to close body stream when context is cancelled
  3. Handles nil body stream edge case by setting end indicator and sending error
  4. Defers are ordered correctly (cancellation handler runs first, then response release, then channel close)

874-910: LGTM! Stream finalization handles incomplete streams gracefully.

The finalization block correctly handles streams that end without an explicit completion event. It:

  1. Calls FinalizeGeminiResponsesStream to close any open items
  2. Properly skips nil responses with warnings
  3. Applies postResponseConverter consistently
  4. Sets the end indicator only on the last finalized response

1276-1293: LGTM! SpeechStream finalization follows consistent pattern.

The finalization correctly sends a SpeechStreamResponseTypeDone event with usage metadata, sets the end indicator, and includes raw request data if configured.


1567-1591: LGTM! TranscriptionStream finalization consistent with other streams.

The finalization correctly aggregates the full transcription text, includes token usage metadata, and sends a TranscriptionStreamResponseTypeDone event with the end indicator set.


1035-1038: LGTM! Context value propagation for response format.

Setting BifrostContextKeyResponseFormat in the context allows ToBifrostSpeechResponse to access the requested audio format for proper response encoding.


73-73: Context migration to *schemas.BifrostContext is consistent throughout.

The migration from context.Context to *schemas.BifrostContext is applied uniformly across all provider methods. The new context type enables:

  • Streaming lifecycle signaling via SetValue/GetValue
  • Unified cancellation and timeout handling
  • Post-hook processing integration

The streaming patterns are consistent with proper cleanup in deferred functions.


530-538: LGTM! Scanner error handling is consistent across all streaming handlers.

The error handling pattern correctly:

  1. Checks if context was cancelled/timed out first (letting defer handle those cases)
  2. Sets the stream end indicator
  3. Logs the error and sends it through the post-hook chain

This pattern is applied consistently in HandleGeminiChatCompletionStream, HandleGeminiResponsesStream, SpeechStream, and TranscriptionStream.

core/providers/utils/utils.go (8)

835-853: LGTM!

The context type migration to *schemas.BifrostContext is correctly applied, maintaining the existing functionality while enabling access to tracing infrastructure.


856-874: LGTM!

Consistent context type migration matching the pattern established in SendCreatedEventResponsesChunk.


937-981: LGTM!

Error handling correctly runs post-hooks before span finalization without chunk accumulation (which is appropriate for error cases). The same duplication pattern exists as in ProcessAndSendResponse - the optional refactor suggestion above could be applied here as well.


1016-1047: LGTM!

The atomic check-and-set pattern using GetAndSetValue correctly prevents duplicate handling. The 499 status code follows the nginx convention for client disconnection, and the error flows through the post-hook chain to properly update logging status.


1057-1088: LGTM!

Consistent implementation with HandleStreamCancellation, using the appropriate 504 Gateway Timeout status for deadline-exceeded scenarios.


1391-1433: LGTM!

The context type migration is correctly applied. The goroutine closure correctly captures the ctx pointer, ensuring all concurrent requests share the same context for coordinated cancellation.


1531-1621: Well-structured deferred span finalization with proper cleanup.

The implementation correctly:

  1. Defensively handles nil/empty values at each step
  2. Populates attributes from accumulated chunks (including TTFT and chunk count) before ending the span
  3. Finalizes post-hook spans as children of the LLM span before closing it
  4. Sets error attributes and appropriate span status
  5. Cleans up the deferred span from TraceStore

The context wrapping at line 1599 (context.WithValue(ctx, ...)) correctly leverages that *schemas.BifrostContext implements context.Context, producing a standard context.Context for the finalizer callback.


983-1006: Well-implemented stream cancellation with proper io.Closer handling.

The type assertion pattern for io.Closer correctly handles both fasthttp.BodyStream() (io.Reader) and net/http.resp.Body (io.ReadCloser). All 19 callers across providers (OpenAI, Mistral, Gemini, Anthropic, etc.) properly invoke the cleanup function via defer, ensuring the goroutine completes and the done channel is closed when streaming finishes.

core/schemas/plugin.go (4)

4-8: LGTM!

Imports are appropriate for the new functionality: sync for the pool, strings for case-insensitive lookups, and context for the ObservabilityPlugin.Inject method.


73-79: LGTM!

The HTTPResponse struct is well-designed for short-circuit scenarios. Not pooling responses is reasonable since they're only created when short-circuiting.


98-113: No action needed. The project's minimum Go version is 1.25.5 (confirmed across all go.mod files), which fully supports the clear() builtin introduced in Go 1.21. The code is compatible.


191-219: Well-designed observability extension interface.

The design follows Go idioms well:

  • Clean interface composition with Plugin
  • Using standard context.Context for the async method is appropriate
  • Type assertion for identification avoids marker method boilerplate

The documentation clearly communicates async execution semantics and fresh context behavior. The Trace type is properly defined and exported in core/schemas/trace.go with appropriate fields for trace representation.

## Summary

Added error handling for empty response streams in the Gemini provider to prevent potential nil pointer dereferences.

## Changes

- Added a check to verify if the response body stream is nil before attempting to scan it
- If the stream is nil, the code now creates a Bifrost operation error, sets the stream end indicator, and sends the error through the response channel
- This prevents the application from crashing when Gemini returns an empty response

## Type of change

- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Documentation
- [ ] Chore/CI

## Affected areas

- [x] Core (Go)
- [ ] Transports (HTTP)
- [x] Providers/Integrations
- [ ] Plugins
- [ ] UI (Next.js)
- [ ] Docs

## How to test

Test the Gemini provider with a scenario that would result in an empty response:

```sh
# Core/Transports
go version
go test ./core/providers/gemini/...

# Create a test that mocks an empty response from Gemini API
```

## Breaking changes

- [ ] Yes
- [x] No

## Related issues

Fixes potential nil pointer dereference when Gemini returns empty responses

## Security considerations

Prevents potential application crashes that could be triggered by empty responses from the Gemini API.

## Checklist

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [ ] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable
@akshaydeo akshaydeo merged commit 0e8c38f into main Jan 12, 2026
18 of 20 checks passed
@akshaydeo akshaydeo deleted the v1.4.0 branch January 12, 2026 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants