feat: support for image generation (#980)#1312
Conversation
This PR introduces image generation support to Bifrost and wires it into the existing gateway infrastructure. It adds core schemas, provider extensions, HTTP transport, streaming plumbing, cache integration, and initial UI support to address #950. Implementation is in progress. Completed functionality is marked, and remaining work will be completed in follow-up commits before marking this PR ready for review. Scoped Changes: - [x] New /v1/images/generations endpoint (OpenAI-compatible) - [x] Image generation via Chat Completion API (tool use pattern) - [x] Image generation via Responses API (native support) - [x] Streaming image delivery (base64 chunks) - [x] Semantic caching for image generation - [x] UI components for image rendering - [x] Provider implementations: OpenAI DALL-E, Azure DALL-E Rollout Plan: - [x] Phase 1: Core schema and provider implementation (OpenAI + Azure) - [x] Phase 2: HTTP transport and non-streaming endpoint - [x] Phase 3: Streaming support and accumulator - [x] Phase 4: Semantic cache integration (Base64 storage, 5min TTL) - [x] Phase 5: UI components and documentation - [ ] Bug fix - [x] Feature - [ ] Refactor - [x] Documentation - [ ] Chore/CI - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [x] Plugins - [x] UI (Next.js) - [x] Docs Unit Tests - [x] Schema serialization/deserialization - [x] Request transformation (Bifrost → OpenAI format) - [x] Response transformation (OpenAI → Bifrost format) - [x] Stream chunk accumulation - [x] Cache key generation Integration Tests - [x] End-to-end image generation (non-streaming) - [x] End-to-end streaming image generation - [x] Fallback to secondary provider - [x] Cache hit/miss scenarios - [x] Error handling (rate limits, invalid prompts) Load Tests - [x] Concurrent image generation requests - [x] Stream memory usage under load - [x] Cache performance at scale End-to-end image generation ```sh curl -X POST http://localhost:8080/v1/images/generations \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $API_KEY" \ -d '{ "model": "openai/dall-e-3", "prompt": "A serene Japanese garden with cherry blossoms", "size": "1024x1024", "quality": "high", "response_format": "b64_json" }' ``` Expected Outcome: ```json { "id": "abc123", "created": 1699999999, "model": "dall-e-3", "data": [ { "b64_json": "iVBORw0KGgo...", "revised_prompt": "A tranquil Japanese garden featuring blooming cherry blossom trees...", "index": 0 } ], "usage": { "prompt_tokens": 15, "total_tokens": 15 }, "extra_fields": { "provider": "openai", "latency_ms": 8500, "cache_debug": null } } ``` Unit Tests: ```sh go test -v github.com/maximhq/bifrost/core/providers/openai -run TestImage go test -v github.com/maximhq/bifrost/core -run TestImage ``` Expected outcome: PASS Note: Might need to replace remote paths with local modules in go.mod (Once PR is merged, this directive is not needed) ```sh replace github.com/maximhq/bifrost/core => ../../core ``` N/A - [ ] Yes - [x] No Closes #950 N/A - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
… tests to support image gen; minor code fixes
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds end-to-end image generation: new request/response/stream types, Bifrost public methods and provider image APIs (implemented or stubbed), streaming accumulator and assembly, HTTP/OpenAPI routes, schemas, logging/storage/pricing, semantic-cache and tests, and UI components. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant HTTPHandler
participant Bifrost
participant Provider
participant StreamAccumulator
participant RemoteAPI
Client->>HTTPHandler: POST /v1/images/generations (stream? true/false)
HTTPHandler->>Bifrost: ImageGenerationRequest / ImageGenerationStreamRequest
alt Streaming
Bifrost->>Provider: ImageGenerationStream(req)
Provider->>RemoteAPI: open streaming/SSE
RemoteAPI-->>Provider: stream chunks (partial/completed/error)
Provider->>StreamAccumulator: addImageStreamChunk(chunk)
StreamAccumulator->>StreamAccumulator: processImageStreamingResponse()
StreamAccumulator-->>Bifrost: assembled ImageGenerationStreamResponse(s)
else Non‑streaming
Bifrost->>Provider: ImageGeneration(req)
Provider->>RemoteAPI: HTTP POST
RemoteAPI-->>Provider: JSON ImageGenerationResponse
Provider-->>Bifrost: BifrostImageGenerationResponse
end
Bifrost-->>HTTPHandler: JSON or SSE (image chunks / final)
HTTPHandler-->>Client: 200 OK with image data
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧪 Test Suite AvailableThis PR can be tested by a repository admin. |
There was a problem hiding this comment.
Actionable comments posted: 9
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (18)
docs/providers/supported-providers/perplexity.mdx (1)
31-33: Update the Note to include Image Generation in the unsupported operations list.For consistency with the updated table, add "Image Generation" to the list of unsupported operations in the Note.
📝 Suggested documentation update
<Note> -**Unsupported Operations** (❌): Text Completions, Embeddings, Speech, Transcriptions, Files, Batch, and List Models are not supported by the upstream Perplexity API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Text Completions, Embeddings, Image Generation, Speech, Transcriptions, Files, Batch, and List Models are not supported by the upstream Perplexity API. These return `UnsupportedOperationError`. </Note>docs/providers/supported-providers/cohere.mdx (1)
18-33: Add Image Generation to the Unsupported Operations note to match the table.The table includes
Image Generation | ❌ | ❌ | -(line 25), but the note omits it from the unsupported operations list (lines 31-33). This inconsistency should be resolved by adding Image Generation to the note. Cohere does not offer image generation APIs—only multimodal embeddings and image understanding—so the ❌ status is correct.Proposed doc fix
<Note> -**Unsupported Operations** (❌): Text Completions, Speech, Transcriptions, Files, and Batch are not supported by the upstream Cohere API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Text Completions, Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Cohere API. These return `UnsupportedOperationError`. </Note>core/schemas/responses.go (1)
540-575: Error message mentions image generation output but struct doesn't handle it.The error messages on lines 556 and 574 reference "image generation call output" as a valid type, but
ResponsesToolMessageOutputStructonly has three fields and no corresponding field or handling logic for image generation output:
ResponsesToolCallOutputStrResponsesFunctionToolCallOutputBlocksResponsesComputerToolCallOutputThe marshal/unmarshal methods don't check for or process image generation output data. Either add a field and handling for image generation output, or remove the reference from the error messages to match the actual struct capabilities.
plugins/semanticcache/stream.go (1)
91-129: Prevent possible panic in stream chunk sorting (missingj-side type guard).
ImageGenerationStreamResponsesorting checksibut assumesjis also an image chunk; if a mixed variant ever lands inaccumulator.Chunks, this will nil-deref at Line 123/126. (This same pattern exists in the earlier branches too, but this new branch is an immediate place to harden.)Proposed minimal fix (guard `j` before deref)
if accumulator.Chunks[i].Response.TranscriptionStreamResponse != nil { return accumulator.Chunks[i].Response.TranscriptionStreamResponse.ExtraFields.ChunkIndex < accumulator.Chunks[j].Response.TranscriptionStreamResponse.ExtraFields.ChunkIndex } - if accumulator.Chunks[i].Response.ImageGenerationStreamResponse != nil { - // For image generation, sort by Index first, then ChunkIndex - if accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index != accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index { - return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index - } - return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.ChunkIndex < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.ChunkIndex - } + if iImg := accumulator.Chunks[i].Response.ImageGenerationStreamResponse; iImg != nil { + jImg := accumulator.Chunks[j].Response.ImageGenerationStreamResponse + if jImg == nil { + // Mixed response variants: keep stable input order, but don't panic. + return false + } + // For image generation, sort by Index first, then ChunkIndex + if iImg.Index != jImg.Index { + return iImg.Index < jImg.Index + } + return iImg.ChunkIndex < jImg.ChunkIndex + } return false })docs/providers/supported-providers/elevenlabs.mdx (1)
18-35: Keep “Unsupported Operations” note consistent with the table.Table adds “Image Generation” as ❌, but the note’s explicit list doesn’t include it—worth updating to avoid confusion.
docs/providers/supported-providers/azure.mdx (1)
18-34: Reconcile Image Generation endpoint in the table vs the section.The Supported Operations table says
/openai/v1/images/generations, but the Image Generation section uses/openai/deployments/{deployment}/images/generations?api-version={version}. Pick one or explicitly label them as “Gateway endpoint” vs “Azure upstream endpoint”.Also applies to: 364-433
core/internal/testutil/account.go (1)
676-710: Enable ImageGeneration scenario flags for OpenAI.OpenAI has
ImageGenerationModel: "dall-e-2"configured, but the correspondingScenarios.ImageGenerationandScenarios.ImageGenerationStreamflags are not enabled. This is inconsistent with other capabilities (e.g.,SpeechSynthesisModelis paired withSpeechSynthesis: true). The image generation tests will be skipped for OpenAI unless these flags are set totrue.core/providers/gemini/types.go (1)
1114-1123: UpdateBlob.Datacomment to clarify base64 encoding semantics.The comment states "Required. Raw bytes." but the field actually contains a base64-encoded string. All callsites (utils.go, responses.go, speech.go, transcription.go, images.go) correctly encode bytes via
encodeBytesToBase64String(). Update the comment to:// Required. Base64-encoded bytes.docs/providers/supported-providers/vertex.mdx (1)
20-31: Clarify parameter naming in Image Generation request examples —promptvsinput.prompt.The "Core Parameter Mapping" table (line 325) shows
input.prompt, but the Gateway example (line 345) demonstrates"prompt"at the root level of the request JSON. This creates ambiguity about the expected request structure. The Go SDK example correctly reflects the struct (Input: &ImageGenerationInput{Prompt: ...}), but the Gateway curl example should be clarified: does it accept"prompt"directly, or should it be nested as"input": {"prompt": ...}? Update the documentation or table to remove this inconsistency.The Endpoint column concern is already addressed in the "Endpoint Selection" section (lines 388–394), which clearly explains these are Vertex AI endpoints.
core/providers/gemini/speech.go (1)
146-148: Unsafe type assertion may cause panic.Line 147 uses a type assertion without the comma-ok idiom. If
BifrostContextKeyResponseFormatis missing from the context or has a different type, this will panic at runtime.🛠️ Suggested fix with safe type assertion
if len(audioData) > 0 { - responseFormat := ctx.Value(BifrostContextKeyResponseFormat).(string) + responseFormat, _ := ctx.Value(BifrostContextKeyResponseFormat).(string) // Gemini returns PCM audio (s16le, 24000 Hz, mono)Using the comma-ok idiom ensures
responseFormatdefaults to an empty string if the context value is missing or has an unexpected type, which will then fall through to theelsebranch returning rawaudioData.tests/integrations/python/tests/test_openai.py (1)
74-152: Import additions look fine; update the big module docstring test list to include image-generation if that list is meant to stay authoritative.Not required for correctness, but right now the header docstring enumerates many cases and doesn’t mention these new ones.
core/providers/bedrock/bedrock.go (1)
1360-1365: Fix logger call to use structured logging format: Line 1362 passes a printf-style format string ("%s") toLogger.Error, but the Logger interface uses structured logging with signatureError(msg string, args ...any). The"%s"will not be interpolated; instead, the message will log literally and the argument becomes an unkeyed field. Usefmt.Sprintf()to format the message before logging.Fix
- if err.Error != nil { - provider.logger.Error("file upload operation not allowed: %s", err.Error.Message) - } + if err.Error != nil { + provider.logger.Error(fmt.Sprintf("file upload operation not allowed: %s", err.Error.Message)) + } return nil, errcore/providers/openai/openai.go (1)
2286-2293: Transcription stream: re-add keepalive/comment skipping (":" lines) to avoid bogus JSON parsing.
Right now you only skip empty lines; SSE streams commonly include:keepalive lines (and sometimesevent:). This likely regresses behavior vs other stream handlers in this file.Proposed fix
- // Skip empty lines and comments - if line == "" { + // Skip empty lines and comments / keepalive pings + if line == "" || strings.HasPrefix(line, ":") || strings.HasPrefix(line, "event:") { continue }framework/logstore/tables.go (1)
98-138: BuildContentSummary is missing image generation prompts, making them unsearchable.Image generation input prompts and outputs are stored but not included in
BuildContentSummary(), unlike speech input and transcription output. AddImageGenerationInputParsed.Promptto the searchable content summary so users can find logs by image generation prompts.Additionally, storing full base64-encoded image outputs in the database may impact performance and storage. Consider whether this data should be truncated, metadata-only, or stored externally.
framework/streaming/types.go (1)
13-20: InitializeDatato empty slice when constructing emptyBifrostImageGenerationResponse.In
ToBifrostResponse(framework/streaming/types.go, StreamTypeImage case), when creating an empty response:imageResp = &schemas.BifrostImageGenerationResponse{}The
Datafield remainsniland marshals to JSON asnull. Since the field lacksomitempty, this is inconsistent with other code paths (e.g., framework/streaming/images.go:102-110) that initializeDataas an empty slice. Initialize it to avoid serialization inconsistency:imageResp = &schemas.BifrostImageGenerationResponse{ Data: make([]schemas.ImageData, 0), }transports/bifrost-http/handlers/inference.go (1)
1258-1334: Do not log entire streaming chunks on marshal failure (can leak base64 image data + explode logs).This line can print the full
chunk(includingb64_jsonimage bytes) into logs:
logger.Warn(..., chunk: %v", err, chunk)That’s both a data leakage risk and a reliability risk (huge log lines, backpressure, cost).
Also,
skipDoneMarkeris inferred from seeing an image-gen chunk; if the channel closes without yielding chunks, you can still emit[DONE]for an image stream.Proposed diff
- chunkJSON, err := sonic.Marshal(chunk) + chunkJSON, err := sonic.Marshal(chunk) if err != nil { - logger.Warn(fmt.Sprintf("Failed to marshal streaming response: %v, chunk: %v", err, chunk)) + // Avoid logging full chunk (may contain large base64 payloads / sensitive data) + logger.Warn(fmt.Sprintf( + "Failed to marshal streaming response: %v (has_image=%t has_responses=%t has_error=%t)", + err, + chunk.BifrostImageGenerationStreamResponse != nil, + chunk.BifrostResponsesStreamResponse != nil, + chunk.BifrostError != nil, + )) continue } @@ - } else if chunk.BifrostError != nil { - eventType = string(schemas.ResponsesStreamResponseTypeError) + } else if chunk.BifrostError != nil { + // Prefer an image-generation-specific error type constant if one exists; otherwise "error". + eventType = "error" }Based on learnings, keeping image-generation SSE semantics consistent (“error” event, no
[DONE]) is important across transports/tests/UI.framework/streaming/accumulator.go (1)
460-491: Cleanup(): image chunks should be reset before pooling to avoid retaining large base64 buffers.
Cleanup()currently doesa.imageStreamChunkPool.Put(chunk)without clearingchunk.Delta/chunk.RawResponseetc. For images, this can pin large memory in the pool.Proposed fix
for _, chunk := range accumulator.ImageStreamChunks { - a.imageStreamChunkPool.Put(chunk) + a.putImageStreamChunk(chunk) }core/providers/huggingface/huggingface.go (1)
220-265:completeRequestuses image-specific error parser for all request types.
parseHuggingFaceImageError()is called for every non-200 response at line 245, regardless of whether the request is chat, embedding, audio, speech, or transcription. The codebase has two different error structures (HuggingFaceResponseErrorfor images andHuggingFaceHubErrorfor other endpoints), yet all errors flow through the image parser. This creates risk of parsing failures if error formats differ by endpoint type.The function receives
isHFInferenceAudioRequestandisHFInferenceImageRequestflags but lacks request type context needed to select the appropriate error handler.completeRequestWithModelAliasCachealready has access torequestType—pass it through to enable proper error branching.
🤖 Fix all issues with AI agents
In @core/providers/gemini/images.go:
- Around line 123-183: In ToBifrostImageGenerationResponse ensure
bifrostResp.Data is never nil (initialize to an empty []schemas.ImageData when
no inline images are found) and set ImageData.Index for each inline image part
(track the part index while iterating candidate.Content.Parts and assign it to
the ImageData.Index field when appending). Update the loop that builds imageData
to include an index counter and always assign bifrostResp.Data (either the
populated slice or an empty slice) before returning.
In @core/providers/huggingface/huggingface.go:
- Around line 1024-1278: The image stream chunks never set the Type field,
breaking consumers and tests; update the per-image chunk construction
(BifrostImageGenerationStreamResponse variable chunk inside the for loop) to set
chunk.Type = string(openai.ImageGenerationDelta) for non-final incremental
chunks, and set finalChunk.Type = string(openai.ImageGenerationCompleted) when
building the completion chunk (finalChunk). Ensure the Type is set consistently
whether URL or B64JSON is used and preserve existing ExtraFields and raw
request/response handling.
In @core/providers/nebius/nebius.go:
- Around line 244-343: The ImageGeneration method can panic when accessing
request.Params.ExtraParams["ai_project_id"] if ExtraParams is nil; update the
conditional that currently checks request.Params != nil to also ensure
request.Params.ExtraParams != nil before indexing (i.e., only read
request.Params.ExtraParams["ai_project_id"] when ExtraParams is non-nil), and
preserve the existing logic that url-escapes and appends the ai_project_id to
path; modify the block in ImageGeneration that inspects request.Params to
perform this nil check on ExtraParams.
In @core/providers/vertex/vertex.go:
- Around line 1420-1437: Guard against a nil request and validate the prompt
before calling getModelDeployment: at the start of
VertexProvider.ImageGeneration, check if request == nil and return a
providerUtils.NewConfigurationError with a clear message (using providerName),
then validate the prompt field on the schemas.BifrostImageGenerationRequest
(e.g., ensure request.Prompt is not nil/empty after trimming) and return a
similar configuration error if missing/empty; perform these checks before
calling provider.getModelDeployment(key, request.Model) so the function cannot
panic and fails fast on invalid input.
- Around line 1493-1559: The code incorrectly allows API-key query auth for
custom fine-tuned, Imagen, and Gemini endpoints (authQuery, completeURL
construction in the branches for schemas.IsAllDigitsASCII,
schemas.IsImagenModel, schemas.IsGeminiModel) and conditionally skips OAuth2;
remove the authQuery logic entirely (do not set or append "?key=...") and always
obtain an OAuth2 token via getAuthTokenSource and token.AccessToken, ensuring
the Authorization header is always set before the request; update the branch
code to only build completeURL (no authQuery) and remove the conditional that
bypasses token fetching when key.Value is present so that all requests use
bearer tokens.
In @framework/streaming/images.go:
- Around line 13-110: The special-case branch in
buildCompleteImageFromImageStreamChunks can nil-deref when the single final
chunk has Delta == nil and it also treats URL-only streams as empty; update the
initial single-chunk check to verify chunks[0].Delta != nil before accessing
fields, and when reconstructing images collect both Delta.B64JSON and Delta.URL
(or treat URL-only chunks as valid image entries) into the images/result
construction so the function doesn't return nil for URL-only streams; also
prefer using chunk.Delta.CreatedAt (if non-zero/non-empty) when setting Created
on the final response (falling back to time.Now() only if provider CreatedAt is
missing), and ensure RevisedPrompt handling and response ID selection still work
when some chunks have nil Delta by guarding accesses to chunk.Delta.
- Around line 207-337: The request-id handling and ImageIndex fallback are
incorrect: in processImageStreamingResponse change the request-id branch to log
the missing/empty requestID and return nil, nil (to match the comment) instead
of returning an error, and fix the ImageIndex selection so it doesn't always
choose Index (which defaults to 0); update the logic around
ImageGenerationStreamResponse.Index and SequenceNumber to use Index only when it
is explicitly set (e.g., Index != 0) and otherwise fall back to SequenceNumber
so chunk ordering is preserved.
🟠 Major comments (23)
docs/providers/supported-providers/mistral.mdx-31-33 (1)
31-33: Correct the attribution: "upstream Mistral API" does not support image generation is factually inaccurate.Mistral added image generation support in late 2024 and expanded it in 2025. The unsupported operations are limitations of Bifrost's current integration, not Mistral's API capabilities. Update the docs to attribute the limitation correctly to the integration layer.
Proposed fix
-**Unsupported Operations** (❌): Text Completions, Image Generation, Speech (TTS), Files, and Batch are not supported by the upstream Mistral API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Text Completions, Image Generation, Speech (TTS), Files, and Batch are not currently supported by Bifrost's Mistral integration. These return `UnsupportedOperationError`.Also update the Unsupported Features table (lines 237–245) to reflect this distinction.
transports/bifrost-http/integrations/genai.go-436-440 (1)
436-440: Consolidate model detection function with canonical implementation in core/schemas.The function
isImagenModel()duplicates the logic ofschemas.IsImagenModel()(incore/schemas/utils.go), which is already used throughout the codebase (vertex.go, gemini.go). Since genai.go already imports the schemas package, useschemas.IsImagenModel(modelStr)instead of maintaining a local copy. This eliminates unnecessary duplication and ensures consistent model detection across the codebase.The substring matching approach using
strings.Contains(modelLower, "imagen")is acceptable given that Imagen models follow a consistent naming convention (e.g., "imagen-3.0-generate-001"), so no change to the detection logic is needed—only consolidation to the existing canonical function.transports/bifrost-http/integrations/router.go-1085-1087 (1)
1085-1087: Streaming path can also panic ifImageGenerationStreamResponseConverterisn’t set.Right now it’s called unconditionally when an image stream chunk arrives. Consider failing fast before starting the stream if the request is image-generation streaming but the converter is nil.
Proposed fix (fail fast for streaming converter)
@@ func (g *GenericRouter) handleStreamingRequest(ctx *fasthttp.RequestCtx, config RouteConfig, bifrostReq *schemas.BifrostRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { @@ // Check if streaming is configured for this route if config.StreamConfig == nil { @@ } + + // Fail fast on misconfigured image-generation streaming routes + if bifrostReq.ImageGenerationRequest != nil && config.StreamConfig.ImageGenerationStreamResponseConverter == nil { + cancel() + g.sendStreamError(ctx, bifrostCtx, config, newBifrostError(nil, "ImageGenerationStreamResponseConverter not configured")) + return + } // Handle streaming using the centralized approach // Pass cancel function so it can be called when the writer exits (errors, completion, etc.) g.handleStreaming(ctx, bifrostCtx, config, stream, cancel) }Also applies to: 1285-1287
transports/bifrost-http/integrations/router.go-203-210 (1)
203-210: Good plumbing, but add safety rails for missing image-generation converters (avoid panics).Route config validation currently doesn’t enforce these new converters, and later code assumes they’re non-nil. Suggest adding runtime checks where they’re used (similar to the existing CountTokens converter guard) so misconfigured routes fail with a 5xx/clear error instead of panicking.
Proposed fix (nil guards similar to CountTokens)
@@ case bifrostReq.ImageGenerationRequest != nil: imageGenerationResponse, bifrostErr := g.client.ImageGenerationRequest(bifrostCtx, bifrostReq.ImageGenerationRequest) if bifrostErr != nil { g.sendError(ctx, bifrostCtx, config.ErrorConverter, bifrostErr) return } @@ if imageGenerationResponse == nil { g.sendError(ctx, bifrostCtx, config.ErrorConverter, newBifrostError(nil, "Bifrost response is nil after post-request callback")) return } + if config.ImageGenerationResponseConverter == nil { + g.sendError(ctx, bifrostCtx, config.ErrorConverter, newBifrostError(nil, "ImageGenerationResponseConverter not configured")) + return + } + // Convert Bifrost response to integration-specific format and send response, err = config.ImageGenerationResponseConverter(bifrostCtx, imageGenerationResponse)Also applies to: 257-264, 278-309
tests/integrations/python/tests/test_openai.py-1166-1253 (1)
1166-1253: Guard cross-provider image params (quality, non-squaresize) to avoid flaky failures.
quality="low"andsize="1024x1536"are not universally supported across providers/models; ifimage_generationscenario can include providers that don’t support those fields, these will fail hard rather than skip. Consider:
- capability filtering in
get_cross_provider_params_with_vk_for_scenario("image_generation"), or- skipping/xfailing when provider/model doesn’t support
quality/ requested size, or- catching a known “unsupported” error and
pytest.skip(...).ui/components/chat/ImageMessage.tsx-39-49 (1)
39-49: Passoutput_formatto ImageMessage component to construct correct MIME type forb64_json.The OpenAI Images API supports multiple output formats (
png,jpeg,webp). Hardcodingdata:image/pngbreaks rendering when upstream emits other formats. TheBifrostImageGenerationOutputcontainsoutput_formatinparams, but this information is lost inImageMessageData. Either include the format in the data object passed toImageMessageand use it to construct the correct data URL, or add amime_typefield to preserve the actual image format.core/providers/openai/openai_image_test.go-13-107 (1)
13-107: Add an “empty prompt” test to match request validation expectations
You cover nil request and nil input, but notInput.Prompt == "". Given the repo emphasis on rejecting missing prompts before dispatch, this test suite should lock in the intended behavior. Based on learnings, validate nil/empty prompts before dispatch.core/providers/elevenlabs/elevenlabs.go-442-442 (1)
442-442: VerifyGetBifrostResponseForStreamResponseargument ordering after signature expansion
These call sites assume the “speech stream response” position is still the 4th argument; if the helper’s parameter list was extended for image generation, it’s easy to shift the slot and silently route the chunk into the wrong union field. A focused unit test around the helper would guard this.Also applies to: 464-464
tests/integrations/python/tests/utils/common.py-2629-2705 (1)
2629-2705: Fix Imagen "predictions" validation: object-path incorrectly useshasattr()instead of extracting the actual base64 valueIn the Imagen prediction branch, when
predictionis not a dict,has_b64becomes a boolean fromhasattr()rather than the actual base64 string. This allows assertions to pass even whenbytesBase64EncodedisNoneor empty. Additionally, the GeminiinlineDataparsing doesn't account for casing variations (inlineDatavsinline_data,mimeTypevsmime_type) that can occur in actual API responses.Proposed fix
elif (isinstance(response, dict) and "predictions" in response) or hasattr(response, "predictions"): # Imagen response predictions = response.get("predictions") if isinstance(response, dict) else response.predictions assert len(predictions) > 0, "Response should have at least one prediction" for i, prediction in enumerate(predictions): - has_b64 = (prediction.get("bytesBase64Encoded") if isinstance(prediction, dict) - else (hasattr(prediction, "bytesBase64Encoded") or hasattr(prediction, "bytes_base64_encoded"))) - assert has_b64, f"Prediction {i} should have base64 encoded bytes" + if isinstance(prediction, dict): + b64 = ( + prediction.get("bytesBase64Encoded") + or prediction.get("bytes_base64_encoded") + or prediction.get("bytesBase64") + ) + else: + b64 = ( + getattr(prediction, "bytes_base64_encoded", None) + or getattr(prediction, "bytesBase64Encoded", None) + or getattr(prediction, "bytesBase64", None) + ) + assert b64, f"Prediction {i} should contain base64 encoded bytes" + assert isinstance(b64, str), f"Prediction {i} base64 should be a string, got {type(b64)}" + assert len(b64) > 100, f"Prediction {i} base64 seems too short" inline_data = part.get("inlineData") if isinstance(part, dict) else getattr(part, "inline_data", None) if inline_data: found_image = True - mime_type = inline_data.get("mimeType") if isinstance(inline_data, dict) else getattr(inline_data, "mime_type", "") - data = inline_data.get("data") if isinstance(inline_data, dict) else getattr(inline_data, "data", "") + mime_type = ( + (inline_data.get("mimeType") or inline_data.get("mime_type", "")) if isinstance(inline_data, dict) + else (getattr(inline_data, "mime_type", "") or getattr(inline_data, "mimeType", "")) + ) + data = inline_data.get("data") if isinstance(inline_data, dict) else getattr(inline_data, "data", "") assert mime_type.startswith("image/"), \ f"Expected image mime type, got {mime_type}" assert len(data) > 100, "Image data seems too short"Also update the
inlineDatalookup to handle casing:- inline_data = part.get("inlineData") if isinstance(part, dict) else getattr(part, "inline_data", None) + inline_data = ( + (part.get("inlineData") or part.get("inline_data")) if isinstance(part, dict) + else (getattr(part, "inline_data", None) or getattr(part, "inlineData", None)) + )core/providers/nebius/images.go-11-71 (1)
11-71: Add prompt + size validation (empty/zero/negative) and consider normalizing case.
ToNebiusImageGenerationRequest: currently allows emptyPromptand non-positiveWidth/Height(e.g.,"0x-1"). This should fail early (per learnings: validate nil/missing prompts before dispatch).OutputFormatconversion only matches"jpeg"exactly; considerstrings.ToLowerbefore comparison.Proposed fix
diff --git a/core/providers/nebius/images.go b/core/providers/nebius/images.go @@ func (provider *NebiusProvider) ToNebiusImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) (*NebiusImageGenerationRequest, error) { if bifrostReq == nil || bifrostReq.Input == nil { return nil, fmt.Errorf("bifrost request is nil or input is nil") } + if strings.TrimSpace(bifrostReq.Input.Prompt) == "" { + return nil, fmt.Errorf("prompt is required") + } @@ width, err := strconv.Atoi(size[0]) if err != nil { return nil, fmt.Errorf("invalid width in size %q: %w", *bifrostReq.Params.Size, err) } @@ height, err := strconv.Atoi(size[1]) if err != nil { return nil, fmt.Errorf("invalid height in size %q: %w", *bifrostReq.Params.Size, err) } + if width <= 0 || height <= 0 { + return nil, fmt.Errorf("invalid size %q: width and height must be > 0", *bifrostReq.Params.Size) + } @@ if bifrostReq.Params.OutputFormat != nil { - req.ResponseExtension = bifrostReq.Params.OutputFormat + ext := strings.ToLower(strings.TrimSpace(*bifrostReq.Params.OutputFormat)) + req.ResponseExtension = &ext } - if req.ResponseExtension != nil && *req.ResponseExtension == "jpeg" { + if req.ResponseExtension != nil && *req.ResponseExtension == "jpeg" { req.ResponseExtension = schemas.Ptr("jpg") }core/providers/nebius/images.go-62-68 (1)
62-68: UseSafeExtractFloat64Pointerand round tointfor guidance_scale extraction.
NebiusImageGenerationRequest.GuidanceScaleis typed as*int(0-100 range), but the currentSafeExtractIntPointerwill fail silently if the input is a float. Guidance scale is commonly provided as a fractional value (e.g., 7.5). Instead, extract as float64 and convert to int with proper rounding:if v, ok := schemas.SafeExtractFloat64Pointer(bifrostReq.Params.ExtraParams["guidance_scale"]); ok { rounded := int(math.Round(*v)) req.GuidanceScale = &rounded }plugins/semanticcache/utils.go-114-116 (1)
114-116: Potential nil deref in hash generation for image requests.
hashInput.Params = req.ImageGenerationRequest.Paramswill panic ifRequestTypeis image-generation butreq.ImageGenerationRequestis nil.Proposed fix
case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: - hashInput.Params = req.ImageGenerationRequest.Params + if req.ImageGenerationRequest != nil { + hashInput.Params = req.ImageGenerationRequest.Params + }core/providers/openai/images.go-31-42 (1)
31-42: Fix nil receiver + avoid always emitting emptyparams.Two issues:
requestcan be nil (panic onrequest.Model).Params: &request.ImageGenerationParametersforcesparamsto be present even when empty ({}), which can change semantics and cache keys.Proposed fix
func (request *OpenAIImageGenerationRequest) ToBifrostImageGenerationRequest() *schemas.BifrostImageGenerationRequest { + if request == nil { + return nil + } provider, model := schemas.ParseModelString(request.Model, schemas.OpenAI) + var params *schemas.ImageGenerationParameters + if request.ImageGenerationParameters != (schemas.ImageGenerationParameters{}) { + p := request.ImageGenerationParameters + params = &p + } + return &schemas.BifrostImageGenerationRequest{ Provider: provider, Model: model, Input: &schemas.ImageGenerationInput{ Prompt: request.Prompt, }, - Params: &request.ImageGenerationParameters, + Params: params, Fallbacks: schemas.ParseFallbacks(request.Fallbacks), } }docs/openapi/paths/inference/images.yaml-3-57 (1)
3-57: OpenAPI: movedescriptionout of thetext/event-streammedia type; document partial/error SSE event types too.
descriptionundercontent > text/event-streamis not a valid Media Type Object field in OAS; put that text in the200.descriptionor in the referenced schema’sdescription.- Also document
image_generation.partial_imageanderrorevent types (not justimage_generation.completed). Based on learnings, enforce the SSE event type values consistently.core/providers/openai/openai.go-2496-2533 (1)
2496-2533: Streaming ImageGeneration: nil-check is good; also validate Input/prompt.
You guardrequest == nil, but you can still send a null/empty prompt (and then rely on provider errors). Align this with the non-streaming validation and the repo’s image-generation validation pattern. Based on learnings, validate missing prompts before dispatch.core/providers/openai/openai.go-2382-2491 (1)
2382-2491: Validate nil request / missing prompt before dispatching image generation.
ImageGeneration()doesn’t guardreq == nilorreq.Input == nil/ empty prompt; later you derefrequest.Modelin error paths. This should follow the existing pattern used elsewhere in the repo for image generation validation. Based on learnings, enforce nil/empty prompt checks.Proposed fix
func (provider *OpenAIProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if req == nil || req.Input == nil || strings.TrimSpace(req.Input.Prompt) == "" { + return nil, providerUtils.NewBifrostOperationError("invalid request: prompt is required", nil, provider.GetProviderKey()) + } + if err := providerUtils.CheckOperationAllowed(schemas.OpenAI, provider.customProviderConfig, schemas.ImageGenerationRequest); err != nil { return nil, err }plugins/semanticcache/plugin_image_generation_test.go-103-110 (1)
103-110: Config missing required fields (Provider, Keys, Dimension) disables semantic caching.The test intends to test semantic search but silently falls back to direct image generation.
Config{Threshold: 0.5}omitsProvider,Keys, andDimension, causingInit()to log a warning and disable semantic caching. Match the pattern fromNewTestSetup()and tests inplugin_core_test.goby providing all required fields alongside the customThreshold.framework/streaming/types.go-360-387 (1)
360-387: Ensure image responses always marshaldataas[], notnull.
WhenimageResp == nil, you create&schemas.BifrostImageGenerationResponse{}but don’t initializeData. Givencore/schemas/images.godefinesData []ImageData \json:"data"`, a nil slice can serialize tonull`, which is a common client-breaker.Proposed fix
case StreamTypeImage: imageResp := p.Data.ImageGenerationOutput if imageResp == nil { imageResp = &schemas.BifrostImageGenerationResponse{} + imageResp.Data = []schemas.ImageData{} if p.RequestID != "" { imageResp.ID = p.RequestID } if p.Model != "" { imageResp.Model = p.Model } }core/providers/azure/azure.go-1212-1252 (1)
1212-1252: Add nil/prompt validation before accessingrequest.Model(prevents panic + bad upstream calls).
request.Modelis read at Line 1222; ifrequest == nilyou’ll panic. Also validaterequest.Input.Promptto avoid dispatching empty prompts. Based on learnings, this should be consistently enforced.Proposed fix
func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if request == nil || request.Input == nil || strings.TrimSpace(request.Input.Prompt) == "" { + return nil, providerUtils.NewBifrostOperationError("prompt is required", nil, provider.GetProviderKey()) + } + if strings.TrimSpace(request.Model) == "" { + return nil, providerUtils.NewBifrostOperationError("model is required", nil, provider.GetProviderKey()) + } // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err }Based on learnings, please apply the same guard pattern across providers that implement image generation.
tests/integrations/python/tests/test_google.py-1690-1853 (1)
1690-1853: Don’t let missing GEMINI_API_KEY / transient HTTP issues become hard failures or get silently skipped.
test_41a_*/test_41c_*callget_api_key(provider)but are not decorated with@skip_if_no_api_key(...)→ will raise and fail the suite when env isn’t set.test_41b_imagen_predictcatchesExceptionand skips, which can hide real regressions (e.g., schema/response shape changes). Narrow the exception and only skip on expected “not available” conditions.Proposed diff
@@ # IMAGE GENERATION TEST CASES # ========================================================================= + @skip_if_no_api_key("gemini") @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("image_generation")) def test_41a_image_generation_simple(self, test_config, provider, model): @@ - from .utils.config_loader import get_integration_url, get_config - from .utils.common import get_api_key + from .utils.config_loader import get_integration_url + from .utils.common import get_api_key @@ response = requests.post(url, json=payload, headers=headers, timeout=60) assert response.status_code == 200, f"Request failed with status {response.status_code}: {response.text}" @@ - @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("imagen")) + @skip_if_no_api_key("gemini") + @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("imagen")) def test_41b_imagen_predict(self, test_config, provider, model): @@ - try: - response = requests.post(url, json=payload, headers=headers, timeout=60) - assert response.status_code == 200, f"Request failed with status {response.status_code}: {response.text}" - - response_data = response.json() - - # Validate response structure - assert_valid_image_generation_response(response_data, "google") - except Exception as e: - # Imagen may not be available in all regions or configurations - pytest.skip(f"Imagen generation failed (may not be available): {e}") + response = requests.post(url, json=payload, headers=headers, timeout=60) + if response.status_code in (404, 501): + pytest.skip(f"Imagen endpoint not available (status={response.status_code})") + response.raise_for_status() + response_data = response.json() + assert_valid_image_generation_response(response_data, "google") @@ + @skip_if_no_api_key("gemini") @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("image_generation")) def test_41c_image_generation_with_text(self, test_config, provider, model): @@ - if "candidates" in response_data and response_data["candidates"]: - for candidate in response_data["candidates"]: - if "content" in candidate and "parts" in candidate["content"]: - for part in candidate["content"]["parts"]: - if "text" in part and part["text"]: - found_text = True - if "inlineData" in part and part["inlineData"]: - found_image = True + for candidate in (response_data.get("candidates") or []): + for part in ((candidate.get("content") or {}).get("parts") or []): + if part.get("text"): + found_text = True + if part.get("inlineData"): + found_image = Trueframework/streaming/accumulator.go-293-321 (1)
293-321: addImageStreamChunk: FinalTimestamp shouldn’t depend on de-dup + fix indentation bug.
Right now FinalTimestamp is only set when the chunk is “new”; if a final chunk is duplicated (or arrives after a duplicate key), FinalTimestamp can remain unset. Also, Line 317 indentation is off (easy to miss in review; can hide logic mistakes).Proposed fix
func (a *Accumulator) addImageStreamChunk(requestID string, chunk *ImageStreamChunk, isFinalChunk bool) error { acc := a.getOrCreateStreamAccumulator(requestID) acc.mu.Lock() defer acc.mu.Unlock() // ... chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex) if _, seen := acc.ImageChunksSeen[chunkKey]; !seen { acc.ImageChunksSeen[chunkKey] = struct{}{} acc.ImageStreamChunks = append(acc.ImageStreamChunks, chunk) - if isFinalChunk { - acc.FinalTimestamp = chunk.Timestamp - } } + if isFinalChunk { + acc.FinalTimestamp = chunk.Timestamp + } return nil }core/providers/gemini/gemini.go-1614-1671 (1)
1614-1671: ImageGeneration: add nil/empty prompt validation + guard against nil request-body conversion.
schemas.IsImagenModel(request.Model)will panic ifrequestis nil, and the request converter can still returnnil(e.g., missingInput). Also, per learnings, prompt should be validated before dispatch.Proposed fix
func (provider *GeminiProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if request == nil || request.Input == nil || strings.TrimSpace(request.Input.Prompt) == "" { + return nil, providerUtils.NewBifrostOperationError( + schemas.ErrProviderCreateRequest, + fmt.Errorf("image generation prompt is required"), + provider.GetProviderKey(), + ) + } + // Check if image gen is allowed for this provider if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ImageGenerationRequest); err != nil { return nil, err } // check for imagen models if schemas.IsImagenModel(request.Model) { return provider.handleImagenImageGeneration(ctx, key, request) } // Prepare body jsonData, bifrostErr := providerUtils.CheckContextAndGetRequestBody( ctx, request, - func() (any, error) { return ToGeminiImageGenerationRequest(request), nil }, + func() (any, error) { + reqBody := ToGeminiImageGenerationRequest(request) + if reqBody == nil { + return nil, fmt.Errorf("image generation request could not be converted to Gemini format") + } + return reqBody, nil + }, provider.GetProviderKey())core/providers/gemini/gemini.go-1673-1749 (1)
1673-1749: Imagen handler: respect configured BaseURL/path overrides and avoid logging entire image payloads.
baseURL := "https://generativelanguage.googleapis.com/..."ignoresprovider.networkConfig.BaseURLandproviderUtils.GetPathFromContext(...)overrides (common in this repo for routing/proxy).Debug(... string(resp.Body()))can log huge base64 payloads (and potentially user content), which is risky and noisy.Proposed fix
func (provider *GeminiProvider) handleImagenImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { providerName := provider.GetProviderKey() // Prepare Imagen request body jsonData, bifrostErr := providerUtils.CheckContextAndGetRequestBody( ctx, request, - func() (any, error) { return ToImagenImageGenerationRequest(request), nil }, + func() (any, error) { + reqBody := ToImagenImageGenerationRequest(request) + if reqBody == nil { + return nil, fmt.Errorf("imagen request could not be converted (missing input?)") + } + return reqBody, nil + }, providerName) if bifrostErr != nil { return nil, bifrostErr } - baseURL := fmt.Sprintf("https://generativelanguage.googleapis.com/v1beta/models/%s:predict", request.Model) + path := providerUtils.GetPathFromContext(ctx, "/models/"+request.Model+":predict") + baseURL := provider.networkConfig.BaseURL + path // ... if resp.StatusCode() != fasthttp.StatusOK { - provider.logger.Debug(fmt.Sprintf("error from %s provider (Imagen): %s", providerName, string(resp.Body()))) + provider.logger.Debug(fmt.Sprintf("error from %s provider (Imagen): status=%d", providerName, resp.StatusCode())) return nil, parseGeminiError(resp, &providerUtils.RequestMetadata{ Provider: providerName, Model: request.Model, RequestType: schemas.ImageGenerationRequest, }) }
56f1f0b to
40211ea
Compare
There was a problem hiding this comment.
Actionable comments posted: 17
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
plugins/semanticcache/utils.go (2)
88-116: Guard against nilImageGenerationRequestin request-hash generation.
hashInput.Params = req.ImageGenerationRequest.Paramswill panic ifreq.ImageGenerationRequestis nil (even ifRequestTypesays image gen). Other cases sometimes guard (e.g., Speech). Suggest mirroring that pattern.Proposed diff
@@ case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: - hashInput.Params = req.ImageGenerationRequest.Params + if req.ImageGenerationRequest != nil { + hashInput.Params = req.ImageGenerationRequest.Params + } }
379-413: Don’t store full base64 images in semantic-cache metadata (likely size/cost blow-up).
metadata["response"]already contains the whole image generation response; additionally storingmetadata["image_b64"]duplicates the largest fields and can exceed vector-store metadata limits / increase write costs. Strongly suggest storing only URLs (or a short hash/preview) and maybe counts, not full b64 blobs.Proposed diff (drop b64 duplication; keep urls + revised prompts)
@@ if res.ImageGenerationResponse != nil { var imageURLs []string - var imageB64 []string var revisedPrompts []string @@ if img.B64JSON != "" { - imageB64 = append(imageB64, img.B64JSON) + // Intentionally not duplicating base64 blobs into metadata. } @@ metadata["image_urls"] = imageURLs - metadata["image_b64"] = imageB64 metadata["revised_prompts"] = revisedPrompts }plugins/semanticcache/stream.go (1)
91-129: Fix potential nil deref in image-generation chunk sorting comparator.In the new block,
accumulator.Chunks[j].Response.ImageGenerationStreamResponsemay be nil even wheniis non-nil, which would panic during sort.Proposed diff (nil-safe compare)
@@ if accumulator.Chunks[i].Response.ImageGenerationStreamResponse != nil { + if accumulator.Chunks[j].Response.ImageGenerationStreamResponse == nil { + // Push non-image-gen chunks after image-gen chunks deterministically + return true + } // For image generation, sort by Index first, then ChunkIndex if accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index != accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index { return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index } return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.ChunkIndex < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.ChunkIndex }transports/bifrost-http/integrations/genai.go (2)
36-62: Fix misrouting:IsEmbeddingcan shadowIsImageGeneration
Right nowRequestConverterchecksgeminiReq.IsEmbeddingbeforegeminiReq.IsImageGeneration. InextractAndSetModelFromURL,:predictcan setIsEmbedding = truewhileisImageGenerationRequest(r)can also setIsImageGeneration = true, causing the request to be treated as embedding. Make these flags mutually exclusive (or reorder the converter).Proposed fix (mutual exclusivity + safer ordering)
- } else if geminiReq.IsEmbedding { + } else if geminiReq.IsImageGeneration { + return &schemas.BifrostRequest{ + ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(), + }, nil + } else if geminiReq.IsEmbedding { return &schemas.BifrostRequest{ EmbeddingRequest: geminiReq.ToBifrostEmbeddingRequest(), }, nil } ... - } else if geminiReq.IsImageGeneration { - return &schemas.BifrostRequest{ - ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(), - }, nil - }- r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r) + r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r) + if r.IsImageGeneration { + r.IsEmbedding = false + r.IsSpeech = false + r.IsTranscription = false + r.IsCountTokens = false + }Also applies to: 413-431
362-369: Avoid panics: type-assertmodelpath param safely
modelStr := model.(string)can panic if the router ever passes a non-string. Defensiveokchecking is cheap here.Proposed fix
- modelStr := model.(string) + modelStr, ok := model.(string) + if !ok || modelStr == "" { + return fmt.Errorf("model parameter must be a non-empty string") + }framework/streaming/types.go (1)
345-393: Bug: Transcription response extra fields look truncated (RawResponse/CacheDebug missing).
Right now StreamTypeTranscription sets RawRequest then falls into StreamTypeImage without populating RawResponse/CacheDebug (unlike other response types).Proposed fix
case StreamTypeTranscription: transcriptionResp := p.Data.TranscriptionOutput if transcriptionResp == nil { transcriptionResp = &schemas.BifrostTranscriptionResponse{} } resp.TranscriptionResponse = transcriptionResp resp.TranscriptionResponse.ExtraFields = schemas.BifrostResponseExtraFields{ RequestType: schemas.TranscriptionRequest, Provider: p.Provider, ModelRequested: p.Model, Latency: p.Data.Latency, } if p.RawRequest != nil { resp.TranscriptionResponse.ExtraFields.RawRequest = p.RawRequest } + if p.Data.RawResponse != nil { + resp.TranscriptionResponse.ExtraFields.RawResponse = *p.Data.RawResponse + } + if p.Data.CacheDebug != nil { + resp.TranscriptionResponse.ExtraFields.CacheDebug = p.Data.CacheDebug + } case StreamTypeImage:transports/bifrost-http/handlers/inference.go (1)
1258-1334: Major:[DONE]emission decision should not depend on the last processed chunk (Responses/ImageGen correctness).
Right nowincludeEventTypeis reset per chunk, but the post-loop[DONE]check uses its final value. If the last non-nil chunk doesn’t match the typed-event condition, you could incorrectly emit[DONE]for Responses/ImageGen streams. Based on learnings, Responses-style streams must not rely on[DONE].Proposed fix
- var includeEventType bool + var includeEventType bool // Use streaming response writer ctx.Response.SetBodyStreamWriter(func(w *bufio.Writer) { @@ - var skipDoneMarker bool + var skipDoneMarker bool + var streamUsesTypedEvents bool @@ for chunk := range stream { @@ includeEventType = false if chunk.BifrostResponsesStreamResponse != nil || chunk.BifrostImageGenerationStreamResponse != nil || (chunk.BifrostError != nil && (chunk.BifrostError.ExtraFields.RequestType == schemas.ResponsesStreamRequest || chunk.BifrostError.ExtraFields.RequestType == schemas.ImageGenerationStreamRequest)) { includeEventType = true + streamUsesTypedEvents = true } @@ - if !includeEventType && !skipDoneMarker { + if !streamUsesTypedEvents && !skipDoneMarker { // Send the [DONE] marker ...core/providers/huggingface/huggingface.go (2)
114-218: URL path inconsistency in retry logic for hfInference provider.The initial request builds the URL with
originalModelName(line 128), but the retry (line 202) rebuilds it withmodelName(validated) for non-skipValidation cases. Since hfInference embeds the model name in the URL path (/hf-inference/models/{modelName}/...), the retry will hit a different endpoint than the initial attempt.The skipValidation optimization correctly avoids this for falAI, nebius, and together image generation by ensuring
retryModelNameequalsoriginalModelName. However, for hfInference and other request types whereskipValidationis false, the URL changes on retry after cache invalidation, creating an inconsistent retry behavior.Either the initial URL should be built after validation, or the retry should consistently use
originalModelNameas originally provided.
220-265: Branch error parsing onisHFInferenceImageRequestor request type; all non-200 responses currently route throughparseHuggingFaceImageErrorregardless of request type.The parameter
isHFInferenceImageRequestis unused in the error handler (line 251-252). This causes chat, embedding, speech, and transcription errors to be parsed through an image-specific error parser, which is incorrect. Additionally, metadata is passed asnilunlike the streaming path (line 1108), losing Provider/Model/RequestType context in error responses.Refactor to either branch error parsing on
isHFInferenceImageRequestor passrequestTypetocompleteRequestand select the appropriate error handler, and include metadata.
🤖 Fix all issues with AI agents
In @core/providers/azure/azure.go:
- Around line 1212-1305: The ImageGenerationStream method should not attempt SSE
streaming for Azure because Azure OpenAI image generation doesn't support
streaming; replace the current implementation of ImageGenerationStream to
immediately return
providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationStreamRequest,
provider.GetProviderKey()). Also ensure the non-streaming ImageGeneration path
sends the required "model" field: verify
openai.HandleOpenAIImageGenerationRequest is called with request.Model
(deployment) included in the request body/params per Azure docs (the function
call in ImageGeneration and the handler HandleOpenAIImageGenerationRequest
should ensure the "model" field is present).
In @core/providers/gemini/gemini.go:
- Around line 1673-1749: The Imagen handler hard-codes the endpoint URL in
handleImagenImageGeneration which bypasses provider.networkConfig.BaseURL and
providerUtils.GetPathFromContext; change the baseURL construction to use
provider.networkConfig.BaseURL + providerUtils.GetPathFromContext(ctx,
"/models/"+request.Model+":predict") (preserving the model substitution) so the
request respects configured BaseURL and any context path overrides, then use
that baseURL for req.SetRequestURI as before.
In @core/providers/gemini/images.go:
- Around line 123-180: The ToBifrostImageGenerationResponse function can
overwrite bifrostResp.Data with a nil slice and may panic when iterating
candidate.Content.Parts if any part is nil; ensure you keep bifrostResp.Data as
the initialized empty slice when no images are found by only assigning imageData
when it is non-nil/has length, and add nil guards when iterating
candidate.Content.Parts (check that candidate.Content != nil and each part !=
nil before accessing part.InlineData and part.InlineData.MIMEType); also only
append to imageMetadata and set ImageGenerationResponseParameters when you
actually pushed metadata elements.
In @core/providers/openai/openai_test.go:
- Around line 43-83: The test scenarios block named TestScenarios currently
disables every scenario except ImageGeneration/ImageGenerationStream; restore
full coverage by re-enabling the needed scenarios (set the boolean flags like
TextCompletion, SimpleChat, CompletionStream, MultiTurnConversation, ToolCalls,
Embedding, Transcription, ListModels, File* tests, Batch* tests,
StructuredOutputs, etc. to true) or replace the manual override with the
canonical default (e.g., use testutil.DefaultScenarios or the existing baseline
scenario constant) so that openai_test.go runs the full suite rather than only
image generation before merging.
In @core/providers/openai/openai.go:
- Around line 2637-2641: The streaming goroutine for image responses is missing
the context cancellation and timeout cleanup used elsewhere; update the
goroutine that currently defers close(responseChan) and
providerUtils.ReleaseStreamingResponse(resp) to also defer the same
cancellation/timeout handlers used in HandleOpenAIChatCompletionStreaming —
e.g., add defer calls to providerUtils.HandleStreamingContextCancellation(ctx,
cancel, resp, responseChan) and providerUtils.HandleStreamingExitTimeout(ctx,
cancel, resp, responseChan) (or the exact handler names used in that handler) so
the stream correctly propagates cancellations/timeouts and performs identical
cleanup.
In @docs/providers/supported-providers/nebius.mdx:
- Around line 153-156: Update the guidance_scale type and extraction to use
float64: change the documented type for guidance_scale from int to float in the
provider docs, change the struct field GuidanceScale *int to GuidanceScale
*float64 in the Nebius provider types, and replace the use of
SafeExtractIntPointer() with SafeExtractFloat64Pointer() where guidance_scale is
parsed (e.g., in the image extraction logic that currently calls
SafeExtractIntPointer for guidance_scale) so the value is handled as a float64
pointer throughout.
In @framework/modelcatalog/pricing.go:
- Around line 101-113: The code captures imageUsage for
ImageGenerationStreamResponse but CalculateCostFromUsage only treats request
types equal to schemas.ImageGenerationRequest as image pricing; to fix,
normalize stream request types to the image request type before calling
CalculateCostFromUsage (or pass a derived requestType variable): detect when
imageUsage != nil and extraFields.RequestType indicates a stream (e.g.,
"image_generation_stream" or the enum for ImageGenerationStreamRequest) and set
requestType = schemas.ImageGenerationRequest (or the equivalent constant) so
CalculateCostFromUsage treats stream responses as images; update the call site
that uses extraFields.RequestType to use this normalized requestType (affected
symbols: imageUsage, extraFields.RequestType, ImageGenerationStreamResponse,
CalculateCostFromUsage).
In @framework/streaming/images.go:
- Around line 13-110: In buildCompleteImageFromImageStreamChunks, preserve the
original provider timestamps and any streamed URL fields: don't set Created to
time.Now().Unix() for the multi-chunk path—use the first non-empty
chunk.Delta.CreatedAt (same source used in the single-chunk path) when
assembling finalResponse.ID/Created, and while reconstructing images, collect
and set per-image URL values (chunk.Delta.URL) similar to how B64JSON is
concatenated so the final schemas.BifrostImageGenerationResponse.Data entries
include URL when present (use the URL from the final chunk for that image if
multiple chunks provide it).
In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 11-216: Both tests are integration tests that should be gated;
update TestImageGenerationCacheBasicFunctionality and
TestImageGenerationSemanticSearch to early-skip when running in short mode or
when the OpenAI API key is missing (os.Getenv("OPENAI_API_KEY") == ""), and
consider allowing an opt-in build tag for CI. Insert checks at the top of each
test (use testing.Short() and the env var) to call t.Skipf with a clear message
if gating conditions are not met, and keep the existing
CreateContextWithCacheKey/CreateImageGenerationRequest logic unchanged so only
the tests’ entry points are gated.
In @tests/integrations/python/tests/test_google.py:
- Around line 1691-1856: Rename the unused fixture parameter test_config to
_test_config in test_41a_image_generation_simple, test_41b_imagen_predict, and
test_41c_image_generation_with_text to silence ARG002; in
test_41b_imagen_predict replace the broad except Exception with targeted
handlers for network/HTTP and JSON decode errors (catch
requests.exceptions.RequestException and json.JSONDecodeError around the
requests.post/response.json() block); and in test_41c_image_generation_with_text
replace the nested "if 'candidates' in response_data and
response_data['candidates']" and deep indexing with safe .get() calls (e.g.,
response_data.get("candidates", []) and candidate.get("content",
{}).get("parts", [])) when iterating to avoid KeyError and improve readability.
In @ui/app/workspace/logs/views/imageView.tsx:
- Around line 83-93: The pagination buttons render icons only and lack
accessible labels; update the two Button usages that call goToPrevious and
goToNext (the elements rendering ChevronLeft and ChevronRight) to include
descriptive aria-label attributes (e.g., "Previous image" and "Next image") and
optionally add matching title attributes for hover/tooltips so screen readers
and mouse users can discover the controls.
In @ui/lib/types/logs.ts:
- Around line 142-174: The BifrostImageGenerationData interface currently
requires index but the UI doesn't rely on it; change the
BifrostImageGenerationData type so index is optional (index?: number) to match
ImageMessageData and avoid forcing callers to manufacture it; then run a quick
grep for usages of BifrostImageGenerationData (and any code referencing
image.index) and ensure callers use the existing fallback pattern (e.g.,
image.index ?? 0) so nothing breaks.
🟡 Minor comments (6)
transports/bifrost-http/handlers/inference.go-1414-1499 (1)
1414-1499: Good: request validation and streaming hookup are in the right place; minor message nit.
The early model parsing + prompt checks are good; consider changing"prompt can not be empty"to"prompt cannot be empty"for consistency.docs/openapi/schemas/inference/images.yaml-21-31 (1)
21-31: Missing"auto"option in size enum.The Go struct
ImageGenerationParameters(from relevant snippets) includes"auto"as a valid size option, but it's missing from this OpenAPI schema.📝 Suggested fix
size: type: string enum: - "256x256" - "512x512" - "1024x1024" - "1792x1024" - "1024x1792" - "1536x1024" - "1024x1536" + - "auto" description: Size of the generated imagecore/providers/openai/openai.go-2857-2860 (1)
2857-2860: Missing context error check before processing scanner errors.Other streaming handlers check
ctx.Err()before processing scanner errors to let the defer handle cancellation/timeout. Without this check, scanner errors from cancelled contexts may be incorrectly logged and processed.🐛 Proposed fix
if err := scanner.Err(); err != nil { + // If context was cancelled/timed out, let defer handle it + if ctx.Err() != nil { + return + } logger.Warn(fmt.Sprintf("Error reading stream: %v", err)) providerUtils.ProcessAndSendError(ctx, postHookRunner, err, responseChan, schemas.ImageGenerationStreamRequest, providerName, request.Model, logger) }core/providers/gemini/images.go-348-389 (1)
348-389: MIME parsing fallback should strip parameters (e.g.,image/png; charset=binary).
convertMimeTypeToExtensionwill returnpng; charset=binarytoday. Consider splitting on;first.core/internal/testutil/test_retry_framework.go-2947-3087 (1)
2947-3087: Minor: function name/comment mismatch (StreamValidationRetry vs WithImageGenerationStreamRetry).Either rename the function to
WithImageGenerationStreamValidationRetryor adjust the comment to match what the API is.core/providers/gemini/images.go-182-250 (1)
182-250: UseSafeExtractSafetySettingshelper and validate prompt is not empty before creating request.The
SafetySettingsextraction at lines 201–209 bypasses the type-safe helper already defined and used elsewhere in the codebase. Replace the direct type assertionsafetySettings.([]SafetySetting)with theSafeExtractSafetySettingshelper (seegemini/transcription.go:123andgemini/responses.go:125for the pattern).Additionally, validate that the prompt is not empty before returning the request. Return
nilwhenbifrostReq.Input == nilor the prompt is blank, soCheckContextAndGetRequestBodycan raise a structured error instead of allowing a malformed request to propagate downstream.
🧹 Nitpick comments (39)
core/schemas/plugin.go (1)
49-51: Stale comment references old function name.The comment still references the unexported
caseInsensitiveLookupname, but the function has been renamed to the exportedCaseInsensitiveLookup.📝 Suggested fix
-// caseInsensitiveLookup looks up a key in a case-insensitive manner for a map of strings +// CaseInsensitiveLookup looks up a key in a case-insensitive manner for a map of strings // Returns the value if found, otherwise an empty string func CaseInsensitiveLookup(data map[string]string, key string) string {docs/providers/supported-providers/vertex.mdx (1)
25-25: Verify the endpoint value in the supported operations table.The endpoint column shows
/generateContentor/predict(Imagen), but the endpoint selection section (lines 392-394) shows more complete paths. Consider aligning with the detailed endpoint format for consistency, or clarify that these are simplified endpoint suffixes.docs/providers/supported-providers/gemini.mdx (1)
571-576: Consider clarifying valid aspect ratio mappings.The size conversion mentions aspect ratios like
"1:1","3:4","4:3","9:16","16:9", but it's unclear what happens if a user provides a size that doesn't map cleanly to these ratios (e.g.,"800x600"which is 4:3 but"850x600"which isn't). Consider adding a note about how non-standard ratios are handled or if they default to the nearest supported ratio.framework/configstore/tables/modelpricing.go (1)
40-46: LGTM! Consider adding cache creation cost for symmetry.The new image generation pricing fields are well-structured with proper GORM column tags and nullable types. The naming follows the existing conventions.
One consideration: you have
CacheReadInputImageTokenCostbut no correspondingCacheCreationInputImageTokenCost. The text token pricing has bothCacheReadInputTokenCostandCacheCreationInputTokenCost. If image token caching follows the same pattern, you may want to add the creation cost field for completeness.Optional: Add cache creation cost field for symmetry
// Image generation pricing InputCostPerImageToken *float64 `gorm:"default:null;column:input_cost_per_image_token" json:"input_cost_per_image_token,omitempty"` OutputCostPerImageToken *float64 `gorm:"default:null;column:output_cost_per_image_token" json:"output_cost_per_image_token,omitempty"` InputCostPerImage *float64 `gorm:"default:null;column:input_cost_per_image" json:"input_cost_per_image,omitempty"` OutputCostPerImage *float64 `gorm:"default:null;column:output_cost_per_image" json:"output_cost_per_image,omitempty"` CacheReadInputImageTokenCost *float64 `gorm:"default:null;column:cache_read_input_image_token_cost" json:"cache_read_input_image_token_cost,omitempty"` + CacheCreationInputImageTokenCost *float64 `gorm:"default:null;column:cache_creation_input_image_token_cost" json:"cache_creation_input_image_token_cost,omitempty"`core/providers/xai/xai.go (1)
230-244: Minor grammar fix in doc comment.Line 230: "a image" → "an image".
Verify xAI image generation API support.
This implementation assumes xAI's API is OpenAI-compatible for image generation at
/v1/images/generations. Please confirm xAI actually exposes this endpoint.Does xAI Grok API support image generation endpoint?plugins/semanticcache/utils.go (2)
3-15: Verifysonicis safe/compatible for this repo’s Go/runtime constraints before relying on it.
soniccan be great for throughput, but it’s an additional dependency with different performance/unsafe characteristics thanencoding/json. Please confirm it’s supported in your Go version/build flags and that marshaling output is acceptable for how the cachedmetadata["response"]is consumed.
477-516: Image-generation input normalization + params→metadata extraction are reasonable; consider key-collision strategy.The new
extractImageGenerationParametersToMetadatamirrors other param extractors well. One thing to sanity-check across the whole stack: metadata keys like"n","seed","size"are now shared across multiple request types—if downstream filtering assumes keys are type-specific, you may want a"request_type"tag orimage_*prefixing.Also applies to: 650-657, 956-1008
tests/integrations/python/tests/test_openai.py (1)
1170-1257: Reduce flakiness in cross-provider image-gen tests (lint + per-parameter compatibility).
- Unused fixture arg (Ruff ARG002): rename
test_config→_test_configintest_52a..test_52d.- Quality/size compatibility:
quality="low"and"1024x1536"won’t be accepted by every provider/model even if they support basic image generation—consider skipping on “unsupported parameter / invalid size” errors unless the scenario is guaranteed to be that strict.Proposed diff (lint-only)
@@ - def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled): + def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled): @@ - def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled): + def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled): @@ - def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled): + def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled): @@ - def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled): + def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):core/providers/nebius/nebius_test.go (1)
47-49: Minor: Consider consistent ordering of test scenario flags.The
ImageGenerationStreamflag (line 49) is placed afterCompleteEnd2End(line 48), whereas ingemini_test.goit immediately followsImageGeneration. Consider aligning the ordering for consistency across provider tests.Suggested reordering for consistency
MultipleImages: true, ImageGeneration: true, + ImageGenerationStream: false, CompleteEnd2End: true, - ImageGenerationStream: false, Embedding: true, // Nebius supports embeddingsdocs/providers/supported-providers/xai.mdx (1)
124-141: LGTM - Image generation documentation section.The documentation clearly explains request/response conversion, endpoint path, and streaming limitations. The reference to OpenAI Image Generation docs is helpful for detailed parameter information.
Minor style suggestion: The static analysis tool flagged three lines starting with similar patterns (Model, Parameters, Endpoint). Consider varying the format for improved readability:
📝 Optional style improvement
-- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` -- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the request struct via struct embedding -- **Endpoint**: `/v1/images/generations` +| Field | Mapping | +|-------|---------| +| Model & Prompt | `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` | +| Parameters | All fields from `bifrostReq.Params` (`ImageGenerationParameters`) embedded via struct embedding | +| Endpoint | `/v1/images/generations` |core/providers/nebius/types.go (1)
26-29: Consider Go naming convention for URL acronym.Go convention prefers
URLoverUrlfor acronyms. However, since this must match the JSON field name"url"for API compatibility, this is acceptable as-is.core/internal/testutil/validation_presets.go (1)
222-222: Remove leftover implementation comment.This comment appears to be a leftover note from development and should be removed as it's redundant — the function is already in this file.
🧹 Proposed fix
-// In validation_presets.go - add this function +// ImageGenerationExpectations returns validation expectations for image generation scenarioscore/providers/openai/openai_image_test.go (1)
12-13: Consider renaming test function for clarity.The function name
TestImageGenerationStreamingRequestConversionsuggests it tests streaming-specific behavior, but it actually tests the generalToOpenAIImageGenerationRequestconversion. Consider renaming toTestToOpenAIImageGenerationRequestfor accuracy.✏️ Suggested rename
-// TestImageGenerationStreamingRequestConversion -func TestImageGenerationStreamingRequestConversion(t *testing.T) { +// TestToOpenAIImageGenerationRequest tests Bifrost to OpenAI request conversion +func TestToOpenAIImageGenerationRequest(t *testing.T) {core/schemas/provider.go (1)
352-357: Minor grammar issue in doc comments.The comments say "a image" which should be "an image".
📝 Suggested fix
- // ImageGeneration performs a image generation request + // ImageGeneration performs an image generation request ImageGeneration(ctx *BifrostContext, key Key, request *BifrostImageGenerationRequest) ( *BifrostImageGenerationResponse, *BifrostError) - // ImageGenerationStream performs a image generation stream request + // ImageGenerationStream performs an image generation stream request ImageGenerationStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key, request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)core/internal/testutil/response_validation.go (1)
1096-1167: HardenProviderSpecificparsing (avoid brittle.(int)/.(string)assertions)
expectations.ProviderSpecific["min_images"].(int)/["expected_size"].(string)will silently skip validation when the value isn’t the exact asserted type (common if values originate from JSON/untyped maps). Consider acceptingint/float64/json.Numberand coercing. Also consider assertingresponse.ExtraFields.RequestTypeis an image generation type when present, to catch misrouted responses early.ui/app/workspace/logs/views/imageView.tsx (1)
22-39: Consider memoizingimagesto avoid effect churn
imagesis a new array each render, so the[images, totalImages]effect will run frequently. This likely bails out, butuseMemo(or depending onimageOutput?.dataandtotalImages) would make the clamp effect more stable.core/providers/openai/images.go (1)
31-57: Avoid Params aliasing (params = &request.ImageGenerationParameters)
Returning a pointer to the receiver’s embedded params can leak mutations across layers. Safer to copy into a local before taking the address (and deep-copyExtraParamsif you rely on it later).Proposed fix
- var params *schemas.ImageGenerationParameters + var params *schemas.ImageGenerationParameters if request.N != nil || request.Background != nil || request.Moderation != nil || ... len(request.ExtraParams) > 0 { - params = &request.ImageGenerationParameters + p := request.ImageGenerationParameters + params = &p }docs/openapi/paths/inference/images.yaml (2)
3-58: Clarify wherestreamis modeled (query vs body) so clients can discover SSE behavior.
The description saysstream=true, but this path definition doesn’t declare a query parameter. Ifstreamis a request-body field (likely), consider explicitly stating “streamin request body” or adding aparameters:entry if it’s query-based. This reduces SDK/client confusion.
13-20: Avoid over-asserting “Supported Providers” in the spec unless it’s enforced.
If runtime support is feature-flagged / stack-dependent, consider wording like “Examples” or “Known supported providers” to avoid docs drifting from actual routing/provider enablement.transports/bifrost-http/integrations/router.go (1)
1085-1087: Done-marker suppression for/images/generationsis directionally right, but path-based checks are brittle.
You’re correctly avoidingdata: [DONE]for typed-event streams, butstrings.Contains(config.Path, "/images/generations")can drift if paths change. A per-routeDisableDoneMarker(or aStreamTerminationMode) would be easier to maintain than hardcoding substrings.Also applies to: 1190-1193
core/providers/huggingface/images.go (2)
95-129: Size parsing is strict for Nebius but best-effort for fal-ai; consider consistent behavior.
Right now, invalidsizeerrors for Nebius but silently no-ops for fal-ai (and stream). If you want predictable UX, either (a) validatesizeformat once in a shared layer, or (b) consistently “best-effort” across providers and document it.Also applies to: 156-231
233-323: Response conversion is reasonable; consider populatingID/Createdwhen upstream provides it.
Some branches return onlyModel+Data. If upstream responses contain IDs/timestamps (Nebius/fal/together), mapping them would improve consistency with other providers’ image responses.core/providers/gemini/responses.go (1)
2443-2556: Consider avoiding decode+re-encode for base64 image/audio/file inline data (if Blob.Data invariant is base64).
Right now you decode base64 and then re-encode to base64 again forBlob.Data. If the invariant is already “base64 string”, passing it through avoids CPU/memory overhead on large media payloads.plugins/semanticcache/plugin_image_generation_test.go (2)
71-86: Avoid failing on performance ratios in tests (log-only), unless the repo explicitly treats perf regressions as test failures.
Even the “10x slower” threshold can happen under CI contention, causing noisy failures unrelated to correctness. Prefer logging perf stats or gating perf asserts behind an env flag.Also applies to: 198-213, 377-396
103-196: Semantic cache test may be non-deterministic; consider soft-asserting or widening diagnostics.
A strict “semantic match expected” can fail if embeddings shift or if the cache keying differs; consider logging CacheDebug fields and skipping (or retrying) if no semantic hit occurs, depending on how stable you expect this to be.core/bifrost.go (1)
1008-1094: Add ctx-nil handling consistency + clarify nil-response behavior for ImageGenerationRequest.
ImageGenerationRequestrelies onhandleRequest()to normalizectxwhen nil, which is fine, but the method’s own nil-response guard (response == nil || response.ImageGenerationResponse == nil) can turn a plugin short-circuit/mis-route into a generic “nil response” error that will not trigger fallbacks (fallbacks already happened insidehandleRequest). Consider either:
- treating this as an internal invariant violation (and include more details), or
- moving this check into provider dispatch where you can attach provider/model and preserve a clearer cause.
core/schemas/images.go (2)
4-16: Request schema: consider serializing ExtraParams (or document that it’s internal-only).
ImageGenerationParameters.ExtraParamsisjson:"-", so callers can’t actually send arbitrary provider params over the API surface unless there’s separate merge logic elsewhere. If the intent is to support pass-through params, considerjson:"extra_params,omitempty"(and validate/sanitize) or document clearly that this is internal/populated by plugins/transports only.
22-38: Parameter typing: prefermap[string]anyovermap[string]interface{}.Not a behavior change, but aligns with modern Go style and improves readability across the codebase.
framework/streaming/images.go (1)
112-205: Potential perf issue: rebuilding full base64 on every delta chunk is O(n²).
processAccumulatedImageStreamingChunkscallsbuildCompleteImageFromImageStreamChunkseven for non-final chunks, which sorts and concatenates all chunks every time. If images are large / chunk counts high, this will be a hot path.Consider returning a true delta response for non-final chunks (or incrementally appending to per-image builders stored in the accumulator).
Also applies to: 322-336
framework/streaming/types.go (1)
360-393: Consider setting ImageGenerationResponse.Created when you synthesize a fallback response.
You already setIDandModel; settingCreatedfromp.Data.StartTimestampwould better match other API responses.core/providers/huggingface/types.go (1)
352-440: Optional: consider moving the image-generation structs into a dedicated file (e.g., images_types.go).
This file is already a “kitchen sink” and will be harder to maintain as more providers/models are added.core/providers/openai/types.go (1)
546-546: Define constants forImageGenerationEventTypeto ensure consistency.The type is declared but no constants are defined. Based on learnings, the SSE event type values should be
"image_generation.partial_image","image_generation.completed", and"error". Without defined constants, consumers may use inconsistent string literals.♻️ Suggested constants
type ImageGenerationEventType string + +const ( + ImageGenerationPartialImage ImageGenerationEventType = "image_generation.partial_image" + ImageGenerationCompleted ImageGenerationEventType = "image_generation.completed" + ImageGenerationError ImageGenerationEventType = "error" +)core/providers/openai/openai.go (1)
2706-2706: Use constant instead of string literal for error type check.For consistency with the established patterns and to avoid typos, consider using a constant instead of the string literal
"error". This aligns with the learnings about enforcing SSE event type values consistently.♻️ Suggested change
- if response.Type == "error" { + if response.Type == ImageGenerationError {This requires defining
ImageGenerationErroras a constant intypes.go(as suggested in the earlier comment).core/providers/gemini/types.go (2)
64-88: Clarify/normalize the dual request shapes (GenerateContent vs Imagen :predict) to avoid accidental mixed payloads.
GeminiGenerationRequestnow carries Imagen-only fields (Instances,Parameters) alongsideContents. Consider enforcing mutual exclusivity at the conversion layer (or adding a comment/guard) so callers don’t accidentally populate both and send an ambiguous request.Also applies to: 79-85
90-93: Consolidate Imagen instance modeling—prompt must be required, not optional.Two conflicting types exist for Imagen instances:
ImagenInstance.Prompt(string, omitempty) — optionalGeminiImagenRequest.Instances[].Prompt(*string) — requiredGoogle Vertex AI Imagen API requires the prompt field. The optional semantics in
ImagenInstancecontradict this contract and force defensive checks in conversion logic (images.go:27). Use a single exported type across bothGeminiGenerationRequest.InstancesandGeminiImagenRequest.Instances, and make Prompt required (*string without omitempty).Also applies to: 1702-1707
core/providers/gemini/images.go (2)
252-346: Validate empty prompt and add legacy key support consistently for Imagen-specific ExtraParams.
ToImagenImageGenerationRequestshould returnnilifbifrostReq.Input.Promptis empty (same structured-error rationale). Also, if you’re supporting legacy snake_case for safety, consider doing the same forimage_size/aspect_ratioif those ever existed.
440-468: Consider surfacing filtered/blocked images (raiFilteredReason) as errors or metadata.Right now
RaiFilteredReasonis ignored; consumers may get empty/invalid image data without explanation. At least propagate it intoExtraFields/ error path when present.core/providers/huggingface/huggingface.go (1)
968-1022: Streaming URL construction: verify it matches router expectations and custom path overrides.You bypass
getInferenceProviderRouteURLand assemble"/fal-ai/%s/stream"directly (then run throughGetRequestPath). Please confirm the final URL matches the non-stream route conventions and any custom provider overrides.core/internal/testutil/test_retry_framework.go (1)
980-994: DefaultImageGenerationRetryConfig returns TestRetryConfig (generic) — ensure typed configs follow the repo convention.Given the convention in
core/internal/testutil/(genericGetTestRetryConfigForScenario()then build typed config copying only selected fields), please ensure new image-generation tests constructImageGenerationRetryConfigthe same way (with an emptyConditionsslice unless explicitly needed).Based on learnings, the typed-config construction convention should be applied consistently.
Also applies to: 1250-1251
40211ea to
b136412
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (25)
docs/providers/supported-providers/groq.mdx (1)
16-35: Add Image Generation to the Unsupported Operations note for consistency.Image Generation is marked as ❌ in the Supported Operations table (line 25) but is omitted from the "Unsupported Operations" explanation (line 34), creating a documentation inconsistency. The implementation correctly returns
UnsupportedOperationErrorfor bothImageGenerationandImageGenerationStreammethods.Proposed doc fix
-**Unsupported Operations** (❌): Embeddings, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Image Generation, Embeddings, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`.plugins/semanticcache/utils.go (3)
88-116: Prevent nil deref for image-generation requests in hash generation.
hashInput.Params = req.ImageGenerationRequest.Params(Line 115) will panic ifRequestTypeis image-generation butreq.ImageGenerationRequestis nil (similar guard exists for Speech).Proposed fix
case schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest: hashInput.Params = req.TranscriptionRequest.Params case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: - hashInput.Params = req.ImageGenerationRequest.Params + if req.ImageGenerationRequest != nil { + hashInput.Params = req.ImageGenerationRequest.Params + } }
379-413: Guard againstres == niland avoid unbounded metadata growth (esp. base64 duplication).
res.ImageGenerationResponseaccess (Line 392) will panic ifresis nil.- Storing
metadata["response"]plusimage_b64(Line 410) can explode metadata size and potentially exceed vector store limits / hurt performance (base64 images can be very large). Consider storing only URLs, counts, hashes, or gating base64 fields behind a config/size cap.Proposed fix (nil guard + lighter metadata defaults)
func (plugin *Plugin) addSingleResponse(ctx context.Context, responseID string, res *schemas.BifrostResponse, embedding []float32, metadata map[string]interface{}, ttl time.Duration) error { + if res == nil { + return fmt.Errorf("failed to store response: nil response") + } // Marshal response as string responseData, err := sonic.Marshal(res) if err != nil { return fmt.Errorf("failed to marshal response: %w", err) } // Add response field to metadata metadata["response"] = string(responseData) metadata["stream_chunks"] = []string{} // image specific metadata if res.ImageGenerationResponse != nil { var imageURLs []string - var imageB64 []string var revisedPrompts []string for _, img := range res.ImageGenerationResponse.Data { if img.URL != "" { imageURLs = append(imageURLs, img.URL) } - if img.B64JSON != "" { - imageB64 = append(imageB64, img.B64JSON) - } if img.RevisedPrompt != "" { revisedPrompts = append(revisedPrompts, img.RevisedPrompt) } } metadata["image_urls"] = imageURLs - metadata["image_b64"] = imageB64 metadata["revised_prompts"] = revisedPrompts }
480-516: Prevent nil deref ingetInputForCachingfor image generation.
return req.ImageGenerationRequest.Input(Line 513) will panic ifreq.ImageGenerationRequestis nil.Proposed fix
case schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest: return req.TranscriptionRequest.Input case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: - return req.ImageGenerationRequest.Input + if req.ImageGenerationRequest == nil { + return nil + } + return req.ImageGenerationRequest.Input default: return nil }docs/providers/supported-providers/overview.mdx (1)
53-61: Document "Images (stream)" semantics in NotesThe matrix includes an "Images (stream)" column, but the Notes section defines "Images" without explaining the streaming variant. Add a note clarifying that streaming is invoked by passing
stream=truein the request to/v1/images/generations, which triggers Server-Sent Events (SSE) delivery with base64-encoded image chunks.Suggested addition: "Images (stream)" refers to streaming image generation using the same
/v1/images/generationsendpoint withstream=true, which delivers image chunks as base64 data via Server-Sent Events (SSE).ui/lib/types/schemas.ts (1)
327-345: Back-compat risk: newallowed_requestskeys may break parsing of existing configs.If older configs omit these keys,
z.boolean()will reject them. Consider defaulting them tofalse(so missing keys still parse).Proposed fix
@@ export const allowedRequestsSchema = z.object({ @@ - image_generation: z.boolean(), - image_generation_stream: z.boolean(), + image_generation: z.boolean().default(false), + image_generation_stream: z.boolean().default(false), count_tokens: z.boolean(), list_models: z.boolean(), });core/providers/huggingface/models.go (1)
46-90: Add image generation tags to enable discovery via tags alone.Tag-based model discovery currently has no image generation matching. Models with image generation tags will be incorrectly filtered out when
pipeline_tagis missing or incorrect.Fix: Add image generation tag cases to switch statement
for _, tag := range tags { tagLower := strings.ToLower(tag) switch { case tagLower == "automatic-speech-recognition" || tagLower == "speech-to-text" || strings.Contains(tagLower, "speech-recognition"): addMethods(schemas.TranscriptionRequest) + case tagLower == "text-to-image" || + strings.Contains(tagLower, "text-to-image") || + strings.Contains(tagLower, "image-generation"): + addMethods(schemas.ImageGenerationRequest) } }Regarding
ImageGenerationStreamRequest: HuggingFace supports streaming via the fal-ai inference provider only. The current logic should addImageGenerationStreamRequestalongsideImageGenerationRequestfor consistency with other capability patterns (chat, speech), since the runtime will handle provider-specific restrictions.docs/providers/supported-providers/mistral.mdx (1)
18-33: Correct the upstream capability claims—Mistral API actually supports Files, Batch, and TTS.Verification shows the doc incorrectly claims these features are "not supported by the upstream Mistral API":
- Files API — Mistral DOES support
/v1/files(batch uploads up to 512 MB per file)- Batch API — Mistral DOES support asynchronous batch jobs via
/v1/filesupload + job creation- TTS/Speech — Mistral DOES support text-to-speech (Voxtral models, available since July 2025)
- Text Completions — Mistral deprecated the old generic endpoint in favor of
/v1/chat/completionsand/v1/fim/completions(so the claim is partly correct but misleading)- Image Generation — Correctly noted as supported upstream but not yet in Bifrost (verified: launched Nov 18, 2024)
Update the "Unsupported Operations" note and the operations table to accurately reflect which features Mistral's public API actually supports vs. which are missing from Bifrost's integration.
docs/providers/supported-providers/openrouter.mdx (1)
134-144: Minor consistency: “Embedding” vs “Embeddings” naming.The table uses “Embeddings” (Line 24) while the “Unsupported Features” section uses “Embedding” (Line 138). Consider aligning the label to reduce confusion.
core/providers/gemini/transcription.go (2)
34-45: Don’t silently drop invalid inline audio without any signal.Currently (Line 36-40) invalid base64 just gets skipped, which can lead to “empty audio” requests later with no actionable error. At minimum, consider tracking a flag/count in
Params.ExtraParamsso the caller/debug logs can explain why audio is missing.
106-166: Fix potential nil-pointer panic onbifrostReq.Input.File.
ToGeminiTranscriptionRequestassumesbifrostReq.Inputis non-nil (Line 158), but the function accepts arbitrary input and will panic ifInputis nil.Proposed fix
- // Add audio file if present - if len(bifrostReq.Input.File) > 0 { + // Add audio file if present + if bifrostReq.Input != nil && len(bifrostReq.Input.File) > 0 { parts = append(parts, &Part{ InlineData: &Blob{ MIMEType: utils.DetectAudioMimeType(bifrostReq.Input.File), Data: encodeBytesToBase64String(bifrostReq.Input.File), }, }) }core/schemas/responses.go (1)
540-575: Blocker: error message claims image-generation output support, but code doesn’t implement it (and may mis-unmarshal objects).Right now you only updated the failure message (Line 556-557, 574-575).
ResponsesToolMessageOutputStructstill can’t marshal/unmarshal an image-generation output variant, andUnmarshalJSONwill also happily treat any JSON object asResponsesComputerToolCallOutputData(because missing fields don’t error), which will likely break once image-gen outputs are objects.Concrete direction (add an explicit image-gen variant + discriminate objects)
type ResponsesToolMessageOutputStruct struct { ResponsesToolCallOutputStr *string // Common output string for tool calls and outputs (used by function, custom and local shell tool calls) ResponsesFunctionToolCallOutputBlocks []ResponsesMessageContentBlock ResponsesComputerToolCallOutput *ResponsesComputerToolCallOutputData + ResponsesImageGenerationCallOutput *ResponsesImageGenerationCall } func (output ResponsesToolMessageOutputStruct) MarshalJSON() ([]byte, error) { if output.ResponsesToolCallOutputStr != nil { return Marshal(*output.ResponsesToolCallOutputStr) } if output.ResponsesFunctionToolCallOutputBlocks != nil { return Marshal(output.ResponsesFunctionToolCallOutputBlocks) } if output.ResponsesComputerToolCallOutput != nil { return Marshal(output.ResponsesComputerToolCallOutput) } + if output.ResponsesImageGenerationCallOutput != nil { + return Marshal(output.ResponsesImageGenerationCallOutput) + } return nil, fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output") } func (output *ResponsesToolMessageOutputStruct) UnmarshalJSON(data []byte) error { var str string if err := Unmarshal(data, &str); err == nil { output.ResponsesToolCallOutputStr = &str return nil } var array []ResponsesMessageContentBlock if err := Unmarshal(data, &array); err == nil { output.ResponsesFunctionToolCallOutputBlocks = array return nil } - var computerToolCallOutput ResponsesComputerToolCallOutputData - if err := Unmarshal(data, &computerToolCallOutput); err == nil { - output.ResponsesComputerToolCallOutput = &computerToolCallOutput - return nil - } + // Discriminate object shapes to avoid false-positive unmarshalling into structs with optional fields. + var obj map[string]any + if err := Unmarshal(data, &obj); err == nil { + if t, ok := obj["type"].(string); ok && t == "computer_screenshot" { + var computerToolCallOutput ResponsesComputerToolCallOutputData + if err := Unmarshal(data, &computerToolCallOutput); err == nil { + output.ResponsesComputerToolCallOutput = &computerToolCallOutput + return nil + } + } + if _, ok := obj["result"]; ok { + var imageGenOutput ResponsesImageGenerationCall + if err := Unmarshal(data, &imageGenOutput); err == nil { + output.ResponsesImageGenerationCallOutput = &imageGenOutput + return nil + } + } + } return fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output") }core/providers/xai/xai_test.go (1)
25-54: RemoveImageGenerationStream: truefrom test config—XAI provider explicitly rejects this operation.The XAI provider's
ImageGenerationStreammethod (xai.go:247) returnsproviderUtils.NewUnsupportedOperationError()with the comment "ImageGenerationStream is not supported by the xAI provider." Enabling this scenario in the test will cause failures since the provider cannot support it.Change line 46 in xai_test.go:
ImageGenerationStream: false,plugins/semanticcache/test_utils.go (1)
356-365: Skipping tests when Weaviate isn’t reachable is pragmatic, but consider an explicit opt-in/out.
Right now any connection/config failure becomes a skip, which can mask integration regressions in CI. If CI should enforce Weaviate-backed tests, consider gating this behavior behind an env var (e.g.,REQUIRE_WEAVIATE=1).core/providers/vertex/vertex_test.go (1)
25-55: Vertex test config conflicts with global capability matrix for streaming image generation.
core/internal/testutil/account.gomarks VertexImageGenerationStream: false, but this test enables it. If Vertex doesn’t implement streaming image-gen, the suite will fail or hang.Proposed fix (if Vertex image-gen streaming is not supported)
@@ Scenarios: testutil.TestScenarios{ @@ ImageGeneration: true, - ImageGenerationStream: true, + ImageGenerationStream: false, MultipleImages: true,plugins/logging/operations.go (1)
544-555: ExtractimageUsagefromImageGenerationOutputParsedfor cost recalculation of image-generation requests.The parameter list is correct (9 parameters match all call sites), but
calculateCostForLogunconditionally passesnilfor theimageUsageparameter even when handling image-generation requests. For image-generation requests withImageGenerationRequestorImageGenerationStreamRequesttypes, extract theUsagefield fromlogEntry.ImageGenerationOutputParsedand pass it as the 9th argument toCalculateCostFromUsage—similar to how the mainCalculateCostfunction handles it (lines 108–110 in pricing.go). Also ensureImageGenerationOutputParsedis deserialized alongsideTokenUsageParsedandCacheDebugParsedwhen needed.framework/modelcatalog/main.go (1)
223-230: MissingImageGenerationRequestin request type lookup.
GetPricingEntryForModeliterates over request types but does not includeschemas.ImageGenerationRequest. This means callers using this method won't find pricing entries for image generation models.Proposed fix
for _, mode := range []schemas.RequestType{ schemas.TextCompletionRequest, schemas.ChatCompletionRequest, schemas.ResponsesRequest, schemas.EmbeddingRequest, schemas.SpeechRequest, schemas.TranscriptionRequest, + schemas.ImageGenerationRequest, } {framework/streaming/types.go (2)
345-392: Fix missing RawResponse/CacheDebug propagation for transcription (likely regression).
Incase StreamTypeTranscription, onlyRawRequestis set; other stream types includeRawResponseandCacheDebug. This can silently drop debug data for transcription streams.Proposed fix
case StreamTypeTranscription: @@ if p.RawRequest != nil { resp.TranscriptionResponse.ExtraFields.RawRequest = p.RawRequest } + if p.Data.RawResponse != nil { + resp.TranscriptionResponse.ExtraFields.RawResponse = *p.Data.RawResponse + } + if p.Data.CacheDebug != nil { + resp.TranscriptionResponse.ExtraFields.CacheDebug = p.Data.CacheDebug + } case StreamTypeImage:
13-20: StreamTypeImage is properly handled; however, StreamTypeTranscription is missing RawResponse and CacheDebug field population.StreamTypeImage is correctly added to the switch statement in ToBifrostResponse (lines 360–391) and properly copies RawRequest, RawResponse, and CacheDebug fields just like other types (Text, Chat, Audio, Responses). It is also correctly used in framework/streaming/images.go.
The issue is that StreamTypeTranscription (lines 345–358) does not populate RawResponse and CacheDebug, unlike all other stream types. This creates an inconsistency where transcription responses will have missing debug information that other types include.
transports/bifrost-http/integrations/genai.go (2)
36-64: Fix request routing precedence: ImageGeneration must win over Embedding when both flags are set.
Today the converter checksIsEmbeddingbeforeIsImageGeneration, butextractAndSetModelFromURLcan set both (e.g.,:predictdefaulting to embedding + body indicates IMAGE). That will misroute image generation requests.Proposed fix (reorder + enforce mutual exclusivity)
RequestConverter: func(ctx *schemas.BifrostContext, req interface{}) (*schemas.BifrostRequest, error) { if geminiReq, ok := req.(*gemini.GeminiGenerationRequest); ok { if geminiReq.IsCountTokens { return &schemas.BifrostRequest{ CountTokensRequest: geminiReq.ToBifrostResponsesRequest(), }, nil - } else if geminiReq.IsEmbedding { + } else if geminiReq.IsImageGeneration { + return &schemas.BifrostRequest{ + ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(), + }, nil + } else if geminiReq.IsEmbedding { return &schemas.BifrostRequest{ EmbeddingRequest: geminiReq.ToBifrostEmbeddingRequest(), }, nil } else if geminiReq.IsSpeech { return &schemas.BifrostRequest{ SpeechRequest: geminiReq.ToBifrostSpeechRequest(), }, nil } else if geminiReq.IsTranscription { return &schemas.BifrostRequest{ TranscriptionRequest: geminiReq.ToBifrostTranscriptionRequest(), }, nil - } else if geminiReq.IsImageGeneration { - return &schemas.BifrostRequest{ - ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(), - }, nil } else { return &schemas.BifrostRequest{ ResponsesRequest: geminiReq.ToBifrostResponsesRequest(), }, nil } } return nil, errors.New("invalid request type") },And in
extractAndSetModelFromURL:- r.IsEmbedding = isEmbedding + r.IsEmbedding = isEmbedding && !(isImagenPredict || isImageGenerationRequest(r)) @@ - r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r) + r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)Also applies to: 426-429
370-411::predictdefaulting to embedding is risky and not mitigated by body checks; requests will route to embedding before image generation can be detected.For non-Imagen models with
:predict,isEmbeddingis set to true (line 410). WhileisImageGenerationRequest()checks the request body forresponseModalities: ["IMAGE"](line 509), this provides no safety because the downstream if-else chain (lines 40–62) checksIsEmbeddingbeforeIsImageGeneration. Any future non-Imagen image models using:predictwill be incorrectly routed to embedding regardless of request body content.Consider:
- Explicitly checking request body for image modalities before defaulting to embedding, or
- Requiring
isImageGenerationRequest()to return true for non-Imagen:predictendpoints instead of treating them as embedding by default.transports/bifrost-http/handlers/inference.go (1)
1258-1334: Fix SSE termination for Responses API: emit[DONE]based on API type, not chunk observation.The current implementation derives whether to emit
[DONE]from observing chunks at runtime:includeEventTypeis set only if chunks are observed from Responses/ImageGen APIs. If a Responses stream closes without yielding any chunks,includeEventTyperemains false and the handler incorrectly emits[DONE]. However, the code explicitly documents that "OpenAI responses API doesn't use [DONE] marker, it ends when the stream closes," making this behavior incorrect.Replace the chunk-derived logic with an explicit
emitDoneMarker boolparameter passed from each endpoint handler. This moves the decision from unreliable runtime state to static configuration: Responses API (handleStreamingResponsesandhandleStreamingImageGeneration) passesfalse, all others passtrue.Proposed diff
-func (h *CompletionHandler) handleStreamingTextCompletion(ctx *fasthttp.RequestCtx, req *schemas.BifrostTextCompletionRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { +func (h *CompletionHandler) handleStreamingTextCompletion(ctx *fasthttp.RequestCtx, req *schemas.BifrostTextCompletionRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) { return h.client.TextCompletionStreamRequest(bifrostCtx, req) } - h.handleStreamingResponse(ctx, getStream, cancel) + h.handleStreamingResponse(ctx, getStream, cancel, true) } func (h *CompletionHandler) handleStreamingChatCompletion(ctx *fasthttp.RequestCtx, req *schemas.BifrostChatRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) { return h.client.ChatCompletionStreamRequest(bifrostCtx, req) } - h.handleStreamingResponse(ctx, getStream, cancel) + h.handleStreamingResponse(ctx, getStream, cancel, true) } func (h *CompletionHandler) handleStreamingResponses(ctx *fasthttp.RequestCtx, req *schemas.BifrostResponsesRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) { return h.client.ResponsesStreamRequest(bifrostCtx, req) } - h.handleStreamingResponse(ctx, getStream, cancel) + h.handleStreamingResponse(ctx, getStream, cancel, false) } func (h *CompletionHandler) handleStreamingSpeech(ctx *fasthttp.RequestCtx, req *schemas.BifrostSpeechRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) { return h.client.SpeechStreamRequest(bifrostCtx, req) } - h.handleStreamingResponse(ctx, getStream, cancel) + h.handleStreamingResponse(ctx, getStream, cancel, true) } func (h *CompletionHandler) handleStreamingTranscriptionRequest(ctx *fasthttp.RequestCtx, req *schemas.BifrostTranscriptionRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) { getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) { return h.client.TranscriptionStreamRequest(bifrostCtx, req) } - h.handleStreamingResponse(ctx, getStream, cancel) + h.handleStreamingResponse(ctx, getStream, cancel, true) } -func (h *CompletionHandler) handleStreamingResponse(ctx *fasthttp.RequestCtx, getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), cancel context.CancelFunc) { +func (h *CompletionHandler) handleStreamingResponse(ctx *fasthttp.RequestCtx, getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), cancel context.CancelFunc, emitDoneMarker bool) { // ... - var skipDoneMarker bool - // Process streaming responses for chunk := range stream { // ... - // Image generation streams don't use [DONE] marker - if chunk.BifrostImageGenerationStreamResponse != nil { - skipDoneMarker = true - } // ... } - if !includeEventType && !skipDoneMarker { + if emitDoneMarker && !includeEventType { if _, err := fmt.Fprint(w, "data: [DONE]\n\n"); err != nil { // ... } } cancel() }) }core/providers/gemini/types.go (2)
1114-1123: Removeomitemptytag fromBlob.Datafield to align with "Required" semantics.
Blob.Datais marked as "Required" in the comment but tagged withjson:"data,omitempty", creating a semantic conflict. If the field is required by the Gemini API, the tag should bejson:"data"withoutomitempty.Recommended fix
type Blob struct { // Optional. Display name of the blob. Used to provide a label or filename to distinguish // blobs. This field is not currently used in the Gemini GenerateContent calls. DisplayName string `json:"displayName,omitempty"` - // Required. Base64-encoded bytes. - Data string `json:"data,omitempty"` + // Required. Base64-encoded bytes. + Data string `json:"data"` // Required. The IANA standard MIME type of the source data. MIMEType string `json:"mimeType,omitempty"` }Current implementation correctly encodes/decodes base64 data through helper functions; no blast radius concerns or double-encoding issues detected.
64-88: Normalize Imagen parameter casing and unify instance representation to prevent silent mapping failures.The
GeminiImagenParameters.ImageSizetype comment documents"1K", "2K", "4K"(uppercase), but the converter functionconvertImagenFormatToSize()(lines 96–100) only handles lowercase"1k","2k". Google/Vertex Imagen API docs show the canonical values as uppercase but accept both cases; however, the internal converter is case-sensitive and will silently default to 1024 if uppercase values are encountered.Additionally, there are two incompatible Imagen instance representations:
ImagenInstance(line 91–93) hasPrompt string, whileGeminiImagenRequest.Instances(line 1703–1705) uses an inline struct withPrompt *string. Unify these to a single canonical type to avoid pointer/nullability mismatches.core/providers/huggingface/huggingface.go (1)
114-218: Build URL with validated model ID for hfInference to avoid preventable 404s.For hfInference (chat, embedding, transcription),
getInferenceProviderRouteURLencodes the modelName into the URL path:/hf-inference/models/{modelName}/pipeline/{pipeline}. The code builds the URL at line 128 usingoriginalModelName, but only validates it at line 136—causing a 404 on the first attempt whenoriginalModelNamediffers from theProviderModelIDreturned by validation.For skipValidation providers (falAI, nebius, together image generation), this is not an issue: falAI encodes the model in the path but uses
originalModelNameconsistently (the comment confirms format is already correct), while nebius and together use fixed URL paths without encoding the model name.Reorder validation to occur before URL construction, or defer URL building until after
getValidatedProviderModelIDreturns for non-skipValidation cases.
🤖 Fix all issues with AI agents
In @core/internal/testutil/test_retry_conditions.go:
- Around line 907-956: EmptyImageGenerationCondition currently implements
TestRetryCondition by accepting *schemas.BifrostResponse; change its ShouldRetry
signature (EmptyImageGenerationCondition.ShouldRetry) to accept
*schemas.BifrostImageGenerationResponse, keep the same logic but adapt
references to the image-generation response type; update
DefaultImageGenerationRetryConfig to return ImageGenerationRetryConfig (not
TestRetryConfig) and ensure the new EmptyImageGenerationCondition is appended to
that config's conditions slice so it implements ImageGenerationRetryCondition
and is registered for image-generation retry handling.
In @core/providers/gemini/gemini.go:
- Around line 1614-1671: ImageGeneration currently dereferences request (calling
schemas.IsImagenModel(request.Model)) without validating request and does not
enforce presence of a prompt; add an early nil-check for the request in
ImageGeneration and return a proper *schemas.BifrostError when request == nil,
then validate the prompt/content field on request (reject empty/missing prompt
per repo pattern) before calling schemas.IsImagenModel or building the JSON body
(affects ImageGeneration, ToGeminiImageGenerationRequest usage and
provider.completeRequest); ensure the error includes ExtraFields.Provider,
ModelRequested and RequestType like other failures.
In @docs/openapi/schemas/inference/images.yaml:
- Around line 100-124: The OpenAPI schema lists
ImageGenerationResponse.parameters nested under "params", but the Go type
BifrostImageGenerationResponse embeds *ImageGenerationResponseParameters without
a JSON tag so those fields are flattened at the top level; remove the "params"
property from ImageGenerationResponse and instead promote all properties defined
in ImageGenerationResponseParameters to be top-level properties of
ImageGenerationResponse (or use an allOf referencing
ImageGenerationResponseParameters to achieve the same flattening), ensuring the
names and types exactly match ImageGenerationResponseParameters and keeping
existing refs like ImageUsage and BifrostResponseExtraFields intact.
In @docs/providers/supported-providers/openrouter.mdx:
- Around line 24-33: Update the documentation text that currently states these
unsupported operations return `UnsupportedOperationError` to instead state they
return a `BifrostError` with an error code of `"unsupported_operation"`;
specifically change the note on unsupported operations in the OpenRouter
provider doc (the table/Note block describing Embeddings, Image Generation,
Speech, Transcriptions, Files, Batch) to mention `BifrostError` and include the
`"unsupported_operation"` code so the documented error structure matches the
provider's actual response.
In @framework/streaming/images.go:
- Around line 207-214: Guard against nil ctx (or nil *ctx) at the start of
Accumulator.processImageStreamingResponse to avoid dereferencing and panics:
check that ctx != nil and *ctx != nil before using (*ctx).Value(...), and return
a clear error if missing; keep the existing behavior of not failing the request
flow beyond returning an error. Specifically update
processImageStreamingResponse to validate ctx and *ctx before extracting
requestID using schemas.BifrostContextKeyRequestID.
🟠 Major comments (20)
docs/providers/supported-providers/xai.mdx-124-141 (1)
124-141: Avoid brittle cross-doc anchor & verify schema superset claim.
- The link to
openai#7-image-generation(line 130) depends on section numbering; if the OpenAI docs get reorganized, this link will break. Consider whether your doc system supports stable anchor IDs (e.g.,#image-generation-section).- "Bifrost's response schema is a superset of OpenAI's format" (line 138) is a strong architectural claim. Verify in this PR that the implementation preserves all fields from xAI's response, especially any provider-specific extensions beyond OpenAI's standard fields.
core/providers/openai/images.go-13-28 (1)
13-28: ToOpenAIImageGenerationRequest: missing fallbacks mapping and asymmetric model handling.The function ignores the
Fallbacksfield frombifrostReq, which the reverse functionToBifrostImageGenerationRequestproperly handles. Additionally, while the reverse function defensively strips provider prefixes from models usingParseModelString, the forward direction passes the model through unchanged—this should be normalized for consistency.Proposed fix (model normalization + fallbacks mapping)
func ToOpenAIImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) *OpenAIImageGenerationRequest { if bifrostReq == nil || bifrostReq.Input == nil || bifrostReq.Input.Prompt == "" { return nil } + _, model := schemas.ParseModelString(bifrostReq.Model, schemas.OpenAI) + req := &OpenAIImageGenerationRequest{ - Model: bifrostReq.Model, + Model: model, Prompt: bifrostReq.Input.Prompt, } if bifrostReq.Params != nil { req.ImageGenerationParameters = *bifrostReq.Params } + + if len(bifrostReq.Fallbacks) > 0 { + req.Fallbacks = make([]string, 0, len(bifrostReq.Fallbacks)) + for _, fb := range bifrostReq.Fallbacks { + if fb.Provider == "" || fb.Model == "" { + continue + } + req.Fallbacks = append(req.Fallbacks, string(fb.Provider)+"/"+fb.Model) + } + } return req }plugins/logging/operations.go-41-42 (1)
41-42: Be careful logging image-generation outputs: base64 payloads can explode log size.
If an image response containsb64_json, persisting it intoimage_generation_outputcan massively bloat the DB and degrade log search/UI. Consider truncating/omitting base64 fields (or storing only URLs + metadata) when serializing these fields.Also applies to: 127-134, 287-295
tests/integrations/python/tests/utils/common.py-2624-2724 (1)
2624-2724: Tighten Google “predictions” validation to check actual base64 bytes, not just attribute presence.
Right now an object with an emptybytes_base64_encoded-like field can still pass.Proposed fix
@@ elif (isinstance(response, dict) and "predictions" in response) or hasattr(response, "predictions"): # Imagen response predictions = response.get("predictions") if isinstance(response, dict) else response.predictions assert len(predictions) > 0, "Response should have at least one prediction" for i, prediction in enumerate(predictions): - has_b64 = (prediction.get("bytesBase64Encoded") if isinstance(prediction, dict) - else (hasattr(prediction, "bytesBase64Encoded") or hasattr(prediction, "bytes_base64_encoded"))) - assert has_b64, f"Prediction {i} should have base64 encoded bytes" + if isinstance(prediction, dict): + b64 = prediction.get("bytesBase64Encoded") or prediction.get("bytes_base64_encoded") + else: + b64 = getattr(prediction, "bytesBase64Encoded", None) or getattr(prediction, "bytes_base64_encoded", None) + + assert b64, f"Prediction {i} should have base64 encoded bytes" + # Base64 should not be trivially small + assert len(b64) > 100, f"Prediction {i} base64 seems too short"tests/integrations/python/tests/test_google.py-80-83 (1)
80-83:test_41b_imagen_predictwill fail (not skip) when API key is missing.Unlike 41a/41c,
test_41b_imagen_predictis missing@skip_if_no_api_key("google")(or equivalent), andget_api_key(provider)will raise before you reach the currenttry/exceptthat callspytest.skip(...).Proposed fix
@@ @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("imagen")) + @skip_if_no_api_key("google") def test_41b_imagen_predict(self, _test_config, provider, model):Also applies to: 96-96, 1691-1792
transports/bifrost-http/integrations/router.go-203-210 (1)
203-210: Add nil-guards for new image generation converters to avoid runtime panics.
handleNonStreamingRequestcallsconfig.ImageGenerationResponseConverter(...)(Line 718) and streaming callsconfig.StreamConfig.ImageGenerationStreamResponseConverter(...)(Line 1286) without checking they’re configured. If an integration route enables image generation request conversion but forgets these, this will panic.Proposed fix
@@ case bifrostReq.ImageGenerationRequest != nil: + if config.ImageGenerationResponseConverter == nil { + g.sendError(ctx, bifrostCtx, config.ErrorConverter, newBifrostError(nil, "ImageGenerationResponseConverter not configured")) + return + } imageGenerationResponse, bifrostErr := g.client.ImageGenerationRequest(bifrostCtx, bifrostReq.ImageGenerationRequest) @@ - response, err = config.ImageGenerationResponseConverter(bifrostCtx, imageGenerationResponse) + response, err = config.ImageGenerationResponseConverter(bifrostCtx, imageGenerationResponse) @@ case chunk.BifrostImageGenerationStreamResponse != nil: + if config.StreamConfig.ImageGenerationStreamResponseConverter == nil { + log.Printf("ImageGenerationStreamResponseConverter not configured for route: %s", config.Path) + continue + } eventType, convertedResponse, err = config.StreamConfig.ImageGenerationStreamResponseConverter(bifrostCtx, chunk.BifrostImageGenerationStreamResponse)Also applies to: 256-264, 277-307, 696-718, 1284-1287
docs/openapi/paths/integrations/openai/images.yaml-1-86 (1)
1-86: Mark Azureapi-versionasrequired: trueand add concrete SSE event examples for clarity.Azure OpenAI image generation endpoints require an explicit
api-versionquery parameter per Azure REST API specifications. Currently,api-versionis documented but not required (lines 62–67). Update torequired: trueto enforce compliance with Azure's actual API contract and reduce client integration friction.Additionally, while the schema in
docs/openapi/schemas/integrations/openai/images.yamldefinesOpenAIImageStreamResponseevent types (image_generation.partial_image,image_generation.completed) and payloads, adding a brief inline SSE example in the endpoint description (showing actualevent:anddata:format) will help client developers understand the concrete wire protocol without cross-referencing the schema definition.Note: This api-version pattern affects all Azure OpenAI endpoints in the codebase (audio, embeddings, text, models, responses, chat). Consider applying the same
required: truefix consistently across all Azure integration endpoints.framework/streaming/images.go-14-110 (1)
14-110: Multi-chunk image streaming drops URL-only outputs and uses inconsistent timestamps.The multi-chunk reconstruction path has two issues:
URL-only outputs are lost: The code only accumulates
B64JSONchunks into theimagesmap (line 47). When a provider streams onlyURLwithoutB64JSON,len(images)==0triggersreturn nil(lines 70-72), discarding valid image data. The single-chunk fast path correctly handlesURL, but the multi-chunk path ignores it entirely (line 89).Timestamp inconsistency: The single-chunk path uses
chunks[0].Delta.CreatedAt(line 20), but the multi-chunk path usestime.Now().Unix()(line 104). Should prefer the stream'sCreatedAtwhen available for consistency.framework/streaming/images.go-248-259 (1)
248-259:ExtraFields.ChunkIndex > 0ignores valid 0 index, breaking priority rule.When the first chunk arrives (
ChunkIndex = 0), the condition is false, so the code falls through to usePartialImageIndexinstead—violating the stated priority to useExtraFields.ChunkIndex. SinceExtraFields.ChunkIndexis always set by the provider (line 2814 in openai.go), the condition should not exclude 0. This can cause chunk mis-ordering downstream.Change the condition to
>= 0or remove the guard entirely and always prioritizeExtraFields.ChunkIndexover the nullablePartialImageIndex.tests/integrations/python/tests/test_openai.py-1170-1257 (1)
1170-1257: Gate unsupported image parameters to prevent cross-provider test failures.The
quality="low"(line 1228) andsize="1024x1536"(line 1252) parameters are specific to OpenAI'sgpt-image-1model. These tests run against all providers supporting theimage_generationscenario (including Gemini, which does not support these parameters), causing failures for non-OpenAI providers. Add error handling to skip on unsupported parameter errors, or create separate scenario categories (image_generation_quality,image_generation_sizes) that gate to OpenAI-only.Separately, the
test_configfixture is unused in all four test methods. For consistency with the pattern used intest_google.py(e.g.,_test_config), consider renaming to_test_configif the fixture serves only to satisfy test infrastructure requirements.Suggested approach (parameter handling)
- def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled): + def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled): ... + try: response = client.images.generate( model=format_provider_model(provider, model), prompt=IMAGE_GENERATION_SIMPLE_PROMPT, n=1, size="1024x1024", quality="low", + ) + except Exception as e: + if "quality" in str(e).lower() and ("unsupported" in str(e).lower() or "invalid" in str(e).lower()): + pytest.skip(f"quality parameter not supported for {provider}/{model}") + raisecore/providers/azure/azure.go-1215-1304 (1)
1215-1304: Add nil validation for image generation requests to avoid panics; reuse getModelDeployment for consistency.
Both methods readrequest.Modelwithout checkingrequest != nil.Proposed fix
func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is required", nil, provider.GetProviderKey()) + } // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err } - deployment := key.AzureKeyConfig.Deployments[request.Model] - if deployment == "" { - return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey()) - } + deployment, err := provider.getModelDeployment(key, request.Model) + if err != nil { + return nil, err + } @@ func (provider *AzureProvider) ImageGenerationStream( @@ request *schemas.BifrostImageGenerationRequest, ) (chan *schemas.BifrostStream, *schemas.BifrostError) { + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is required", nil, provider.GetProviderKey()) + } // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err } - // - deployment := key.AzureKeyConfig.Deployments[request.Model] - if deployment == "" { - return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey()) - } + deployment, err := provider.getModelDeployment(key, request.Model) + if err != nil { + return nil, err + }core/providers/gemini/gemini.go-1673-1749 (1)
1673-1749: Don’t hardcode generativelanguage base URL; normalize model and reuse configured BaseURL/context path.
handleImagenImageGenerationignoresprovider.networkConfig.BaseURLandproviderUtils.GetPathFromContext, and usesrequest.Modeldirectly in the URL. This can break custom base URLs and any model normalization the rest of the provider relies on.Proposed fix
func (provider *GeminiProvider) handleImagenImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { providerName := provider.GetProviderKey() + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is required", nil, providerName) + } @@ - baseURL := fmt.Sprintf("https://generativelanguage.googleapis.com/v1beta/models/%s:predict", request.Model) + // Ensure we use configured BaseURL + request-scoped path rewriting + _, model := schemas.ParseModelString(request.Model, schemas.Gemini) + baseURL := strings.TrimRight(provider.networkConfig.BaseURL, "/") + urlPath := providerUtils.GetPathFromContext(ctx, "/models/"+model+":predict") @@ - req.SetRequestURI(baseURL) + req.SetRequestURI(baseURL + urlPath)transports/bifrost-http/handlers/inference.go-1414-1486 (1)
1414-1486: Guardcancel()usage whenConvertToBifrostContextfails; also tighten prompt validation.At Line 1466 you call
cancel()even whenbifrostCtx == nil. UnlessConvertToBifrostContextguarantees a non-nil cancel func on failure, this can panic. Also,promptshould likely bestrings.TrimSpace(req.Prompt) == ""to reject whitespace-only prompts.Proposed diff
bifrostCtx, cancel := lib.ConvertToBifrostContext(ctx, h.handlerStore.ShouldAllowDirectKeys(), h.config.GetHeaderFilterConfig()) if bifrostCtx == nil { - cancel() + if cancel != nil { + cancel() + } SendError(ctx, fasthttp.StatusInternalServerError, "Failed to convert context") return } - if req.ImageGenerationInput == nil || req.Prompt == "" { + if req.ImageGenerationInput == nil || strings.TrimSpace(req.Prompt) == "" { SendError(ctx, fasthttp.StatusBadRequest, "prompt can not be empty") return }core/internal/testutil/image_generation.go-58-69 (1)
58-69: Enforce deterministic image decode validation by settingresponse_format=b64_jsonandoutput_format=png.The test currently accepts URL-only responses (lines 104-106), but dimension validation only executes when
b64_jsonis present (line 109). Without explicitly requestingb64_json, providers may return URLs instead, silently skipping dimension checks. Additionally, some providers may default towebpformat, which fails to decode since only PNG and JPEG decoders are imported (lines 9-10), causing flaky test failures.Proposed diff
Params: &schemas.ImageGenerationParameters{ Size: bifrost.Ptr("1024x1024"), Quality: bifrost.Ptr("low"), N: bifrost.Ptr(1), + ResponseFormat: bifrost.Ptr("b64_json"), + OutputFormat: bifrost.Ptr("png"), },core/providers/openai/openai.go-2534-2874 (1)
2534-2874: Fix image streaming to handle multi-image generation: multiple issues with event indexing, usage fields, and premature stream termination.The implementation has several critical issues for multi-image streams:
Premature stream termination: The code returns after the first
isCompletedevent (if isCompleted { return }), but OpenAI's spec indicates "Final event for each image"—meaning N-image requests should receive N completed events. This truncates multi-image streams.Fragile image indexing: When
incompleteImagesis empty (no prior partial chunks), completed events default toimageIndex = 0, collapsing multiple images. This occurs when:
- Provider emits only completed events (no partials)
- Provider omits
partial_image_index- Chunks arrive out of expected order
Consider tracking a monotonic counter for completed events, falling back when partial indices are unavailable.
Incomplete usage field copy: Only
InputTokens,OutputTokens, andTotalTokensare copied fromresponse.Usage. TheInputTokensDetailsfield is dropped, which may contain important token breakdown information. Copy all fields or preserveresponse.Usageas-is.Event types (
image_generation.partial_image,image_generation.completed) correctly match the OpenAI spec.core/providers/gemini/images.go-183-251 (1)
183-251: Enforce “non-empty prompt” before building image-generation provider payloads.Both
ToGeminiImageGenerationRequest()andToImagenImageGenerationRequest()will happily build requests withPrompt == "". Based on learnings, image-gen requests should be rejected early to avoid dispatching null/empty bodies to providers.Proposed minimal guard (lets CheckContextAndGetRequestBody produce the structured nil-body error)
func ToGeminiImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) *GeminiGenerationRequest { if bifrostReq == nil { return nil } - if bifrostReq.Input == nil { + if bifrostReq.Input == nil || strings.TrimSpace(bifrostReq.Input.Prompt) == "" { return nil } ... } func ToImagenImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) *GeminiImagenRequest { - if bifrostReq == nil || bifrostReq.Input == nil { + if bifrostReq == nil || bifrostReq.Input == nil || strings.TrimSpace(bifrostReq.Input.Prompt) == "" { return nil } ... }Also applies to: 253-351
core/providers/gemini/images.go-396-443 (1)
396-443: convertSizeToImagenFormat() only implements 5 of 10 documented aspect ratios.The type definition in types.go documents support for "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", and "21:9", but the function only handles 1:1, 3:4, 4:3, 9:16, and 16:9. For input sizes matching 2:3, 3:2, 4:5, 5:4, or 21:9, the function returns an empty aspect ratio string, causing data loss in callers.
core/providers/gemini/images.go-90-121 (1)
90-121: Missing support for "4K" size and additional aspect ratios, with case-sensitivity issue.
convertImagenFormatToSize()is incomplete and case-sensitive. The type comments document support for"1K", "2K", "4K"and 10 aspect ratios ("1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"), but the function only handles lowercase"1k"/"2k"(no"4K") and only 5 aspect ratios, missing"2:3", "3:2", "4:5", "5:4", "21:9". This will cause silent failures for valid requests using uppercase sizes or unsupported ratios.Add case-insensitivity with
strings.ToLower(), support"4k", and add all documented aspect ratios to match the type definition.core/providers/huggingface/huggingface.go-882-966 (1)
882-966: Add empty/whitespace prompt validation to converter (match OpenAI/Nebius pattern).
ToHuggingFaceImageGenerationRequest()only validatesbifrostReq.Input == nilbut not empty or whitespace-only prompts. OpenAI checksbifrostReq.Input.Prompt == ""and Nebius validatesstrings.TrimSpace(request.Input.Prompt) == "". Add validation for empty/whitespace prompt to match the repo-wide pattern.core/providers/huggingface/huggingface.go-1024-1278 (1)
1024-1278: Always set StreamEndIndicator when stream goroutine exits, even with zero chunks.The handler only sets
BifrostContextKeyStreamEndIndicatoron error paths or whenchunkIndex > 0. If the stream ends cleanly without yielding any image chunks (e.g., stream yields only non-data:lines), consumers won't receive the end indicator, breaking stream completion logic in plugins like telemetry and logging.Use a defer at the goroutine entry to unconditionally set the indicator:
Proposed fix
go func() { defer providerUtils.ReleaseStreamingResponse(resp) + defer ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) defer close(responseChan) ... if chunkIndex > 0 { ... - ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) providerUtils.ProcessAndSendResponse(...) } ... if err := scanner.Err(); err != nil { ... - ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) providerUtils.ProcessAndSendBifrostError(...) } }()Regarding SSE parsing: bufio.Scanner's line-based approach is appropriate if fal-ai guarantees single-line JSON payloads (standard SSE format
data: {json}\n\n). Codify this assumption in a comment if confirmed.
b136412 to
8b8d1f8
Compare
There was a problem hiding this comment.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
core/providers/utils/utils.go (1)
907-918: Correct propagation, but missing in sibling error-handling functions.This addition correctly propagates
ImageGenerationStreamResponsethrough the streaming pipeline, following the established pattern for other response types.However, the same propagation is missing in
ProcessAndSendBifrostError(lines 959-966) andProcessAndSendError(lines 1125-1131). If a post-hook transforms an error into a valid image generation response, that response will be silently dropped.🔧 Proposed fix for ProcessAndSendBifrostError
if processedResponse != nil { streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse streamResponse.BifrostChatResponse = processedResponse.ChatResponse streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse + streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse }🔧 Proposed fix for ProcessAndSendError
if processedResponse != nil { streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse streamResponse.BifrostChatResponse = processedResponse.ChatResponse streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse + streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse }docs/providers/supported-providers/groq.mdx (1)
34-34: Update the Note to include Image Generation.The Note lists unsupported operations but doesn't include the newly added "Image Generation" row from the table above.
📝 Suggested fix
-**Unsupported Operations** (❌): Embeddings, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Embeddings, Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`.plugins/semanticcache/utils.go (1)
379-412: Avoid duplicating / storing full base64 images in semantic-cache metadata (size + retention risk).
metadata["response"]already contains the marshaled response; additionally storingimage_b64can explode metadata size and may exceed vectorstore limits or retention expectations.Proposed change: keep URLs + revised prompts, drop image_b64 from metadata
// image specific metadata if res.ImageGenerationResponse != nil { var imageURLs []string - var imageB64 []string var revisedPrompts []string for _, img := range res.ImageGenerationResponse.Data { if img.URL != "" { imageURLs = append(imageURLs, img.URL) } - if img.B64JSON != "" { - imageB64 = append(imageB64, img.B64JSON) - } if img.RevisedPrompt != "" { revisedPrompts = append(revisedPrompts, img.RevisedPrompt) } } metadata["image_urls"] = imageURLs - metadata["image_b64"] = imageB64 metadata["revised_prompts"] = revisedPrompts }core/providers/gemini/speech.go (1)
147-147: Potential panic: unchecked type assertion on context value.Line 147 performs an unguarded type assertion
ctx.Value(BifrostContextKeyResponseFormat).(string). If the context value is missing or not a string, this will panic.🐛 Proposed fix
- responseFormat := ctx.Value(BifrostContextKeyResponseFormat).(string) + responseFormat, _ := ctx.Value(BifrostContextKeyResponseFormat).(string)docs/providers/supported-providers/gemini.mdx (1)
19-30: Correct Files endpoint in table to match Google's official API pathThe table lists
/upload/storage/v1beta/files, but Google's official Generative AI Files API specifies/upload/v1beta/files(without "storage" in the path). Update line 27 to/upload/v1beta/filesfor accuracy.core/providers/openai/openai.go (1)
2286-2311: TranscriptionStream: comment lines (':') should be skipped to avoid noisy unmarshal warningsLine 2290 only checks
line == "", but the comment says “Skip empty lines and comments”. Other SSE handlers skipstrings.HasPrefix(line, ":")as well; doing so avoids repeatedly trying to unmarshal comment lines.Proposed fix
- // Skip empty lines and comments - if line == "" { + // Skip empty lines and comments + if line == "" || strings.HasPrefix(line, ":") { continue }
🤖 Fix all issues with AI agents
In @core/internal/testutil/response_validation.go:
- Around line 240-275: The image validation can panic when
expectations.ProviderSpecific is nil because code accesses
expectations.ProviderSpecific["min_images"] and ["expected_size"] without
guarding; update ValidateImageGenerationResponse/validateImageGenerationFields
(where ProviderSpecific keys are read) to first check if
expectations.ProviderSpecific != nil (or use a safe lookup with value, ok :=
expectations.ProviderSpecific["min_images"]) before indexing, and treat missing
keys as absent/defaults; apply the same nil-guard pattern to the other
occurrences referenced (lines ~1092-1166) to prevent nil-map panics.
In @core/providers/azure/azure.go:
- Around line 1212-1305: Both ImageGeneration and ImageGenerationStream
dereference request.Model before checking for nil; add an early guard to return
a configuration error when request == nil and also validate that request.Model
is not empty, then use the existing getModelDeployment (or
provider.getModelDeployment) helper to resolve the deployment instead of
directly indexing key.AzureKeyConfig.Deployments; update both functions to
perform: if request == nil { return nil,
providerUtils.NewConfigurationError("request is nil", provider.GetProviderKey())
} then if request.Model == "" { return nil,
providerUtils.NewConfigurationError("model not specified",
provider.GetProviderKey()) } and replace the direct deployment lookup with a
call to getModelDeployment to obtain the deployment and handle missing
deployment uniformly.
In @core/providers/gemini/gemini.go:
- Around line 1614-1754: Add the same nil and prompt validation used across
providers to both ImageGeneration and handleImagenImageGeneration: check request
!= nil and that request.Input (or equivalent) contains a non-empty Prompt
(return the appropriate Bifrost error as other providers do) before referencing
request.Model or building the request body. In handleImagenImageGeneration, stop
hardcoding the Google URL; construct the request URL from
provider.networkConfig.BaseURL combined with the path helper (e.g.,
providerUtils.GetPathFromContext or the repo’s path-building utility) and the
model predict suffix instead of using
"https://generativelanguage.googleapis.com/…", so routing/proxy/BaseURL
overrides are respected. Ensure these validations return early and match
existing error creation patterns used elsewhere in the provider code.
In @core/providers/gemini/images.go:
- Around line 90-121: The convertImagenFormatToSize function is missing several
aspect ratio cases documented in GeminiImagenParameters (2:3, 3:2, 4:5, 5:4,
21:9), causing those inputs to fall back to a square size; update
convertImagenFormatToSize to handle these ratios by adding switch branches for
"2:3" (width = baseSize*2/3, height = baseSize), "3:2" (width = baseSize, height
= baseSize*2/3), "4:5" (width = baseSize*4/5, height = baseSize), "5:4" (width =
baseSize, height = baseSize*4/5), and "21:9" (width = baseSize*21/9, height =
baseSize) using strconv.Itoa on the computed integers so the function returns
correct WxH strings consistent with existing cases.
In @core/providers/gemini/transcription.go:
- Around line 36-41: The code silently skips parts when
decodeBase64StringToBytes(part.InlineData.Data) fails, causing inconsistent
error handling with ToBifrostSpeechResponse; update the handler so decoding
failures are propagated (or at minimum logged) rather than silently continued:
replace the current "continue" behavior with returning or bubbling up a wrapped
error (e.g., fmt.Errorf or errors.Wrap) from the enclosing function, or if
returning isn't possible, call the module logger with a clear message including
part identifiers and the decode error before continuing to keep observability
consistent with ToBifrostSpeechResponse.
In @core/providers/openai/openai.go:
- Around line 2382-2491: Add explicit validation to prevent nil/missing-prompt
panics: in ImageGeneration ensure the incoming req is non-nil and contains a
non-empty prompt (or prompts) before calling HandleOpenAIImageGenerationRequest;
in HandleOpenAIImageGenerationRequest check request != nil and that the prompt
field(s) are present/non-empty before calling ToOpenAIImageGenerationRequest or
providerUtils.CheckContextAndGetRequestBody, and return an appropriate
BifrostError (using the existing providerUtils error constructors and
schemas.ImageGenerationRequest/providerName) when validation fails.
In @core/schemas/plugin.go:
- Around line 49-51: Update the top-of-function doc comment to reference the
exported function name CaseInsensitiveLookup (instead of the old unexported
caseInsensitiveLookup) and ensure the comment follows Go's convention of
starting with the function name: e.g., "CaseInsensitiveLookup looks up..." and
then keep the rest of the description unchanged.
In @docs/openapi/paths/integrations/openai/images.yaml:
- Around line 62-66: The OpenAPI spec is missing the required flag for the Azure
OpenAI image generation query parameter named "api-version"; update the
parameter block for name: api-version (in: query) in
docs/openapi/paths/integrations/openai/images.yaml to include required: true so
the spec enforces the mandatory api-version query parameter for the
images/generations endpoint.
In @docs/providers/supported-providers/vertex.mdx:
- Line 314: The header " # 5. Image Generation" in
docs/providers/supported-providers/vertex.mdx is misnumbered; change that header
to "# 4. Image Generation" and update the subsequent "List Models" header to "#
5. List Models" (or renumber any following top-level headers accordingly) so the
section sequence 1, 2, 3, 4, 5 is preserved; locate and edit the literal header
lines "5. Image Generation" and the "List Models" header to their corrected
numeric values.
- Line 402: Update the Markdown heading in the vertex provider doc: change the
heading text from "# 6. List Models" to "# 5. List Models" (the header in the
file that currently reads "6. List Models"); if the "Image Generation" section
is moved to section 4 as noted, ensure this header becomes section 5 so
numbering remains sequential.
In @docs/quickstart/go-sdk/multimodal.mdx:
- Around line 85-88: The image generation example should note that token-based
usage metrics are not provided by image-generation providers, so response.Usage
will often be nil; update the multimodal example around the response.Usage check
to either remove the Usage printing or add a short inline comment clarifying
that for image generation endpoints response.Usage (and Usage.TotalTokens) may
be empty/not populated and thus won’t produce metrics. Ensure the change
references response.Usage and Usage.TotalTokens so readers know which fields are
affected.
In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 388-402: The branch that handles duration2 < duration1 computes
speedup := float64(duration1)/float64(duration2) which is always >1, so the
subsequent check if speedup < 0.1 is unreachable; remove that inner conditional
and its t.Errorf call (keep the t.Logf). Locate the block that compares
duration1 and duration2 (variables duration1, duration2, speedup) inside the
test and delete the dead-speedup check to avoid dead code.
In @transports/bifrost-http/handlers/inference.go:
- Around line 141-158: The imageParamsKnownFields map is missing keys present on
schemas.ImageGenerationParameters which causes typed fields to be treated as
ExtraParams; update the imageParamsKnownFields variable to include at least
"seed", "negative_prompt", and "num_inference_steps" so those keys are
recognized as known fields and not placed into ExtraParams, ensuring
ImageGenerationParameters handling remains authoritative.
In @transports/bifrost-http/integrations/router.go:
- Around line 203-210: RegisterRoutes currently only validates
RequestConverter/ErrorConverter but can panic when route configs lack other
required converters; update RegisterRoutes (and any route config validation) to
explicitly check for nil on route-specific converter fields such as
ImageGenerationResponseConverter and ImageGenerationStreamResponseConverter (and
any other converters referenced in the blocks around lines ~256-264 and
~277-309), and either return a clear structured error indicating the missing
converter or assign a safe default/no-op converter to avoid panics at runtime;
reference the converter types ImageGenerationResponseConverter and
ImageGenerationStreamResponseConverter when adding these checks.
- Around line 1085-1087: The ImageGenerationStreamResponseConverter on
StreamConfig can be nil and cause a panic during the streaming loop; add a
nil-check guard where you call g.client.ImageGenerationStreamRequest and where
the stream is consumed (same pattern used for other converters), returning a
clear error if StreamConfig.ImageGenerationStreamResponseConverter is nil, or
add a startup validation that ensures ImageGenerationStreamResponseConverter is
set when StreamConfig exists so ImageGenerationStreamRequest and its stream loop
never dereference a nil converter.
- Around line 696-718: Guard the call to ImageGenerationResponseConverter:
before invoking config.ImageGenerationResponseConverter(bifrostCtx,
imageGenerationResponse) check whether config.ImageGenerationResponseConverter
is nil and if so call g.sendError(ctx, bifrostCtx, config.ErrorConverter,
newBifrostError(nil, "missing ImageGenerationResponseConverter for
integration")) and return; ensure this nil-check happens after verifying
imageGenerationResponse is non-nil (and keep existing sendError/newBifrostError
usage and the surrounding post-callback handling intact).
In @ui/app/workspace/logs/views/imageView.tsx:
- Around line 27-34: The effect that clamps currentIndex is running
unnecessarily because `images` is recreated each render; update the dependency
array for the useEffect that references `setCurrentIndex` and `totalImages` to
only depend on `totalImages` (i.e., change useEffect([...]) to useEffect(() => {
... }, [totalImages]);); alternatively, if you need the effect to run when the
actual image content changes, memoize `images` using `useMemo` (wrap the
`imageOutput?.data?.filter(...)` expression in useMemo with
`[imageOutput?.data]`) and keep `images` in the dependency array so the effect
only triggers on real changes.
🧹 Nitpick comments (29)
core/schemas/utils.go (1)
1183-1186: Consider aligning case-sensitivity behavior with other model-checking functions.This function applies
strings.ToLower()before matching, whereas the adjacent functions (IsNovaModel,IsAnthropicModel,IsMistralModel,IsGeminiModel) perform case-sensitive matching. This creates inconsistent behavior:IsImagenModel("IMAGEN-3")returnstrue, butIsGeminiModel("GEMINI-1.5")returnsfalse.If case-insensitive matching is intentional for Imagen models, the logic is correct as-is. Otherwise, consider aligning with the existing pattern:
🔧 Suggested change for consistency
// IsImagenModel checks if the model is an Imagen model. func IsImagenModel(model string) bool { - return strings.Contains(strings.ToLower(model), "imagen") + return strings.Contains(model, "imagen") }framework/logstore/migrations.go (1)
684-748: Consider combining both image generation column migrations.Both
migrationAddImageGenerationOutputColumnandmigrationAddImageGenerationInputColumnadd columns for the same feature. Similar to howmigrationAddResponsesOutputColumn(lines 198-259) adds multiple related columns in a single migration, consolidating these would reduce migration overhead and keep related schema changes atomic.♻️ Suggested consolidation
-func migrationAddImageGenerationOutputColumn(ctx context.Context, db *gorm.DB) error { +func migrationAddImageGenerationColumns(ctx context.Context, db *gorm.DB) error { opts := *migrator.DefaultOptions opts.UseTransaction = true m := migrator.New(db, &opts, []*migrator.Migration{{ - ID: "logs_add_image_generation_output_column", + ID: "logs_add_image_generation_columns", Migrate: func(tx *gorm.DB) error { tx = tx.WithContext(ctx) migrator := tx.Migrator() if !migrator.HasColumn(&Log{}, "image_generation_output") { if err := migrator.AddColumn(&Log{}, "image_generation_output"); err != nil { return err } } + if !migrator.HasColumn(&Log{}, "image_generation_input") { + if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil { + return err + } + } return nil }, Rollback: func(tx *gorm.DB) error { tx = tx.WithContext(ctx) migrator := tx.Migrator() if migrator.HasColumn(&Log{}, "image_generation_output") { if err := migrator.DropColumn(&Log{}, "image_generation_output"); err != nil { return err } } + if migrator.HasColumn(&Log{}, "image_generation_input") { + if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil { + return err + } + } return nil }, }}) err := m.Migrate() if err != nil { - return fmt.Errorf("error while adding image generation output column: %s", err.Error()) + return fmt.Errorf("error while adding image generation columns: %s", err.Error()) } return nil } - -func migrationAddImageGenerationInputColumn(ctx context.Context, db *gorm.DB) error { - opts := *migrator.DefaultOptions - opts.UseTransaction = true - m := migrator.New(db, &opts, []*migrator.Migration{{ - ID: "logs_add_image_generation_input_column", - Migrate: func(tx *gorm.DB) error { - tx = tx.WithContext(ctx) - migrator := tx.Migrator() - if !migrator.HasColumn(&Log{}, "image_generation_input") { - if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil { - return err - } - } - return nil - }, - Rollback: func(tx *gorm.DB) error { - tx = tx.WithContext(ctx) - migrator := tx.Migrator() - if migrator.HasColumn(&Log{}, "image_generation_input") { - if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil { - return err - } - } - return nil - }, - }}) - err := m.Migrate() - if err != nil { - return fmt.Errorf("error while adding image generation input column: %s", err.Error()) - } - return nil -}And update
triggerMigrations:- if err := migrationAddImageGenerationOutputColumn(ctx, db); err != nil { - return err - } - if err := migrationAddImageGenerationInputColumn(ctx, db); err != nil { + if err := migrationAddImageGenerationColumns(ctx, db); err != nil { return err }That said, the current implementation is correct and follows the defensive patterns established in the codebase.
ui/app/workspace/logs/views/imageView.tsx (1)
75-81: Simplify misleadingisStreamingprop.The expression
isStreaming && !currentImageis alwaysfalsehere because this code is inside thecurrentImage && (...)block wherecurrentImageis truthy. This is confusing to read.Since
ImageMessageonly usesisStreamingto show a skeleton when there's no usable image data, andcurrentImagealways has usable data (it passed the filter), you can simply omit the prop or explicitly passfalse.Proposed simplification
<ImageMessage image={{ ...currentImage, output_format: imageOutput?.params?.output_format, }} - isStreaming={isStreaming && !currentImage} />core/providers/huggingface/models.go (1)
67-69: Consider adding tag-based detection for image generation.The pipeline-based detection for "text-to-image" is correct. However, other capabilities like embeddings (lines 74-77), chat (lines 78-83), speech (lines 84-86), and transcription (lines 87-89) have both pipeline-based AND tag-based detection. For consistency, consider adding tag-based detection for image generation models.
🔧 Suggested addition for tag-based image generation detection
case tagLower == "automatic-speech-recognition" || tagLower == "speech-to-text" || strings.Contains(tagLower, "speech-recognition"): addMethods(schemas.TranscriptionRequest) + case tagLower == "text-to-image" || tagLower == "image-generation" || + strings.Contains(tagLower, "text-to-image") || strings.Contains(tagLower, "image-gen"): + addMethods(schemas.ImageGenerationRequest) }docs/providers/supported-providers/openai.mdx (1)
240-246: Minor style suggestion: Vary sentence beginnings.Per static analysis, three successive bullet points begin with the same structural pattern. Consider varying the phrasing slightly:
📝 Suggested rewording
-OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion: +OpenAI serves as Bifrost's baseline schema for image generation, requiring minimal conversion: -- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` -- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed. +- **Model & Prompt**: Maps `bifrostReq.Model` → `req.Model` and `bifrostReq.Input.Prompt` → `req.Prompt` +- **Parameters**: Fields from `bifrostReq.Params` (`ImageGenerationParameters`) embed directly into the OpenAI request struct. No field mapping or transformation is performed. - **Streaming**: When streaming is requested, `stream: true` is set in the request body.core/providers/huggingface/utils.go (1)
170-186: Consider defining explicit behavior for image-gen streaming onnebius/togetherroutes.At Line 173–176 and Line 180–185,
nebiusandtogetheronly handleschemas.ImageGenerationRequest. If image-gen streaming is intentionally unsupported for these, it may be cleaner to return anUnsupportedOperationErrorforschemas.ImageGenerationStreamRequest(vs a genericfmt.Errorf(...)), to keep error semantics consistent across providers.plugins/semanticcache/test_utils.go (1)
356-365: Be careful:t.Skipfcan mask regressions if CI expects Weaviate to be present.If CI/environment should guarantee the vector store, consider gating the skip behind an env like
REQUIRE_WEAVIATE=1(fail in CI, skip locally).docs/providers/supported-providers/nebius.mdx (1)
135-189: Consider looseningoutput_formatdocs to include"jpg"as accepted input (not just"jpeg").The code path typically normalizes
"jpeg"→"jpg", but clients often send"jpg"directly—documenting both reduces confusion.Also, the LanguageTool nit about repeated sentence starts is safe to ignore unless you’re already editing that block.
docs/providers/supported-providers/azure.mdx (2)
20-31: Potential doc confusion: table endpoint/openai/v1/images/generationsvs examples using/v1/images/generations.If both routes exist (gateway vs provider-prefixed), consider adding a one-liner clarifying which endpoint family the table refers to and which the examples use.
364-429: Verify the documented request JSON matches the actual image-generation request schema exposed by the gateway.This section mixes OpenAI-style fields (e.g.,
prompt,size,n) with the internal Bifrost struct shape (input.prompt,params.*) used elsewhere—please align with the OpenAPI spec in this Graphite stack (or document both shapes explicitly, if both are supported).core/internal/testutil/account.go (1)
1129-1155: Minor: XAI scenario field ordering differs from other providers.The
ImageGenerationandImageGenerationStreamfields are placed afterListModelsin the XAI config (lines 1153-1154), whereas other providers place them earlier in theScenariosstruct. This doesn't affect functionality but slightly reduces consistency.Suggested reordering for consistency
Scenarios: TestScenarios{ TextCompletion: false, // Not typical SimpleChat: true, CompletionStream: true, MultiTurnConversation: true, ToolCalls: true, MultipleToolCalls: true, End2EndToolCalling: true, AutomaticFunctionCall: true, ImageURL: true, ImageBase64: true, MultipleImages: true, CompleteEnd2End: true, SpeechSynthesis: false, // Not supported SpeechSynthesisStream: false, // Not supported Transcription: false, // Not supported TranscriptionStream: false, // Not supported Embedding: false, // Not supported + ImageGeneration: true, + ImageGenerationStream: false, ListModels: true, - ImageGeneration: true, - ImageGenerationStream: false, },tests/integrations/python/tests/utils/common.py (1)
2707-2723: Consider adding type hints for consistency.The new functions
assert_valid_image_generation_responseandassert_image_generation_usagelack return type hints, while other similar functions in this file (e.g.,assert_valid_embedding_response) have-> Noneannotations.Add return type hints
-def assert_valid_image_generation_response(response: Any, library: str = "openai"): +def assert_valid_image_generation_response(response: Any, library: str = "openai") -> None:-def assert_image_generation_usage(response: Any, library: str = "openai"): +def assert_image_generation_usage(response: Any, library: str = "openai") -> None:plugins/semanticcache/utils.go (1)
956-1008: ImageGeneration parameter-to-metadata mapping looks complete; watch key collisions via ExtraParams.
maps.Copy(metadata, params.ExtraParams)can overwrite typed keys (e.g.,"size","quality"). If that’s intended (provider-specific override), ok; otherwise consider copying ExtraParams under a prefix like"extra_*"to avoid silent overrides.transports/bifrost-http/handlers/inference.go (1)
1258-1334: Avoid logging entire streaming chunks on marshal failure (can leak large/secret payloads).
logger.Warn(..., chunk: %v)may dump base64 images, attachments, or other sensitive content into logs. Consider logging only the chunk type + request id/correlation fields.framework/streaming/accumulator.go (1)
293-324: Optional: reduce allocations inimageChunkKeyif image streaming is high-throughput.
fmt.Sprintf("%d:%d", ...)allocates; if this ends up in a hot path, consider a smallstrconv.AppendInt-based builder or a fixed-width encoding.core/internal/testutil/test_retry_framework.go (1)
1250-1252: Consider separating retry configs for ImageGeneration vs ImageGenerationStream scenarios.Right now both map to
DefaultImageGenerationRetryConfig(). If you want stream-specific behavior (e.g., retry on incomplete stream / stream errors), usingStreamingRetryConfig()(or a new image-stream-specific config) for"ImageGenerationStream"would be clearer.core/providers/openai/types.go (1)
546-584: Add explicit constants for image SSE event types to prevent drift/typos.The unconstrained
ImageGenerationEventTypeshould have typed constants defined. OpenAI's official image generation streaming API usesimage_generation.partial_imageandimage_generation.completedas event types—defining these as constants enforces the contract and prevents typos.Proposed change
type ImageGenerationEventType string +const ( + ImageGenerationEventPartial ImageGenerationEventType = "image_generation.partial_image" + ImageGenerationEventCompleted ImageGenerationEventType = "image_generation.completed" +)core/providers/gemini/utils.go (1)
1454-1469: LGTM! Clean base64 encoding helper.The
encodeBytesToBase64Stringfunction correctly handles nil/empty input and uses standard base64 encoding as documented.One minor observation: the local variable declaration could be simplified, but this is a stylistic preference.
✨ Optional simplification
func encodeBytesToBase64String(bytes []byte) string { - var base64str string - - if len(bytes) > 0 { - // Use standard base64 encoding to match external SDK expectations - base64str = base64.StdEncoding.EncodeToString(bytes) - } - - return base64str + if len(bytes) == 0 { + return "" + } + return base64.StdEncoding.EncodeToString(bytes) }core/providers/nebius/nebius_test.go (1)
31-32: LGTM! Image generation test configuration added for Nebius.The configuration correctly enables image generation testing with the
black-forest-labs/flux-schnellmodel and sets streaming to false.Minor observation:
ImageGenerationStream(line 49) is placed afterCompleteEnd2End(line 48), which slightly breaks the logical grouping where related flags are adjacent. Consider placing it immediately afterImageGenerationfor consistency with other provider test files.Also applies to: 47-49
core/providers/xai/xai.go (1)
230-244: LGTM! ImageGeneration method correctly delegates to OpenAI handler.The implementation properly:
- Uses the shared
HandleOpenAIImageGenerationRequesthandler- Passes all required configuration (client, URL, headers, raw request/response flags)
- Uses
providerUtils.GetPathFromContextfor path resolution- Follows the established pattern of other methods in this file
Minor: Line 230 has a grammatical issue: "performs a image" should be "performs an image".
📝 Minor grammar fix
-// ImageGeneration performs a image generation request to the xAI API. +// ImageGeneration performs an image generation request to the xAI API.core/schemas/provider.go (1)
352-357: Minor grammar nit in doc comments.The doc comments say "a image" instead of "an image".
📝 Suggested fix
- // ImageGeneration performs a image generation request + // ImageGeneration performs an image generation request ImageGeneration(ctx *BifrostContext, key Key, request *BifrostImageGenerationRequest) ( *BifrostImageGenerationResponse, *BifrostError) - // ImageGenerationStream performs a image generation stream request + // ImageGenerationStream performs an image generation stream request ImageGenerationStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key, request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)core/providers/nebius/nebius.go (1)
259-259: Consider usingprovider.GetProviderKey()for consistency.The hardcoded
schemas.Nebiusis used here, but other methods in this file (e.g.,Speech,Transcription, batch operations) useprovider.GetProviderKey(). While Nebius doesn't currently support custom provider configs, using the getter would maintain consistency and future-proof the code.♻️ Suggested change
path := providerUtils.GetPathFromContext(ctx, "/v1/images/generations") - providerName := schemas.Nebius + providerName := provider.GetProviderKey()core/providers/bedrock/bedrock.go (1)
1347-1355: Unsupported ImageGeneration endpoints: OK, but provider naming is now inconsistent vs other unsupported ops in this file.
Here you useprovider.GetProviderKey()whileSpeech*/Transcription*useschemas.Bedrock. If custom provider names are supported, consider standardizing all unsupported ops toprovider.GetProviderKey()for consistentExtraFields.Provider.docs/openapi/paths/inference/images.yaml (1)
1-58: OpenAPI path wiring looks consistent; consider de-risking the hard-coded model list.
The “Supported Providers” section enumerates specific model IDs that will drift; recommend either (a) keeping it provider-level without concrete model IDs, or (b) generating from a central source.plugins/semanticcache/plugin_image_generation_test.go (1)
38-41: Redundantreturnaftert.Skipf.
t.Skipfmarks the test as skipped and returns from the test function via runtime.Goexit(), so the explicitreturnon line 40 is unnecessary. This pattern repeats throughout the file (lines 139-141, 239-241, 251-253, 265-267, 296-298).♻️ Suggested cleanup
if err1 != nil { - t.Skipf("First image generation request failed (may be rate limited): %v", err1) - return + t.Skipf("First image generation request failed (may be rate limited): %v", err1) }core/internal/testutil/image_generation.go (1)
154-287: Avoid coupling generic stream testutil to the OpenAI provider for completion detectionLine 233 uses
openai.ImageGenerationCompletedto detect stream completion. That couplescore/internal/testutilto a specific provider package. If the intent is “canonical event types across all providers”, prefer moving the canonical constants tocore/schemas(orcore/providers/utils) and referencing them from there.core/providers/gemini/types.go (1)
1702-1707: Inconsistent struct definition:GeminiImagenRequest.Instancesuses anonymous struct instead ofImagenInstance.
GeminiImagenRequest.Instancesdefines an anonymous struct withPrompt *string, whileImagenInstance(line 91-93) usesPrompt string. Consider using theImagenInstancetype for consistency:♻️ Suggested refactor for consistency
type GeminiImagenRequest struct { - Instances *[]struct { - Prompt *string `json:"prompt"` - } `json:"instances"` + Instances *[]ImagenInstance `json:"instances"` Parameters GeminiImagenParameters `json:"parameters"` }Note: This would require updating
ImagenInstance.Promptto be*stringif nil prompts need to be distinguished, or updating callers to handle the non-pointer type.core/providers/gemini/images.go (2)
379-394: Asymmetric format handling:gifis handled inconvertMimeTypeToExtensionbut not inconvertOutputFormatToMimeType.For consistency in round-trip conversions, consider adding
gifsupport:♻️ Add gif support for symmetry
func convertOutputFormatToMimeType(outputFormat string) string { format := strings.ToLower(strings.TrimSpace(outputFormat)) switch format { case "png": return "image/png" case "jpg", "jpeg": return "image/jpeg" case "webp": return "image/webp" + case "gif": + return "image/gif" default: return "" } }
425-443: Missing aspect ratio detection inconvertSizeToImagenFormat.Similar to
convertImagenFormatToSize, this function only detects a subset of aspect ratios. Consider adding detection for the full set supported by Imagen.♻️ Add missing aspect ratio detection
// Common aspect ratios with tolerance if ratio >= 0.99 && ratio <= 1.01 { aspectRatio = "1:1" + } else if ratio >= 0.66 && ratio <= 0.68 { + aspectRatio = "2:3" + } else if ratio >= 1.49 && ratio <= 1.51 { + aspectRatio = "3:2" } else if ratio >= 0.74 && ratio <= 0.76 { aspectRatio = "3:4" } else if ratio >= 1.32 && ratio <= 1.34 { aspectRatio = "4:3" + } else if ratio >= 0.79 && ratio <= 0.81 { + aspectRatio = "4:5" + } else if ratio >= 1.24 && ratio <= 1.26 { + aspectRatio = "5:4" } else if ratio >= 0.56 && ratio <= 0.57 { aspectRatio = "9:16" } else if ratio >= 1.77 && ratio <= 1.78 { aspectRatio = "16:9" + } else if ratio >= 2.32 && ratio <= 2.34 { + aspectRatio = "21:9" }
8b8d1f8 to
d283d40
Compare
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
docs/providers/supported-providers/ollama.mdx (1)
31-35: Update the Note to include Image Generation.The Note lists unsupported operations but doesn't include Image Generation, which is now shown as ❌ in the table above.
📝 Suggested fix
<Note> -**Unsupported Operations** (❌): Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`. Ollama is self-hosted. Ensure you have an Ollama instance running and configured with the correct BaseURL (e.g., `http://localhost:11434`). </Note>core/utils.go (1)
50-53: Add ImageGenerationRequest and ImageGenerationStreamRequest to model validation.The
isModelRequiredfunction currently excludesImageGenerationRequestandImageGenerationStreamRequest, but the HTTP handler explicitly validates that a model is required for these requests (lines 1427-1429 in inference.go). This validation should be enforced at the core API level inisModelRequiredto ensure consistent model requirement checking for all code paths, including direct API calls that bypass the HTTP handler.Suggested change
func isModelRequired(reqType schemas.RequestType) bool { - return reqType == schemas.TextCompletionRequest || reqType == schemas.TextCompletionStreamRequest || reqType == schemas.ChatCompletionRequest || reqType == schemas.ChatCompletionStreamRequest || reqType == schemas.ResponsesRequest || reqType == schemas.ResponsesStreamRequest || reqType == schemas.SpeechRequest || reqType == schemas.SpeechStreamRequest || reqType == schemas.TranscriptionRequest || reqType == schemas.TranscriptionStreamRequest || reqType == schemas.EmbeddingRequest + return reqType == schemas.TextCompletionRequest || reqType == schemas.TextCompletionStreamRequest || reqType == schemas.ChatCompletionRequest || reqType == schemas.ChatCompletionStreamRequest || reqType == schemas.ResponsesRequest || reqType == schemas.ResponsesStreamRequest || reqType == schemas.SpeechRequest || reqType == schemas.SpeechStreamRequest || reqType == schemas.TranscriptionRequest || reqType == schemas.TranscriptionStreamRequest || reqType == schemas.EmbeddingRequest || reqType == schemas.ImageGenerationRequest || reqType == schemas.ImageGenerationStreamRequestui/lib/types/schemas.ts (1)
327-345: Add defaults for new requiredallowed_requestsfields to prevent validation failures on existing configs.When
allowed_requestsis stored with older field subsets and later re-validated, missingimage_generationandimage_generation_streamwill fail schema parsing. Use.default(false)for these new fields to maintain backward compatibility with existing persisted configurations.Proposed fix
export const allowedRequestsSchema = z.object({ text_completion: z.boolean(), text_completion_stream: z.boolean(), chat_completion: z.boolean(), chat_completion_stream: z.boolean(), responses: z.boolean(), responses_stream: z.boolean(), embedding: z.boolean(), speech: z.boolean(), speech_stream: z.boolean(), transcription: z.boolean(), transcription_stream: z.boolean(), - image_generation: z.boolean(), - image_generation_stream: z.boolean(), + image_generation: z.boolean().default(false), + image_generation_stream: z.boolean().default(false), count_tokens: z.boolean(), list_models: z.boolean(), });core/schemas/responses.go (1)
540-575: Correctness gap: error message claims image generation output support that isn't implemented.
ResponsesToolMessageOutputStructerror messages (lines 556, 574) claim support for "image generation call output", but the struct has no corresponding field and no marshal/unmarshal branch to handle it. The struct only contains:
ResponsesToolCallOutputStrResponsesFunctionToolCallOutputBlocksResponsesComputerToolCallOutputImage generation outputs will fail to serialize/deserialize with a misleading error message.
Add a
ResponsesImageGenerationOutputfield (using the appropriate image generation output type) and handle it in bothMarshalJSONandUnmarshalJSON, following the same pattern asResponsesComputerToolCallOutput.core/providers/gemini/speech.go (1)
135-158: Fix unsafe type assertion on context value that will panic whenrequest.Paramsis nil.The code assumes
BifrostContextKeyResponseFormatis always set in context, butctx.SetValue()is only called whenrequest.Params != nil(line in gemini.go). WhenParamsis omitted, the unsafe assertionctx.Value(BifrostContextKeyResponseFormat).(string)panics with "interface {} is nil, not string".Use a safe type assertion with a default fallback to
"wav":Fix
- if len(audioData) > 0 { - responseFormat := ctx.Value(BifrostContextKeyResponseFormat).(string) + if len(audioData) > 0 { + responseFormat, _ := ctx.Value(BifrostContextKeyResponseFormat).(string) + if responseFormat == "" { + responseFormat = "wav" + } // Gemini returns PCM audio (s16le, 24000 Hz, mono) // Convert to WAV for standard playable output format if responseFormat == "wav" {framework/modelcatalog/main.go (1)
223-236: AddImageGenerationRequestto pricing lookup.
ImageGenerationRequestis defined in the request types but missing from the pricing lookup loop. The codebase has full pricing support for image generation (seepricing.gowhich explicitly handlesImageGenerationResponseand normalizesImageGenerationStreamRequest), so the method should includeschemas.ImageGenerationRequestin the iteration at line 223 to avoid pricing lookup failures for image generation models.transports/bifrost-http/handlers/inference.go (1)
1261-1336: Use hardcoded"error"for image-generation SSE error events instead ofschemas.ResponsesStreamResponseTypeError.The error event type is currently sourced from a Responses API constant, which creates unnecessary coupling. Since
BifrostImageGenerationStreamResponse.Typeis a plain string field (distinct from the Responses API enum), and there is no dedicated image-generation error type constant, the error event type should be explicitly hardcoded as"error".Fix
- } else if chunk.BifrostError != nil { - eventType = string(schemas.ResponsesStreamResponseTypeError) - } + } else if chunk.BifrostError != nil { + eventType = "error" + }core/providers/gemini/types.go (1)
64-88: Unify Imagen request types; current split risks omitting requiredprompt.You now have:
ImagenInstance{ Prompt string \json:"prompt,omitempty"` }` (omitempty + non-pointer)GeminiImagenRequest.Instances *[]struct{ Prompt *string \json:"prompt"` }` (required + pointer + anonymous type)This makes it easy to accidentally serialize an Imagen request without a
prompt(or maintain two converters forever). Suggest: use a singleImagenInstanceeverywhere, and makepromptrequired (noomitempty) for the outbound Imagen request type.Concrete direction
type ImagenInstance struct { - Prompt string `json:"prompt,omitempty"` + Prompt string `json:"prompt"` } type GeminiImagenRequest struct { - Instances *[]struct { - Prompt *string `json:"prompt"` - } `json:"instances"` - Parameters GeminiImagenParameters `json:"parameters"` + Instances []ImagenInstance `json:"instances"` + Parameters GeminiImagenParameters `json:"parameters,omitempty"` }Also applies to: 90-93, 1702-1727
core/providers/huggingface/huggingface.go (1)
220-265: Unused parameterisHFInferenceImageRequestincompleteRequest.The parameter
isHFInferenceImageRequestis added to the function signature but is never used within the function body. This is dead code that adds unnecessary complexity to the API surface.Either remove the unused parameter or implement the intended differentiation for image requests.
🐛 If no special handling is needed, remove the parameter
-func (provider *HuggingFaceProvider) completeRequest(ctx *schemas.BifrostContext, jsonData []byte, url string, key string, isHFInferenceAudioRequest bool, isHFInferenceImageRequest bool) ([]byte, time.Duration, *schemas.BifrostError) { +func (provider *HuggingFaceProvider) completeRequest(ctx *schemas.BifrostContext, jsonData []byte, url string, key string, isHFInferenceAudioRequest bool) ([]byte, time.Duration, *schemas.BifrostError) {And update all call sites accordingly.
🤖 Fix all issues with AI agents
In @core/internal/testutil/test_retry_framework.go:
- Around line 2939-3095: The top comment block references
WithImageGenerationStreamValidationRetry but the actual function is named
WithImageGenerationStreamRetry; fix the mismatch by either renaming the function
to WithImageGenerationStreamValidationRetry (and update all call sites/tests) or
by updating the comment header and any documentation to say
WithImageGenerationStreamRetry to match the function; ensure the chosen name
follows the existing "*StreamValidationRetry" helper naming convention and
update any related references (comments, tests, or other helpers) to remain
consistent.
- Around line 161-165: DefaultImageGenerationRetryConfig currently returns a
TestRetryConfig with []TestRetryCondition while WithImageGenerationRetry expects
ImageGenerationRetryConfig with []ImageGenerationRetryCondition, so default
conditions are never applied; update DefaultImageGenerationRetryConfig to return
an ImageGenerationRetryConfig (or provide a conversion) that copies retry
timings from GetTestRetryConfigForScenario() and leaves Conditions nil/empty per
testutil convention, or alternatively change WithImageGenerationRetry to accept
TestRetryConfig—ensure the types for Conditions match (TestRetryCondition vs
ImageGenerationRetryCondition) and update any factory/constructor usages (e.g.,
DefaultImageGenerationRetryConfig, WithImageGenerationRetry,
ImageGenerationRetryConfig, TestRetryConfig, GetTestRetryConfigForScenario) so
default conditions are preserved and consistent across the codebase.
In @core/providers/gemini/gemini.go:
- Around line 1673-1749: The handleImagenImageGeneration function currently
hardcodes baseURL to
"https://generativelanguage.googleapis.com/v1beta/models/%s:predict", which
ignores provider.networkConfig.BaseURL; change baseURL construction to prefer
provider.networkConfig.BaseURL (falling back to the hardcoded default) and
append the path "/v1beta/models/{model}:predict" using request.Model so
custom/proxy/regional endpoints are honored; update any tests or callers that
assume the old constant and ensure request URI still uses
req.SetRequestURI(baseURL) as before.
In @core/providers/gemini/images.go:
- Around line 253-351: In ToImagenImageGenerationRequest, guard against empty
prompts by checking bifrostReq.Input.Prompt before building instances and return
nil when the prompt is empty or all-whitespace; specifically, after the nil
checks for bifrostReq and bifrostReq.Input, retrieve the prompt
(bifrostReq.Input.Prompt) and if strings.TrimSpace(prompt) == "" then return nil
instead of constructing the instances/GeminiImagenRequest, ensuring you import
strings and keep all other mapping logic unchanged.
- Around line 90-121: In convertImagenFormatToSize, normalize the imageSize
input to lower-case and add a "4k" case so values like "1K"/"2K"/"4K" are
handled correctly; update the switch on *imageSize in the
convertImagenFormatToSize function to use strings.ToLower(*imageSize) and
include cases for "1k" => 1024, "2k" => 2048, and "4k" => 4096 so the baseSize
calculation produces correct dimensions for subsequent aspectRatio handling.
In @core/providers/huggingface/images.go:
- Around line 14-154: The Nebius mapping has two fixes: change
NebiusImageGenerationRequest.GuidanceScale from *int to *float64 in
core/providers/nebius/types.go, then in ToHuggingFaceImageGenerationRequest
update the extraction call from schemas.SafeExtractIntPointer(...) to
schemas.SafeExtractFloat64Pointer(...) and assign the returned *float64 to
req.GuidanceScale; also add prompt validation in the nebius case by checking
strings.TrimSpace(bifrostReq.Input.Prompt) == "" and returning an error if empty
(keep references to NebiusImageGenerationRequest,
ToHuggingFaceImageGenerationRequest, GuidanceScale, SafeExtractFloat64Pointer,
and bifrostReq.Input.Prompt).
In @core/providers/huggingface/types.go:
- Around line 352-440: The Fal‑AI and Together image structs are incomplete and
need schema alignment and tests: add the missing optional fields to
HuggingFaceFalAIImageGenerationRequest (negative_prompt, guidance_scale,
num_inference_steps, seed, output_format, sync_mode, enable_safety_checker and
any others mirrored from HuggingFaceFalAIImageStreamRequest), update
HuggingFaceFalAIImageGenerationResponse to match the API envelope (e.g.,
top‑level metadata like requestId/status/createdAt and a data array wrapping
Images/FalAIImage) instead of the flattened struct, and modify
HuggingFaceTogetherImageGenerationRequest to accept explicit width/height (or
both width/height and size) and a steps parameter; then add focused
unmarshalling unit tests that feed representative mocked JSON payloads for each
of HuggingFaceFalAIImageGenerationRequest,
HuggingFaceFalAIImageGenerationResponse, HuggingFaceFalAIImageStreamResponse,
and HuggingFaceTogetherImageGenerationRequest to validate decoding and catch
schema drift.
In @core/providers/nebius/types.go:
- Around line 11-24: The GuidanceScale field on the NebiusImageGenerationRequest
struct is typed as *int but must accept decimals per Nebius API; change the
GuidanceScale field declaration in NebiusImageGenerationRequest from *int to
*float64 (retain the json tag `guidance_scale,omitempty`) and update any
construction, unmarshalling, validation, or usages that assume an integer (e.g.,
code that sets or reads GuidanceScale) to handle *float64 values instead.
In @core/providers/openai/images.go:
- Around line 35-39: The filterXAISpecificParameters function currently only
clears Quality, Style, and Size but must also remove all XAI-unsupported image
params so they aren't sent to the Grok API; update filterXAISpecificParameters
to set Background, Moderation, PartialImages, OutputCompression, OutputFormat,
Seed, NegativePrompt, and NumInferenceSteps on req.ImageGenerationParameters to
nil (in addition to the existing Quality, Style, Size) so only model, prompt, n,
and response_format remain.
In @core/providers/openai/openai.go:
- Around line 2493-2874: HandleOpenAIImageGenerationStreaming currently returns
on the first ImageGenerationCompleted event, which closes the stream and drops
images when n>1; modify the loop so that when isCompleted is true you only exit
the goroutine after all images have finished (e.g., after removing the completed
index from incompleteImages check that len(incompleteImages) == 0 before
returning), ensuring you continue processing interleaved partial chunks until
every image is completed; update the return logic in the loop (inside
HandleOpenAIImageGenerationStreaming) to only return when no incomplete images
remain rather than on the first completed event.
In @core/providers/xai/xai.go:
- Around line 230-244: The doc comment above the ImageGeneration function
contains a grammar issue; update the comment string to read cleanly (e.g.,
"ImageGeneration performs an image generation request to the xAI API.") so it
correctly uses "an image" and matches the function name and purpose; edit the
comment immediately above the ImageGeneration method in xai.go and keep the rest
of the implementation unchanged.
In @docs/openapi/schemas/inference/images.yaml:
- Around line 1-242: The OpenAPI schema enums and token-detail fields are out of
sync with the implementation: add "auto" to the size enum (used by
core/schemas/images.go), add "jpg" as an allowed value alongside "jpeg" for
output_format (providers normalize "jpg"↔"jpeg"), and extend ImageTokenDetails
to include the NImages field (match the implementation's
OutputTokensDetails.NImages naming or use snake_case consistent with other
schemas) so streaming/usage payloads align with
ImageGenerationRequest/ImageGenerationResponse and
ImageGenerationStreamResponse. Ensure the changes touch the size and
output_format definitions in ImageGenerationRequest and ImageGenerationResponse
and update the ImageTokenDetails object to include the NImages token count
field.
In @docs/quickstart/gateway/multimodal.mdx:
- Around line 49-85: In the "Basic Image Generation" section update formatting
by adding a blank line after the closing bash/code block and another blank line
before the "## Audio Understanding" header, and expand the JSON example under
**Response format:** (the Image generation response example) to include the
missing fields: add "output_tokens" in the "usage" object, include optional
ImageGenerationResponseParameters fields ("background", "output_format",
"quality", "size") where applicable, and add an "extra_fields" object at the
top-level response example so the documented response matches the actual
implementation.
In @framework/logstore/tables.go:
- Around line 98-102: Summary: Persisting full base64 image outputs in
ImageGenerationOutput (and similar fields) can be huge and slow queries;
sanitize or separate blobs. Fix: before serializing/storing into the struct
fields ImageGenerationOutput (and other image-related log fields), strip or
redact any b64_json/base64 fields (or replace with metadata/URLs and
revised_prompt) by adding a sanitizer helper (e.g., sanitizeImageGenOutput or
ImageGenerationOutputParsedSanitize) that removes or truncates large base64
blobs; call this helper wherever image outputs are marshaled for DB persistence
(the code paths that populate ImageGenerationOutput, ImageGenerationOutputParsed
and analogous fields around lines noted); alternatively implement storing the
raw blob in a separate blob store with retention controls and save only a
reference/metadata in ImageGenerationOutput. Ensure tests cover that stored JSON
no longer contains b64_json and that existing URL/metadata fields remain.
In @framework/streaming/images.go:
- Around line 18-35: The loop accesses chunks[i].Delta.Type and other Delta
fields without ensuring Delta is non-nil, which can panic; update the condition
to first check chunks[i].Delta != nil (e.g., if chunks[i].Delta != nil &&
(chunks[i].Delta.Type == string(openai.ImageGenerationCompleted) ||
chunks[i].FinishReason != nil)) and only then read Delta.ID, Delta.CreatedAt,
Delta.ExtraFields.ModelRequested, Delta.B64JSON, Delta.URL and
Delta.RevisedPrompt when building the BifrostImageGenerationResponse so all
Delta field accesses are guarded; also ensure the loop iterates correctly over
chunks (use for i := range chunks or a classic index loop) so you’re referencing
valid indices.
In @framework/streaming/types.go:
- Around line 14-20: In convertAccResultToProcessedStreamResponse in
plugins/maxim/main.go the code fails to detect ImageGenerationOutput and so
falls back to StreamTypeChat; add an else-if branch checking
accResult.ImageGenerationOutput != nil and set streamType =
streaming.StreamTypeImage, and include accResult.ImageGenerationOutput in the
returned AccumulatedData (alongside AudioOutput and TranscriptionOutput) so the
caller receives the image output field.
In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 388-394: The branch that handles the case where duration2 <
duration1 (cache faster) contains an inverted and redundant failure check:
remove the inner if that tests speedup < 0.1 (which would only be true if the
cache were massively slower) and its t.Errorf call; keep only the t.Logf
reporting the speedup. This change affects the test variables duration1,
duration2 and speedup in the streaming cache timing block so locate that block
in plugin_image_generation_test.go and delete the erroneous conditional and
error emission.
In @tests/integrations/python/tests/test_google.py:
- Around line 1740-1748: The test function test_41b_imagen_predict is missing
the @skip_if_no_api_key("google") decorator which other imagen tests (test_41a,
test_41c) use; add the @skip_if_no_api_key("google") decorator directly above
the def test_41b_imagen_predict(...) declaration so the test is skipped when the
Google API key is not configured, keeping behavior consistent with the other
imagen tests.
🧹 Nitpick comments (43)
docs/providers/supported-providers/bedrock.mdx (1)
41-49: Update the Note to include Image Generation or clarify its unsupported status.The new row correctly documents that Image Generation is unsupported for Bedrock. However, the Note at lines 45-49 only mentions Speech and Transcriptions as unsupported "by the upstream AWS Bedrock API."
Unlike Speech/STT, AWS Bedrock does support image generation through models like Amazon Titan Image Generator and Stable Diffusion. Consider updating the Note to either:
- Include Image Generation in the list and adjust the wording (e.g., "not yet implemented" vs. "not supported upstream"), or
- Add a separate clarification that Image Generation support for Bedrock is planned for a future release.
This helps users understand whether the limitation is temporary (Bifrost implementation gap) or permanent (upstream API limitation).
📝 Suggested documentation update
<Note> -**Unsupported Operations** (❌): Speech (TTS) and Transcriptions (STT) are not supported by the upstream AWS Bedrock API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Speech (TTS) and Transcriptions (STT) are not supported by the upstream AWS Bedrock API. Image Generation is not yet implemented for Bedrock in Bifrost. These return `UnsupportedOperationError`. **Limitations**: Images must be in base64 or data URI format (remote URLs not supported). Text completion streaming is not supported. </Note>docs/providers/supported-providers/overview.mdx (1)
18-38: Minor table formatting inconsistencies.The image generation columns are correctly documented. Two small spacing issues in the table:
- Line 28 (Hugging Face):
✅| ✅— missing space before the pipe after the Images column.- Line 30 (Nebius):
|❌— missing space after the pipe near the Files column.These won't break rendering but create inconsistency in the raw markdown.
🔧 Suggested fix
-| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅| ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | +| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |-| Nebius (`nebius/<model>`) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |❌ | ❌ | ❌ | +| Nebius (`nebius/<model>`) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |ui/app/workspace/providers/fragments/allowedRequestsFields.tsx (1)
76-78: LGTM!The new request type entries follow the existing naming convention and are correctly placed in the array.
Optional: The column split (lines 82-83) now creates a 6 vs 9 imbalance. Consider adjusting to
slice(0, 8)for a more balanced 8 vs 7 layout:- const leftColumn = REQUEST_TYPES.slice(0, 6); - const rightColumn = REQUEST_TYPES.slice(6); + const leftColumn = REQUEST_TYPES.slice(0, 8); + const rightColumn = REQUEST_TYPES.slice(8);tests/integrations/python/tests/test_google.py (3)
1691-1696: Inconsistent fixture parameter naming.The parameter
_test_configuses an underscore prefix (indicating unused), whereas all other tests in this file usetest_config. If the fixture is intentionally unused, consider removing it from the signature entirely for clarity. If it might be used later, use the consistent nametest_config.Suggested fix
Either remove the unused parameter:
- def test_41a_image_generation_simple(self, _test_config, provider, model): + def test_41a_image_generation_simple(self, provider, model):Or use consistent naming if it may be needed:
- def test_41a_image_generation_simple(self, _test_config, provider, model): + def test_41a_image_generation_simple(self, test_config, provider, model):Apply the same change to
test_41b_imagen_predictandtest_41c_image_generation_with_text.
1702-1703: Remove redundant in-function imports.
get_integration_urlandget_api_keyare already imported at the module level (lines 107 and 102 respectively). These in-function imports are redundant and add unnecessary noise.Suggested fix
Remove the in-function imports from all three test methods:
- from .utils.config_loader import get_integration_url, get_config - from .utils.common import get_api_keyAlso applies to: 1750-1751, 1804-1805
1841-1857: Consider strengthening the multi-modal response assertion.The test requests both
IMAGEandTEXTmodalities but only assertsfound_text or found_image. Since the test is explicitly requesting both modalities, consider asserting that at least an image is returned (the primary expectation), or logging when only text is returned to help debug potential issues.Suggested enhancement
# At least one of text or image should be present assert found_text or found_image, "Response should contain text or image" + # Log what was received for debugging + if not found_image: + print(f"⚠ Multi-modal request returned text only (no image)") if found_image: print("✓ Multi-modal response with image generated successfully") + if found_text and found_image: + print("✓ Multi-modal response with both text and image generated successfully")tests/integrations/python/config.yml (1)
490-717: Consider addingmodel_capabilitiesentries for the new image generation models.For consistency with other specialized models (e.g.,
tts-1,whisper-1, embedding models), consider adding capability entries for the image generation models (gpt-image-1,gemini-2.5-flash-image,imagen-4.0-generate-001). This would document their capabilities (or lack thereof for chat/tools/vision) and any relevant limits.Example structure
# OpenAI Image Generation Models "gpt-image-1": chat: false tools: false vision: false streaming: false speech: false transcription: false embeddings: false image_generation: true max_tokens: null context_window: null # Gemini Image Generation Models "gemini-2.5-flash-image": chat: false tools: false vision: false streaming: false image_generation: true max_tokens: null context_window: null "imagen-4.0-generate-001": chat: false tools: false vision: false streaming: false image_generation: true max_tokens: null context_window: nulltests/integrations/python/tests/test_openai.py (2)
91-93: Unused import:IMAGE_GENERATION_DETAILED_PROMPT
IMAGE_GENERATION_DETAILED_PROMPTis imported but never used in the test file. All image generation tests useIMAGE_GENERATION_SIMPLE_PROMPTinstead.Consider either removing the unused import or adding a test case that exercises the detailed prompt (e.g., a test that validates more complex prompt handling).
Option 1: Remove unused import
- IMAGE_GENERATION_DETAILED_PROMPT, - # Image Generation utilities - IMAGE_GENERATION_SIMPLE_PROMPT, + # Image Generation utilities + IMAGE_GENERATION_SIMPLE_PROMPT,Option 2: Add a test using detailed prompt
@pytest.mark.parametrize( "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation") ) def test_52e_image_generation_detailed_prompt(self, test_config, provider, model, vk_enabled): """Test Case 52e: Image generation with detailed prompt""" if provider == "_no_providers_" or model == "_no_model_": pytest.skip("No providers configured for this scenario") client = get_provider_openai_client(provider, vk_enabled=vk_enabled) response = client.images.generate( model=format_provider_model(provider, model), prompt=IMAGE_GENERATION_DETAILED_PROMPT, n=1, size="1024x1024", ) assert_valid_image_generation_response(response, "openai") assert len(response.data) == 1
1240-1256: Add image count assertion for consistency with other tests.Unlike
test_52a,test_52b, andtest_52c, this test doesn't assert the expected image count. For consistency and completeness, consider adding the assertion.Suggested addition
# Validate response structure assert_valid_image_generation_response(response, "openai") + + # Verify we got exactly 1 image + assert len(response.data) == 1, f"Expected 1 image, got {len(response.data)}"ui/lib/types/logs.ts (2)
142-174: Consider tightening the image data shape (URL vs base64) and avoiding redundant types.Right now
BifrostImageGenerationData/ImageMessageDataallow bothurlandb64_jsonsimultaneously and duplicate fields. Consider a discriminated union (or at least a type-level “at least one of url/b64_json”) to prevent impossible states and reduce drift between the two interfaces. Also: base64 fields can be huge—ensure the UI/log views won’t eagerly render them.
309-314: LogEntry image_generation_input may be missing important request context.
image_generation_input?: { prompt: string }likely won’t be enough to reproduce/debug (e.g., size/quality/output_format/background). If those are available in backend logs, consider mirroring them here (or reusing the request type shape).core/internal/testutil/test_retry_conditions.go (1)
907-956: Streaming retry condition may false-positive on non-payload chunks.
EmptyImageGenerationConditionassumesImageGenerationStreamResponsealways containsB64JSONorURL. If the stream includes “in_progress / generating / partial_image metadata” events where payload is legitimately empty, this will trigger retries incorrectly. Suggest gating the check on the stream event/type (e.g., only validate payload on “partial_image” / “completed/done” events).core/providers/gemini/transcription.go (1)
36-41: Silent skip on base64 decode failure may hide data corruption.When
decodeBase64StringToBytesfails, the code silently skips the part without logging or propagating the error. This could mask malformed input data or encoding issues.Consider logging a warning when decode fails to aid debugging:
Proposed improvement
if part.InlineData != nil && strings.HasPrefix(strings.ToLower(part.InlineData.MIMEType), "audio/") { decodedData, err := decodeBase64StringToBytes(part.InlineData.Data) if err != nil { - // Skip this part if decoding fails + // Log and skip this part if decoding fails + // Consider adding a logger parameter or using a package-level logger continue }docs/providers/supported-providers/openai.mdx (1)
218-273: Image Generation documentation is comprehensive and well-structured.The section covers all key aspects:
- Request parameters with appropriate types and constraints
- Request/response conversion details
- Streaming event types (
image_generation.partial_image,image_generation.completed,error) are consistent with the expected SSE format per repository patterns.One minor stylistic note from static analysis: lines 244-246 have three successive bullet points starting with the same pattern. Consider varying the sentence structure for readability, though this is optional.
core/providers/gemini/utils.go (1)
1454-1469: Consider simplifying the encoding function.The implementation is correct but slightly verbose. A more idiomatic approach:
♻️ Optional simplification
func encodeBytesToBase64String(bytes []byte) string { - var base64str string - - if len(bytes) > 0 { - // Use standard base64 encoding to match external SDK expectations - base64str = base64.StdEncoding.EncodeToString(bytes) + if len(bytes) == 0 { + return "" } - - return base64str + return base64.StdEncoding.EncodeToString(bytes) }framework/logstore/migrations.go (1)
684-748: Consider combining into a single migration.Both migrations follow the correct pattern with proper
HasColumnchecks and rollback safety. However, since these columns are part of the same feature (image generation), you could consider combining them into a single migration (e.g.,logs_add_image_generation_columns) similar to howmigrationAddCostAndCacheDebugColumncombines related columns. This reduces migration table entries and keeps related schema changes atomic.That said, the current approach is functionally correct and keeps rollbacks granular.
♻️ Optional: Combined migration
-func migrationAddImageGenerationOutputColumn(ctx context.Context, db *gorm.DB) error { +func migrationAddImageGenerationColumns(ctx context.Context, db *gorm.DB) error { opts := *migrator.DefaultOptions opts.UseTransaction = true m := migrator.New(db, &opts, []*migrator.Migration{{ - ID: "logs_add_image_generation_output_column", + ID: "logs_add_image_generation_columns", Migrate: func(tx *gorm.DB) error { tx = tx.WithContext(ctx) migrator := tx.Migrator() if !migrator.HasColumn(&Log{}, "image_generation_output") { if err := migrator.AddColumn(&Log{}, "image_generation_output"); err != nil { return err } } + if !migrator.HasColumn(&Log{}, "image_generation_input") { + if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil { + return err + } + } return nil }, Rollback: func(tx *gorm.DB) error { tx = tx.WithContext(ctx) migrator := tx.Migrator() if migrator.HasColumn(&Log{}, "image_generation_output") { if err := migrator.DropColumn(&Log{}, "image_generation_output"); err != nil { return err } } + if migrator.HasColumn(&Log{}, "image_generation_input") { + if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil { + return err + } + } return nil }, }}) err := m.Migrate() if err != nil { - return fmt.Errorf("error while adding image generation output column: %s", err.Error()) + return fmt.Errorf("error while adding image generation columns: %s", err.Error()) } return nil }core/internal/testutil/validation_presets.go (1)
222-237: Remove leftover instruction comment.Line 222 contains a leftover instruction comment that should be removed.
🧹 Proposed fix
-// In validation_presets.go - add this function +// ImageGenerationExpectations returns validation expectations for image generation scenarios func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {core/schemas/provider.go (1)
352-357: Minor: Fix grammar in doc comments.The interface method signatures are correct and consistent with other methods.
📝 Proposed fix
- // ImageGeneration performs a image generation request + // ImageGeneration performs an image generation request ImageGeneration(ctx *BifrostContext, key Key, request *BifrostImageGenerationRequest) ( *BifrostImageGenerationResponse, *BifrostError) - // ImageGenerationStream performs a image generation stream request + // ImageGenerationStream performs an image generation stream request ImageGenerationStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key, request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)docs/providers/supported-providers/azure.mdx (1)
364-429: LGTM! Comprehensive documentation for Azure Image Generation.The section covers all essential aspects:
- Parameter mapping with deployment ID handling
- Gateway and Go SDK examples
- Response conversion and streaming behavior
Minor formatting note: Line 429 is missing a newline before the
---separator.📝 Proposed fix for formatting
Image generation streaming is supported and uses OpenAI's streaming format with Server-Sent Events (SSE). + ---core/providers/utils/utils.go (1)
959-966: Consider addingImageGenerationStreamResponsetoProcessAndSendBifrostError.The
ProcessAndSendResponsefunction (line 914) was updated to propagateImageGenerationStreamResponse, butProcessAndSendBifrostError(lines 959-966) was not updated with the same field. This could lead to inconsistent behavior if an image generation stream encounters an error that needs to include partial response data.💡 Suggested addition for consistency
streamResponse := &schemas.BifrostStream{} if processedResponse != nil { streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse streamResponse.BifrostChatResponse = processedResponse.ChatResponse streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse + streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse }core/internal/testutil/account.go (1)
360-368: Missing: Nebius provider test config lacks image generation settings.The Nebius provider block doesn't include
ImageGenerationModelor enableImageGenerationscenario, butcore/providers/nebius/nebius.goimplements theImageGenerationmethod. Consider adding test coverage.Suggested addition for Nebius provider config
You may need to add a Nebius entry to
AllProviderConfigswith image generation settings, or update the existing Nebius key configuration with anImageGenerationModeland enable theImageGenerationscenario flag.core/providers/azure/azure.go (3)
1212-1252: UsegetModelDeploymenthelper for consistency.Other methods in this file (e.g.,
TextCompletion,ChatCompletion,Embedding,Speech) useprovider.getModelDeployment(key, request.Model)which provides a more descriptive error message. The direct map access here works but is inconsistent with the established pattern.♻️ Suggested refactor for consistency
func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err } - deployment := key.AzureKeyConfig.Deployments[request.Model] - if deployment == "" { - return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey()) + deployment, err := provider.getModelDeployment(key, request.Model) + if err != nil { + return nil, err }
1269-1269: Remove stray empty comment.Line 1269 has an empty comment (
//) that appears to be leftover and should be removed.🧹 Remove empty comment
- // deployment := key.AzureKeyConfig.Deployments[request.Model]
1257-1304: UsegetModelDeploymenthelper for consistency.Same issue as in
ImageGeneration- use the helper method for consistency with other provider methods.Also, the method signature has an extra blank line at line 1263 which could be cleaned up.
♻️ Suggested refactor
func (provider *AzureProvider) ImageGenerationStream( ctx *schemas.BifrostContext, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostImageGenerationRequest, ) (chan *schemas.BifrostStream, *schemas.BifrostError) { - // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err } - // - deployment := key.AzureKeyConfig.Deployments[request.Model] - if deployment == "" { - return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey()) + deployment, err := provider.getModelDeployment(key, request.Model) + if err != nil { + return nil, err }core/internal/testutil/response_validation.go (1)
236-238: Minor formatting inconsistency.The extra blank lines and brace placement at lines 236-238 differ slightly from the other validator functions (e.g.,
ValidateSpeechResponseends at line 235 without extra blank lines before the closing brace).🧹 Clean up formatting
logValidationResults(t, result, scenarioName) - - return result - + return result }core/providers/nebius/types.go (1)
26-29: Consider Go naming convention forUrlfield.Go convention for acronyms is to use all caps (
URLinstead ofUrl). This is a minor style issue.🧹 Naming convention fix
type NebiusLora struct { - Url string `json:"url"` + URL string `json:"url"` Scale int `json:"scale"` }framework/modelcatalog/pricing.go (1)
282-319: Per-image pricing tier check uses zero value for tier determination.When tokens are all zero (the condition at line 282),
imageTotalTokenswill be 0, making theisAbove128kcheck at line 292 always false. This is likely intentional (per-image pricing doesn't use token tiers), but the variable and check are misleading. Consider removing or clarifying.♻️ Suggested simplification
// Use per-image pricing when tokens are nil/zero // Extract number of images from ImageTokenDetails if available numImages := 1 if imageUsage.OutputTokensDetails != nil && imageUsage.OutputTokensDetails.NImages > 0 { numImages = imageUsage.OutputTokensDetails.NImages } else if imageUsage.InputTokensDetails != nil && imageUsage.InputTokensDetails.NImages > 0 { numImages = imageUsage.InputTokensDetails.NImages } - isAbove128k := imageTotalTokens > TokenTierAbove128K - var inputPerImageRate, outputPerImageRate *float64 - if isAbove128k { - inputPerImageRate = pricing.InputCostPerImageAbove128kTokens - // Note: OutputCostPerImageAbove128kTokens may not exist in TableModelPricing - // For now, use regular OutputCostPerImage even above 128k - } else { - inputPerImageRate = pricing.InputCostPerImage - } + // Per-image pricing doesn't use token tiers (tokens are zero in this path) + inputPerImageRate = pricing.InputCostPerImage // Use OutputCostPerImage if available outputPerImageRate = pricing.OutputCostPerImagedocs/providers/supported-providers/gemini.mdx (1)
560-576: Consider varying sentence structure for readability.The "Request Conversion" subsections have repeated sentence patterns starting with bold terms. While functional, varying the structure slightly would improve readability.
plugins/semanticcache/utils.go (1)
382-382: Mixed JSON marshaling libraries.Line 382 uses
sonic.Marshalwhile other places in this file usejson.Marshal(e.g., lines 119, 348, 746, 835). Consider using sonic consistently throughout for performance, or document the rationale for mixing.♻️ Option: Standardize on sonic for consistency
If performance is the goal, consider replacing
json.Marshalcalls withsonic.Marshalthroughout the file for consistency. Alternatively, ifjson.Marshalis preferred for stability, revert line 382 to usejson.Marshal.plugins/semanticcache/plugin_image_generation_test.go (1)
38-41: Unreachable code aftert.Skipf.
t.Skipfmarks the test as skipped and continues execution, so thereturnstatement is reachable. However, the patternt.Skipf(...); returnis redundant ast.Skipfalready stops test execution. This pattern appears multiple times in the file.♻️ Simplified pattern
if err1 != nil { - t.Skipf("First image generation request failed (may be rate limited): %v", err1) - return + t.Skipf("First image generation request failed (may be rate limited): %v", err1) }Note:
t.Skipfcallsruntime.Goexit()internally, so thereturnis technically unreachable. Removing it makes the code cleaner.framework/streaming/images.go (1)
270-274: Remove empty/dead code block.This comment block with no actionable code appears to be leftover from development.
♻️ Proposed cleanup
- // Extract usage if available - if result.ImageGenerationStreamResponse.Usage != nil { - // Note: ImageUsage doesn't directly map to BifrostLLMUsage, but we can store it - // The actual usage will be extracted in processAccumulatedImageStreamingChunks - }core/providers/openai/types.go (1)
549-562: Consider makingPrompta*stringfor parity with other request types (if you need “omitted vs empty”).
CurrentlyPromptis a requiredstring. If upstream code ever needs to distinguish “missing” from “empty”, this won’t be possible. If not needed, current approach is fine.core/internal/testutil/image_generation.go (2)
57-130: Avoid image decode flakes: request an explicit output format (or skip dimension checks when format isn’t supported).
Right now the test only registersimage/jpegandimage/png, but providers may returnwebp(or other) inb64_json. Consider forcingOutputFormat(and/orResponseFormat) inParams, or making the dimension assertion conditional on successfulimage.DecodeConfig.
228-238: Remove provider-specific constant from shared testutil; use canonical string instead.The testutil imports
core/providers/openaiand usesopenai.ImageGenerationCompletedat line 232, but this module is used by multiple providers (Azure, Gemini, Vertex, Nebius, XAI). Other providers don't define their ownImageGenerationCompletedconstant, creating an avoidable coupling. Since all providers normalize to the canonical"image_generation.completed"string, replace the provider-specific constant:Proposed diff
- if imgResp.Type == string(openai.ImageGenerationCompleted) { + if imgResp.Type == "image_generation.completed" {This also allows removing the
core/providers/openaiimport from testutil.core/providers/huggingface/types.go (1)
151-163: MakeHuggingFaceErrorDetail.Locoptional for resilience.
FastAPI error shapes can vary; iflocis absent/null, strictjson:"loc"still unmarshals (nil slice), but marking itomitempty(and possiblyMsg/Typetoo) makes this struct more tolerant across providers and versions.framework/streaming/accumulator.go (2)
293-325: Avoid per-chunk string allocations in image de-dup key (hot path).
imageChunkKey()usesfmt.Sprintf, which will allocate for every chunk; this can get expensive for long streams / many images. Prefer a non-alloc key (e.g.,uint64packing or a small struct key).Proposed refactor (uint64 key)
- ImageChunksSeen make(map[string]struct{}), + ImageChunksSeen make(map[uint64]struct{}), - func imageChunkKey(imageIndex, chunkIndex int) string { - return fmt.Sprintf("%d:%d", imageIndex, chunkIndex) - } + func imageChunkKey(imageIndex, chunkIndex int) uint64 { + return (uint64(uint32(imageIndex)) << 32) | uint64(uint32(chunkIndex)) + } - chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex) - if _, seen := acc.ImageChunksSeen[chunkKey]; !seen { + chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex) + if _, seen := acc.ImageChunksSeen[chunkKey]; !seen { acc.ImageChunksSeen[chunkKey] = struct{}{} acc.ImageStreamChunks = append(acc.ImageStreamChunks, chunk) }
114-132: Nice: pooling reset wrappers reduce reuse bugs; consider consistency for index fields across chunk types.
putImageStreamChunk()resetsChunkIndex/ImageIndex, but the otherput*StreamChunk()helpers don’t reset their index fields (if any). Not required, but aligning the reset policy across chunk types reduces risk if future code reads stale indices after reuse.Also applies to: 469-486
core/providers/gemini/images.go (2)
183-251:labelsextraction is likely too strict (map[string]string); consider accepting map[string]any.
labels.(map[string]string)will usually fail ifExtraParamscame from JSON (commonlymap[string]any). Consider converting entry-by-entry tomap[string]string(or supporting both).
445-473: ImagenRaiFilteredReasonis dropped; consider surfacing as an error or metadata.If predictions are filtered, returning base64 data may be empty and you’ll return “success” with unusable output. Consider mapping
RaiFilteredReasoninto a Bifrost error (or at leastExtraParams/ debug metadata) so clients can distinguish “model refused” vs “transport failed”.core/internal/testutil/test_retry_framework.go (1)
1249-1252: Good: scenario mapping is wired; ensure stream scenario uses the correct retry wrapper.You map
"ImageGenerationStream"toDefaultImageGenerationRetryConfig()(generic). Please confirm the stream test path usesWithImageGenerationStreamRetry()(generic) rather than the typed non-stream helper, so the intended config type actually matches.core/providers/huggingface/huggingface.go (3)
133-163: Consider extracting duplicate body modification logic.The JSON body modification code (parsing, setting model field, re-encoding) is duplicated between the initial request path (lines 155-162) and the retry path (lines 187-194). While functional, this could be extracted to a small helper function to reduce duplication.
♻️ Optional refactor to reduce duplication
// Helper function to update model in JSON body func updateModelInJSONBody(jsonData []byte, modelName string) []byte { var reqBody map[string]interface{} if err := sonic.Unmarshal(jsonData, &reqBody); err == nil { reqBody["model"] = modelName if newJSON, err := sonic.Marshal(reqBody); err == nil { return newJSON } } return jsonData }Then use in both locations:
- var reqBody map[string]interface{} - if err := sonic.Unmarshal(jsonData, &reqBody); err == nil { - // For other providers (embeddings, hf-inference images), use validated model ID - reqBody["model"] = modelName - if newJSON, err := sonic.Marshal(reqBody); err == nil { - updatedJSONData = newJSON - } - } + updatedJSONData = updateModelInJSONBody(jsonData, modelName)
1127-1130: Minor: Variable initialization could benefit from explicit nil handling.The variables
lastB64Data,lastURLData,lastJsonDataare initialized as empty strings. When used in the final chunk (line 1252), an emptylastJsonDatawould setRawResponseto an empty string rather than omitting it. Consider checking for non-empty before setting.♻️ Optional: Only set RawResponse if data exists
if sendBackRawResponse { - finalChunk.ExtraFields.RawResponse = lastJsonData + if lastJsonData != "" { + finalChunk.ExtraFields.RawResponse = lastJsonData + } }
1231-1274: Minor: Stream end indicator potentially set twice.When
chunkIndex > 0, the stream end indicator is set at line 1254 (for the completion chunk). Ifscanner.Err()also returns an error, it's set again at line 1272. This is harmless but redundant. Consider restructuring to only set once.
d283d40 to
675a340
Compare
There was a problem hiding this comment.
Actionable comments posted: 17
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
docs/providers/supported-providers/anthropic.mdx (1)
26-33: Update the Note to include Image Generation.The Image Generation row was added to the Supported Operations table showing it's unsupported (❌), but the Note on line 32 still only lists "Embeddings, Speech, and Transcriptions" as unsupported operations. For consistency, Image Generation should be added to the Note.
Suggested fix
<Note> -**Unsupported Operations** (❌): Embeddings, Speech, and Transcriptions are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Embeddings, Image Generation, Speech, and Transcriptions are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`. </Note>transports/bifrost-http/integrations/genai.go (1)
36-65: Fix mutual exclusivity ofIsEmbeddingvsIsImageGeneration(current order misroutes).
Right now a:predictrequest can end up with both flags true, andRequestConverterwill pickEmbeddingRequestfirst (Line 42-46), even if the client asked for IMAGE modality (Line 426-429).Proposed fix (ensure flags don’t conflict, and only force embedding when it’s clearly embedding)
func extractAndSetModelFromURL(ctx *fasthttp.RequestCtx, bifrostCtx *schemas.BifrostContext, req interface{}) error { @@ - // Determine if :predict is for image generation (Imagen) or embedding - // Imagen models use :predict for image generation - isImagenPredict := isPredict && schemas.IsImagenModel(modelStr) - if isPredict && !isImagenPredict { - // :predict for non-Imagen models is embedding - isEmbedding = true - } + // Determine if :predict is for image generation (Imagen) or embedding + // NOTE: Do not force embedding if the request clearly asks for IMAGE output. + isImagenPredict := isPredict && schemas.IsImagenModel(modelStr) @@ case *gemini.GeminiGenerationRequest: r.Model = modelStr r.Stream = isStreaming - r.IsEmbedding = isEmbedding r.IsCountTokens = isCountTokens @@ - // Detect if this is an image generation request - // isImagenPredict takes precedence for :predict endpoints - r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r) + // Detect image generation intent first + r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r) + + // :predict for non-Imagen defaults to embedding, but only if it's not image generation + r.IsEmbedding = isEmbedding || (isPredict && !isImagenPredict && !r.IsImageGeneration) return nil }Also applies to: 370-434
plugins/logging/operations.go (1)
78-191: Fixraw_responsegating precedence (can write"null"whenRawResponseis nil).
Current condition effectively becomesdisableContentLogging==nil || (... && RawResponse!=nil). Likely intended: “content logging enabled AND raw response present”.Proposed fix
diff --git a/plugins/logging/operations.go b/plugins/logging/operations.go @@ - if p.disableContentLogging == nil || !*p.disableContentLogging && data.RawResponse != nil { + if (p.disableContentLogging == nil || !*p.disableContentLogging) && data.RawResponse != nil { rawResponseBytes, err := sonic.Marshal(data.RawResponse) if err != nil { p.logger.Error("failed to marshal raw response: %v", err)framework/modelcatalog/main.go (1)
218-236: Consider includingschemas.ImageGenerationRequestinGetPricingEntryForModellookup.
With image-generation pricing added, this function may fail to return pricing for image-only models (or models whose pricing is stored under the image-generation mode).Proposed fix
for _, mode := range []schemas.RequestType{ schemas.TextCompletionRequest, schemas.ChatCompletionRequest, schemas.ResponsesRequest, schemas.EmbeddingRequest, schemas.SpeechRequest, schemas.TranscriptionRequest, + schemas.ImageGenerationRequest, } {core/providers/gemini/responses.go (1)
1597-1661: Add defensive prefix check to image and audio handlers matching the file handler pattern.The file handler (line 1652) defensively checks
if !strings.HasPrefix(fileDataURL, "data:")before wrapping as data URL, but image (line 1616) and audio (line 1627) handlers do not. WhileBlob.Datais documented as "Base64-encoded bytes," apply the same defensive check consistently:For images (line 1616):
if !strings.HasPrefix(blob.Data, "data:") { imageURL = fmt.Sprintf("data:%s;base64,%s", mimeType, blob.Data) } else { imageURL = blob.Data }For audio (line 1627), either apply the same guard or add a comment explaining why it's safe to skip.
transports/bifrost-http/handlers/inference.go (1)
1261-1333: Fix SSE termination + avoid logging full chunks (can emit wrong[DONE]/ leak huge base64).
includeEventType/skipDoneMarkerare inferred from chunks; if the upstream closes without emitting any chunks, the handler will still senddata: [DONE], which is incorrect for Responses/Image streams.- Logging
chunk: %von marshal errors can dump large image payloads / sensitive fields and cause log spikes.Proposed fix (make protocol explicit + truncate logs)
-func (h *CompletionHandler) handleStreamingResponse(ctx *fasthttp.RequestCtx, getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), cancel context.CancelFunc) { +type sseMode int + +const ( + sseModeLegacyDone sseMode = iota // data: ... + data: [DONE] + sseModeTypedEvents // event: ... + data: ... (no [DONE]) +) + +func (h *CompletionHandler) handleStreamingResponse( + ctx *fasthttp.RequestCtx, + getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), + cancel context.CancelFunc, + mode sseMode, +) { // Set SSE headers ctx.SetContentType("text/event-stream") ctx.Response.Header.Set("Cache-Control", "no-cache") ctx.Response.Header.Set("Connection", "keep-alive") @@ - var includeEventType bool + includeEventType := mode == sseModeTypedEvents @@ - var skipDoneMarker bool - // Process streaming responses for chunk := range stream { @@ - includeEventType = false - if chunk.BifrostResponsesStreamResponse != nil || - chunk.BifrostImageGenerationStreamResponse != nil || - (chunk.BifrostError != nil && (chunk.BifrostError.ExtraFields.RequestType == schemas.ResponsesStreamRequest || chunk.BifrostError.ExtraFields.RequestType == schemas.ImageGenerationStreamRequest)) { - includeEventType = true - } - - // Image generation streams don't use [DONE] marker - if chunk.BifrostImageGenerationStreamResponse != nil { - skipDoneMarker = true - } + // In typed-events mode, always use event lines (even for errors). + // In legacy mode, never use event lines. @@ chunkJSON, err := sonic.Marshal(chunk) if err != nil { - logger.Warn(fmt.Sprintf("Failed to marshal streaming response: %v, chunk: %v", err, chunk)) + // Avoid logging full chunks (may contain large base64 / sensitive content) + logger.Warn(fmt.Sprintf("Failed to marshal streaming response: %v (chunk types: resp=%t chat=%t img=%t err=%t)", + err, + chunk.BifrostResponsesStreamResponse != nil, + chunk.BifrostChatResponse != nil, + chunk.BifrostImageGenerationStreamResponse != nil, + chunk.BifrostError != nil, + )) continue } @@ } - - if !includeEventType && !skipDoneMarker { + if mode == sseModeLegacyDone { // Send the [DONE] marker to indicate the end of the stream (only for non-responses/image-gen APIs) if _, err := fmt.Fprint(w, "data: [DONE]\n\n"); err != nil { logger.Warn(fmt.Sprintf("Failed to write SSE [DONE] marker: %v", err)) cancel() // Client disconnected (write error), cancel upstream stream return } }And update callers:
h.handleStreamingResponse(ctx, getStream, cancel) +// text/chat/speech/transcription: legacy [DONE] +h.handleStreamingResponse(ctx, getStream, cancel, sseModeLegacyDone) // responses/image-gen: typed events, no [DONE] -h.handleStreamingResponse(ctx, getStream, cancel) +h.handleStreamingResponse(ctx, getStream, cancel, sseModeTypedEvents)Based on learnings, OpenAI Responses-style streaming should not rely on a
[DONE]sentinel.
🤖 Fix all issues with AI agents
In @core/bifrost.go:
- Around line 2411-2416: The helpers are missing ImageGenerationRequest and
ImageGenerationStreamRequest cases; update isModelRequired to treat both
ImageGenerationRequest and ImageGenerationStreamRequest as model-required types,
add cases in the fallback parsing switch that copy Provider and Model into
fallbackReq.ImageGenerationRequest / ImageGenerationStreamRequest (mirroring the
existing text/audio/chat handling), and extend plugins/mocker.getRequestInput to
handle ImageGenerationRequest and ImageGenerationStreamRequest by returning the
appropriate prompt/input string for image generation instead of the empty
default. Ensure you reference the types ImageGenerationRequest and
ImageGenerationStreamRequest and the symbols isModelRequired,
fallbackReq.ImageGenerationRequest/fallbackReq.ImageGenerationStreamRequest,
handleProviderStreamRequest behavior, and getRequestInput when making these
changes.
In @core/providers/gemini/gemini.go:
- Around line 1614-1671: In ImageGeneration, first guard against a nil request
and validate required fields before calling schemas.IsImagenModel: ensure
request != nil, that request.Model is non-empty, and that request.Prompt (or
request.Prompts/whatever the image payload requires) is present; only then call
schemas.IsImagenModel(request.Model) or delegate to handleImagenImageGeneration.
If validation fails, return a proper *schemas.BifrostError (matching the
existing validation/error pattern used elsewhere for image generation) with
ExtraFields set (Provider, ModelRequested, RequestType) so downstream code
cannot panic or send bad requests.
In @core/providers/gemini/images.go:
- Around line 188-256: In ToGeminiImageGenerationRequest, do not propagate
bifrostReq.Params.N into geminiReq.GenerationConfig.CandidateCount because
Gemini image generation always returns a single image; remove or replace the
assignment that sets GenerationConfig.CandidateCount from Params.N and instead
set CandidateCount = 1 (or leave it unset/zero) for image modality. Locate the
conditional that checks bifrostReq.Params.N and change it so it either does
nothing or explicitly clamps/assigns int32(1) to GenerationConfig.CandidateCount
to avoid using Params.N.
In @core/providers/huggingface/images.go:
- Around line 14-154: In ToHuggingFaceImageGenerationRequest validate that
bifrostReq.Input.Prompt is non-nil/empty (trim whitespace and return an error if
empty) before constructing any provider-specific request (apply this check at
the top of the function before the switch), and in the nebius branch normalize
ResponseExtension by using strings.ToLower(*req.ResponseExtension) when
comparing to "jpeg" (i.e., replace the case-sensitive comparison with a
lowercase comparison) so output_format is handled case-insensitively.
In @core/providers/nebius/images.go:
- Around line 62-68: The extracted guidance_scale from
bifrostReq.Params.ExtraParams is currently assigned directly to
req.GuidanceScale without bounds checking; after calling
schemas.SafeExtractIntPointer to get v, validate and enforce it is within 0–100
(either clamp values below 0 to 0 and above 100 to 100 or return a validation
error), then assign the validated value to req.GuidanceScale; update the
handling around bifrostReq.Params.ExtraParams and schemas.SafeExtractIntPointer
so req.GuidanceScale always receives a guaranteed 0–100 integer.
- Around line 74-95: The function ToBifrostImageResponse currently hard-codes
ExtraFields.Provider to schemas.Nebius; change it to respect custom provider
aliases by either turning ToBifrostImageResponse into a method on NebiusProvider
(e.g., func (p *NebiusProvider) ToBifrostImageResponse(...)) or by adding a
providerName parameter (e.g., providerName schemas.ModelProvider) and using that
value for ExtraFields.Provider; ensure you obtain the remapped name from
customProviderConfig lookup when called and populate
schemas.BifrostImageGenerationResponse.ExtraFields.Provider with the
passed/derived provider name instead of schemas.Nebius.
- Around line 12-20: In ToNebiusImageGenerationRequest validate that
bifrostReq.Input.Prompt is not nil/empty or only whitespace (trim and error if
empty), and avoid taking pointers into caller-owned structs by copying values
into the NebiusImageGenerationRequest (e.g., allocate local variables for model
and prompt and assign their addresses or change NebiusImageGenerationRequest to
hold values), update the construction of NebiusImageGenerationRequest to use
those copies, and return an error if prompt validation fails; reference:
function ToNebiusImageGenerationRequest, types NebiusImageGenerationRequest,
bifrostReq, Model, Prompt.
In @core/providers/openai/openai.go:
- Around line 2756-2786: The code incorrectly defaults imageIndex to 0 and
ignores explicit indices; change logic so both partial and completed chunks use
response.PartialImageIndex when it is non-nil, and only fall back to tracked
state when it is nil: in the else branch for partial chunks, if
response.PartialImageIndex != nil use it and mark
incompleteImages[imageIndex]=true; if nil allocate a new unique index (e.g.,
nextUnused or maxKey+1) instead of leaving 0; in the isCompleted branch, first
check if response.PartialImageIndex != nil and use that index (and delete from
incompleteImages), otherwise pick the oldest incomplete as the fallback and log
as before (using logger.Warn when falling back).
In @core/providers/vertex/vertex.go:
- Around line 1420-1644: Add a nil-request guard at the top of
VertexProvider.ImageGeneration to return a configuration/operation error if
request is nil before accessing request.Model, and ensure requestBody is
nil-safe after the conversion step by checking if requestBody != nil before
calling delete(requestBody, "region"); if requestBody can be nil create/return a
suitable error or initialize an empty map. Update references in ImageGeneration
to use these checks so neither request.Model dereference nor delete(requestBody,
"region") can panic.
In @framework/modelcatalog/pricing.go:
- Around line 276-383: The image-token counts are being priced with the generic
token rates; update the token-rate selection to prefer image-token rate fields
when present: create separate inputImageTokenRate and outputImageTokenRate
(checking Pricing fields InputCostPerImageToken,
InputCostPerImageTokenAbove128kTokens, InputCostPerImageTokenAbove200kTokens and
the corresponding OutputCostPerImageToken variants) and fall back to the generic
inputTokenRate/outputTokenRate if image-specific fields are nil; then use
inputTextTokens*inputTokenRate + inputImageTokens*inputImageTokenRate and
outputTextTokens*outputTokenRate + outputImageTokens*outputImageTokenRate when
computing inputCost/outputCost (modify the rate-selection logic around pricing,
and the final cost calculations where inputCost/outputCost are assigned).
In @plugins/semanticcache/stream.go:
- Around line 121-127: The comparison block in the sort comparator only checks
accumulator.Chunks[i].Response.ImageGenerationStreamResponse for nil but then
reads
accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index/ChunkIndex,
risking a nil deref; update the comparator in stream.go to also verify
accumulator.Chunks[j].Response.ImageGenerationStreamResponse != nil (same
pattern used in other branches) before accessing its fields and handle the
mixed-type case (e.g., treat nil as less/greater or fall through to other
comparisons) so both i and j are safely validated when comparing Index and
ChunkIndex.
In @transports/bifrost-http/integrations/router.go:
- Around line 1290-1291: The switch branch invocations for streaming converters
(TextStreamResponseConverter, ChatStreamResponseConverter,
ResponsesStreamResponseConverter, SpeechStreamResponseConverter,
TranscriptionStreamResponseConverter, ImageGenerationStreamResponseConverter)
can panic if any converter is nil; update the switch in router.go to nil-check
each converter on config.StreamConfig before calling it (e.g., verify
config.StreamConfig.ImageGenerationStreamResponseConverter != nil) and handle
the missing-converter case gracefully (log an error via the router's logger and
return a proper error/skip the chunk instead of calling the nil func).
Alternatively, add a route initialization validation that ensures all required
StreamConfig converters are non-nil and fail fast; reference the existing switch
handling around chunk.BifrostImageGenerationStreamResponse and the StreamConfig
field names to locate the changes.
🟡 Minor comments (15)
docs/providers/supported-providers/huggingface.mdx-19-19 (1)
19-19: Date reference may need updating.The table header references "as of December 2025" but the current date is January 2026. Consider updating to reflect the current state.
docs/providers/supported-providers/overview.mdx-28-28 (1)
28-28: Minor formatting inconsistency.The Hugging Face row has
✅|without a space before the pipe on the Images column, while other rows have consistent spacing. This is a minor visual inconsistency.-| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅| ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | +| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |tests/integrations/python/config.yml-186-187 (1)
186-187: Inconsistent environment variable casing.The
nebiusAPI key usesNebius_API_KEYwhich has mixed casing. Environment variables are conventionally all uppercase for consistency with other entries (e.g.,OPENAI_API_KEY,GEMINI_API_KEY).huggingface: "HUGGINGFACE_API_KEY" - nebius: "Nebius_API_KEY" + nebius: "NEBIUS_API_KEY"plugins/semanticcache/utils.go-391-412 (1)
391-412: Potential metadata size concern with base64 image data.Storing full
image_b64data in metadata (line 402-403, 410) could result in very large entries. Base64-encoded images can be several megabytes, which may exceed metadata size limits in some vector stores (e.g., Pinecone has a 40KB metadata limit per vector).Consider:
- Storing only a hash or truncated prefix of the base64 data for cache key matching
- Storing image data separately from vector metadata
- Documenting the metadata size implications for users configuring semantic cache
tests/integrations/python/tests/test_openai.py-1173-1173 (1)
1173-1173: Fix Ruff ARG002: unusedtest_configarg.
Rename to_test_config(or remove it) in the 4 new tests.Proposed minimal change
-def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled): +def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled): @@ -def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled): +def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled): @@ -def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled): +def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled): @@ -def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled): +def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):Also applies to: 1195-1195, 1218-1218, 1241-1241
core/internal/testutil/validation_presets.go-222-237 (1)
222-237: Remove the stray “add this function” comment.
Looks like a local note that shouldn’t ship.Proposed fix
-// In validation_presets.go - add this function func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {plugins/semanticcache/plugin_image_generation_test.go-110-223 (1)
110-223: Add consistent integration-test guards (Short mode + missing API key) to all tests.
OnlyTestImageGenerationCacheBasicFunctionalitycheckstesting.Short()andOPENAI_API_KEY. The other tests will run in-shortand/or with missing creds, and may “skip” for the wrong reason (e.g., auth failure).Proposed diff (pattern)
func TestImageGenerationSemanticSearch(t *testing.T) { + if testing.Short() { + t.Skip("skipping integration test in -short mode") + } + if os.Getenv("OPENAI_API_KEY") == "" { + t.Skip("OPENAI_API_KEY not set") + } // Initialize test with custom threshold config := &Config{Also applies to: 224-275, 276-401
plugins/semanticcache/plugin_image_generation_test.go-346-350 (1)
346-350: Streaming cache test: chunk-count equality may be flaky.
Cached streams may legitimately have different chunking/segmentation vs live streams (even if content is equivalent). Consider asserting completion + cache hit + non-empty data instead of strict chunk count equality.framework/modelcatalog/pricing.go-109-120 (1)
109-120: Normalize stream request types even when usage is missing.
Today normalization toImageGenerationRequestis gated onimageUsage != nil; if a provider returns no usage (but you still have requestType=stream), you’ll skip any image pricing lookup keyed by normalized type.Proposed diff
- if imageUsage != nil && requestType == schemas.ImageGenerationStreamRequest { + if requestType == schemas.ImageGenerationStreamRequest { requestType = schemas.ImageGenerationRequest }core/internal/testutil/image_generation.go-261-287 (1)
261-287: Avoid double-reporting stream errors on failure.
You foldstreamErrorsintovalidationErrors, and laterappend(validationResult.Errors, validationResult.StreamErrors...), so failures print duplicates.Proposed diff (keep streamErrors separate)
- // Stream errors should cause the test to fail - convert them to validation errors - if len(streamErrors) > 0 { - validationErrors = append(validationErrors, fmt.Sprintf("Stream errors encountered: %s", strings.Join(streamErrors, "; "))) - } + // Stream errors should cause the test to fail, but keep them separate for reporting.core/internal/testutil/image_generation.go-22-27 (1)
22-27: Add scenario flag gate to non-streaming image generation test.
RunImageGenerationTestcurrently runs wheneverImageGenerationModelis configured, ignoring thetestConfig.Scenarios.ImageGenerationflag. The streaming variant correctly gates onScenarios.ImageGenerationStreamfirst (line 155). Add the same gate to the non-streaming path to prevent unintended test execution in stacked PR scenarios where tests are selectively enabled via Scenarios flags.Proposed diff
func RunImageGenerationTest(t *testing.T, client *bifrost.Bifrost, ctx context.Context, testConfig ComprehensiveTestConfig) { + if !testConfig.Scenarios.ImageGeneration { + t.Logf("Image generation not enabled for provider %s", testConfig.Provider) + return + } if testConfig.ImageGenerationModel == "" { t.Logf("Image generation not configured for provider %s", testConfig.Provider) return }core/providers/openai/images.go-41-69 (1)
41-69: PopulateFallbacksin reverse conversion to achieve round-trip symmetry.
ToBifrostImageGenerationRequest()parsesrequest.Fallbacks(line 67), butToOpenAIImageGenerationRequest()never populatesreq.FallbacksfrombifrostReq.Fallbacks, causing data loss on round-trip conversion. Either set fallbacks in the reverse converter or remove it from the forward converter to maintain consistency.transports/bifrost-http/handlers/inference.go-1417-1489 (1)
1417-1489: Image generation handler: good validation + ExtraParams extraction; minor grammar correction needed.Change
"prompt can not be empty"→"prompt cannot be empty"for consistency with other error messages in the codebase (e.g., nebius.go, logging.go).The
cancel()call whenbifrostCtx == nilis safe—ConvertToBifrostContextalways returns a valid cancel function alongside the context.core/internal/testutil/test_retry_framework.go-2939-3095 (1)
2939-3095: Stream retry helper is consistent; minor naming/doc nit.
WithImageGenerationStreamRetrycomment references “ValidationRetry” but the function name doesn’t; consider aligning terminology for grep-ability.core/providers/huggingface/huggingface.go-943-947 (1)
943-947: Misleading comment - refers to "Nebius converter" but function name is generic.The comment says "Unmarshal response using Nebius converter" but the function is named
UnmarshalHuggingFaceImageGenerationResponse. This appears to be a copy-paste error from another provider's implementation.📝 Fix the misleading comment
- // Unmarshal response using Nebius converter + // Unmarshal response to Bifrost format bifrostResponse, convErr := UnmarshalHuggingFaceImageGenerationResponse(responseBody, request.Model)
🧹 Nitpick comments (30)
docs/providers/supported-providers/sgl.mdx (1)
25-25: Update the Note and Unsupported Features section for consistency.The table row correctly documents that SGLang does not support image generation. However, for documentation consistency, consider updating:
- The Note on line 32 to include "Image Generation" in the list of unsupported operations
- The "Unsupported Features" table (lines 105-110) to add an Image Generation row
📝 Suggested updates
Line 32:
-**Unsupported Operations** (❌): Speech, Transcriptions, Files, and Batch are not supported by the upstream SGL API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream SGL API. These return `UnsupportedOperationError`.Unsupported Features table:
| Feature | Reason | |---------|--------| +| Image Generation | Not offered by SGL API | | Speech/TTS | Not offered by SGL API | | Transcription/STT | Not offered by SGL API | | Batch Operations | Not offered by SGL API | | File Management | Not offered by SGL API |docs/providers/supported-providers/gemini.mdx (2)
560-576: Consider simplifying the bullet point formatting for readability.The bullet points in the "Standard Gemini Format" and "Imagen Format" subsections use a bold key-value pattern. While functional, consider using a table format similar to the Parameter Mapping section above for visual consistency.
This is a minor stylistic suggestion - the current format is acceptable.
600-604: Consider simplifying implementation detail reference.The mention of
schemas.IsImagenModel()exposes internal implementation details that may not be meaningful to end users. Consider rephrasing to focus on observable behavior rather than code references.📝 Suggested documentation improvement
## Endpoint Selection The provider automatically selects the endpoint based on model name: -- **Imagen models** (detected via `schemas.IsImagenModel()`): Uses `/v1beta/models/{model}:predict` endpoint +- **Imagen models** (e.g., `imagen-4.0-generate-001`): Uses `/v1beta/models/{model}:predict` endpoint - **Other models**: Uses `/v1beta/models/{model}:generateContent` endpoint with image response modalityui/app/workspace/providers/fragments/allowedRequestsFields.tsx (1)
82-83: Consider rebalancing column split.With the two new request types,
REQUEST_TYPESnow has 15 items, creating a 6/9 column split. For better visual balance, consider adjusting to a 7/8 or 8/7 split:- const leftColumn = REQUEST_TYPES.slice(0, 6); - const rightColumn = REQUEST_TYPES.slice(6); + const leftColumn = REQUEST_TYPES.slice(0, 8); + const rightColumn = REQUEST_TYPES.slice(8);docs/providers/supported-providers/azure.mdx (1)
429-429: Missing newline before horizontal rule.The horizontal rule
---should have a blank line before it for proper Markdown rendering to avoid it being interpreted as a setext heading.Image generation streaming is supported and uses OpenAI's streaming format with Server-Sent Events (SSE). + ---core/providers/azure/azure.go (1)
1269-1269: Remove orphan comment.Line 1269 has an empty comment that appears to be leftover from editing.
- // deployment := key.AzureKeyConfig.Deployments[request.Model]tests/integrations/python/tests/test_google.py (1)
1739-1741: Consider narrowing the exception type for better error visibility.The broad
except Exceptioncatch can mask unexpected errors (e.g., test logic bugs, network timeouts) that differ from expected "Imagen not available" scenarios. Consider catching more specific exceptions from the Google GenAI SDK (e.g.,google.api_core.exceptions.NotFoundor similar) to let genuine failures surface.♻️ Suggested refinement
- except Exception as e: - # Imagen may not be available in all regions or configurations - pytest.skip(f"Imagen generation failed: {e}") + except (google.api_core.exceptions.NotFound, + google.api_core.exceptions.PermissionDenied, + google.api_core.exceptions.InvalidArgument) as e: + # Imagen may not be available in all regions or configurations + pytest.skip(f"Imagen generation failed: {e}")plugins/semanticcache/utils.go (1)
11-11: Consider usingsonic.Marshalconsistently for response storage.The
addSingleResponsefunction usessonic.Marshal(line 382) to serialize the response, while other marshaling operations in this file usejson.Marshal(lines 119, 348, 746, 835). Since the project uses sonic as the standard serialization library throughout the codebase, consider aligning this usage for consistency.Note: This is a stylistic concern. The response string is stored but not hashed or compared, so the choice between sonic and json has no functional impact on cache correctness.
core/providers/bedrock/bedrock.go (1)
1347-1355: Use a consistent provider identifier in unsupported-operation errors.
In this file, other unsupported methods passschemas.Bedrockbut ImageGeneration passesprovider.GetProviderKey(). Please standardize (either allschemas.Bedrockor allprovider.GetProviderKey()), otherwise clients may see inconsistent “provider” values depending on operation.Proposed adjustment (pick one convention)
func (provider *BedrockProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationRequest, schemas.Bedrock) } func (provider *BedrockProvider) ImageGenerationStream(ctx *schemas.BifrostContext, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationStreamRequest, provider.GetProviderKey()) + return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationStreamRequest, schemas.Bedrock) }transports/bifrost-http/integrations/genai.go (1)
504-522:isImageGenerationRequestheuristic is reasonable; consider tightening the fallback.
The model-name fallback (IsImagenModel(req.Model)) may cause false positives if non-image models include “imagen” in a path/alias. If you’ve got a more authoritative discriminator available (endpoint suffix, request fields, or provider capability map), prefer that over substring matching.core/providers/huggingface/models.go (1)
57-91: Also recognize image generation from tags (not justPipelineTag).
Many HF models rely on tags; consider addingtext-to-image,image-generation,stable-diffusion, etc. to the tags switch so listing doesn’t under-report capabilities.core/providers/openai/openai_test.go (1)
25-84: Test config change is clear; consider cost/flakiness gating for image generation.
Since image generation can be slow/expensive, it may be worth guarding these scenarios behind a dedicated env flag (in addition toOPENAI_API_KEY) if CI stability/cost is a concern.ui/app/workspace/logs/views/logDetailsSheet.tsx (1)
75-78: Consider adding image generation support tocopyRequestBody.The "Copy request body" feature doesn't support image generation requests. If this is intentional, consider updating the error message to explicitly mention image generation as unsupported. Otherwise, this could be a follow-up enhancement to support copying image generation request bodies.
core/providers/huggingface/errors.go (1)
12-65: FastAPIdetail.locparsing: consider handling non-float numeric types to avoid dropping path segments.
Today only(string|float64)are included; if loc elements decode asint/json.Number, the location string becomes incomplete.Proposed tweak
diff --git a/core/providers/huggingface/errors.go b/core/providers/huggingface/errors.go @@ for _, locPart := range detail.Loc { if locStr, ok := locPart.(string); ok { locParts = append(locParts, locStr) - } else if locNum, ok := locPart.(float64); ok { + } else if locNum, ok := locPart.(float64); ok { locParts = append(locParts, fmt.Sprintf("%.0f", locNum)) + } else { + // Best-effort fallback (covers int/json.Number/etc.) + locParts = append(locParts, fmt.Sprintf("%v", locPart)) } }core/internal/testutil/response_validation.go (1)
240-275: Image-generation validation helpers are reasonable; consider also assertingExtraFields.RequestTypewhen available.
Would catch provider wiring issues early (e.g., response tagged as the wrong request type).Also applies to: 1092-1166
core/internal/testutil/validation_presets.go (1)
363-370: MakecustomParams["min_images"]extraction tolerant to non-intnumeric types.
IfcustomParamsever comes from JSON/YAML, numbers often arrive asfloat64and the current code will always fall back to defaults.Possible fix
case "ImageGeneration": - if minImages, ok := customParams["min_images"].(int); ok { - if expectedSize, ok := customParams["expected_size"].(string); ok { - return ImageGenerationExpectations(minImages, expectedSize) - } - } + if expectedSize, ok := customParams["expected_size"].(string); ok { + if minImages, ok := schemas.SafeExtractInt(customParams["min_images"]); ok { + return ImageGenerationExpectations(minImages, expectedSize) + } + } return ImageGenerationExpectations(1, "1024x1024")ui/app/workspace/logs/views/imageView.tsx (2)
75-81: TheisStreamingprop logic is correct but could be clearer.The expression
isStreaming && !currentImageon line 80 will always evaluate tofalsehere since we're inside thecurrentImage &&block. This is actually the intended behavior (ImageMessage should only show streaming placeholder when there's no image data), but the logic is redundant.Consider simplifying to
isStreaming={false}or removing the prop entirely for clarity:♻️ Suggested simplification
<ImageMessage image={{ ...currentImage, output_format: imageOutput?.params?.output_format, }} - isStreaming={isStreaming && !currentImage} + isStreaming={false} />
9-11: Consider importingImageGenerationInputfrom shared types.This interface duplicates the definition from
core/schemas/images.go. If there's a corresponding TypeScript type in the shared types (similar toBifrostImageGenerationOutputinui/lib/types/logs.ts), consider importing it for consistency.framework/streaming/images.go (1)
270-274: Remove empty/unused code block.This if-block has only a comment and no actual code. If usage extraction is intended for future implementation, consider adding a TODO comment or removing the block entirely.
♻️ Suggested cleanup
- // Extract usage if available - if result.ImageGenerationStreamResponse.Usage != nil { - // Note: ImageUsage doesn't directly map to BifrostLLMUsage, but we can store it - // The actual usage will be extracted in processAccumulatedImageStreamingChunks - } + // Note: Usage (ImageUsage) is stored in chunk.Delta and extracted in processAccumulatedImageStreamingChunkscore/providers/gemini/images.go (2)
11-93:ToBifrostImageGenerationRequest: parameters are only mapped in the Imagen-instances path.
If a request comes viaContents+Parameters, those params get dropped. Consider applying the samerequest.Parametersmapping after theContentsprompt extraction too.
128-186: Response conversion looks reasonable; consider erroring on “no candidates / no predictions”.
Right now,ToBifrostImageGenerationResponsecan return a success with emptyData. If downstream assumes at least one image, returning a structured error would be safer.Also applies to: 450-478
core/providers/huggingface/images.go (1)
156-231: Stream request builder should assert fal-ai provider to avoid misuse.
Since this always returns a fal-ai stream request type, it’s safer to verifysplitIntoModelProvider(bifrostReq.Model)resolves tofalAIand otherwise error.framework/streaming/accumulator.go (1)
293-324: Image chunk dedup uses string keys; consider a cheaper key type.
fmt.Sprintf("%d:%d", ...)allocates per chunk; auint64composite key (or a small struct key) avoids that hot-path overhead.core/providers/gemini/types.go (1)
1702-1707: Redundant struct definition - consider reusingImagenInstancetype.
GeminiImagenRequest.Instancesuses an inline anonymous struct with*stringfor Prompt, whileImagenInstance(line 91-93) already exists withstringfor Prompt. This creates two different representations for the same concept.Consider consolidating to a single type for consistency and maintainability:
♻️ Suggested consolidation
type GeminiImagenRequest struct { - Instances *[]struct { - Prompt *string `json:"prompt"` - } `json:"instances"` + Instances *[]ImagenInstance `json:"instances"` Parameters GeminiImagenParameters `json:"parameters"` }If the pointer semantics are required for the API, update
ImagenInstance:type ImagenInstance struct { - Prompt string `json:"prompt,omitempty"` + Prompt *string `json:"prompt,omitempty"` }core/providers/huggingface/huggingface.go (5)
1086-1103: Ensure consistent error typing for timeout vs cancellation.The error handling correctly differentiates between context cancellation, timeout, and other errors. However, the context.Canceled check should also handle wrapped errors.
♻️ Use errors.Is for wrapped error compatibility
The current code already uses
errors.Is, which handles wrapped errors correctly. This is good practice.
1123-1129: Unused variableslastB64Data,lastURLData,lastJsonDatawhen no chunks are received.These variables are initialized but only used when
chunkIndex > 0. This is fine, but consider moving their declarations inside the loop or the final chunk block to reduce scope.
1261-1274: Scanner error handling occurs after completion chunk is sent.If
scanner.Err()returns an error after successfully processing chunks, both a completion chunk and an error will be sent to the channel. This could confuse consumers expecting either success or failure.Consider checking for scanner errors before sending the completion chunk:
♻️ Check scanner error before sending completion
+ // Check for scanner errors first + if err := scanner.Err(); err != nil { + bifrostErr := providerUtils.NewBifrostOperationError( + fmt.Sprintf("Error reading fal-ai stream: %v", err), + err, + providerName, + ) + bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{ + Provider: providerName, + ModelRequested: request.Model, + RequestType: schemas.ImageGenerationStreamRequest, + } + ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) + providerUtils.ProcessAndSendBifrostError(ctx, postHookRunner, bifrostErr, responseChan, logger) + return + } + // Stream closed - send completion chunk if chunkIndex > 0 { // ... completion chunk logic } - - if err := scanner.Err(); err != nil { - // ... error handling (remove this block) - }
1231-1259: Consider sending the stream end indicator only once.The stream end indicator (
BifrostContextKeyStreamEndIndicator) is set in multiple places: on error (line 1179, 1272), and on successful completion (line 1254). While this works, it could be cleaner to set it in a single deferred function or just before closing the channel.
250-253: RenameparseHuggingFaceImageErrorfor clarity—it handles all request types, not just images.The function name is misleading since it's used for chat completions, embeddings, and other request types beyond image generation. While the underlying error structure (FastAPI validation errors in the Detail field) is consistent across HuggingFace inference endpoints, the function should be renamed to reflect its generic purpose, such as
parseHuggingFaceError.Additionally,
completeRequestpassesnilmetadata (line 252) whilecompleteRequestWithModelAliasCachepasses proper metadata—consider making this consistent for better error context.framework/streaming/types.go (1)
360-393: MissingMaxImageChunkIndextracking for image streaming consistency.ImageStreamChunk contains TokenUsage, Cost, and other metadata fields, and
images.goextracts these from what should be the final chunk. However, unlike Chat, Audio, and Transcription streaming which trackMax*ChunkIndexand use getter methods (e.g.,getLastChatChunk()), image streaming directly accesses the last array element viaacc.ImageStreamChunks[len(acc.ImageStreamChunks)-1]. This approach is fragile for out-of-order chunk arrival and inconsistent with other stream types.Adding
MaxImageChunkIndextoStreamAccumulatorand a correspondinggetLastImageChunk()method would improve robustness and maintain consistency across all streaming types.
675a340 to
e6d8d09
Compare
There was a problem hiding this comment.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
core/internal/testutil/account.go (1)
1131-1157: Add missing Fallbacks field to XAI provider configuration.The XAI provider configuration at lines 1131-1157 is missing the
Fallbacksfield. Every other provider in the file includes a Fallbacks field pointing to OpenAI with "gpt-4o-mini" as the fallback model. XAI should follow the same pattern for consistency.core/providers/anthropic/anthropic.go (1)
629-699: Don’t emit the final summary chunk after an in-stream error.When
event.ToBifrostChatCompletionStream()returnsbifrostErr != nil, the code sends an error andbreaks—then proceeds to emit the final usage/summary chunk after the loop. That can confuse clients (error followed by “final” chunk).Proposed fix
response, bifrostErr, isLastChunk := event.ToBifrostChatCompletionStream() if bifrostErr != nil { bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{ RequestType: schemas.ChatCompletionStreamRequest, Provider: providerName, ModelRequested: modelName, } ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) providerUtils.ProcessAndSendBifrostError(ctx, postHookRunner, bifrostErr, responseChan, logger) - break + return } @@ if isLastChunk { break }core/providers/huggingface/huggingface.go (1)
220-265:parseHuggingFaceImageErrorshould not be called withnilmetadata—request context is lost for all callers.Despite the misleading name,
parseHuggingFaceImageErroris actually generic and safe for all endpoints (chat, embedding, speech, transcription, image generation). However,completeRequestcalls it withnilfor metadata at line 256, which preventsExtraFields(Provider, ModelRequested, RequestType) from being populated in error responses. This degrades error reporting across all endpoints usingcompleteRequest.Pass the appropriate request metadata instead of
nil, or refactor to construct metadata from the context and request type information available incompleteRequest.
🤖 Fix all issues with AI agents
In @core/internal/testutil/response_validation.go:
- Around line 240-275: ValidateImageGenerationResponse calls
validateImageGenerationFields which assumes expectations.ProviderSpecific is
non-nil; add a nil-guard so we never index into a nil map/slice. Update either
ValidateImageGenerationResponse to ensure expectations.ProviderSpecific is
initialized to an empty map/struct before calling validateImageGenerationFields,
or modify validateImageGenerationFields to check if
expectations.ProviderSpecific == nil and handle it safely (return early or use
safe defaults) before any indexing or range operations; reference the
ValidateImageGenerationResponse and validateImageGenerationFields functions and
the expectations.ProviderSpecific field when making the change.
- Around line 1096-1122: In validateImageGenerationFields, avoid panics by first
checking expectations.ProviderSpecific != nil before accessing ["min_images"],
and make parsing robust by using a type switch to accept int, int64, float64,
json.Number (or string that can be parsed to int) to derive minImages as an int;
if conversion fails, skip the min_images check or record an error. Then compare
len(response.Data) to the parsed minImages and update result.Passed,
result.Errors, and result.MetricsCollected["image_count"] as before.
In @core/providers/gemini/gemini.go:
- Around line 1614-1753: ImageGeneration can panic if request == nil and
handleImagenImageGeneration may dereference a nil conversion result; add a nil
guard at the top of ImageGeneration (mirror nebius/openai pattern) that returns
a missing-parameter error when request is nil, and in
handleImagenImageGeneration verify the result of
imagenResponse.ToBifrostImageGenerationResponse() is non-nil before setting
response.ExtraFields; if nil, return an appropriate Bifrost error (e.g., a
provider response decode/operation error) including providerName and
request.Model to avoid panics.
In @core/providers/huggingface/images.go:
- Around line 14-154: In ToHgingFaceImageGenerationRequest's falAI case, restore
parity with ToHuggingFaceImageStreamRequest by mapping missing fields from
bifrostReq.Params and bifrostReq.Params.ExtraParams into the
HuggingFaceFalAIImageGenerationRequest: copy OutputFormat ->
ResponseFormat/ResponseFormat field, Seed -> Seed, NegativePrompt ->
NegativePrompt, and extract NumInferenceSteps, GuidanceScale (and other numeric
extra params) using schemas.SafeExtractIntPointer into
NumInferenceSteps/GuidanceScale (or the fal-ai equivalents) and normalize image
size/format handling similar to the stream converter; update the falAI branch in
ToHuggingFaceImageGenerationRequest (function name
ToHuggingFaceImageGenerationRequest, type
HuggingFaceFalAIImageGenerationRequest) to include these mappings consistent
with ToHuggingFaceImageStreamRequest.
In @core/providers/openai/openai.go:
- Around line 2382-2491: Add the same nil/request validation used by
ImageGenerationStream to the non-streaming path: in
HandleOpenAIImageGenerationRequest (and the ImageGeneration wrapper) first check
request != nil and return a BifrostError if nil, then validate the prompt/input
fields on the request (e.g., ensure request.Prompt or the equivalent image
prompt field is non-empty) and return a clear Bifrost user/operation error when
missing; do this before calling
CheckContextAndGetRequestBody/ToOpenAIImageGenerationRequest so we never marshal
or send a null/invalid body to the provider.
In @core/providers/vertex/vertex_test.go:
- Around line 45-46: The test incorrectly enables streaming image generation by
setting ImageGenerationStream: true while the Vertex provider does not support
it; update the test configuration to set ImageGenerationStream: false so it
matches the provider capability (the code path that returns
UnsupportedOperationError in vertex.go for image streaming should no longer be
exercised). Ensure the test config object with fields ImageGeneration and
ImageGenerationStream is modified accordingly.
In @docs/openapi/schemas/inference/images.yaml:
- Around line 3-100: The OpenAPI ImageGenerationRequest schema is inconsistent
with the core BifrostImageGenerationRequest shape (core/schemas/images.go):
update the docs or the handler so they match; specifically, either modify
docs/docs/openapi/schemas/inference/images.yaml ImageGenerationRequest to
include a top-level provider (string) and an input object that nests prompt and
any per-request params (e.g., n, size, quality, style, response_format,
background, moderation, partial_images, output_compression, output_format, user,
stream, fallbacks) to mirror BifrostImageGenerationRequest.input/params, or
change the handler to accept the flat top-level fields currently in the YAML;
ensure the request field names and nesting (provider, model, input, params)
exactly match the struct names in BifrostImageGenerationRequest so generated
clients and decoding work correctly.
In @framework/modelcatalog/pricing.go:
- Around line 276-426: The tiering is computed from imageTotalTokens (set from
imageUsage.TotalTokens) but that value can be zero when providers omit
total_tokens; update the logic before computing isAbove128k/isAbove200k to
compute a fallback imageTotalTokens by summing available counts from
imageUsage.InputTokens, imageUsage.OutputTokens and the breakdowns
imageUsage.InputTokensDetails.{TextTokens,ImageTokens} and
imageUsage.OutputTokensDetails.{TextTokens,ImageTokens} when TotalTokens==0 so
tier flags (isAbove128k/isAbove200k) use the derived total; adjust the branch
that sets imageTotalTokens (and any places that later read it) to use this
computed fallback to determine the correct tiered rates.
In @framework/streaming/images.go:
- Around line 18-35: Loop body dereferences chunks[i].Delta and its ExtraFields
without nil checks. Add a guard at the top of the loop (e.g., if chunks[i].Delta
== nil { continue }) to skip chunks that lack Delta, and also defensively check
Delta.ExtraFields before accessing ModelRequested (e.g., model := ""; if
chunks[i].Delta.ExtraFields != nil { model =
chunks[i].Delta.ExtraFields.ModelRequested }). Then build and return the
BifrostImageGenerationResponse using these safe values (still referencing
chunks, Delta, Delta.ExtraFields, and BifrostImageGenerationResponse).
In @plugins/logging/main.go:
- Around line 34-48: The pool cleanup in putUpdateLogData is missing resets for
ImageGenerationOutput, RawRequest, and RawResponse, which allows
image-generation and raw payloads to leak between reused UpdateLogData
instances; update putUpdateLogData to explicitly set data.ImageGenerationOutput
= nil, data.RawRequest = nil, and data.RawResponse = nil before calling
p.updateDataPool.Put(data) so all pooled fields are cleared; reference the
UpdateLogData struct fields ImageGenerationOutput, RawRequest, RawResponse and
the putUpdateLogData function in plugins/logging/pool.go when making the change.
In @plugins/mocker/main.go:
- Around line 718-722: PreHook currently filters out image generation requests
so the image prompt extraction in the case for schemas.ImageGenerationRequest
and schemas.ImageGenerationStreamRequest is never reached; update PreHook to
allow these types by including schemas.ImageGenerationRequest and
schemas.ImageGenerationStreamRequest in the allowed request types (modify the
condition that now only allows schemas.ChatCompletionRequest and
schemas.ResponsesRequest). After enabling the path, extend
generateSuccessShortCircuit and handleDefaultBehavior to handle image generation
responses by adding branches that construct appropriate image-generation
short-circuit responses (parallel to the existing ChatResponse handling) when
the request type is ImageGenerationRequest or ImageGenerationStreamRequest, and
ensure the code path uses the extracted prompt from the existing
ImageGenerationRequest.Input.Prompt handling.
In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 109-223: The test TestImageGenerationSemanticSearch should be made
opt-in: at its start check testing.Short() and ENV OPENAI_API_KEY and call
t.Skip with explanatory messages if short mode is set or the key is missing;
update the test initializer in TestImageGenerationSemanticSearch to do these
early guards (before creating Config/NewTestSetupWithConfig) so CI won’t run the
external OpenAI-dependent semantic-cache test when credentials are absent or
when -short is requested.
- Around line 224-401: Both TestImageGenerationDifferentParameters and
TestImageGenerationStreamCaching need the same integration-test guard used in
TestImageGenerationCacheBasicFunctionality to prevent accidental network calls
during unit runs; update each test to early-skip when the integration
flag/condition used by TestImageGenerationCacheBasicFunctionality is false
(i.e., replicate the same guard check and t.Skip behavior), placing the guard at
the top of TestImageGenerationDifferentParameters and
TestImageGenerationStreamCaching so they only run when the integration gate is
enabled.
In @tests/integrations/python/config.yml:
- Around line 45-46: Replace the incorrectly-cased env var string
"Nebius_API_KEY" with the uppercase "NEBIUS_API_KEY" throughout the test config
so the tests pick up the actual environment variable; update every occurrence
(e.g., the entries currently showing "Nebius_API_KEY" around the noted ranges
such as lines ~45, 104-106, 133-135, 170-175, 186-188, 208-209, 311-313,
353-355, 457-462, 486-487) to "NEBIUS_API_KEY".
In @tests/integrations/python/tests/test_google.py:
- Around line 1687-1787: The tests use get_provider_google_client(provider)
across multiple providers but only have @skip_if_no_api_key("google"), causing
get_api_key(provider) to raise ValueError at runtime; inside each test
(test_41a_image_generation_simple, test_41b_imagen_predict,
test_41c_image_generation_with_text) add an early provider-specific API key
check (call get_api_key(provider) or wrap get_provider_google_client(provider)
in a try/except that skips the test on ValueError) before creating the client,
rename the unused test_config parameter to _test_config in those test
signatures, and replace the broad except Exception in test_41b_imagen_predict
with except ValueError as e (or the specific exception raised by get_api_key) so
missing-key cases are skipped cleanly.
In @tests/integrations/python/tests/test_openai.py:
- Around line 1166-1258: Tests test_52b, test_52c, and test_52d assume
parameters unsupported by some configured models (e.g., dall-e-3); instead of
hardcoded model string checks, query provider/model capabilities before running
assertions: in each test (test_52b_image_generation_multiple,
test_52c_image_generation_quality, test_52d_image_generation_different_sizes)
call the capability helper (or extend
get_provider_openai_client/format_provider_model) to check flags like
supports_multiple_images, supported_quality_values, and supported_sizes and skip
if capability missing, or adapt the request to use a supported value (e.g., use
n=1 when multiple not supported, choose a quality from supported_quality_values,
and pick sizes from supported_sizes) so the tests exercise only valid parameter
combinations for the given provider/model.
In @tests/integrations/python/tests/utils/common.py:
- Around line 2620-2724: The Google branch in
assert_valid_image_generation_response is too permissive and has inconsistent
key casing and weak base64 checks; update assert_valid_image_generation_response
to accept both camelCase and snake_case for inlineData/inline_data and
mimeType/mime_type and data/data, validate image data with a real base64 check
(e.g., regex for base64 chars and padding) plus a length threshold, ensure
mime_type/mimeType starts with "image/", and tighten the Imagen predictions
check so that for each prediction in predictions you verify the bytes field
exists AND is non-empty (handle bytesBase64Encoded and bytes_base64_encoded)
rather than only asserting attribute presence; reference these symbols when
changing logic: assert_valid_image_generation_response, the local variables
candidates/parts/inline_data/inlineData, mimeType/mime_type, data, and
predictions/bytesBase64Encoded/bytes_base64_encoded.
🟡 Minor comments (6)
core/providers/bedrock/bedrock.go-691-692 (1)
691-692: Confirm: all GetBifrostResponseForStreamResponse call sites correctly migrated to 6-arg signature.Verified that all five call sites (lines 691, 937, 951, 1128, 1221) pass exactly 6 arguments with the expected parameter positioning (text completion, chat, responses variants). Signature alignment is sound.
Address inconsistency in unsupported method stubs: ImageGeneration (line 1349) and ImageGenerationStream (line 1354) use
provider.GetProviderKey()for the second argument, while TranscriptionStream (line 1344) usesschemas.Bedrock. Standardize to one approach for consistency.core/internal/testutil/validation_presets.go-222-237 (1)
222-237: Makemin_imagesparsing tolerant (int vs float64) and drop the stray inline comment.This is testutil, but if
customParamsis ever JSON-decoded, numbers will come through asfloat64and you’ll always fall back to defaults.Proposed adjustment
-// In validation_presets.go - add this function func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations { @@ case "ImageGeneration": - if minImages, ok := customParams["min_images"].(int); ok { - if expectedSize, ok := customParams["expected_size"].(string); ok { - return ImageGenerationExpectations(minImages, expectedSize) - } - } + var minImages int + switch v := customParams["min_images"].(type) { + case int: + minImages = v + case float64: + minImages = int(v) + } + if minImages > 0 { + if expectedSize, ok := customParams["expected_size"].(string); ok && expectedSize != "" { + return ImageGenerationExpectations(minImages, expectedSize) + } + return ImageGenerationExpectations(minImages, "1024x1024") + } return ImageGenerationExpectations(1, "1024x1024")Also applies to: 363-370
docs/providers/supported-providers/nebius.mdx-135-189 (1)
135-189: Good:guidance_scaledocumented asint (0-100)andai_project_idbehavior is clear.
This matches the Nebius image-gen typing/range expectation (as per learnings). Only nit: the “Model & Prompt” bullet list could be lightly reworded to avoid repetitive sentence starts.core/providers/nebius/types.go-56-75 (1)
56-75: UnmarshalJSON does not handlenullor empty input.If
dataisnull(JSON literal) or empty, the current implementation may behave unexpectedly. Thestrings.TrimSpaceon empty data results in an empty string, which passes thelen(trimmed) > 0check as false, then attempts to unmarshal as a string which may fail or produce an empty message.Suggested improvement
func (d *NebiusErrorDetail) UnmarshalJSON(data []byte) error { + // Handle null JSON value + if string(data) == "null" { + return nil + } + // First, try to unmarshal as an array (validation errors) trimmed := strings.TrimSpace(string(data)) + if len(trimmed) == 0 { + return nil + } if len(trimmed) > 0 && trimmed[0] == '[' {framework/modelcatalog/pricing.go-164-173 (1)
164-173: Update documentation example forCalculateCostFromUsagemethod signature.The example in
docs/architecture/framework/model-catalog.mdx(lines 169-180) is outdated. It shows only 8 arguments and is missing thedeploymentparameter (betweenmodelandusage) and theimageUsageparameter (at the end). All code call sites have been correctly updated with the 9 required parameters.core/providers/huggingface/huggingface.go-968-1022 (1)
968-1022: ImageGenerationStream error construction: avoid using dynamic strings as “operation code”.
providerUtils.NewBifrostOperationErrorappears to expect a stable error code (you use constants elsewhere). Passing a formatted sentence as the “code” will make error handling/analytics inconsistent. Prefer a dedicated constant (or useNewUnsupportedOperationErrorif that’s the intended category).
🧹 Nitpick comments (27)
core/schemas/utils.go (1)
1183-1186: Inconsistent case handling compared to sibling functions.This function uses
strings.ToLower(model)before checking for "imagen", but all other model detection functions in this file (IsNovaModel,IsAnthropicModel,IsMistralModel,IsGeminiModel) perform case-sensitive matching directly.If case-insensitivity is intentional for Imagen models (e.g., providers may report "Imagen" vs "imagen"), please add a comment explaining this. Otherwise, consider aligning with the existing pattern for consistency:
♻️ Suggested fix for consistency
// IsImagenModel checks if the model is an Imagen model. func IsImagenModel(model string) bool { - return strings.Contains(strings.ToLower(model), "imagen") + return strings.Contains(model, "imagen") }Please verify whether Imagen model identifiers can appear in mixed case across different providers, which would justify the case-insensitive check.
docs/quickstart/gateway/multimodal.mdx (1)
97-98: Minor: Remove extra blank line.There's a double blank line between the Image Generation section and Audio Understanding section.
📝 Suggested fix
}
Audio Understanding: Analyzing Audio with AI
</details> </blockquote></details> <details> <summary>docs/providers/supported-providers/xai.mdx (1)</summary><blockquote> `132-134`: **Consider varying sentence structure (optional).** Per static analysis, three successive bullet points begin with similar patterns. Consider rewording for better readability. <details> <summary>📝 Suggested alternative wording</summary> ```diff -**Request Conversion** - -xAI uses the same conversion as OpenAI (see [OpenAI Image Generation](/providers/supported-providers/openai#7-image-generation)): - -- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` -- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the request struct via struct embedding -- **Endpoint**: `/v1/images/generations` +**Request Conversion** + +xAI uses the same conversion as OpenAI (see [OpenAI Image Generation](/providers/supported-providers/openai#7-image-generation)): + +| Field | Mapping | +|-------|---------| +| Model & Prompt | `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` | +| Parameters | All fields from `bifrostReq.Params` (`ImageGenerationParameters`) embedded via struct embedding | +| Endpoint | `/v1/images/generations` |docs/providers/supported-providers/vertex.mdx (1)
25-25: Consider consistency in streaming indicator.The Image Generation row shows ✅ for non-streaming and
-for streaming. Other unsupported streaming operations in this table use❌(e.g., Embeddings, List Models). Consider whether-is intentionally indicating "not applicable" or if it should be❌for consistency.Based on the code (
ImageGenerationStreamreturnsUnsupportedOperationError), using❌would be more consistent with other providers' documentation.📝 Suggested change for consistency
-| Image Generation | ✅ | - | `/generateContent` or `/predict` (Imagen) | +| Image Generation | ✅ | ❌ | `/generateContent` or `/predict` (Imagen) |tests/integrations/python/tests/test_google.py (1)
2318-2339: New function-call extractor looks fine; consider using it to reduce repeatedhasattr(response, "function_calls")checks.
-->core/providers/bedrock/bedrock.go (1)
1347-1355: Nit: make unsupported-operation provider naming consistent across this file.
These new stubs useprovider.GetProviderKey(), while older unsupported methods (e.g., Speech/Transcription) passschemas.Bedrock, which can produce mismatched provider names when custom provider config is used.Proposed consistency tweak
func (provider *BedrockProvider) Speech(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostSpeechRequest) (*schemas.BifrostSpeechResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, schemas.Bedrock) + return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) }core/providers/gemini/types.go (1)
1702-1707: Consider using a named type for Instances.The anonymous struct pattern for
Instancesis unusual and harder to work with:Instances *[]struct { Prompt *string `json:"prompt"` } `json:"instances"`Consider using the already-defined
ImagenInstancetype for consistency:♻️ Suggested improvement
type GeminiImagenRequest struct { - Instances *[]struct { - Prompt *string `json:"prompt"` - } `json:"instances"` + Instances []ImagenInstance `json:"instances"` Parameters GeminiImagenParameters `json:"parameters"` }Note:
ImagenInstanceis already defined at lines 91-93 and could be reused here.core/providers/nebius/errors.go (1)
11-68: Add a nil-guard forbifrostErrand consider pairing validation message + location.
HandleProviderAPIErroris assumed non-nil; a guard avoids a panic if that contract ever changes. Also, aggregating locations separately can make debugging harder when multiple validation errors exist.Proposed tweak
func parseNebiusImageError(resp *fasthttp.Response, meta *providerUtils.RequestMetadata) *schemas.BifrostError { var nebiusErr NebiusError bifrostErr := providerUtils.HandleProviderAPIError(resp, &nebiusErr) + if bifrostErr == nil { + return &schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{ + Message: "provider API error", + }, + } + } if bifrostErr.Error == nil { bifrostErr.Error = &schemas.ErrorField{} } @@ if len(nebiusErr.Detail.ValidationErrors) > 0 { - var messages []string - var locations []string + var messages []string @@ for _, detail := range nebiusErr.Detail.ValidationErrors { if detail.Msg != "" { - messages = append(messages, detail.Msg) - } - if len(detail.Loc) > 0 { - locations = append(locations, strings.Join(detail.Loc, ".")) + if len(detail.Loc) > 0 { + messages = append(messages, detail.Msg+" ["+strings.Join(detail.Loc, ".")+"]") + } else { + messages = append(messages, detail.Msg) + } } } @@ - if len(locations) > 0 { - locationStr := strings.Join(locations, ", ") - if message == "" { - message = "[" + locationStr + "]" - } else { - message = message + " [" + locationStr + "]" - } - } } }plugins/logging/main.go (1)
574-583: Consider DB/UI bloat from storing full image outputs in logs.If
BifrostImageGenerationResponse.Datacontains base64 payloads, logging it can balloon DB size and slow log queries/UI rendering. If this is expected, OK; otherwise consider storing only metadata (counts, sizes, URLs) or gating behind a dedicated config.ui/app/workspace/logs/views/imageView.tsx (2)
19-39: LGTM: index clamping avoids out-of-range on output updates; navigation is safe.
Optional:imagescould beuseMemo’d if this view re-renders frequently with large outputs.
56-99: Consider whetherImageMessageshould receiveisStreamingeven whencurrentImageexists.
Right now you passisStreaming={isStreaming && !currentImage}, so once the first image arrives you’ll never show the streaming skeleton/progress even if more images are still streaming in. If that’s intended, ignore; if not, wireisStreaming(and maybe progress) differently.framework/logstore/migrations.go (1)
684-748: LGTM: guarded add/drop column migrations with transactions enabled.
Minor: consider grouping input+output into one migration ID if you want them to be “atomic” as a pair (optional).docs/openapi/schemas/integrations/openai/images.yaml (1)
98-104: Minor inconsistency in stream event type enum.The
typeenum has inconsistent naming:image_generation.partial_imageandimage_generation.completedfollow a namespace pattern, buterrordoes not. Consider usingimage_generation.errorfor consistency with the other values.Suggested fix
enum: - "image_generation.partial_image" - "image_generation.completed" - - "error" + - "image_generation.error"core/providers/nebius/nebius.go (1)
129-135: Consider escapingai_project_idin ChatCompletion for consistency.The
ImageGenerationmethod at line 269 properly usesurl.QueryEscapewhen appendingai_project_idto the query string, butChatCompletionusesfmt.Sprint(rawID)directly without escaping. This could be a security concern if the value contains special characters.Suggested fix
// Append query parameter if present if rawID, ok := request.Params.ExtraParams["ai_project_id"]; ok && rawID != nil { + escapedID := url.QueryEscape(fmt.Sprint(rawID)) if strings.Contains(path, "?") { - path = path + "&ai_project_id=" + fmt.Sprint(rawID) + path = path + "&ai_project_id=" + escapedID } else { - path = path + "?ai_project_id=" + fmt.Sprint(rawID) + path = path + "?ai_project_id=" + escapedID } }core/providers/nebius/types.go (1)
26-29: Consider usingURLinstead ofUrlfor consistency with Go naming conventions.Go convention typically uses all-caps for acronyms in exported identifiers (e.g.,
URL,ID,HTTP).Suggested change
type NebiusLora struct { - Url string `json:"url"` + URL string `json:"url"` Scale int `json:"scale"` }core/internal/testutil/image_generation.go (1)
284-287: Slice append may unintentionally modify the originalErrorsslice.Using
append(validationResult.Errors, validationResult.StreamErrors...)can modify the backing array ofvalidationResult.Errorsif it has sufficient capacity. Since this is only used for error reporting and the result is immediately consumed, the practical impact is minimal, but it's worth noting for correctness.Safer alternative
if !validationResult.Passed { - allErrors := append(validationResult.Errors, validationResult.StreamErrors...) + allErrors := make([]string, 0, len(validationResult.Errors)+len(validationResult.StreamErrors)) + allErrors = append(allErrors, validationResult.Errors...) + allErrors = append(allErrors, validationResult.StreamErrors...) t.Fatalf("❌ Image generation stream validation failed: %s", strings.Join(allErrors, "; ")) }core/providers/azure/azure.go (1)
1269-1269: Remove orphaned empty comment.Line 1269 has an empty comment
//which appears to be a leftover. Either remove it or add meaningful documentation.Suggested fix
- // deployment := key.AzureKeyConfig.Deployments[request.Model]framework/modelcatalog/pricing.go (1)
16-120: Normalize image-generation stream request type without depending onimageUsagepresence.Right now, pricing normalization only happens when
imageUsage != nil(Line 116). If upstream ever produces an image stream response whereExtraFields.RequestTypeisimage_generation_streambutUsageis absent, you’ll silently skip normalization and risk a pricing miss. Consider normalizing purely based onrequestType.core/providers/openai/openai.go (1)
2493-2888: Streaming parser:bufio.Scannermax token (10MB) is a likely ceiling for base64-heavy chunks.If OpenAI emits large
b64_jsonpayloads,scanner.Scan()will fail withErrTooLongand you’ll end up treating it as a stream read error at the end (Lines 2881-2884). Consider switching to abufio.Reader-based SSE parser, or at least raising the limit defensively.Minimal mitigation (bump scanner limit)
- scanner := bufio.NewScanner(resp.BodyStream()) - buf := make([]byte, 0, 1024*1024) - scanner.Buffer(buf, 10*1024*1024) + scanner := bufio.NewScanner(resp.BodyStream()) + buf := make([]byte, 0, 1024*1024) + // Image payloads (base64) can be large; keep this comfortably above expected chunk sizes. + scanner.Buffer(buf, 64*1024*1024)transports/bifrost-http/integrations/router.go (1)
1195-1416: Avoid emitting duplicateevent:lines when the converter returns a full SSE string.Right now you write
event: <eventType>before checking whetherconvertedResponseis a preformatted SSE string (Lines 1308-1377). If an integration ever returns both (non-empty eventType +"event: ...\ndata: ...\n\n"), the stream output becomes malformed.Proposed fix
- if eventType != "" { - // OPENAI RESPONSES FORMAT: Use event: and data: lines for OpenAI responses API compatibility - if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil { - cancel() // Client disconnected (write error), cancel upstream stream - return - } - } + // If the converter returned a *full* SSE string (already contains event/data), + // don't prepend a second event line. + if s, ok := convertedResponse.(string); !ok || (!strings.HasPrefix(s, "event:") && !strings.HasPrefix(s, "data:")) { + if eventType != "" { + if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil { + cancel() + return + } + } + }framework/streaming/images.go (1)
271-274: Remove empty if block.This if block contains only a comment explaining that usage will be extracted elsewhere. Consider removing it to reduce code noise.
♻️ Proposed fix
- // Extract usage if available - if result.ImageGenerationStreamResponse.Usage != nil { - // Note: ImageUsage doesn't directly map to BifrostLLMUsage, but we can store it - // The actual usage will be extracted in processAccumulatedImageStreamingChunks - } + // Note: ImageUsage extraction is handled in processAccumulatedImageStreamingChunksframework/streaming/types.go (1)
121-128: Consider addingMaxImageChunkIndexfor consistency with other stream types.Other stream types (Chat, Responses, Transcription, Audio) have corresponding
Max*ChunkIndexfields inStreamAccumulatorfor metadata extraction. The image stream uses a composite string key for de-duplication (ImageChunksSeen map[string]struct{}), which is appropriate for per-image scoping, but there's noMaxImageChunkIndexfield.If image streaming metadata (TokenUsage, Cost, FinishReason) needs to be extracted from the last chunk similar to other stream types, consider adding a tracking mechanism.
docs/openapi/schemas/inference/images.yaml (1)
101-177: Either use or removeImageGenerationResponseParameters.
ImageGenerationResponseParametersis defined but not referenced byImageGenerationResponse. If the intent is to mirror the Go embedding (*ImageGenerationResponseParameters), consider composing viaallOf, otherwise drop the unused schema to avoid drift.framework/streaming/accumulator.go (1)
293-325: Dedup key allocs + missing defensive checks (optional).
fmt.Sprintf("%d:%d", ...)allocates per chunk; you could use a[2]intkey (or a small struct) to reduce GC in high-throughput streams.- Consider guarding
chunk == niland negative indices to avoid panics / map blowups from malformed provider output.Proposed (lower-allocation) key change
- func imageChunkKey(imageIndex, chunkIndex int) string { - return fmt.Sprintf("%d:%d", imageIndex, chunkIndex) - } + type imageChunkKey struct { + imageIndex int + chunkIndex int + }- chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex) + chunkKey := imageChunkKey{imageIndex: chunk.ImageIndex, chunkIndex: chunk.ChunkIndex} if _, seen := acc.ImageChunksSeen[chunkKey]; !seen {- ImageChunksSeen: make(map[string]struct{}), + ImageChunksSeen: make(map[imageChunkKey]struct{}),core/providers/huggingface/images.go (1)
15-18: Add “missing prompt” validation (align with existing repo behavior).
You guardbifrostReq == nil || bifrostReq.Input == nil, but a blank prompt currently passes through. If core-level validation is bypassed (or this helper is used directly), we’ll send invalid requests downstream. Based on learnings, image generation should reject missing prompts before dispatch.Also applies to: 157-160
core/providers/huggingface/huggingface.go (1)
882-966: ImageGeneration: add “missing prompt” validation (align with repo expectation).
This method relies onToHuggingFaceImageGenerationRequest, which currently doesn’t reject blank prompts; consider rejecting empty/whitespace prompts here before dispatch (same rationale as other providers per learnings).core/internal/testutil/test_retry_framework.go (1)
980-994: Consider renamingDefaultImageGenerationRetryConfig()to clarify it returns the genericTestRetryConfig.The function name suggests it returns
ImageGenerationRetryConfig(the type-specific config), but it actually returns the deprecated genericTestRetryConfig. This creates confusion when callers manually convert it toImageGenerationRetryConfigbefore passing toWithImageGenerationRetry()(see image_generation.go:48-55).Either rename to indicate it's the generic stream-oriented config (e.g.,
DefaultImageGenerationTestRetryConfig()), or add a helper that convertsTestRetryConfig→ImageGenerationRetryConfigto follow the established pattern for domain-specific configs.Also applies to line 1250-1251 where it's returned in the scenario router.
| func WithImageGenerationRetry( | ||
| t *testing.T, | ||
| config ImageGenerationRetryConfig, | ||
| context TestRetryContext, | ||
| expectations ResponseExpectations, | ||
| scenarioName string, | ||
| operation func() (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError), | ||
| ) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { | ||
|
|
||
| var lastResponse *schemas.BifrostImageGenerationResponse | ||
| var lastError *schemas.BifrostError | ||
|
|
||
| for attempt := 1; attempt <= config.MaxAttempts; attempt++ { | ||
| context.AttemptNumber = attempt | ||
|
|
||
| // Execute the operation | ||
| response, err := operation() | ||
| lastResponse = response | ||
| lastError = err | ||
|
|
||
| // If we have a response, validate it FIRST | ||
| if response != nil { | ||
| validationResult := ValidateImageGenerationResponse(t, response, err, expectations, scenarioName) | ||
|
|
||
| // If validation passes, we're done! | ||
| if validationResult.Passed { | ||
| return response, err | ||
| } | ||
|
|
||
| // Validation failed - ALWAYS retry validation failures for functionality checks | ||
| // Network errors are handled by bifrost core, so these are content/functionality validation errors | ||
| if attempt < config.MaxAttempts { | ||
| // ALWAYS retry on timeout errors - this takes precedence over all other conditions | ||
| if err != nil && isTimeoutError(err) { | ||
| retryReason := fmt.Sprintf("❌ timeout error detected: %s", GetErrorMessage(err)) | ||
| if config.OnRetry != nil { | ||
| config.OnRetry(attempt, retryReason, t) | ||
| } | ||
|
|
||
| // Calculate delay with exponential backoff | ||
| delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) | ||
| time.Sleep(delay) | ||
| continue | ||
| } | ||
|
|
||
| // Check other retry conditions first (for logging/debugging) | ||
| shouldRetryFromConditions, conditionReason := checkImageGenerationRetryConditions(response, err, context, config.Conditions) | ||
|
|
||
| // ALWAYS retry on validation failures - this is the primary purpose of these tests | ||
| // Content validation errors indicate functionality issues that should be retried | ||
| shouldRetry := len(validationResult.Errors) > 0 | ||
| var retryReason string | ||
|
|
||
| if shouldRetry { | ||
| // Validation failures are the primary retry reason - ALWAYS prefix with ❌ | ||
| retryReason = fmt.Sprintf("❌ validation failure (content/functionality check): %s", strings.Join(validationResult.Errors, "; ")) | ||
| // Append condition-based reason if present for additional context | ||
| if shouldRetryFromConditions && conditionReason != "" { | ||
| retryReason += fmt.Sprintf(" | also: %s", conditionReason) | ||
| } | ||
| } else if shouldRetryFromConditions { | ||
| // Fallback to condition-based retry if no validation errors (edge case) | ||
| // Ensure ❌ prefix for consistency with error logging | ||
| shouldRetry = true | ||
| if !strings.Contains(conditionReason, "❌") { | ||
| retryReason = fmt.Sprintf("❌ %s", conditionReason) | ||
| } else { | ||
| retryReason = conditionReason | ||
| } | ||
| } | ||
|
|
||
| if shouldRetry { | ||
| if config.OnRetry != nil { | ||
| config.OnRetry(attempt, retryReason, t) | ||
| } | ||
|
|
||
| // Calculate delay with exponential backoff | ||
| delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) | ||
| time.Sleep(delay) | ||
| continue | ||
| } | ||
| } | ||
|
|
||
| // All retries failed validation - create a BifrostError to force test failure | ||
| validationErrors := strings.Join(validationResult.Errors, "; ") | ||
|
|
||
| if config.OnFinalFail != nil { | ||
| finalErr := fmt.Errorf("❌ validation failed after %d attempts: %s", attempt, validationErrors) | ||
| config.OnFinalFail(attempt, finalErr, t) | ||
| } | ||
|
|
||
| // Return nil response + BifrostError so calling test fails | ||
| statusCode := 400 | ||
| testFailureError := &schemas.BifrostError{ | ||
| IsBifrostError: true, | ||
| StatusCode: &statusCode, | ||
| Error: &schemas.ErrorField{ | ||
| Message: fmt.Sprintf("❌ Validation failed after %d attempts: %s", attempt, validationErrors), | ||
| }, | ||
| } | ||
| return nil, testFailureError | ||
| } | ||
|
|
||
| // If we have an error without a response, check if we should retry | ||
| if err != nil && attempt < config.MaxAttempts { | ||
| // ALWAYS retry on timeout errors - this takes precedence over other conditions | ||
| if isTimeoutError(err) { | ||
| retryReason := fmt.Sprintf("❌ timeout error detected: %s", GetErrorMessage(err)) | ||
| if config.OnRetry != nil { | ||
| config.OnRetry(attempt, retryReason, t) | ||
| } | ||
|
|
||
| // Calculate delay with exponential backoff | ||
| delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) | ||
| time.Sleep(delay) | ||
| continue | ||
| } | ||
|
|
||
| shouldRetry, retryReason := checkImageGenerationRetryConditions(response, err, context, config.Conditions) | ||
|
|
||
| // ALWAYS retry on non-structural errors (network errors are handled by bifrost core) | ||
| // If no condition matches, still retry on any error as it's likely transient | ||
| if !shouldRetry { | ||
| shouldRetry = true | ||
| errorMsg := GetErrorMessage(err) | ||
| if !strings.Contains(errorMsg, "❌") { | ||
| errorMsg = fmt.Sprintf("❌ %s", errorMsg) | ||
| } | ||
| retryReason = fmt.Sprintf("❌ non-structural error (will retry): %s", errorMsg) | ||
| } else if !strings.Contains(retryReason, "❌") { | ||
| retryReason = fmt.Sprintf("❌ %s", retryReason) | ||
| } | ||
|
|
||
| if shouldRetry { | ||
| if config.OnRetry != nil { | ||
| config.OnRetry(attempt, retryReason, t) | ||
| } | ||
|
|
||
| // Calculate delay with exponential backoff | ||
| delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) | ||
| time.Sleep(delay) | ||
| continue | ||
| } | ||
| } | ||
|
|
||
| // If we get here, either we got a final error or no more retries | ||
| break | ||
| } | ||
|
|
||
| // Final failure callback | ||
| if config.OnFinalFail != nil && lastError != nil { | ||
| errorMsg := "unknown error" | ||
| if lastError.Error != nil { | ||
| errorMsg = lastError.Error.Message | ||
| } | ||
| // Ensure error message has ❌ prefix if not already present | ||
| if !strings.Contains(errorMsg, "❌") { | ||
| errorMsg = fmt.Sprintf("❌ %s", errorMsg) | ||
| } | ||
| config.OnFinalFail(config.MaxAttempts, fmt.Errorf("❌ final error: %s", errorMsg), t) | ||
| } | ||
|
|
||
| return lastResponse, lastError | ||
| } |
There was a problem hiding this comment.
Potential false-pass: (nil response, nil err) currently ends retries with success-like return.
If operation() ever returns (nil, nil), this function breaks out and returns (nil, nil) without calling OnFinalFail—tests could incorrectly pass depending on the caller. Recommend treating response == nil as retryable failure (similar to how stream wrappers handle responseChannel == nil).
Minimal defensive fix
// Execute the operation
response, err := operation()
lastResponse = response
lastError = err
+ // Treat nil response as a failure (even if err is nil) to avoid false-passing tests
+ if response == nil {
+ if attempt < config.MaxAttempts {
+ retryReason := "❌ image generation response is nil"
+ if config.OnRetry != nil {
+ config.OnRetry(attempt, retryReason, t)
+ }
+ delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
+ time.Sleep(delay)
+ continue
+ }
+ statusCode := 400
+ return nil, &schemas.BifrostError{
+ IsBifrostError: true,
+ StatusCode: &statusCode,
+ Error: &schemas.ErrorField{
+ Message: fmt.Sprintf("❌ image generation response is nil after %d attempts", attempt),
+ },
+ }
+ }
-
- // If we have a response, validate it FIRST
- if response != nil {
+ // If we have a response, validate it FIRST
+ if response != nil {e6d8d09 to
8c8fcde
Compare
| var parsed int64 | ||
| parsed, parseErr = strconv.ParseInt(v, 10, 64) | ||
| if parseErr == nil { | ||
| minImages = int(parsed) |
Check failure
Code scanning / CodeQL
Incorrect conversion between integer types High test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
In general, to fix this class of problem you must ensure that the bit size used when parsing matches the size of the target type, or you must explicitly check that the parsed value is within the target type’s bounds before converting. Here, the target type is int (potentially 32‑bit), but we parse as 64‑bit and then cast without checks in the json.Number and string cases.
The best fix with minimal functional change is to validate that the parsed int64 lies within the representable range of int on all platforms before doing int(parsed). math does not provide MaxInt/MinInt, but Go defines int to be either 32 or 64 bits, so we can compute the appropriate bounds based on the platform’s int size. A simple, portable pattern is:
const (
maxInt = int(^uint(0) >> 1)
minInt = -maxInt - 1
)We can define these constants once in this file and then, before assigning minImages = int(parsed), check that parsed is between minInt and maxInt. If it is out of range, we set parseErr to a descriptive error and avoid performing the narrowing conversion. This needs to be applied to all code paths where an int64 from user/config parsing is converted to int — in this snippet, that is the json.Number case (1131) and the string case (1137). No new imports are required.
Concretely:
- Add two
constdeclarations near the top ofcore/internal/testutil/response_validation.goto definemaxIntandminInt. - Replace
minImages = int(parsed)in thejson.Numbercase with a bounds check against[minInt, maxInt], settingparseErrif out of range. - Do the same in the
stringcase. - Leave the rest of the logic unchanged; if
parseErris set, the existing code already logs a warning and skips themin_imagescheck.
| @@ -11,6 +11,11 @@ | ||
| "github.com/maximhq/bifrost/core/schemas" | ||
| ) | ||
|
|
||
| const ( | ||
| maxInt = int(^uint(0) >> 1) | ||
| minInt = -maxInt - 1 | ||
| ) | ||
|
|
||
| // ============================================================================= | ||
| // RESPONSE VALIDATION FRAMEWORK | ||
| // ============================================================================= | ||
| @@ -1128,13 +1133,21 @@ | ||
| var parsed int64 | ||
| parsed, parseErr = v.Int64() | ||
| if parseErr == nil { | ||
| minImages = int(parsed) | ||
| if parsed < int64(minInt) || parsed > int64(maxInt) { | ||
| parseErr = fmt.Errorf("min_images value %d out of range for int", parsed) | ||
| } else { | ||
| minImages = int(parsed) | ||
| } | ||
| } | ||
| case string: | ||
| var parsed int64 | ||
| parsed, parseErr = strconv.ParseInt(v, 10, 64) | ||
| if parseErr == nil { | ||
| minImages = int(parsed) | ||
| if parsed < int64(minInt) || parsed > int64(maxInt) { | ||
| parseErr = fmt.Errorf("min_images value %d out of range for int", parsed) | ||
| } else { | ||
| minImages = int(parsed) | ||
| } | ||
| } | ||
| default: | ||
| parseErr = fmt.Errorf("unsupported type for min_images: %T", v) |
There was a problem hiding this comment.
addressed and resolved.
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (7)
plugins/maxim/main.go (2)
125-138: AddImageGenerationOutputassignment toAccumulatedDatastruct initialization.The
ImageGenerationOutputfield is checked to determine the stream type but is not being copied to theAccumulatedDatastruct. BothAccumulatedDataandStreamAccumulatorResulthave this field, and it should be assigned alongsideAudioOutputandTranscriptionOutputfor consistency:ImageGenerationOutput: accResult.ImageGenerationOutput,This should be added to the struct initialization at lines 125-138.
548-567: Add image generation request type handling to the result logging switch.ImageGenerationRequest and ImageGenerationStreamRequest types are not handled in this switch statement. Since these types are supported across the codebase (e.g., in mocker and logging plugins) and providers like OpenAI and Nebius support image generation, Maxim should log image generation results consistently. Add:
case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: if streamResponse != nil { logger.AddResultToGeneration(generationID, streamResponse.ToBifrostResponse().ImageGenerationResponse) } else { logger.AddResultToGeneration(generationID, result.ImageGenerationResponse) }core/schemas/responses.go (1)
540-575: Error messages reference "image generation call output" but there's no corresponding field or handling logic.The error messages at lines 556 and 574 mention "image generation call output" as a valid type, but:
- The
ResponsesToolMessageOutputStructstruct (lines 540-544) doesn't include an image generation output fieldMarshalJSONonly handles 3 types: string, array blocks, and computer tool call outputUnmarshalJSONonly attempts to unmarshal those same 3 typesThis creates a misleading error message. Either:
- Remove "image generation call output" from the error messages if it's not a valid type for this struct, or
- Add the missing field and handling logic if image generation outputs should be supported here
🔧 Option A: Remove misleading text from error messages
- return nil, fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output") + return nil, fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data")- return fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output") + return fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data")framework/modelcatalog/main.go (1)
223-230: Addschemas.ImageGenerationRequestto the pricing lookup modes.
GetPricingEntryForModelat lines 223-230 uses a hardcoded list ofRequestTypemodes that excludesschemas.ImageGenerationRequest. Since image generation models are supported in this codebase (Gemini image, Imagen, DALL-E, etc.) and pricing.go explicitly handles image generation pricing, omittingImageGenerationRequestfrom the mode loop prevents these models' pricing from being retrieved.Add
schemas.ImageGenerationRequestto the mode list to ensure image generation models are discoverable viaGetPricingEntryForModel.core/providers/gemini/responses.go (2)
459-465: Fix thought-signature extraction for streamoutput_item.added(bad field check + missing base64 decode).Current code checks
bifrostResp.Item.EncryptedContent(likely non-existent) and then casts the encrypted base64 string to[]byte, which corrupts the signature and may not compile depending on theResponsesMessagestruct shape.Proposed fix
case schemas.ResponsesStreamResponseTypeOutputItemAdded: - if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil { - candidate.Content.Parts = append(candidate.Content.Parts, &Part{ - ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent), - }) - } + if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && + bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil { + decodedSig, err := base64.StdEncoding.DecodeString(*bifrostResp.Item.ResponsesReasoning.EncryptedContent) + if err == nil { + candidate.Content.Parts = append(candidate.Content.Parts, &Part{ + ThoughtSignature: decodedSig, + }) + } + }
1609-1618: Apply consistent prefix guard for image data URLs, matching file handling.The image handling at lines 1609-1618 should include the same prefix check that protects file handling at lines 1649-1652. Currently, if
blob.Dataunexpectedly contains adata:prefix, the image code will produce malformed URLs like:image/.... The file path already prevents this with a guard—apply the same pattern to images.Proposed fix
if isImageMimeType(mimeType) { - // Convert to base64 data URL - imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, blob.Data) + imageDataURL := blob.Data + if !strings.HasPrefix(imageDataURL, "data:") { + imageDataURL = fmt.Sprintf("data:%s;base64,%s", mimeType, imageDataURL) + } return &schemas.ResponsesMessageContentBlock{ Type: schemas.ResponsesInputMessageContentBlockTypeImage, ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{ - ImageURL: &imageURL, + ImageURL: &imageDataURL, }, } }Also applies to: 1649-1652
plugins/semanticcache/utils.go (1)
380-412: Avoid duplicating large image payloads in semantic cache metadata (base64 can explode storage)
metadata["response"]already contains the full response (potentially withb64_json), and you also addmetadata["image_b64"]. Consider storing only URLs / revised prompts, or store hashes/lengths for base64 instead of the full strings.
🤖 Fix all issues with AI agents
In `@core/internal/testutil/response_validation.go`:
- Around line 1097-1157: The min_images parsing in validateImageGenerationFields
silently records parse errors without failing and performs unchecked narrowing
to int; update the switch in validateImageGenerationFields to (1) validate
numeric ranges when converting int64/float64/json.Number to int by checking
against int64(intMin)/int64(intMax) (for float64 ensure it is an integer value
via math.Trunc and within bounds), (2) treat any parse/overflow error as a hard
validation failure by setting result.Passed = false and appending a clear error
message (instead of only recording a warning), and (3) keep recording
result.MetricsCollected["image_count"] and comparing against minImages only when
conversion succeeded; reference the ProviderSpecific["min_images"] handling and
result.Errors/result.Passed modifications in your patch.
In `@core/providers/azure/azure.go`:
- Around line 1257-1303: The streaming ImageGenerationStream call currently
passes nil converters to openai.HandleOpenAIImageGenerationStreaming so each
streamed BifrostImageGenerationResponse chunk never gets the deployment
metadata; update the openai.HandleOpenAIImageGenerationStreaming invocation in
ImageGenerationStream to supply a postResponseConverter that sets
resp.ExtraFields.ModelDeployment = deployment when resp != nil (i.e., a small
function that mutates and returns the response) so every streamed chunk is
stamped with the ModelDeployment value just like the sync path does.
In `@core/providers/gemini/gemini.go`:
- Around line 1614-1671: Update the incorrect comment in
handleImagenImageGeneration to say it uses the Generative Language API
(x-goog-api-key / /models/{model}:predict), not "Vertex AI endpoint with API key
auth", and add a nil-check after calling
GeminiImagenResponse.ToBifrostImageGenerationResponse() in
handleImagenImageGeneration (or wherever GeminiImagenResponse is converted) to
handle the case where ToBifrostImageGenerationResponse() returns nil before you
dereference response.ExtraFields; if nil, return a BifrostError with ExtraFields
populated (Provider, ModelRequested, RequestType) or a suitable error. Ensure
references to ToBifrostImageGenerationResponse, GeminiImagenResponse, and
handleImagenImageGeneration are used to locate the changes.
In `@core/providers/huggingface/huggingface_test.go`:
- Around line 34-36: The ImageGenerationModel value is using a duplicated
"fal-ai" prefix; update the test data to remove the extra segment so it matches
other fal-ai entries and the URL-building logic in utils.go (which already
prepends "/fal-ai/"); specifically change the ImageGenerationModel in
huggingface_test.go from "fal-ai/fal-ai/flux/dev" to either "fal-ai/flux/dev" or
just "flux/dev" as appropriate, and make the same fix in
core/internal/testutil/account.go where "fal-ai/fal-ai/flux-2" appears so both
locations are consistent with utils.go's model path handling.
In `@core/providers/huggingface/images.go`:
- Around line 329-349: The switch branch handling falAI currently only reads the
legacy flattened falResponse.Images; update it to handle both formats by
checking HuggingFaceFalAIImageGenerationResponse.Data.Images first and using
that if non-empty, otherwise fall back to falResponse.Images (or vice-versa per
types.go), then map whichever slice is present into schemas.ImageData
(preserving URL, B64JSON, Index) before returning the
schemas.BifrostImageGenerationResponse; alternatively, if you choose not to
implement fallback, add a clear comment in the falAI branch stating only the
flattened Images format is expected and why.
In `@core/providers/huggingface/models.go`:
- Around line 67-69: The text-to-image pipeline handling currently only
registers non-streaming requests; update the switch case handling
"text-to-image" to call addMethods with both schemas.ImageGenerationRequest and
schemas.ImageGenerationStreamRequest (so streaming image-generation is
supported), and extend the tag-based capability detection loop (the code that
checks tags for embeddings/chat/speech/transcription) to detect image-generation
tags such as "text-to-image", "image-generation", "image-generation.*" (or
similar patterns) and mark the model accordingly so tag detection matches the
pipeline case.
In `@tests/integrations/python/config.yml`:
- Around line 170-171: The huggingface image_generation model path has a
duplicated namespace; update the value of the image_generation key (currently
"fal-ai/fal-ai/flux/dev") to the correct model ID "fal-ai/flux/dev" so it
follows the namespace/model/variant format used by FAL.
♻️ Duplicate comments (19)
docs/openapi/schemas/inference/images.yaml (1)
195-203: Consider addingn_imagesfield toImageTokenDetails.Per the Go implementation,
OutputTokensDetails.NImagestracks the number of generated images. Adding this field would align the schema with the actual response shape.♻️ Suggested addition
ImageTokenDetails: type: object properties: image_tokens: type: integer description: Tokens used for images text_tokens: type: integer description: Tokens used for text + n_images: + type: integer + description: Number of images generatedcore/providers/openai/openai_test.go (1)
43-83: All test scenarios remain disabled - blocking for merge.This was previously flagged: all scenarios except
ImageGenerationandImageGenerationStreamare set tofalse. This effectively disables comprehensive OpenAI provider testing.Before merging to
main, please re-enable the appropriate test scenarios to maintain full test coverage. Compare with the HuggingFace test configuration which properly keeps other scenarios enabled alongside the new image generation tests.core/providers/vertex/vertex.go (2)
1493-1559: API-key authentication is not supported for Vertex AI predict endpoints.The code still uses API-key-in-query authentication for Imagen (lines 1514-1516), Gemini (lines 1523-1525), and fine-tuned models (lines 1503-1505). Per Vertex AI documentation, these endpoints require OAuth2 Bearer tokens—API keys will fail with "API keys are not supported by this API" errors.
The conditional at lines 1545-1558 skips OAuth2 token fetching when
authQueryis set, leaving no fallback.
1420-1431: Add nil-request guard to prevent panic onrequest.Modeldereference.Line 1427 dereferences
request.Modelbefore any nil check. While upstream validation may exist, defensive programming within the method prevents panics if called incorrectly.Based on learnings, the pattern used in Nebius (and OpenAI) should be applied here.
Proposed fix
func (provider *VertexProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { providerName := provider.GetProviderKey() + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, providerName) + } + + if request.Input == nil || strings.TrimSpace(request.Input.Prompt) == "" { + return nil, providerUtils.NewBifrostOperationError("prompt cannot be empty", nil, providerName) + } + if key.VertexKeyConfig == nil { return nil, providerUtils.NewConfigurationError("vertex key config is not set", providerName) }docs/providers/supported-providers/nebius.mdx (1)
153-156: Changeguidance_scaletype frominttofloatto match Nebius API specification.The Nebius API specifies
guidance_scaleas a numeric float (range 0–100), but the documentation showsint. This was flagged in a previous review and requires updates in both documentation and implementation.core/providers/azure/azure.go (1)
1215-1225: Guard nil/empty requests and usegetModelDeployment(prevents panics).
This is the same issue already called out in prior review comments:request.Modelis read without a nil check, and deployment lookup bypassesgetModelDeployment.Also applies to: 1257-1273
framework/modelcatalog/pricing.go (1)
276-319: Tiering fallback still missing whenimageUsage.TotalTokensis zero but other token counts exist.When providers omit
total_tokens,imageTotalTokensremains 0 (Line 279), causingisAbove128k/isAbove200k(Lines 322-324) to always be false. This under-tiers requests that should use higher-tier pricing.The per-image path (Lines 282-319) correctly short-circuits when all tokens are zero, but the token-based path (Lines 321+) will mis-tier if
TotalTokensis zero whileInputTokens/OutputTokensare populated.Proposed fix to compute fallback total
// Use imageUsage.TotalTokens for tier determination imageTotalTokens := imageUsage.TotalTokens + if imageTotalTokens == 0 { + // Compute fallback from available token counts + if imageUsage.InputTokensDetails != nil { + imageTotalTokens += imageUsage.InputTokensDetails.ImageTokens + imageUsage.InputTokensDetails.TextTokens + } else { + imageTotalTokens += imageUsage.InputTokens + } + if imageUsage.OutputTokensDetails != nil { + imageTotalTokens += imageUsage.OutputTokensDetails.ImageTokens + imageUsage.OutputTokensDetails.TextTokens + } else { + imageTotalTokens += imageUsage.OutputTokens + } + } // Check if tokens are zero/nil - if so, use per-image pricing - if imageTotalTokens == 0 && imageUsage.InputTokens == 0 && imageUsage.OutputTokens == 0 { + if imageTotalTokens == 0 {framework/streaming/images.go (1)
45-114: Multi-chunk path still dropsURLand uses syntheticCreatedtimestamp.As noted in a prior review, the multi-chunk reconstruction path:
- Sets
Created: time.Now().Unix()(Line 108) instead of preserving provider timestamps from chunks- Does not track or preserve
URLfields (Lines 84-94 only setB64JSON,Index,RevisedPrompt)If a provider streams URL-only images or includes the timestamp only in certain chunks, this data is lost.
Proposed fix
// Reconstruct complete images from chunks images := make(map[int]*strings.Builder) + imageURLs := make(map[int]string) var model string var revisedPrompts map[int]string = make(map[int]string) + var createdAt int64 for _, chunk := range chunks { if chunk.Delta == nil { continue } // Extract metadata if model == "" && chunk.Delta.ExtraFields.ModelRequested != "" { model = chunk.Delta.ExtraFields.ModelRequested } + // Track created timestamp (first non-zero wins) + if createdAt == 0 && chunk.Delta.CreatedAt != 0 { + createdAt = chunk.Delta.CreatedAt + } // Store revised prompt if present (usually in first chunk) if chunk.Delta.RevisedPrompt != "" { revisedPrompts[chunk.ImageIndex] = chunk.Delta.RevisedPrompt } + // Track URL if present (last non-empty wins) + if chunk.Delta.URL != "" { + imageURLs[chunk.ImageIndex] = chunk.Delta.URL + } // Reconstruct base64 for each image // ... } // In ImageData construction: imageData = append(imageData, schemas.ImageData{ B64JSON: builder.String(), + URL: imageURLs[imageIndex], Index: imageIndex, RevisedPrompt: revisedPrompts[imageIndex], }) // In finalResponse: finalResponse := &schemas.BifrostImageGenerationResponse{ ID: responseID, - Created: time.Now().Unix(), + Created: func() int64 { if createdAt != 0 { return createdAt }; return time.Now().Unix() }(), Model: model, Data: imageData, }docs/openapi/paths/integrations/openai/images.yaml (1)
62-66: Addrequired: trueforapi-versionparameter.Azure OpenAI image generation endpoints require the
api-versionquery parameter. The endpoint pattern per Microsoft docs isPOST https://<resource>.openai.azure.com/openai/deployments/<deployment>/images/generations?api-version=<api_version>.🔧 Suggested fix
- name: api-version in: query + required: true schema: type: string description: Azure API versioncore/providers/openai/openai.go (1)
2665-2879: Fix image stream termination: handle[DONE]and don’t exit on the first completed image
Today the loop (a) treatsdata: [DONE]as JSON (warn-loop), and (b)returns on the firstimage_generation.completed, truncatingn>1.Proposed fix (handle [DONE], track expected images, only end once all are complete)
@@ - lastChunkTime := startTime + lastChunkTime := startTime + expectedImages := 1 + if request != nil && request.Params != nil && request.Params.N != nil && *request.Params.N > 0 { + expectedImages = *request.Params.N + } + completedImages := make(map[int]struct{}, expectedImages) @@ - jsonData := strings.TrimSpace(strings.TrimPrefix(line, "data:")) + jsonData := strings.TrimSpace(strings.TrimPrefix(line, "data:")) if jsonData == "" { continue } + if jsonData == "[DONE]" { + break + } @@ - // Determine if this is the final chunk + // Determine if this is a completed chunk (OpenAI emits one "completed" per image) isCompleted := response.Type == ImageGenerationCompleted @@ - // Only set PartialImageIndex for partial images, not for completed events - if !isCompleted { - chunk.PartialImageIndex = response.PartialImageIndex - } + // Preserve provider-supplied index for both partial and completed chunks + chunk.PartialImageIndex = response.PartialImageIndex @@ - if isCompleted { + if isCompleted { + completedImages[imageIndex] = struct{}{} if collectedUsage != nil { @@ - ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) } @@ - if isCompleted { - return - } + if isCompleted && len(completedImages) >= expectedImages { + ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) + return + } }transports/bifrost-http/integrations/router.go (1)
1279-1295: Streaming can still panic: converters are invoked without nil checks (incl. image generation).
config.StreamConfig.*ResponseConverter(...)is called directly (Line 1281-1291). If an integration forgets to set one converter, this will nil-deref at runtime.Proposed fix (pattern for each case)
switch { case chunk.BifrostTextCompletionResponse != nil: + if config.StreamConfig.TextStreamResponseConverter == nil { + err = fmt.Errorf("TextStreamResponseConverter not configured") + break + } eventType, convertedResponse, err = config.StreamConfig.TextStreamResponseConverter(bifrostCtx, chunk.BifrostTextCompletionResponse) case chunk.BifrostChatResponse != nil: + if config.StreamConfig.ChatStreamResponseConverter == nil { + err = fmt.Errorf("ChatStreamResponseConverter not configured") + break + } eventType, convertedResponse, err = config.StreamConfig.ChatStreamResponseConverter(bifrostCtx, chunk.BifrostChatResponse) @@ case chunk.BifrostImageGenerationStreamResponse != nil: + if config.StreamConfig.ImageGenerationStreamResponseConverter == nil { + err = fmt.Errorf("ImageGenerationStreamResponseConverter not configured") + break + } eventType, convertedResponse, err = config.StreamConfig.ImageGenerationStreamResponseConverter(bifrostCtx, chunk.BifrostImageGenerationStreamResponse)core/providers/gemini/images.go (3)
79-92: Potential nil-pointer panic when iteratingcontent.Parts(guardpart == nil).If
content.Partsis[]*Part(typical in this codebase),part.Text(Line 83) can panic.Proposed fix
for _, content := range request.Contents { for _, part := range content.Parts { - if part.Text != "" { + if part == nil { + continue + } + if part.Text != "" { bifrostReq.Input.Prompt = part.Text break } }
97-128: ImagenimageSizeparsing is case-sensitive and missing "4k".
convertImagenFormatToSize()only matches"1k"/"2k"(Line 102-107). If callers send"1K"/"2K"/"4K", this will produce wrong sizes.Proposed fix
func convertImagenFormatToSize(imageSize *string, aspectRatio *string) string { // Default size based on imageSize parameter baseSize := 1024 if imageSize != nil { - switch *imageSize { - case "2k": + switch strings.ToLower(strings.TrimSpace(*imageSize)) { + case "4k": + baseSize = 4096 + case "2k": baseSize = 2048 - case "1k": + case "1k": baseSize = 1024 } }
130-188: Nil receiver + “empty Data but success” risk inToBifrostImageGenerationResponse().
- If
responseis nil,response.ResponseIDwill panic (Line 131-134).- If there’s no
InlineData, you returnbifrostRespwithData: []and no error (Line 169-175), which is a silent failure mode.Proposed fix
func (response *GenerateContentResponse) ToBifrostImageGenerationResponse() (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if response == nil { + return nil, &schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{Message: "nil Gemini response"}, + } + } bifrostResp := &schemas.BifrostImageGenerationResponse{ ID: response.ResponseID, Model: response.ModelVersion, Data: []schemas.ImageData{}, } @@ // Only assign imageData when it has elements - if len(imageData) > 0 { - bifrostResp.Data = imageData - // Only set ImageGenerationResponseParameters when metadata exists - if len(imageMetadata) > 0 { - bifrostResp.ImageGenerationResponseParameters = &imageMetadata[0] - } - } + if len(imageData) == 0 { + return nil, &schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{Message: "no image data found in Gemini response"}, + } + } + bifrostResp.Data = imageData + if len(imageMetadata) > 0 { + bifrostResp.ImageGenerationResponseParameters = &imageMetadata[0] + }core/internal/testutil/test_retry_framework.go (3)
980-995: Type mismatch:DefaultImageGenerationRetryConfig()returnsTestRetryConfig, but image-gen retry usesImageGenerationRetryConfigconditions.
This is easy to accidentally “wire wrong” (default conditions never evaluated), and it also diverges from thecore/internal/testutilconvention of building typed configs fromGetTestRetryConfigForScenario()and leavingConditionsempty. Based on learnings, this should be consistent acrosscore/internal/testutil.Suggested verification (confirm how image-gen tests build
ImageGenerationRetryConfigand whether conditions are ever applied):#!/bin/bash set -euo pipefail rg -n --type=go '\bDefaultImageGenerationRetryConfig\s*\(' -S rg -n --type=go '\bWithImageGenerationRetry\s*\(' -S rg -n --type=go '\bImageGenerationRetryConfig\s*\{' -S rg -n --type=go '\bGetTestRetryConfigForScenario\s*\(' core/internal/testutil -SIf the intent is “typed configs derive timings from scenario config, conditions empty”, then consider changing
DefaultImageGenerationRetryConfig()to return the same baseline timings but no conditions, and ensure the typed config is constructed in the test helper accordingly.Also applies to: 1250-1252
2097-2260: Avoid false-pass:(nil response, nil err)currently exits retries and returns success-like(nil, nil).
Ifoperation()ever returns(nil, nil), this breaks out withoutOnFinalFail, and callers may treat it as success.Proposed fix
@@ // Execute the operation response, err := operation() lastResponse = response lastError = err + // Treat nil response as a failure (even if err is nil) to avoid false-passing tests + if response == nil { + if attempt < config.MaxAttempts { + retryReason := "❌ image generation response is nil" + if config.OnRetry != nil { + config.OnRetry(attempt, retryReason, t) + } + delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) + time.Sleep(delay) + continue + } + statusCode := 400 + return nil, &schemas.BifrostError{ + IsBifrostError: true, + StatusCode: &statusCode, + Error: &schemas.ErrorField{ + Message: fmt.Sprintf("❌ image generation response is nil after %d attempts", attempt), + }, + } + } - - // If we have a response, validate it FIRST - if response != nil { + // If we have a response, validate it FIRST + if response != nil { validationResult := ValidateImageGenerationResponse(t, response, err, expectations, scenarioName)
2947-2955: Comment/name mismatch: “WithImageGenerationStreamValidationRetry” vsWithImageGenerationStreamRetry.
Either rename the function for consistency with other*StreamValidationRetryhelpers, or fix the comment header.tests/integrations/python/tests/test_google.py (1)
1691-1786: Fix cross-provider key gating + Ruff (ARG002/BLE001) in image-generation tests (avoid skipping real failures).
These tests are parametrized across providers, but@skip_if_no_api_key("google")only guards the Google key;get_provider_google_client(provider)can still raise for missing provider keys. Alsoexcept Exceptionwill swallowAssertionErrorand violate Ruff BLE001.Proposed diff
@@ `@skip_if_no_api_key`("google") `@pytest.mark.parametrize`("provider,model", get_cross_provider_params_for_scenario("image_generation")) - def test_41a_image_generation_simple(self, test_config, provider, model): + def test_41a_image_generation_simple(self, _test_config, provider, model): @@ - # Get provider-specific client - client = get_provider_google_client(provider) + # Get provider-specific client (skip cleanly if provider key is missing) + try: + client = get_provider_google_client(provider) + except ValueError: + pytest.skip(f"No API key available for provider {provider}") @@ `@skip_if_no_api_key`("google") `@pytest.mark.parametrize`("provider,model", get_cross_provider_params_for_scenario("imagen")) - def test_41b_imagen_predict(self, test_config, provider, model): + def test_41b_imagen_predict(self, _test_config, provider, model): @@ - # Get provider-specific client - client = get_provider_google_client(provider) + # Get provider-specific client (skip cleanly if provider key is missing) + try: + client = get_provider_google_client(provider) + except ValueError: + pytest.skip(f"No API key available for provider {provider}") @@ - try: - response = client.models.generate_content( + try: + response = client.models.generate_content( model=format_provider_model(provider, model), contents=IMAGE_GENERATION_SIMPLE_PROMPT, config=types.GenerateContentConfig() ) @@ - except Exception as e: + except Exception as e: # TODO: narrow to google-genai specific exceptions once confirmed # Imagen may not be available in all regions or configurations pytest.skip(f"Imagen generation failed: {e}") @@ `@skip_if_no_api_key`("google") `@pytest.mark.parametrize`("provider,model", get_cross_provider_params_for_scenario("image_generation")) - def test_41c_image_generation_with_text(self, test_config, provider, model): + def test_41c_image_generation_with_text(self, _test_config, provider, model): @@ - # Get provider-specific client - client = get_provider_google_client(provider) + # Get provider-specific client (skip cleanly if provider key is missing) + try: + client = get_provider_google_client(provider) + except ValueError: + pytest.skip(f"No API key available for provider {provider}") @@ - for candidate in response.candidates: - if hasattr(candidate, "content") and candidate.content: - if hasattr(candidate.content, "parts") and candidate.content.parts: - for part in candidate.content.parts: - if hasattr(part, "text") and part.text: - found_text = True - if hasattr(part, "inline_data") and part.inline_data: - found_image = True + for candidate in (getattr(response, "candidates", None) or []): + content = getattr(candidate, "content", None) + for part in (getattr(content, "parts", None) or []): + if getattr(part, "text", None): + found_text = True + if getattr(part, "inline_data", None): + found_image = TrueTo fully resolve BLE001 without risking false skips, please confirm which exception types
google.genairaises for unavailable Imagen / region restrictions, and replace theExceptioncatch accordingly.google genai python sdk (google-genai) what exception types are raised for API errors / invalid requests?tests/integrations/python/tests/test_openai.py (1)
1166-1258: Tighten cross-provider capability gating for image-gen params + fix Ruff ARG002.
As written,n=2,quality="low", andsize="1024x1536"can be unsupported for some provider/model combos; alsotest_configis unused.Proposed diff
@@ `@pytest.mark.parametrize`( "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation") ) - def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled): + def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled): @@ `@pytest.mark.parametrize`( "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation") ) - def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled): + def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled): @@ - if model == "gemini-2.5-flash-image": + if provider == "gemini" and model == "gemini-2.5-flash-image": pytest.skip("Gemini 2.5 flash image does not support multiple images") @@ - response = client.images.generate( - model=format_provider_model(provider, model), - prompt=IMAGE_GENERATION_SIMPLE_PROMPT, - n=2, - size="1024x1024", - ) + try: + response = client.images.generate( + model=format_provider_model(provider, model), + prompt=IMAGE_GENERATION_SIMPLE_PROMPT, + n=2, + size="1024x1024", + ) + except Exception as e: + pytest.skip(f"{provider}/{model} does not support n=2 image generation: {e}") @@ `@pytest.mark.parametrize`( "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation") ) - def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled): + def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled): @@ - response = client.images.generate( - model=format_provider_model(provider, model), - prompt=IMAGE_GENERATION_SIMPLE_PROMPT, - n=1, - size="1024x1024", - quality="low", # gpt-image-1 supports quality parameter - ) + try: + response = client.images.generate( + model=format_provider_model(provider, model), + prompt=IMAGE_GENERATION_SIMPLE_PROMPT, + n=1, + size="1024x1024", + quality="low", # intended for gpt-image-1 + ) + except Exception as e: + pytest.skip(f"{provider}/{model} does not support quality param: {e}") @@ `@pytest.mark.parametrize`( "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation") ) - def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled): + def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled): @@ - response = client.images.generate( - model=format_provider_model(provider, model), - prompt=IMAGE_GENERATION_SIMPLE_PROMPT, - n=1, - size="1024x1536", # Portrait orientation - ) + try: + response = client.images.generate( + model=format_provider_model(provider, model), + prompt=IMAGE_GENERATION_SIMPLE_PROMPT, + n=1, + size="1024x1536", # Portrait orientation + ) + except Exception as e: + pytest.skip(f"{provider}/{model} does not support size=1024x1536: {e}")Based on learnings, please also verify the stack updates the integration config so
providers.openai.image_generationis set togpt-image-1(and ideally document that in the test config to avoid DALLE-3 constraints leaking back in).
🧹 Nitpick comments (28)
core/schemas/utils.go (1)
1183-1186: Inconsistent case handling compared to peer functions.This function uses
strings.ToLower()for case-insensitive matching, while peer functions likeIsNovaModel,IsGeminiModel,IsMistralModel, andIsAnthropicModeldo not apply case normalization. Consider aligning with the existing pattern unless there's a specific reason Imagen model identifiers may vary in case.Suggested change for consistency
// IsImagenModel checks if the model is an Imagen model. func IsImagenModel(model string) bool { - return strings.Contains(strings.ToLower(model), "imagen") + return strings.Contains(model, "imagen") }If case-insensitivity is intentional (e.g., provider APIs return mixed-case model names), please disregard this suggestion and consider adding a brief comment explaining the rationale.
ui/components/chat/ImageMessage.tsx (2)
40-53: Move pure helper function outside component scope.
getMimeTypeis a pure function with no dependencies on component state or props. Defining it inside the component causes it to be recreated on every render.♻️ Suggested refactor
+// Convert output_format to MIME type for data URLs +const getMimeType = (format?: string): string => { + switch (format?.toLowerCase()) { + case 'png': + return 'image/png'; + case 'jpeg': + case 'jpg': + return 'image/jpeg'; + case 'webp': + return 'image/webp'; + default: + return 'image/png'; + } +}; + export const ImageMessage: React.FC<ImageMessageProps> = ({ image, isStreaming, streamProgress, }) => { // Streaming placeholder... // No usable image data... - // Convert output_format to MIME type for data URLs - const getMimeType = (format?: string): string => { - switch (format?.toLowerCase()) { - case 'png': - return 'image/png'; - case 'jpeg': - case 'jpg': - return 'image/jpeg'; - case 'webp': - return 'image/webp'; - default: - // Default to PNG for backward compatibility - return 'image/png'; - } - }; - const dataUrl = image.url
63-68: Consider enhanced alt text and image error handling.Two suggestions for improved UX and accessibility:
Alt text:
ImageMessageDataincludesrevised_promptwhich often better describes AI-generated images. Consider using it whenpromptis unavailable.Error handling: No
onErrorhandler means broken URLs or invalid base64 will show the browser's default broken image icon without user feedback.♻️ Suggested improvements
+ const altText = image.prompt || image.revised_prompt || `Generated image ${(image.index ?? 0) + 1}`; + return ( <div className="my-4"> <Card className="p-0"> <div className="border border-border overflow-auto"> <img src={dataUrl} - alt={image.prompt || `image-${image.index ?? 0}`} + alt={altText} className="w-auto h-auto" loading="lazy" + onError={(e) => { + e.currentTarget.style.display = 'none'; + }} /> </div> </Card> </div> );docs/quickstart/gateway/multimodal.mdx (1)
49-98: LGTM — Image generation section is well-documented.The new Image Generation section follows the established documentation patterns in this file, with a clear curl example and comprehensive response format including usage metrics and extra_fields.
Optional nit: Lines 97-98 introduce a double blank line before the next section, while other sections use a single blank line. Consider removing one for consistency.
Optional formatting fix
} }
Audio Understanding: Analyzing Audio with AI
</details> </blockquote></details> <details> <summary>tests/integrations/python/config.yml (1)</summary><blockquote> `456-461`: **Consider adding explicit `false` values for unsupported scenarios.** While the minimal configuration works (undefined scenarios likely default to false), other providers explicitly list all scenarios. For consistency and clarity, consider explicitly defining unsupported scenarios as `false`. This is a minor consistency suggestion - the current configuration is functional. </blockquote></details> <details> <summary>docs/providers/supported-providers/groq.mdx (1)</summary><blockquote> `25-25`: **Image Generation entry missing from Unsupported Features table.** The Image Generation row was added to the Supported Operations table (line 25), but unlike the Cerebras and Parasail documentation, it's not added to the Unsupported Features table (lines 126-135). For consistency across provider documentation, consider adding an entry. <details> <summary>📝 Suggested addition to Unsupported Features table</summary> Add after line 131: ```diff | Embedding | Not offered by Groq API | +| Image Generation | Not offered by Groq API | | Speech/TTS | Not offered by Groq API |core/providers/nebius/errors.go (1)
22-51: Consider preservingDetail.Messagewhen validation errors are also present.When both
Detail.MessageandValidationErrorsexist, the current logic overwrites the originalDetail.Message(line 24) with the aggregated validation error messages (line 41). This may lose valuable context from the original error message.Consider combining both messages instead:
💡 Suggested change to preserve both messages
if nebiusErr.Detail != nil { + var originalMessage string if nebiusErr.Detail.Message != nil { - message = *nebiusErr.Detail.Message + originalMessage = *nebiusErr.Detail.Message } if len(nebiusErr.Detail.ValidationErrors) > 0 { var messages []string var locations []string for _, detail := range nebiusErr.Detail.ValidationErrors { if detail.Msg != "" { messages = append(messages, detail.Msg) } if len(detail.Loc) > 0 { locations = append(locations, strings.Join(detail.Loc, ".")) } } if len(messages) > 0 { message = strings.Join(messages, "; ") } if len(locations) > 0 { locationStr := strings.Join(locations, ", ") if message == "" { message = "[" + locationStr + "]" } else { message = message + " [" + locationStr + "]" } } + // Prepend original message if present + if originalMessage != "" && message != "" { + message = originalMessage + ": " + message + } else if originalMessage != "" { + message = originalMessage + } + } else { + message = originalMessage } }core/providers/openai/types.go (1)
546-546: Define constants forImageGenerationEventTypevalues.
ImageGenerationEventTypeis declared as a type alias but no constants are defined in this file. Per learnings, the expected values are"image_generation.partial_image","image_generation.completed", and"error". Consider defining these as constants for type safety and discoverability.♻️ Suggested constants
const ( ImageGenerationEventTypePartialImage ImageGenerationEventType = "image_generation.partial_image" ImageGenerationEventTypeCompleted ImageGenerationEventType = "image_generation.completed" ImageGenerationEventTypeError ImageGenerationEventType = "error" )docs/openapi/schemas/inference/images.yaml (2)
80-86: Add "jpg" as an accepted value foroutput_format.The schema only lists
png,webp, andjpeg, but providers may normalizejpg↔jpeg. For API completeness and to avoid client confusion, consider addingjpgto the enum.♻️ Suggested change
output_format: type: string enum: - "png" - "webp" - "jpeg" + - "jpg" description: Output image formatAlso applies to: 122-128, 252-258
166-177: Remove orphanedImageGenerationResponseParametersschema.The schema is not referenced anywhere in the codebase and duplicates properties already defined inline in
ImageGenerationResponse(lines 119–143). Removing it will eliminate schema bloat without impact.docs/providers/supported-providers/openai.mdx (1)
218-273: Comprehensive documentation for Image Generation API.The documentation thoroughly covers:
- Request parameters with types and notes
- Bifrost-to-OpenAI conversion details
- Streaming response handling with SSE event types
- Per-chunk metadata and tracking
One minor style suggestion from static analysis: Lines 244-246 have three consecutive bullet points starting with the same pattern. Consider varying the sentence structure for readability.
📝 Optional: Vary sentence beginnings
-**Request Conversion (Bifrost → OpenAI)** - -OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion: - -- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` -- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed. -- **Streaming**: When streaming is requested, `stream: true` is set in the request body. +**Request Conversion (Bifrost → OpenAI)** + +OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion: + +- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` +- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed. +- For **Streaming** requests, `stream: true` is set in the request body.core/internal/testutil/validation_presets.go (1)
222-237: Consider removing the inline comment on line 222.The comment
// In validation_presets.go - add this functionappears to be a leftover from development notes. It's redundant since the function is already in the file.Proposed fix
-// In validation_presets.go - add this function +// ImageGenerationExpectations returns validation expectations for image generation scenarios func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {docs/providers/supported-providers/azure.mdx (1)
429-429: Minor formatting: Add newline before section separator.Line 429 has the
---separator immediately after the "Streaming" section content without a blank line, which differs from the formatting used elsewhere in this file.📝 Suggested fix
Image generation streaming is supported and uses OpenAI's streaming format with Server-Sent Events (SSE). + ---framework/logstore/migrations.go (1)
684-748: Consider consolidating migrations for related columns.The two migrations (
migrationAddImageGenerationOutputColumnandmigrationAddImageGenerationInputColumn) follow the established pattern and are correct. However, since both columns are part of the same feature (image generation support), they could be consolidated into a single migration function, similar to howmigrationAddCostAndCacheDebugColumnhandles multiple related columns.That said, keeping them separate provides finer-grained rollback control. This is a minor suggestion and can be deferred.
♻️ Optional: Consolidated migration approach
func migrationAddImageGenerationColumns(ctx context.Context, db *gorm.DB) error { opts := *migrator.DefaultOptions opts.UseTransaction = true m := migrator.New(db, &opts, []*migrator.Migration{{ ID: "logs_add_image_generation_columns", Migrate: func(tx *gorm.DB) error { tx = tx.WithContext(ctx) migrator := tx.Migrator() if !migrator.HasColumn(&Log{}, "image_generation_input") { if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil { return err } } if !migrator.HasColumn(&Log{}, "image_generation_output") { if err := migrator.AddColumn(&Log{}, "image_generation_output"); err != nil { return err } } return nil }, Rollback: func(tx *gorm.DB) error { tx = tx.WithContext(ctx) migrator := tx.Migrator() if migrator.HasColumn(&Log{}, "image_generation_input") { if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil { return err } } if migrator.HasColumn(&Log{}, "image_generation_output") { if err := migrator.DropColumn(&Log{}, "image_generation_output"); err != nil { return err } } return nil }, }}) err := m.Migrate() if err != nil { return fmt.Errorf("error while adding image generation columns: %s", err.Error()) } return nil }core/providers/nebius/types.go (2)
10-24: Avoid emittingnullfor required request fields (model,prompt).
Since these are pointers withoutomitempty, a nil value serializes asnull(easy to hit if a converter returns nil). Consider making them non-pointerstring, or ensure converters/dispatch validate non-nil/non-empty before marshaling (same pattern used elsewhere for image requests).
55-75: NebiusErrorDetail.UnmarshalJSON: consider supporting object-shaped errors too (not just string/array).
Today, an object payload (e.g.{"message":"..."}) will fail to parse.Possible hardening
func (d *NebiusErrorDetail) UnmarshalJSON(data []byte) error { // First, try to unmarshal as an array (validation errors) trimmed := strings.TrimSpace(string(data)) if len(trimmed) > 0 && trimmed[0] == '[' { @@ } // If not an array, try to unmarshal as a string var message string if err := sonic.Unmarshal(data, &message); err != nil { - return err + // Fallback: object payloads like {"message":"..."} + var obj struct { + Message *string `json:"message"` + Detail *string `json:"detail"` + Error *string `json:"error"` + } + if err2 := sonic.Unmarshal(data, &obj); err2 != nil { + return err + } + if obj.Message != nil { + d.Message = obj.Message + return nil + } + if obj.Detail != nil { + d.Message = obj.Detail + return nil + } + if obj.Error != nil { + d.Message = obj.Error + return nil + } + // Unknown object shape; keep original error + return err } d.Message = &message return nil }core/providers/bedrock/bedrock.go (1)
1347-1355: Unsupported image ops: good to returnprovider.GetProviderKey(); consider aligning older unsupported ops too.
Right nowSpeech/SpeechStream/...passschemas.Bedrockwhile image uses the computed provider key (helps custom providers).core/providers/gemini/responses.go (1)
1873-1900: Deduplicate inline-data-to-content-block logic (and apply the same data-URL prefix guard).
convertGeminiCandidatesToResponsesOutputre-implements inline data handling that overlaps withconvertGeminiInlineDataToContentBlock, and currently hardcodesdata:<mime>;base64,<data>without guarding for pre-prefixed data URLs (same risk as above).core/internal/testutil/image_generation.go (1)
170-204: Align stream retry config construction with non-stream pattern in this directory.The non-stream test explicitly constructs a typed
ImageGenerationRetryConfigand setsConditions: []ImageGenerationRetryCondition{}, while the stream test passesretryConfig(genericTestRetryConfigfrom the helper) directly toWithImageGenerationStreamRetry. Since the stream function expectsTestRetryConfigbut the non-stream expectsImageGenerationRetryConfig, consider either:
- Having the stream test construct a typed
ImageGenerationRetryConfigto match the non-stream pattern- Creating a dedicated
ImageGenerationStreamRetryConfigtype for consistencyThis aligns retry config construction across similar test patterns in the directory.
framework/modelcatalog/pricing.go (1)
379-419: Consider consolidating repetitive image token rate logic.The three tier branches (Lines 379-392, 393-406, 407-419) are identical since tiered image token pricing fields don't exist yet. This can be simplified to a single block.
Suggested simplification
- // Determine image-specific token rates, with tiered pricing support - // Check for image token pricing fields and fall back to generic rates if not available - if isAbove200k { - // Prefer tiered image token pricing above 200k, fall back to base image token rate, then generic rate - // Note: InputCostPerImageTokenAbove200kTokens and OutputCostPerImageTokenAbove200kTokens - // may not exist in TableModelPricing yet, so we check base image token rate as fallback - if pricing.InputCostPerImageToken != nil { - inputImageTokenRate = *pricing.InputCostPerImageToken - } else { - inputImageTokenRate = inputTokenRate - } - if pricing.OutputCostPerImageToken != nil { - outputImageTokenRate = *pricing.OutputCostPerImageToken - } else { - outputImageTokenRate = outputTokenRate - } - } else if isAbove128k { - // ... same logic ... - } else { - // ... same logic ... - } + // Determine image-specific token rates + // TODO: Add tiered image token pricing when fields are added to TableModelPricing + if pricing.InputCostPerImageToken != nil { + inputImageTokenRate = *pricing.InputCostPerImageToken + } else { + inputImageTokenRate = inputTokenRate + } + if pricing.OutputCostPerImageToken != nil { + outputImageTokenRate = *pricing.OutputCostPerImageToken + } else { + outputImageTokenRate = outputTokenRate + }core/bifrost.go (1)
1008-1059: Minor grammar nit in doc comments.Lines 1008 and 1061 use "a image" which should be "an image" for correct grammar.
📝 Suggested fix
-// ImageGenerationRequest sends a image generation request to the specified provider. +// ImageGenerationRequest sends an image generation request to the specified provider.-// ImageGenerationStreamRequest sends a image generation stream request to the specified provider. +// ImageGenerationStreamRequest sends an image generation stream request to the specified provider.docs/openapi/paths/integrations/openai/images.yaml (1)
46-86: Consider adding streaming documentation to Azure endpoint.The Azure endpoint description is minimal compared to the OpenAI endpoint. Since the response section includes
text/event-stream, consider adding a note about SSE streaming support similar to the main endpoint.core/schemas/images.go (2)
81-102: Streaming schema likely needs an externally-visible per-image identifier
Right nowIndexis not serialized (json:"-"). If you intend to supportn>1image streaming for clients, consider emitting either:
index(int) on the stream response, or- always populate
partial_image_indexon completed chunks too (so it’s present on every chunk).
22-38: Include ExtraParams in direct cache hash for image generation requests
ExtraParamsisjson:"-"and thus excluded from the request hash used for direct cache matching. Since providers use ExtraParams to modify output (verified in OpenAI, Gemini, Hugging Face, and other provider implementations), requests with identical parameters but different ExtraParams would incorrectly cache hit. For direct-only cache mode, this creates silent correctness issues. While ExtraParams appears in metadata for semantic caching, the hash-based direct cache should explicitly include ExtraParams to ensure cache key uniqueness.core/internal/testutil/account.go (1)
47-48: Clarify fallback source of truth for image generation (genericFallbacksvsImageGenerationFallbacks).
ComprehensiveTestConfig.Fallbacksis commented as being used “for chat, responses, image and reasoning tests” (Line 78), but you also addedImageGenerationFallbacks(Line 85). This is ambiguous and can easily lead to the harness using the wrong list.Proposed fix
- Fallbacks []schemas.Fallback // for chat, responses, image and reasoning tests + Fallbacks []schemas.Fallback // for chat, responses, and reasoning tests TextCompletionFallbacks []schemas.Fallback // for text completion tests TranscriptionFallbacks []schemas.Fallback // for transcription tests SpeechSynthesisFallbacks []schemas.Fallback // for speech synthesis tests EmbeddingFallbacks []schemas.Fallback // for embedding tests SkipReason string // Reason to skip certain tests ImageGenerationModel string // Model for image generation ImageGenerationFallbacks []schemas.Fallback // Fallbacks for image generationAlso applies to: 84-90
framework/streaming/accumulator.go (1)
293-325: Potential hot-path allocs: image de-dup key usesfmt.Sprintf("%d:%d", ...).
If image streaming is high-volume, consider a non-allocating key (e.g.,uint64(imageIndex)<<32 | uint64(uint32(chunkIndex))) withmap[uint64]struct{}.core/providers/huggingface/images.go (1)
120-135: Size parsing is tolerant (silent failure) — verify this is intentional.Unlike the Nebius case which returns errors for invalid size format, the fal-ai case silently skips setting
ImageSizewhen parsing fails. This means malformed size strings like"invalid"or"100"won't produce an error. If this is intentional (relying on fal-ai API validation), consider adding a brief comment explaining the design choice for maintainability.core/providers/openai/images.go (1)
41-69: Avoid params aliasing + mirror prompt validation in reverse conversion.
Params: &request.ImageGenerationParameterscouples the returned Bifrost request to the OpenAI request struct; alsoPromptcan be empty and still produce a non-nil Bifrost request.Proposed diff
func (request *OpenAIImageGenerationRequest) ToBifrostImageGenerationRequest() *schemas.BifrostImageGenerationRequest { if request == nil { return nil } + if request.Prompt == "" { + return nil + } @@ var params *schemas.ImageGenerationParameters if request.N != nil || request.Background != nil || request.Moderation != nil || @@ - params = &request.ImageGenerationParameters + paramsCopy := request.ImageGenerationParameters + params = ¶msCopy }
| if result.ImageGenerationStreamResponse.Index >= 0 { | ||
| chunk.ImageIndex = result.ImageGenerationStreamResponse.Index | ||
| } else { | ||
| chunk.ImageIndex = result.ImageGenerationStreamResponse.SequenceNumber | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the images.go file structure around the flagged lines
cd framework/streaming && head -n 300 images.go | tail -n 100Repository: maximhq/bifrost
Length of output: 4545
🏁 Script executed:
# Check the ImageGenerationStreamResponse type definition
rg "type.*ImageGenerationStreamResponse" --type go -A 15Repository: maximhq/bifrost
Length of output: 3495
🏁 Script executed:
# Look for how Index is typically set in providers
rg "\.Index\s*=" --type go -B 2 -A 2 | head -40Repository: maximhq/bifrost
Length of output: 2411
🏁 Script executed:
# Check the git history of the images.go file to see what changed in b136412
git log --oneline framework/streaming/images.go | head -20Repository: maximhq/bifrost
Length of output: 155
🏁 Script executed:
# Look at the actual commit to see what was changed
git show b136412 --statRepository: maximhq/bifrost
Length of output: 248
Fix the unreachable fallback logic for ImageIndex assignment.
The condition Index >= 0 (Line 259) will always evaluate to true because Index is declared as int (not *int), making its zero value indistinguishable from an explicitly set 0. This renders the fallback to SequenceNumber unreachable.
The previous review marked this as addressed in commit b136412, but that commit does not exist in the repository. The issue persists: you need to either:
- Change
Indexto*int(pointer) sonilindicates "not set" - Use a sentinel value (e.g., -1) to represent "unset"
- Add an explicit flag to track whether
Indexwas explicitly set
Without fixing this, the SequenceNumber fallback will never execute, potentially causing image indexing failures for providers that don't explicitly set the Index field.
…nto feat/image-gen
8c8fcde to
fb98017
Compare
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
docs/providers/supported-providers/anthropic.mdx (1)
26-33: Update the Note to include Image Generation in the list of unsupported operations.The new Image Generation row correctly shows it as unsupported (❌), but the Note on line 32 still only mentions "Embeddings, Speech, and Transcriptions" as unsupported operations. For consistency, Image Generation should be added to this list.
📝 Proposed fix
<Note> -**Unsupported Operations** (❌): Embeddings, Speech, and Transcriptions are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Embeddings, Speech, Transcriptions, and Image Generation are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`. </Note>docs/providers/supported-providers/ollama.mdx (1)
25-32: Update the Note to include Image Generation in the unsupported operations list.The table correctly shows Image Generation as unsupported (❌), but the Note on line 32 lists "Speech, Transcriptions, Files, and Batch" without mentioning Image Generation. For consistency, update the Note to include Image Generation.
📝 Suggested fix
<Note> -**Unsupported Operations** (❌): Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`. +**Unsupported Operations** (❌): Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`.core/internal/testutil/account.go (1)
217-253: Azure image-generation model/deployment mismatch will likely break tests.Azure has a deployment mapping for
"gpt-image-1", but the configuredImageGenerationModelis"gpt-image-1-mini". If Azure routing requires the deployment map (common), image-gen requests will fail to resolve the deployment.Proposed diff (pick one)
@@ Deployments: map[string]string{ @@ - "gpt-image-1": "gpt-image-1", + "gpt-image-1": "gpt-image-1", + "gpt-image-1-mini": "gpt-image-1-mini", @@ },or change the Azure test config to use the mapped model:
- ImageGenerationModel: "gpt-image-1-mini", + ImageGenerationModel: "gpt-image-1",Also applies to: 825-866
plugins/semanticcache/utils.go (2)
88-116: Fix possible nil deref for ImageGenerationRequest in hashing/input extraction.
BothgenerateRequestHashandgetInputForCachingassumereq.ImageGenerationRequestis non-nil for image request types. A malformed request (or a future refactor) would panic here; other cases (e.g., Speech) already guard.Proposed fix (nil-safe access)
func (plugin *Plugin) generateRequestHash(req *schemas.BifrostRequest) (string, error) { @@ switch req.RequestType { @@ case schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest: hashInput.Params = req.TranscriptionRequest.Params case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: - hashInput.Params = req.ImageGenerationRequest.Params + if req.ImageGenerationRequest != nil { + hashInput.Params = req.ImageGenerationRequest.Params + } } @@ } func (plugin *Plugin) getInputForCaching(req *schemas.BifrostRequest) interface{} { switch req.RequestType { @@ case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest: - return req.ImageGenerationRequest.Input + if req.ImageGenerationRequest == nil { + return nil + } + return req.ImageGenerationRequest.Input default: return nil } }Also applies to: 477-516
11-16: Don't store full base64 image data in metadata—vector DB limits will be exceeded.
addSingleResponsestores the entire marshaled response in metadata, and additionally stores the fullimage_b64arrays. For image generation, base64 data can easily be MBs per image, which will exceed vector DB metadata limits and consume unnecessary memory/network resources.Mitigation: For image-gen responses, avoid storing
image_b64in metadata; store onlyimage_urlsand/or persist base64 in an external blob store with references.Suggested fix (remove b64 from metadata)
// image specific metadata if res.ImageGenerationResponse != nil { var imageURLs []string - var imageB64 []string var revisedPrompts []string for _, img := range res.ImageGenerationResponse.Data { if img.URL != "" { imageURLs = append(imageURLs, img.URL) } - if img.B64JSON != "" { - imageB64 = append(imageB64, img.B64JSON) - } if img.RevisedPrompt != "" { revisedPrompts = append(revisedPrompts, img.RevisedPrompt) } } metadata["image_urls"] = imageURLs - metadata["image_b64"] = imageB64 metadata["revised_prompts"] = revisedPrompts }core/providers/huggingface/types.go (1)
32-52: Avoid “successful” unmarshal on non-model payloads (error objects, etc.).Right now, the object fallback (
var obj struct{ Models []HuggingFaceModel }) will “succeed” even when the JSON is{ "error": ... }, resulting inModels=nilwithout an error (Line 46-49). Prefer explicitly checking that themodelskey exists before accepting the object form.Proposed fix
func (r *HuggingFaceListModelsResponse) UnmarshalJSON(data []byte) error { // Try unmarshaling as an array first (most common for /api/models) var arr []HuggingFaceModel if err := sonic.Unmarshal(data, &arr); err == nil { r.Models = arr return nil } - // Fallback: try object with a `models` field - var obj struct { - Models []HuggingFaceModel `json:"models"` - } - if err := sonic.Unmarshal(data, &obj); err == nil { - r.Models = obj.Models - return nil - } + // Fallback: object with a `models` field (only accept if key exists) + var raw map[string]json.RawMessage + if err := sonic.Unmarshal(data, &raw); err == nil { + if modelsRaw, ok := raw["models"]; ok { + var models []HuggingFaceModel + if err := sonic.Unmarshal(modelsRaw, &models); err != nil { + return fmt.Errorf("failed to unmarshal HuggingFaceListModelsResponse.models: %w", err) + } + r.Models = models + return nil + } + } return fmt.Errorf("failed to unmarshal HuggingFaceListModelsResponse: unexpected JSON structure") }framework/streaming/types.go (1)
345-393: Regression: StreamTypeTranscription no longer sets RawResponse/CacheDebug.
StreamTypeChat/Responses/AudiosetExtraFields.RawResponseandExtraFields.CacheDebug(when present), butStreamTypeTranscriptionnow stops after settingRawRequest(Line 357-360). This looks like an accidental omission when inserting the newStreamTypeImagecase, and breaks parity/debuggability for transcription streams.Proposed fix
case StreamTypeTranscription: transcriptionResp := p.Data.TranscriptionOutput if transcriptionResp == nil { transcriptionResp = &schemas.BifrostTranscriptionResponse{} } resp.TranscriptionResponse = transcriptionResp resp.TranscriptionResponse.ExtraFields = schemas.BifrostResponseExtraFields{ RequestType: schemas.TranscriptionRequest, Provider: p.Provider, ModelRequested: p.Model, Latency: p.Data.Latency, } if p.RawRequest != nil { resp.TranscriptionResponse.ExtraFields.RawRequest = p.RawRequest } + if p.Data.RawResponse != nil { + resp.TranscriptionResponse.ExtraFields.RawResponse = *p.Data.RawResponse + } + if p.Data.CacheDebug != nil { + resp.TranscriptionResponse.ExtraFields.CacheDebug = p.Data.CacheDebug + } case StreamTypeImage:core/schemas/provider.go (1)
322-378: Fix grammar in ImageGeneration method comments: use "an" instead of "a" (since "image" starts with a vowel sound).All 19+ provider implementations (BedrockProvider, XAIProvider, OpenAIProvider, NebiusProvider, SGLProvider, VertexProvider, PerplexityProvider, OllamaProvider, MistralProvider, HuggingFaceProvider, GroqProvider, OpenRouterProvider, CohereProvider, AnthropicProvider, CerebrasProvider, AzureProvider, ElevenlabsProvider, GeminiProvider, ParasailProvider) have already been updated with the ImageGeneration and ImageGenerationStream methods, so no build breaker exists.
🤖 Fix all issues with AI agents
In `@core/internal/testutil/test_retry_framework.go`:
- Around line 1250-1252: The non-stream ImageGeneration path is dropping retry
conditions when converting to ImageGenerationRetryConfig; update the branch
handling "ImageGeneration" so it matches the stream behavior by either returning
the TestRetryConfig directly (like ImageGenerationStream does) or by preserving
the default conditions instead of setting Conditions:
[]ImageGenerationRetryCondition{}; ensure the returned config includes
EmptyImageGenerationCondition and GenericResponseCondition as provided by
GetTestRetryConfigForScenario() and use DefaultImageGenerationRetryConfig() or
copy its Conditions into the ImageGenerationRetryConfig conversion.
In `@core/internal/testutil/validation_presets.go`:
- Around line 222-237: Remove the leftover development comment above the
function definition in validation_presets.go; specifically delete the line "//
In validation_presets.go - add this function" so only the
ImageGenerationExpectations function and its body remain (referencing the
ImageGenerationExpectations symbol to locate the block).
In `@core/providers/gemini/gemini.go`:
- Around line 1614-1671: In ImageGeneration, avoid a nil-deref when
geminiResponse.ToBifrostImageGenerationResponse() returns (nil, nil): check if
response == nil and bifrostErr == nil and in that case create and return a
proper *schemas.BifrostError (or populate bifrostErr) before accessing
bifrostErr.ExtraFields; similarly ensure bifrostErr is non-nil before assigning
bifrostErr.ExtraFields and only set response.ExtraFields after confirming
response != nil (references: ImageGeneration,
geminiResponse.ToBifrostImageGenerationResponse, variables response and
bifrostErr).
In `@core/providers/openai/types.go`:
- Around line 546-584: Move the ImageGeneration event constants from images.go
into types.go next to the ImageGenerationEventType declaration: define
ImageGenerationPartial, ImageGenerationCompleted, and ImageGenerationError as
constants of type ImageGenerationEventType (e.g. const ImageGenerationPartial
ImageGenerationEventType = "partial", etc.), remove their definitions from
images.go, and update any references/imports if needed so code uses the
relocated constants; keep the constants in types.go immediately adjacent to the
ImageGenerationEventType type definition for consistency with other providers.
In `@core/providers/vertex/vertex.go`:
- Around line 1493-1559: The IsAllDigitsASCII (custom fine-tuned) and
IsImagenModel branches currently set authQuery from key.Value which enables
API-key auth; update those branches to stop populating authQuery and instead
always obtain an OAuth2 token via getAuthTokenSource(key) and set
req.Header.Set("Authorization", "Bearer "+token.AccessToken) (same approach
already used later), keeping the existing projectNumber/URL construction logic
via completeURL; leave the IsGeminiModel branch behavior unchanged if you want
to continue supporting API key for Gemini. Ensure authQuery remains empty for
Imagen and fine-tuned paths so the later "if authQuery != \"\" { completeURL =
fmt.Sprintf(...)} else { tokenSource... }" flow uses OAuth2 for those models.
In `@docs/openapi/schemas/integrations/openai/images.yaml`:
- Around line 21-63: Update the size enum in the images schema to match OpenAI's
supported values: remove "256x256", "512x512", "1792x1024", and "1024x1792" and
add "auto", leaving "1024x1024", "1536x1024", and "1024x1536"; ensure the
default remains or is set to "1024x1024" as documented. Keep the existing
quality and style enums if this schema is DALL·E 3-specific, but add a short
note in the schema description for the quality/style fields (quality, style)
clarifying they apply to DALL·E 3 only or expand allowed values if the schema is
meant for multiple OpenAI models. Also verify the stream field description and
the response_format/streaming contract (response_format, stream) against
OpenAI's official streaming event shape (partial_images/SSE) and update the
OpenAIImageStreamResponse types elsewhere if needed.
In `@docs/providers/supported-providers/gemini.mdx`:
- Around line 489-610: Add a brief clarifying sentence above the "Parameter
Mapping" table stating that the Gateway accepts OpenAI-compatible flat fields
(e.g., prompt, size, n, output_format) while the SDK uses the
BifrostImageGenerationRequest with nested Input and Params objects
(Input.Prompt, Params.Size, Params.N, Params.OutputFormat); keep the sentence
short and then note that the table shows how those Gateway flat fields and SDK
nested fields map internally.
♻️ Duplicate comments (30)
docs/providers/supported-providers/nebius.mdx (1)
153-156:guidance_scaletype should befloatper Nebius API specification.Per a previous review comment, the Nebius API specifies
guidance_scaleas a numeric float (range 0–100), but the documentation shows it asint. This should be updated tofloatfor accuracy.📝 Suggested fix
| Parameter | Type | Notes | |-----------|------|-------| -| `guidance_scale` | int | Guidance scale (0-100) | +| `guidance_scale` | float | Guidance scale (0-100) | | `ai_project_id` | string | Nebius project ID (added as query parameter) |core/providers/nebius/images.go (3)
12-20: Validate empty prompt and avoid pointer aliasing to caller-owned memory.The current check allows empty or whitespace-only prompts. Additionally, taking pointers to
bifrostReq.ModelandbifrostReq.Input.Promptcouples output to caller-owned memory which can lead to subtle bugs if the caller mutates the original request. Based on learnings, validate nil/empty prompts before dispatch.📝 Proposed fix
func (provider *NebiusProvider) ToNebiusImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) (*NebiusImageGenerationRequest, error) { if bifrostReq == nil || bifrostReq.Input == nil { return nil, fmt.Errorf("bifrost request is nil or input is nil") } + if strings.TrimSpace(bifrostReq.Input.Prompt) == "" { + return nil, fmt.Errorf("prompt is required") + } + model := bifrostReq.Model + prompt := bifrostReq.Input.Prompt req := &NebiusImageGenerationRequest{ - Model: &bifrostReq.Model, - Prompt: &bifrostReq.Input.Prompt, + Model: &model, + Prompt: &prompt, }
28-46: Harden size parsing: trim whitespace, accept uppercase 'X', validate positive dimensions.The current parsing doesn't handle whitespace, is case-sensitive for the delimiter, and doesn't validate that dimensions are positive.
📝 Proposed fix
if bifrostReq.Params.Size != nil { - size := strings.Split(*bifrostReq.Params.Size, "x") + rawSize := strings.TrimSpace(*bifrostReq.Params.Size) + size := strings.Split(strings.ToLower(rawSize), "x") if len(size) != 2 { - return nil, fmt.Errorf("invalid size format: expected 'WIDTHxHEIGHT', got %q", *bifrostReq.Params.Size) + return nil, fmt.Errorf("invalid size format: expected 'WIDTHxHEIGHT', got %q", rawSize) } - width, err := strconv.Atoi(size[0]) + width, err := strconv.Atoi(strings.TrimSpace(size[0])) if err != nil { - return nil, fmt.Errorf("invalid width in size %q: %w", *bifrostReq.Params.Size, err) + return nil, fmt.Errorf("invalid width in size %q: %w", rawSize, err) } - height, err := strconv.Atoi(size[1]) + height, err := strconv.Atoi(strings.TrimSpace(size[1])) if err != nil { - return nil, fmt.Errorf("invalid height in size %q: %w", *bifrostReq.Params.Size, err) + return nil, fmt.Errorf("invalid height in size %q: %w", rawSize, err) } + if width <= 0 || height <= 0 { + return nil, fmt.Errorf("invalid size %q: width and height must be > 0", rawSize) + } req.Width = &width req.Height = &height }
62-68: Enforceguidance_scalebounds (0–100) per Nebius API specification.The Nebius API expects
guidance_scaleas a numeric value in range 0–100, but the current code accepts any integer without validation.📝 Proposed fix
if bifrostReq.Params.ExtraParams != nil { // Map guidance_scale if v, ok := schemas.SafeExtractIntPointer(bifrostReq.Params.ExtraParams["guidance_scale"]); ok { + if v != nil && (*v < 0 || *v > 100) { + return nil, fmt.Errorf("invalid guidance_scale %d: must be between 0 and 100", *v) + } req.GuidanceScale = v } }core/providers/gemini/transcription.go (1)
36-41: Base64 decoding correctly added for Gemini inline audio data.The change properly decodes base64-encoded
InlineData.Datato raw bytes before aggregating audio data. This aligns with Gemini's API format where inline data is base64-encoded on the wire.Note: The silent
continueon decode failure was flagged in a previous review and escalated to maintainers for evaluation.core/providers/openai/openai_test.go (1)
43-82: All other test scenarios are disabled - please re-enable before merging.This configuration disables comprehensive OpenAI provider testing (chat, text completion, tool calls, embeddings, transcription, etc.) and only runs image generation tests. While this may be intentional for focused development, it should be reverted to maintain full test coverage before merging to
main.Compare with the HuggingFace test in this same PR where most scenarios remain enabled alongside the new image generation scenarios.
🔧 Re-enable scenarios before merge
Scenarios: testutil.TestScenarios{ - TextCompletion: false, - TextCompletionStream: false, - SimpleChat: false, - CompletionStream: false, - MultiTurnConversation: false, - ToolCalls: false, - ToolCallsStreaming: false, - MultipleToolCalls: false, - End2EndToolCalling: false, - AutomaticFunctionCall: false, - ImageURL: false, - ImageBase64: false, - MultipleImages: false, - FileBase64: false, - FileURL: false, - CompleteEnd2End: false, - SpeechSynthesis: false, - SpeechSynthesisStream: false, - Transcription: false, - TranscriptionStream: false, - Embedding: false, - Reasoning: false, - ListModels: false, + TextCompletion: true, + TextCompletionStream: true, + SimpleChat: true, + CompletionStream: true, + MultiTurnConversation: true, + ToolCalls: true, + ToolCallsStreaming: true, + MultipleToolCalls: true, + End2EndToolCalling: true, + AutomaticFunctionCall: true, + ImageURL: true, + ImageBase64: true, + MultipleImages: true, + FileBase64: true, + FileURL: true, + CompleteEnd2End: true, + SpeechSynthesis: true, + SpeechSynthesisStream: true, + Transcription: true, + TranscriptionStream: true, + Embedding: true, + Reasoning: true, + ListModels: true, ImageGeneration: true, ImageGenerationStream: true, - BatchCreate: false, - BatchList: false, - BatchRetrieve: false, - BatchCancel: false, - BatchResults: false, - FileUpload: false, - FileList: false, - FileRetrieve: false, - FileDelete: false, - FileContent: false, - FileBatchInput: false, - CountTokens: false, - ChatAudio: false, - StructuredOutputs: false, + BatchCreate: true, + BatchList: true, + BatchRetrieve: true, + BatchCancel: true, + BatchResults: true, + FileUpload: true, + FileList: true, + FileRetrieve: true, + FileDelete: true, + FileContent: true, + FileBatchInput: true, + CountTokens: true, + ChatAudio: true, + StructuredOutputs: true, },core/providers/vertex/vertex.go (1)
1420-1431: Add nil-request guard to prevent panic onrequest.Modelaccess.The method accesses
request.Modelat line 1427 without first checking ifrequestis nil. Other providers in the codebase (e.g., OpenAI) include this guard. Based on learnings, validation should occur before dispatch.Proposed fix
func (provider *VertexProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { providerName := provider.GetProviderKey() + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, providerName) + } + if key.VertexKeyConfig == nil { return nil, providerUtils.NewConfigurationError("vertex key config is not set", providerName) } deployment := provider.getModelDeployment(key, request.Model)core/internal/testutil/response_validation.go (1)
1096-1148: Fixmin_imagesconversion: avoid int overflow + don’t record “warnings” inErrors.
This still does uncheckedint64/float64 -> intnarrowing and appends parse issues toresult.Errorswhile “skipping check” (and without failing), which is inconsistent and may re-trigger CodeQL.Proposed fix (range-check + consistent warning path)
func validateImageGenerationFields(t *testing.T, response *schemas.BifrostImageGenerationResponse, expectations ResponseExpectations, result *ValidationResult) { @@ if expectations.ProviderSpecific != nil { if minImagesVal, ok := expectations.ProviderSpecific["min_images"]; ok { var minImages int var parseErr error + maxInt := int64(^uint(0) >> 1) + minInt := -maxInt - 1 // Use type switch to handle various numeric types switch v := minImagesVal.(type) { case int: minImages = v case int64: - minImages = int(v) + if v > maxInt || v < minInt { + parseErr = fmt.Errorf("min_images out of int range: %d", v) + } else { + minImages = int(v) + } case float64: - minImages = int(v) + if v > float64(maxInt) || v < float64(minInt) || v != float64(int64(v)) { + parseErr = fmt.Errorf("min_images must be an integer within int range: %v", v) + } else { + minImages = int(v) + } case json.Number: var parsed int64 parsed, parseErr = v.Int64() if parseErr == nil { - minImages = int(parsed) + if parsed > maxInt || parsed < minInt { + parseErr = fmt.Errorf("min_images out of int range: %d", parsed) + } else { + minImages = int(parsed) + } } default: parseErr = fmt.Errorf("unsupported type for min_images: %T", v) } if parseErr != nil { - // Skip the min_images check if conversion fails, but record a warning - result.Errors = append(result.Errors, - fmt.Sprintf("Failed to parse min_images: %v (skipping check)", parseErr)) - } else { + // Skip the min_images check if conversion fails, but record a warning + result.Warnings = append(result.Warnings, + fmt.Sprintf("Failed to parse min_images: %v (skipping check)", parseErr)) + return + } + + if minImages > 0 { actualCount := len(response.Data) result.MetricsCollected["image_count"] = actualCount if actualCount < minImages { result.Passed = false result.Errors = append(result.Errors, fmt.Sprintf("Too few images: got %d, expected at least %d", actualCount, minImages)) } } } }tests/integrations/python/tests/test_google.py (1)
1687-1787: Fix cross-provider skipping, narrow exception handling, and make response inspection format-safe.This block reintroduces issues already called out in prior reviews: unused
test_config(ARG002), broadexcept Exception(BLE001), and provider-specific API key gating (decorator checks only"google"). Also,test_41ccan blow up ifresponseis dict-shaped (your own comment says validation supports dict/object).Proposed diff (consolidated)
@@ - def test_41a_image_generation_simple(self, test_config, provider, model): + def test_41a_image_generation_simple(self, _test_config, provider, model): @@ - client = get_provider_google_client(provider) + # Skip cleanly if this provider isn't configured in env for this run + try: + get_api_key(provider) + except ValueError: + pytest.skip(f"No API key available for provider {provider}") + + client = get_provider_google_client(provider) @@ - def test_41b_imagen_predict(self, test_config, provider, model): + def test_41b_imagen_predict(self, _test_config, provider, model): @@ - client = get_provider_google_client(provider) + try: + get_api_key(provider) + except ValueError: + pytest.skip(f"No API key available for provider {provider}") + + client = get_provider_google_client(provider) @@ - except Exception as e: + except requests.exceptions.RequestException as e: # Imagen may not be available in all regions or configurations pytest.skip(f"Imagen generation failed: {e}") + except (json.JSONDecodeError, ValueError) as e: + pytest.skip(f"Imagen generation returned an invalid response: {e}") @@ - def test_41c_image_generation_with_text(self, test_config, provider, model): + def test_41c_image_generation_with_text(self, _test_config, provider, model): @@ - client = get_provider_google_client(provider) + try: + get_api_key(provider) + except ValueError: + pytest.skip(f"No API key available for provider {provider}") + + client = get_provider_google_client(provider) @@ - for candidate in response.candidates: - if hasattr(candidate, "content") and candidate.content: - if hasattr(candidate.content, "parts") and candidate.content.parts: - for part in candidate.content.parts: - if hasattr(part, "text") and part.text: - found_text = True - if hasattr(part, "inline_data") and part.inline_data: - found_image = True + candidates = [] + if isinstance(response, dict): + candidates = response.get("candidates") or [] + else: + candidates = getattr(response, "candidates", None) or [] + + for candidate in candidates: + content = candidate.get("content") if isinstance(candidate, dict) else getattr(candidate, "content", None) + parts = (content.get("parts") if isinstance(content, dict) else getattr(content, "parts", None)) or [] + for part in parts: + text = part.get("text") if isinstance(part, dict) else getattr(part, "text", None) + inline_data = part.get("inline_data") if isinstance(part, dict) else getattr(part, "inline_data", None) + if text: + found_text = True + if inline_data: + found_image = TrueAlso, based on learnings: for OpenAI image-generation scenarios under
tests/integrations/python, prefergpt-image-1viaproviders.openai.image_generationfor determinism and to avoid DALL·E-3 parameter constraints. Based on learnings, please ensure the config/fixtures for this scenario follow that.Google GenAI Python SDK: is `types.GenerateContentConfig(response_modalities=["IMAGE"|"TEXT"])` the correct way to request image generation, and what are the response fields for image bytes (`inline_data` vs `inlineData`)?framework/streaming/images.go (2)
259-263: Unreachable fallback:Index >= 0is always true for int type.The condition on line 259 will always be true since
Indexis declared asint(not*int), making its zero value (0) indistinguishable from an explicitly set 0. The fallback toSequenceNumberon line 262 is effectively dead code.This was flagged in a previous review but appears unresolved. Consider changing
Indexto*intin the schema or using a sentinel value like -1 to indicate "not set."
84-95: Multi-chunk reconstruction drops URL and usestime.Now()instead of provider timestamp.In the multi-chunk fallback path:
- URL is lost: Lines 90-94 only populate
B64JSON,Index, andRevisedPrompt— if a provider streams URL data, it's discarded.- Timestamp is fabricated: Line 108 sets
Created: time.Now().Unix()instead of preserving the provider'sCreatedAtfrom chunks.The single-chunk path (lines 22, 27) correctly preserves both values. This was flagged in a previous review.
Suggested approach
Track URL and CreatedAt while iterating chunks, similar to how
revisedPromptsis tracked:images := make(map[int]*strings.Builder) +imageURLs := make(map[int]string) var model string var revisedPrompts map[int]string = make(map[int]string) +var createdAt int64 for _, chunk := range chunks { // ... existing code ... + if createdAt == 0 && chunk.Delta.CreatedAt != 0 { + createdAt = chunk.Delta.CreatedAt + } + if chunk.Delta.URL != "" { + imageURLs[chunk.ImageIndex] = chunk.Delta.URL + } } // In imageData construction: imageData = append(imageData, schemas.ImageData{ B64JSON: builder.String(), + URL: imageURLs[imageIndex], Index: imageIndex, RevisedPrompt: revisedPrompts[imageIndex], }) // For Created: -Created: time.Now().Unix(), +Created: func() int64 { if createdAt != 0 { return createdAt }; return time.Now().Unix() }(),Also applies to: 106-111
core/providers/azure/azure.go (2)
1212-1252: Add nil guard forrequestbefore accessingrequest.Model.Line 1222 accesses
request.Modelbefore validating thatrequestis non-nil. If a nil request is passed, this will panic. This was flagged in a previous review.While upstream validation in
core/bifrost.gomay catch this in normal flows, defensive programming at the provider level prevents crashes from direct provider calls.Suggested fix
func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, provider.GetProviderKey()) + } // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err }
1257-1312: Add nil guard forrequestin streaming method and clean up stray comment.
Nil request: Line 1270 accesses
request.Modelwithout checking ifrequestis nil, same issue as the sync method.Stray comment: Line 1269 has an orphaned
//comment that should be removed.postResponseConverter: Good addition (lines 1280-1285) — this addresses the previous review about stamping
ModelDeploymenton streamed chunks.Suggested fix
func (provider *AzureProvider) ImageGenerationStream( ctx *schemas.BifrostContext, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostImageGenerationRequest, ) (chan *schemas.BifrostStream, *schemas.BifrostError) { + if request == nil { + return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, provider.GetProviderKey()) + } // Validate api key configs if err := provider.validateKeyConfig(key); err != nil { return nil, err } - // deployment := key.AzureKeyConfig.Deployments[request.Model]core/providers/openai/openai.go (1)
2876-2878: Early return on firstcompletedevent drops remaining images forn > 1requests.The handler returns immediately upon the first
ImageGenerationCompletedevent (lines 2876-2878), which closes the stream and drops subsequent images whenn > 1. TheincompleteImagestracking map is correctly maintained (line 2764 removes completed images, line 2767 tracks started images) but isn't used to determine when all images have finished.Proposed fix
if isCompleted { + // Only exit when all images have completed + if len(incompleteImages) == 0 { + return + } + // More images still in progress, continue processing - return }tests/integrations/python/tests/test_openai.py (2)
1170-1258: Cross-provider gating still looks brittle for n>1 / quality / size.The new cases run against every
(provider, model)in theimage_generationscenario, but onlytest_52bhas a skip—and it’s keyed only onmodel(Line 1199-1201), notprovider+model. Also,quality="low"(Line 1229) andsize="1024x1536"(Line 1253-1254) may not be universally supported across all providers included in that scenario.This overlaps with prior review feedback on capability gating.
OpenAI Images API (gpt-image-1) supported values: does it support quality="low" and size="1024x1536", and do other providers behind an OpenAI-compatible /images.generate endpoint accept the same parameters?Minimal robustness tweak for the Gemini skip
- if model == "gemini-2.5-flash-image": + if provider == "gemini" and model == "gemini-2.5-flash-image": pytest.skip("Gemini 2.5 flash image does not support multiple images")
1173-1254: Silence Ruff ARG002 for the newly added tests.Ruff flags the unused
test_configparam on the 52a–52d methods. If you don’t need it, rename it to_test_configfor these tests.Proposed fix
- def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled): + def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled): ... - def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled): + def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled): ... - def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled): + def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled): ... - def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled): + def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):core/providers/gemini/gemini.go (2)
1673-1748: Add nil-guard afterimagenResponse.ToBifrostImageGenerationResponse()before settingExtraFields.
Line 1733-1737 dereferencesresponseunconditionally; if the converter returnsnil, this will panic.Proposed fix
// Convert to Bifrost format response := imagenResponse.ToBifrostImageGenerationResponse() + if response == nil { + err := providerUtils.NewBifrostOperationError( + schemas.ErrProviderResponseDecode, + fmt.Errorf("failed to convert Imagen response to Bifrost format"), + providerName, + ) + err.ExtraFields = schemas.BifrostErrorExtraFields{ + Provider: providerName, + ModelRequested: request.Model, + RequestType: schemas.ImageGenerationRequest, + } + return nil, err + } response.ExtraFields.Provider = providerName
1673-1688: Nit: comment says “Vertex AI endpoint with API key auth”, but the default BaseURL is Generative Language API.
If this path truly targets Generative Language.../v1beta/models/{model}:predictwithx-goog-api-key, consider rewording the comment to avoid confusion.core/providers/gemini/images.go (3)
12-94: Guard against nilpartwhen extracting prompt fromcontent.Parts.
Line 81-83 can panic ifcontent.Partscontainsnil.Proposed fix
- for _, part := range content.Parts { - if part.Text != "" { + for _, part := range content.Parts { + if part == nil { + continue + } + if part.Text != "" { bifrostReq.Input.Prompt = part.Text break } }
96-127: NormalizeimageSizeand support “4k” (case-insensitive).
Right now this only matches"1k"/"2k"exactly;"1K","2K","4K"will fall back to 1024.Proposed fix
func convertImagenFormatToSize(imageSize *string, aspectRatio *string) string { // Default size based on imageSize parameter baseSize := 1024 if imageSize != nil { - switch *imageSize { - case "2k": + switch strings.ToLower(strings.TrimSpace(*imageSize)) { + case "4k": + baseSize = 4096 + case "2k": baseSize = 2048 - case "1k": + case "1k": baseSize = 1024 } }
129-187: Don’t silently succeed with emptyData; add nil receiver + candidate nil guards.
Today this can return a “success” response withData: []if the provider returns no inline image parts (or ifresponse/candidateis nil). That’s hard to distinguish from a real success.Proposed fix (minimal hardening)
func (response *GenerateContentResponse) ToBifrostImageGenerationResponse() (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if response == nil { + return nil, &schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{Message: "nil Gemini response"}, + } + } bifrostResp := &schemas.BifrostImageGenerationResponse{ ID: response.ResponseID, Model: response.ModelVersion, Data: []schemas.ImageData{}, } @@ if len(response.Candidates) > 0 { - candidate := response.Candidates[0] + candidate := response.Candidates[0] + if candidate == nil { + return nil, &schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{Message: "empty Gemini candidate"}, + } + } if candidate.Content != nil && len(candidate.Content.Parts) > 0 { @@ - if part != nil && part.InlineData != nil { + if part != nil && part.InlineData != nil && part.InlineData.Data != "" { imageData = append(imageData, schemas.ImageData{ B64JSON: string(part.InlineData.Data), }) @@ // Only assign imageData when it has elements if len(imageData) > 0 { bifrostResp.Data = imageData @@ + } else { + return nil, &schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{Message: "no image data found in Gemini response"}, + } }docs/openapi/schemas/inference/images.yaml (3)
1-100: Verify request contract vs actual handler shape (flat model/prompt vs provider/input/params).
This schema documents a flat request withmodel(asprovider/model) +prompt. Please confirm that the HTTP inference endpoint actually accepts this shape (or update either the handler or the schema to match).#!/bin/bash set -euo pipefail # Find the images inference route handler and see what it unmarshals into. rg -n --type=go 'images/generations|ImageGenerationRequest|BifrostImageGenerationRequest' -S transports core # Compare core schema shape. rg -n --type=go 'type\s+BifrostImageGenerationRequest\b' -S core/schemas
178-204: Usage detail fields: confirm whether ImageTokenDetails needs additional counters (e.g., n_images).
Right now onlyimage_tokensandtext_tokensare documented; if the implementation returns more (or uses different casing), align the schema.#!/bin/bash set -euo pipefail rg -n --type=go 'type\s+Image(TokenDetails|Usage)\b' -S core/schemas rg -n --type=go 'OutputTokensDetails|InputTokensDetails|NImages|n_images' -S core
80-86: Enum drift risk: output_format likely needs"jpg"in addition to"jpeg".
If providers accept/normalize"jpg"↔"jpeg", document both to avoid client-side validation failures.Also applies to: 122-128, 252-258
core/internal/testutil/test_retry_framework.go (3)
980-995: DefaultImageGenerationRetryConfig returns TestRetryConfig/TestRetryCondition, but image-gen retry uses ImageGenerationRetryConfig/ImageGenerationRetryCondition.
As-is, “default” image-generation conditions here won’t flow intoWithImageGenerationRetry(...)unless there’s an explicit conversion step elsewhere. This is easy to silently misconfigure (conditions never evaluated).Proposed direction (pick one)
- Make the default return the typed config (and add a separate
DefaultImageGenerationTestRetryConfigif you still need the generic one), or- Keep returning TestRetryConfig, but add an explicit adapter that copies timing/hook fields into
ImageGenerationRetryConfig(and either converts conditions or intentionally leaves them empty per local convention).-// DefaultImageGenerationRetryConfig creates a retry config for image tests -func DefaultImageGenerationRetryConfig() TestRetryConfig { - return TestRetryConfig{ ... } -} +// DefaultImageGenerationRetryConfig creates a retry config for image generation tests +func DefaultImageGenerationRetryConfig() ImageGenerationRetryConfig { + return ImageGenerationRetryConfig{ + MaxAttempts: 10, + BaseDelay: 2000 * time.Millisecond, + MaxDelay: 10 * time.Second, + Conditions: []ImageGenerationRetryCondition{ + &EmptyImageGenerationCondition{}, + &GenericImageGenerationCondition{}, + }, + OnRetry: func(attempt int, reason string, t *testing.T) { + t.Logf("🔄 Retrying image generation test (attempt %d): %s", attempt, reason) + }, + } +}Based on learnings, typed retry configs in
core/internal/testutilare usually constructed fromGetTestRetryConfigForScenario()by copying timing/hook fields and leavingConditionsempty.
2097-2260: Fix false-pass:(nil response, nil err)currently exits retries and returns(nil, nil).
Ifoperation()ever returns(nil, nil), this can incorrectly look like success to callers (and bypassOnFinalFail).Minimal defensive fix
// Execute the operation response, err := operation() lastResponse = response lastError = err + // Treat nil response as failure (even if err is nil) to avoid false-pass. + if response == nil { + if attempt < config.MaxAttempts { + retryReason := "❌ image generation response is nil" + if config.OnRetry != nil { + config.OnRetry(attempt, retryReason, t) + } + delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay) + time.Sleep(delay) + continue + } + statusCode := 400 + return nil, &schemas.BifrostError{ + IsBifrostError: true, + StatusCode: &statusCode, + Error: &schemas.ErrorField{ + Message: fmt.Sprintf("❌ image generation response is nil after %d attempts", attempt), + }, + } + } - - // If we have a response, validate it FIRST - if response != nil { + // If we have a response, validate it FIRST + if response != nil {
2947-3095: Naming/doc mismatch: comment saysWithImageGenerationStreamValidationRetry, function isWithImageGenerationStreamRetry.
Either rename the function to match the “*StreamValidationRetry” naming used elsewhere, or update the comment header.Smallest fix (doc-only)
-// WithImageGenerationStreamValidationRetry wraps an image generation streaming operation with retry logic that includes stream content validation +// WithImageGenerationStreamRetry wraps an image generation streaming operation with retry logic that includes stream content validationtransports/bifrost-http/integrations/router.go (1)
1290-1291: Potential nil-pointer panic: ImageGenerationStreamResponseConverter is invoked unconditionally.
IfStreamConfigexists butImageGenerationStreamResponseConverteris nil, this will panic mid-stream. Prefer validating at startup or guarding before invocation.Minimal guard (local)
case chunk.BifrostImageGenerationStreamResponse != nil: + if config.StreamConfig.ImageGenerationStreamResponseConverter == nil { + log.Printf("missing ImageGenerationStreamResponseConverter for route %s", config.Path) + continue + } eventType, convertedResponse, err = config.StreamConfig.ImageGenerationStreamResponseConverter(bifrostCtx, chunk.BifrostImageGenerationStreamResponse)tests/integrations/python/tests/utils/common.py (1)
2629-2705: LGTM (leaving current validation strictness as-is).
This is a solid cross-provider “shape” validator. There are known edge cases around Google casing/base64 strictness; since we already decided not to tighten those checks right now, I’d keep it unchanged.plugins/logging/main.go (1)
34-48: PooledUpdateLogData: ensure new fields are zeroed onputUpdateLogDatato avoid cross-request leakage.
ImageGenerationOutput,RawRequest, andRawResponseare request-specific and must be cleared when returning to the pool.#!/bin/bash set -euo pipefail # Find pool put/reset logic and confirm new fields are cleared. rg -n --hidden --glob '!**/vendor/**' 'putUpdateLogData\(' -n -C 3 echo "----" # Show the full reset block(s) for UpdateLogData. rg -n --hidden --glob '!**/vendor/**' 'data\.(Status|TokenUsage|Cost|ChatOutput|ResponsesOutput|EmbeddingOutput|ErrorDetails|SpeechOutput|TranscriptionOutput|ImageGenerationOutput|RawRequest|RawResponse)\s*=' -n -C 2
🧹 Nitpick comments (15)
core/providers/bedrock/bedrock.go (1)
1347-1355: LGTM!The image generation stubs correctly return
UnsupportedOperationErrorwith appropriate request types. Usingprovider.GetProviderKey()is the correct approach as it properly handles custom provider configurations.Minor observation: The existing Speech/Transcription stubs (lines 1328-1345) use
schemas.Bedrockdirectly while these new methods useprovider.GetProviderKey(). The new approach is more correct. Consider updating the existing stubs in a follow-up for consistency:// Speech is not supported by the Bedrock provider. func (provider *BedrockProvider) Speech(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostSpeechRequest) (*schemas.BifrostSpeechResponse, *schemas.BifrostError) { - return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, schemas.Bedrock) + return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey()) }framework/modelcatalog/pricing.go (1)
379-419: Consider extracting the tiered rate selection logic.The
isAbove200k,isAbove128k, and base-tier branches for image token rates are nearly identical (all fall back toInputCostPerImageToken→inputTokenRate). This pattern is repeated, which could be simplified.♻️ Optional refactor to reduce duplication
// Helper to get image token rate with fallback getImageTokenRate := func(imageRate *float64, fallback float64) float64 { if imageRate != nil { return *imageRate } return fallback } inputImageTokenRate = getImageTokenRate(pricing.InputCostPerImageToken, inputTokenRate) outputImageTokenRate = getImageTokenRate(pricing.OutputCostPerImageToken, outputTokenRate)This removes the three near-identical branches since tiered image token rates (
InputCostPerImageTokenAbove128kTokens, etc.) don't exist yet inTableModelPricing.core/providers/gemini/responses.go (1)
1598-1661: Add defensive guards fordata:prefixes in inline image encoding for consistency.The codebase documents
Blob.Dataas base64-encoded bytes (withoutdata:prefix), and all creation paths confirm this contract. However, the file handler (line 1649–1651) includes a guard against malformed data, while the image paths (lines 1612 and 1887) don't. This inconsistency creates maintenance risk if specifications change or edge cases emerge.Apply the same guard to images and candidates inline-data blocks to match the file handler pattern:
Image handler guard
if isImageMimeType(mimeType) { - imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, blob.Data) + imageURL := blob.Data + if !strings.HasPrefix(imageURL, "data:") { + imageURL = fmt.Sprintf("data:%s;base64,%s", mimeType, imageURL) + } return &schemas.ResponsesMessageContentBlock{Candidate inline-image handler guard
if strings.HasPrefix(part.InlineData.MIMEType, "image/") { return &schemas.ResponsesInputMessageContentBlockImage{ - ImageURL: schemas.Ptr("data:" + part.InlineData.MIMEType + ";base64," + part.InlineData.Data), + ImageURL: func() *string { + data := part.InlineData.Data + if strings.HasPrefix(data, "data:") { + return schemas.Ptr(data) + } + return schemas.Ptr("data:" + part.InlineData.MIMEType + ";base64," + data) + }(),Also applies to: 1872–1904
core/internal/testutil/image_generation.go (2)
21-151: Non-stream image validation may be too strict on format/dimensions across providers.Right now you hard-require successful
image.DecodeConfig+ exact1024x1024whenB64JSONis present. That can become flaky if a provider returns a format you don’t have a decoder for (or resizes/crops slightly). Consider making the decode/dimension check conditional (log + continue) while still enforcing “non-empty image bytes”.
153-296: Align stream retry config with the directory’s typed-config convention (and keep event typing consistent).You already do the typed
ImageGenerationRetryConfig{ Conditions: []...{} }pattern for non-stream. For consistency, consider doing the same for stream retry config (ifWithImageGenerationStreamRetrysupports it), instead of passing the generic config through. Based on learnings, this convention is expected incore/internal/testutil/*.Also: nice job keying completion off
openai.ImageGenerationCompleted— that helps keep the stack consistent on event types.core/providers/openai/openai.go (2)
2684-2692: Add explicit[DONE]marker handling for clean stream termination.Unlike all other streaming handlers in this file (chat completion at line 947, text completion at line 487, responses at line 1433, speech at line 1913, transcription at line 2295), this handler does not explicitly check for the
data: [DONE]stream termination marker. Currently,[DONE]would failsonic.Unmarshaland log a warning rather than cleanly terminating.♻️ Proposed fix
jsonData := strings.TrimSpace(strings.TrimPrefix(line, "data:")) if jsonData == "" { continue } + + // Check for end of stream + if jsonData == "[DONE]" { + break + }
2881-2884: Add context error check before processing scanner errors (consistency with other handlers).Other streaming handlers (e.g., transcription at lines 2368-2372, speech at lines 1991-1995) check
ctx.Err() != niland return early to let the defer block handle cancellation/timeout. This handler processes scanner errors without that check, potentially leading to duplicate error handling.♻️ Proposed fix for consistency
if err := scanner.Err(); err != nil { + // If context was cancelled/timed out, let defer handle it + if ctx.Err() != nil { + return + } + ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true) logger.Warn(fmt.Sprintf("Error reading stream: %v", err)) providerUtils.ProcessAndSendError(ctx, postHookRunner, err, responseChan, schemas.ImageGenerationStreamRequest, providerName, request.Model, logger) }core/bifrost.go (3)
1008-1059: ImageGenerationRequest: good wiring; fix comment typo + consider consistent ctx fallback.Implementation matches the established pattern (populate pooled
BifrostRequest→handleRequest→ unwrap typed response). Minor: docstring should be “an image generation request”. Optionally mirror other public APIs by defaultingctxtobifrost.ctxearly (today it relies onhandleRequestdoing so).Proposed small polish
-// ImageGenerationRequest sends a image generation request to the specified provider. +// ImageGenerationRequest sends an image generation request to the specified provider. func (bifrost *Bifrost) ImageGenerationRequest(ctx *schemas.BifrostContext, req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { + if ctx == nil { + ctx = bifrost.ctx + } if req == nil { return nil, &schemas.BifrostError{
1061-1094: ImageGenerationStreamRequest: consistent with stream pipeline; fix comment typo.
Same as other stream entrypoints (setsRequestTypeand reusesImageGenerationRequestpayload). Minor: docstring should be “an image generation stream request”.
2858-2873: Streaming short-circuit: image-gen mapping looks correct; add nil-guard symmetry.
The new mapping into/out ofBifrostResponse.ImageGenerationStreamResponseis correct. To match the other fields, consider settingstreamResponse.BifrostImageGenerationStreamResponseonly when non-nil (helps keep payload minimal + consistent).Small consistency tweak
if processedResponse != nil { streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse streamResponse.BifrostChatResponse = processedResponse.ChatResponse streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse - streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse + if processedResponse.ImageGenerationStreamResponse != nil { + streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse + } }core/providers/gemini/images.go (1)
189-256: Label extraction likely won’t work withmap[string]any(common after JSON unmarshal).
labels.(map[string]string)will usually fail unless the map was constructed with that exact type. Consider acceptingmap[string]anyand stringifying values.core/internal/testutil/test_retry_framework.go (1)
247-255: ImageGenerationRetryConfig shape looks fine and matches the other typed configs.Consider whether you want a shared “base retry config” struct embedded in all typed configs to reduce repetition (optional).
transports/bifrost-http/integrations/router.go (1)
278-309: Consider adding fail-fast validation for route-type-specific converters in RegisterRoutes.The ImageGenerationResponseConverter now has a runtime guard (line 717), but other converters marked "SHOULD NOT BE NIL" (ListModelsResponseConverter, TextResponseConverter, ChatResponseConverter, ResponsesResponseConverter, EmbeddingResponseConverter, TranscriptionResponseConverter) are called without nil checks. While ImageGenerationResponseConverter and CountTokensResponseConverter have runtime guards, adding upfront validation in RegisterRoutes for all required response converters would catch configuration errors at startup rather than during request handling—especially valuable in a multi-PR stack where routes and configs may be split across PRs.
tests/integrations/python/tests/utils/common.py (1)
2624-2627: Add a short note that OpenAI image-gen tests should prefergpt-image-1for determinism.
These prompts are fine; adding a 1–2 line comment near them (or in the tests that consume them) to standardize on the config keyproviders.openai.image_generation = gpt-image-1will help avoid DALLE-3 parameter limitations in integration tests. Based on learnings, prefergpt-image-1for image-generation scenarios.plugins/logging/main.go (1)
524-533: Simplify image-generation token usage:TotalTokensis guaranteed populated here.
You can directly useresult.ImageGenerationResponse.Usage.TotalTokenswithout recomputing/fallback. Based on learnings, this invariant should hold forImageUsage.Proposed diff
case result.ImageGenerationResponse != nil && result.ImageGenerationResponse.Usage != nil: usage = &schemas.BifrostLLMUsage{} usage.PromptTokens = result.ImageGenerationResponse.Usage.InputTokens usage.CompletionTokens = result.ImageGenerationResponse.Usage.OutputTokens - if result.ImageGenerationResponse.Usage.TotalTokens > 0 { - usage.TotalTokens = result.ImageGenerationResponse.Usage.TotalTokens - } else { - usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens - } + usage.TotalTokens = result.ImageGenerationResponse.Usage.TotalTokens
| case "ImageGeneration", "ImageGenerationStream": | ||
| return DefaultImageGenerationRetryConfig() | ||
| default: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n --type=go '\bRunImageGeneration(Stream)?Test\b' core/internal/testutil
rg -n --type=go '\bWithImageGenerationRetry\s*\(' core/internal/testutil
rg -n --type=go '\bWithImageGenerationStreamRetry\s*\(' core/internal/testutil
rg -n --type=go '\bGetTestRetryConfigForScenario\s*\(' core/internal/testutilRepository: maximhq/bifrost
Length of output: 6799
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the GetTestRetryConfigForScenario implementation, especially the switch case
sed -n '1218,1350p' core/internal/testutil/test_retry_framework.go
# Check the specific case lines mentioned in the review
sed -n '1245,1260p' core/internal/testutil/test_retry_framework.goRepository: maximhq/bifrost
Length of output: 6460
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check DefaultImageGenerationRetryConfig definition
rg -n --type=go 'func DefaultImageGenerationRetryConfig' core/internal/testutil
# Check WithImageGenerationRetry signature
sed -n '2097,2110p' core/internal/testutil/test_retry_framework.go
# Check WithImageGenerationStreamRetry signature
sed -n '2949,2962p' core/internal/testutil/test_retry_framework.go
# Check how it's used in image_generation.go
sed -n '30,45p' core/internal/testutil/image_generation.go
sed -n '165,180p' core/internal/testutil/image_generation.goRepository: maximhq/bifrost
Length of output: 2248
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the full signature and type def for DefaultImageGenerationRetryConfig
sed -n '981,1050p' core/internal/testutil/test_retry_framework.go
# Check ImageGenerationRetryConfig type definition
rg -n --type=go 'type ImageGenerationRetryConfig' core/internal/testutil
# Check if there's a type assertion or conversion in image_generation.go
sed -n '30,100p' core/internal/testutil/image_generation.go
# Check around line 86 in image_generation.go where WithImageGenerationRetry is called
sed -n '80,95p' core/internal/testutil/image_generation.goRepository: maximhq/bifrost
Length of output: 5352
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the ImageGenerationStream handling around line 170 in image_generation.go
sed -n '165,220p' core/internal/testutil/image_generation.go
# Check if there's any conversion for the stream version
rg -n --type=go -A 20 'GetTestRetryConfigForScenario.*ImageGenerationStream' core/internal/testutil/image_generation.goRepository: maximhq/bifrost
Length of output: 2668
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check ImageGenerationRetryConfig type definition
sed -n '248,280p' core/internal/testutil/test_retry_framework.go
# Verify TestRetryConfig has OnFinalFail field that's being used in the conversion
sed -n '200,240p' core/internal/testutil/test_retry_framework.goRepository: maximhq/bifrost
Length of output: 5305
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check TestRetryConfig definition to see if it has the same fields
sed -n '190,240p' core/internal/testutil/test_retry_framework.goRepository: maximhq/bifrost
Length of output: 3638
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find TestRetryConfig definition
rg -n --type=go 'type TestRetryConfig struct' core/internal/testutil/test_retry_framework.go
# Check if there's an ImageGenerationStreamRetryConfig type
rg -n --type=go 'type ImageGenerationStreamRetryConfig' core/internal/testutil
# Check the complete signature of WithImageGenerationStreamRetry once more
sed -n '2949,2960p' core/internal/testutil/test_retry_framework.goRepository: maximhq/bifrost
Length of output: 92
🏁 Script executed:
#!/bin/bash
set -euo kapipefail
# Get the full TestRetryConfig definition
sed -n '188,196p' core/internal/testutil/test_retry_framework.go
# Now check the actual call to WithImageGenerationStreamRetry with the config being passed
sed -n '195,210p' core/internal/testutil/image_generation.goRepository: maximhq/bifrost
Length of output: 114
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Get the full TestRetryConfig definition
sed -n '188,196p' core/internal/testutil/test_retry_framework.go
# Now check the actual call to WithImageGenerationStreamRetry with the config being passed
sed -n '195,210p' core/internal/testutil/image_generation.goRepository: maximhq/bifrost
Length of output: 1207
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if conditions are being used in WithImageGenerationRetry
sed -n '2097,2150p' core/internal/testutil/test_retry_framework.go | head -60
# Check if conditions are being used in WithImageGenerationStreamRetry
sed -n '2949,3000p' core/internal/testutil/test_retry_framework.go | head -60Repository: maximhq/bifrost
Length of output: 4001
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check what's in config.Conditions for WithImageGenerationRetry
sed -n '2115,2130p' core/internal/testutil/test_retry_framework.go
# See if checkImageGenerationRetryConditions is actually used
sed -n '2120,2165p' core/internal/testutil/test_retry_framework.goRepository: maximhq/bifrost
Length of output: 2767
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify the full conversion code in image_generation.go
sed -n '45,70p' core/internal/testutil/image_generation.go
# Double check what DefaultImageGenerationRetryConfig actually includes
sed -n '981,998p' core/internal/testutil/test_retry_framework.goRepository: maximhq/bifrost
Length of output: 1636
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there's a conversion for stream version
sed -n '165,200p' core/internal/testutil/image_generation.go | grep -A 30 "ImageGenerationStream"Repository: maximhq/bifrost
Length of output: 1169
Fix inconsistent retry condition handling between ImageGeneration and ImageGenerationStreamRetry.
The non-stream ImageGeneration path (image_generation.go, line 48-54) explicitly discards retry conditions during conversion to ImageGenerationRetryConfig by setting Conditions: []ImageGenerationRetryCondition{}, while ImageGenerationStreamRetry preserves the conditions from DefaultImageGenerationRetryConfig(). Both should consistently include the EmptyImageGenerationCondition and GenericResponseCondition from GetTestRetryConfigForScenario().
Convert the non-stream path to either:
- Use
TestRetryConfigdirectly like the stream version, or - Preserve the conditions during the struct conversion instead of dropping them to an empty slice
🤖 Prompt for AI Agents
In `@core/internal/testutil/test_retry_framework.go` around lines 1250 - 1252, The
non-stream ImageGeneration path is dropping retry conditions when converting to
ImageGenerationRetryConfig; update the branch handling "ImageGeneration" so it
matches the stream behavior by either returning the TestRetryConfig directly
(like ImageGenerationStream does) or by preserving the default conditions
instead of setting Conditions: []ImageGenerationRetryCondition{}; ensure the
returned config includes EmptyImageGenerationCondition and
GenericResponseCondition as provided by GetTestRetryConfigForScenario() and use
DefaultImageGenerationRetryConfig() or copy its Conditions into the
ImageGenerationRetryConfig conversion.
| // In validation_presets.go - add this function | ||
| func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations { | ||
| return ResponseExpectations{ | ||
| ShouldHaveContent: false, // Image responses don't have text content | ||
| ExpectedChoiceCount: 0, // Image responses don't have choices | ||
| ShouldHaveUsageStats: true, | ||
| ShouldHaveTimestamps: true, | ||
| ShouldHaveModel: true, | ||
| ShouldHaveLatency: true, // Global expectation: latency should always be present | ||
| ProviderSpecific: map[string]interface{}{ | ||
| "min_images": minImages, | ||
| "expected_size": expectedSize, | ||
| "response_type": "image_generation", | ||
| }, | ||
| } | ||
| } |
There was a problem hiding this comment.
Remove leftover development comment.
Line 222 contains a comment // In validation_presets.go - add this function that appears to be a leftover instruction from development. The function implementation itself follows the established pattern correctly.
Proposed fix
-// In validation_presets.go - add this function
+// ImageGenerationExpectations returns validation expectations for image generation scenarios
func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // In validation_presets.go - add this function | |
| func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations { | |
| return ResponseExpectations{ | |
| ShouldHaveContent: false, // Image responses don't have text content | |
| ExpectedChoiceCount: 0, // Image responses don't have choices | |
| ShouldHaveUsageStats: true, | |
| ShouldHaveTimestamps: true, | |
| ShouldHaveModel: true, | |
| ShouldHaveLatency: true, // Global expectation: latency should always be present | |
| ProviderSpecific: map[string]interface{}{ | |
| "min_images": minImages, | |
| "expected_size": expectedSize, | |
| "response_type": "image_generation", | |
| }, | |
| } | |
| } | |
| // ImageGenerationExpectations returns validation expectations for image generation scenarios | |
| func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations { | |
| return ResponseExpectations{ | |
| ShouldHaveContent: false, // Image responses don't have text content | |
| ExpectedChoiceCount: 0, // Image responses don't have choices | |
| ShouldHaveUsageStats: true, | |
| ShouldHaveTimestamps: true, | |
| ShouldHaveModel: true, | |
| ShouldHaveLatency: true, // Global expectation: latency should always be present | |
| ProviderSpecific: map[string]interface{}{ | |
| "min_images": minImages, | |
| "expected_size": expectedSize, | |
| "response_type": "image_generation", | |
| }, | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@core/internal/testutil/validation_presets.go` around lines 222 - 237, Remove
the leftover development comment above the function definition in
validation_presets.go; specifically delete the line "// In validation_presets.go
- add this function" so only the ImageGenerationExpectations function and its
body remain (referencing the ImageGenerationExpectations symbol to locate the
block).
| // ImageGeneration performs an image generation request to the Gemini API. | ||
| func (provider *GeminiProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) { | ||
| // Check if image gen is allowed for this provider | ||
| if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ImageGenerationRequest); err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| // check for imagen models | ||
| if schemas.IsImagenModel(request.Model) { | ||
| return provider.handleImagenImageGeneration(ctx, key, request) | ||
| } | ||
| // Prepare body | ||
| jsonData, bifrostErr := providerUtils.CheckContextAndGetRequestBody( | ||
| ctx, | ||
| request, | ||
| func() (any, error) { return ToGeminiImageGenerationRequest(request), nil }, | ||
| provider.GetProviderKey()) | ||
| if bifrostErr != nil { | ||
| return nil, bifrostErr | ||
| } | ||
|
|
||
| // Use common request function | ||
| geminiResponse, rawResponse, latency, bifrostErr := provider.completeRequest(ctx, request.Model, key, jsonData, ":generateContent", &providerUtils.RequestMetadata{ | ||
| Provider: provider.GetProviderKey(), | ||
| Model: request.Model, | ||
| RequestType: schemas.ImageGenerationRequest, | ||
| }) | ||
| if bifrostErr != nil { | ||
| return nil, bifrostErr | ||
| } | ||
|
|
||
| response, bifrostErr := geminiResponse.ToBifrostImageGenerationResponse() | ||
| if response == nil || bifrostErr != nil { | ||
| // Add extra fields to error | ||
| bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{ | ||
| Provider: provider.GetProviderKey(), | ||
| ModelRequested: request.Model, | ||
| RequestType: schemas.ImageGenerationRequest, | ||
| } | ||
| return nil, bifrostErr | ||
| } | ||
|
|
||
| // Set ExtraFields | ||
| response.ExtraFields.Provider = provider.GetProviderKey() | ||
| response.ExtraFields.ModelRequested = request.Model | ||
| response.ExtraFields.RequestType = schemas.ImageGenerationRequest | ||
| response.ExtraFields.Latency = latency.Milliseconds() | ||
|
|
||
| if providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest) { | ||
| providerUtils.ParseAndSetRawRequest(&response.ExtraFields, jsonData) | ||
| } | ||
|
|
||
| if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) { | ||
| response.ExtraFields.RawResponse = rawResponse | ||
| } | ||
|
|
||
| return response, nil | ||
| } |
There was a problem hiding this comment.
Fix possible nil-deref when conversion returns (nil, nil) (or nil error).
At Line 1646-1653, if response == nil and bifrostErr == nil, the code dereferences bifrostErr.ExtraFields and will panic.
Proposed fix
- response, bifrostErr := geminiResponse.ToBifrostImageGenerationResponse()
- if response == nil || bifrostErr != nil {
- // Add extra fields to error
- bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{
- Provider: provider.GetProviderKey(),
- ModelRequested: request.Model,
- RequestType: schemas.ImageGenerationRequest,
- }
- return nil, bifrostErr
- }
+ response, convErr := geminiResponse.ToBifrostImageGenerationResponse()
+ if convErr != nil {
+ convErr.ExtraFields = schemas.BifrostErrorExtraFields{
+ Provider: provider.GetProviderKey(),
+ ModelRequested: request.Model,
+ RequestType: schemas.ImageGenerationRequest,
+ }
+ return nil, convErr
+ }
+ if response == nil {
+ convErr := providerUtils.NewBifrostOperationError(
+ schemas.ErrProviderResponseDecode,
+ fmt.Errorf("empty image generation response"),
+ provider.GetProviderKey(),
+ )
+ convErr.ExtraFields = schemas.BifrostErrorExtraFields{
+ Provider: provider.GetProviderKey(),
+ ModelRequested: request.Model,
+ RequestType: schemas.ImageGenerationRequest,
+ }
+ return nil, convErr
+ }🤖 Prompt for AI Agents
In `@core/providers/gemini/gemini.go` around lines 1614 - 1671, In
ImageGeneration, avoid a nil-deref when
geminiResponse.ToBifrostImageGenerationResponse() returns (nil, nil): check if
response == nil and bifrostErr == nil and in that case create and return a
proper *schemas.BifrostError (or populate bifrostErr) before accessing
bifrostErr.ExtraFields; similarly ensure bifrostErr is non-nil before assigning
bifrostErr.ExtraFields and only set response.ExtraFields after confirming
response != nil (references: ImageGeneration,
geminiResponse.ToBifrostImageGenerationResponse, variables response and
bifrostErr).
There was a problem hiding this comment.
no changes here.
There was a problem hiding this comment.
@Radheshg04 Understood.
Quick note on the specific edge: at Line 1646–1653, if ToBifrostImageGenerationResponse() ever returns (nil, nil), the next line dereferences bifrostErr and will panic. If that converter guarantees “never (nil, nil)”, we’re fine. If not, a tiny guard avoids the nil-deref without changing behavior:
resp, convErr := geminiResponse.ToBifrostImageGenerationResponse()
if convErr != nil {
convErr.ExtraFields = schemas.BifrostErrorExtraFields{
Provider: provider.GetProviderKey(),
ModelRequested: request.Model,
RequestType: schemas.ImageGenerationRequest,
}
return nil, convErr
}
if resp == nil {
convErr := providerUtils.NewBifrostOperationError(
schemas.ErrProviderResponseDecode,
fmt.Errorf("empty image generation response"),
provider.GetProviderKey(),
)
convErr.ExtraFields = schemas.BifrostErrorExtraFields{
Provider: provider.GetProviderKey(),
ModelRequested: request.Model,
RequestType: schemas.ImageGenerationRequest,
}
return nil, convErr
}If you’d like, I can open a follow-up issue to track this as a non-blocking hardening item.
| // ToBifrostImageGenerationRequest converts an OpenAI image generation request to Bifrost format | ||
| func (request *OpenAIImageGenerationRequest) ToBifrostImageGenerationRequest() *schemas.BifrostImageGenerationRequest { | ||
| if request == nil { | ||
| return nil | ||
| } | ||
|
|
||
| provider, model := schemas.ParseModelString(request.Model, schemas.OpenAI) | ||
|
|
||
| // Only set Params if the embedded struct is non-empty to avoid always emitting empty params | ||
| var params *schemas.ImageGenerationParameters | ||
| if request.N != nil || request.Background != nil || request.Moderation != nil || | ||
| request.PartialImages != nil || request.Size != nil || request.Quality != nil || | ||
| request.OutputCompression != nil || request.OutputFormat != nil || request.Style != nil || | ||
| request.ResponseFormat != nil || request.Seed != nil || request.NegativePrompt != nil || | ||
| request.NumInferenceSteps != nil || request.User != nil || | ||
| len(request.ExtraParams) > 0 { | ||
| params = &request.ImageGenerationParameters | ||
| } | ||
|
|
||
| return &schemas.BifrostImageGenerationRequest{ | ||
| Provider: provider, | ||
| Model: model, | ||
| Input: &schemas.ImageGenerationInput{ | ||
| Prompt: request.Prompt, | ||
| }, | ||
| Params: params, | ||
| Fallbacks: schemas.ParseFallbacks(request.Fallbacks), | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find ParseModelString implementation and usage
echo "=== ParseModelString definition and calls ==="
rg -n 'ParseModelString|func.*ParseModelString' --type go -A 5 | head -100Repository: maximhq/bifrost
Length of output: 7714
🏁 Script executed:
#!/bin/bash
# Search for where model strings are created/assigned on ingress
echo "=== Model string construction patterns ==="
rg -n 'Model:\s*|\.Model\s*=' --type go core | head -50Repository: maximhq/bifrost
Length of output: 3464
🏁 Script executed:
#!/bin/bash
# Look for model validation or prefix enforcement
echo "=== Model format validation/prefix patterns ==="
rg -n 'x-model-provider|provider.*prefix|model.*prefix|FormatModelString' --type go -i | head -50Repository: maximhq/bifrost
Length of output: 6181
🏁 Script executed:
#!/bin/bash
# Find ParseModelString implementation
rg -n 'func.*ParseModelString|^func ParseModelString' --type go -A 20 core/schemas/Repository: maximhq/bifrost
Length of output: 1294
🏁 Script executed:
#!/bin/bash
# Check the validation logic at ingress points
cat -n transports/bifrost-http/handlers/inference.go | sed -n '565,570p'Repository: maximhq/bifrost
Length of output: 346
🏁 Script executed:
#!/bin/bash
# Look at core/schemas/utils.go to see ParseModelString implementation details
cat -n core/schemas/utils.go | head -40Repository: maximhq/bifrost
Length of output: 1469
🏁 Script executed:
#!/bin/bash
# Find how ToBifrostImageGenerationRequest is called
rg -n 'ToBifrostImageGenerationRequest' --type go -B 3 -A 3Repository: maximhq/bifrost
Length of output: 2545
🏁 Script executed:
#!/bin/bash
# Check if there are any stack/PR references or context
git log --oneline -20 2>/dev/null || echo "No git history available"Repository: maximhq/bifrost
Length of output: 155
🏁 Script executed:
#!/bin/bash
# Look for the OpenAI image generation request handler to understand the flow
rg -n 'ImageGeneration|openai.*image' --type go transports/bifrost-http/integrations/openai.go | head -30Repository: maximhq/bifrost
Length of output: 694
🏁 Script executed:
#!/bin/bash
# Check if there are any direct instantiations or calls to ToBifrostImageGenerationRequest
# that might bypass the ingress validation
rg -n 'ToBifrostImageGenerationRequest|OpenAIImageGenerationRequest' --type go -B 5 | grep -E '(ToBifrostImageGenerationRequest|NewOpenAIImageGenerationRequest|&OpenAIImageGenerationRequest)' | head -40Repository: maximhq/bifrost
Length of output: 966
🏁 Script executed:
#!/bin/bash
# Trace the flow: verify that RequestConverter is only called after ingress validation
rg -n 'RequestConverter|req\.Model' --type go transports/bifrost-http/integrations/openai.go -A 2 -B 2 | head -50Repository: maximhq/bifrost
Length of output: 2456
🏁 Script executed:
#!/bin/bash
# Check if openai integration validates model format before calling RequestConverter
cat -n transports/bifrost-http/integrations/openai.go | sed -n '1,100p'Repository: maximhq/bifrost
Length of output: 4066
🏁 Script executed:
#!/bin/bash
# Verify that the OpenAI integration goes through handler validation before RequestConverter
# Look at the handler registration and see how requests flow through validation
rg -n 'ImageGenerationHandler|images/generations' --type go transports/bifrost-http/ -B 5 -A 10 | head -80Repository: maximhq/bifrost
Length of output: 6020
🏁 Script executed:
#!/bin/bash
# Check if there's a direct handler for image generation or if it uses standard route patterns
fd -type f -name "*.go" transports/bifrost-http/handlers/ -exec grep -l "image" {} \;Repository: maximhq/bifrost
Length of output: 229
🏁 Script executed:
#!/bin/bash
# Verify the complete flow: examine how standard HTTP requests are validated before reaching integration layer
rg -n 'DefaultRoute|handleRequest|RequestConverter' --type go transports/bifrost-http/lib/ | head -20Repository: maximhq/bifrost
Length of output: 41
🏁 Script executed:
#!/bin/bash
# Look at the imageGeneration handler implementation to see if it validates model format
cat -n transports/bifrost-http/handlers/inference.go | sed -n '1417,1450p'Repository: maximhq/bifrost
Length of output: 1487
🏁 Script executed:
#!/bin/bash
# Check what ImageGenerationHTTPRequest looks like and if validation happens
rg -n 'type ImageGenerationHTTPRequest|model.*validation|ParseModelString.*imageGeneration' --type go transports/bifrost-http/handlers/ -A 5Repository: maximhq/bifrost
Length of output: 484
🏁 Script executed:
#!/bin/bash
# Continue reading the imageGeneration handler to see what happens after model validation
cat -n transports/bifrost-http/handlers/inference.go | sed -n '1450,1520p'Repository: maximhq/bifrost
Length of output: 3106
🏁 Script executed:
#!/bin/bash
# Check if the handler uses the integration router or does custom processing
rg -n 'router\.|integration\.|BifrostRequest' --type go transports/bifrost-http/handlers/inference.go | grep -A 5 -B 5 'imageGeneration'Repository: maximhq/bifrost
Length of output: 41
🏁 Script executed:
#!/bin/bash
# Check if there are any other integration paths for image generation besides OpenAI
rg -n 'ImageGeneration|images/generations' --type go transports/bifrost-http/integrations/ | grep -v openai.go | head -20Repository: maximhq/bifrost
Length of output: 3201
🏁 Script executed:
#!/bin/bash
# Verify the GenericRouter and how it processes requests vs the handler approach
rg -n 'GenericRouter|handleRequest|RequestParser.*RequestConverter' --type go transports/bifrost-http/integrations/router.go -B 3 -A 3 | head -60Repository: maximhq/bifrost
Length of output: 3368
🏁 Script executed:
#!/bin/bash
# Check if OpenAI integration routes handle x-model-provider header or similar overrides
rg -n 'x-model-provider|PreHook|PostHook' --type go transports/bifrost-http/integrations/openai.go | head -30Repository: maximhq/bifrost
Length of output: 1343
🏁 Script executed:
#!/bin/bash
# Check if there's validation or preprocessing in the OpenAI router before RequestConverter
cat -n transports/bifrost-http/integrations/openai.go | sed -n '1,50p'Repository: maximhq/bifrost
Length of output: 2139
🏁 Script executed:
#!/bin/bash
# Examine a complete route config for image generation to see if there are pre-hooks
cat -n transports/bifrost-http/integrations/openai.go | sed -n '415,445p'Repository: maximhq/bifrost
Length of output: 1490
🏁 Script executed:
#!/bin/bash
# Check route registration order and understand which handler takes precedence
# Look for how routes are registered in the main app
rg -n 'RegisterRoutes|GenericRouter|OpenAIRouter' --type go transports/bifrost-http/ | grep -E '(main|setup|register)' | head -20Repository: maximhq/bifrost
Length of output: 1717
🏁 Script executed:
#!/bin/bash
# Check if both the inference handler and openai integration both register for the same path
# and understand the precedence
rg -n 'RegisterRoutes|createHandler.*imageGeneration|CreateOpenAIRoutes' --type go transports/bifrost-http/handlers/ transports/bifrost-http/integrations/ | head -30Repository: maximhq/bifrost
Length of output: 4020
🏁 Script executed:
#!/bin/bash
# Look for the main entry point to understand which handlers are registered
find transports/bifrost-http -name "main.go" -o -name "server.go" -o -name "router.go" | head -5Repository: maximhq/bifrost
Length of output: 178
🏁 Script executed:
#!/bin/bash
# Check main.go to see the order of route registration
cat -n transports/bifrost-http/main.goRepository: maximhq/bifrost
Length of output: 8002
🏁 Script executed:
#!/bin/bash
# Check the server setup
cat -n transports/bifrost-http/server/server.go | head -100Repository: maximhq/bifrost
Length of output: 4519
🏁 Script executed:
#!/bin/bash
# Look for the Bootstrap method to see route registration order
cat -n transports/bifrost-http/server/server.go | sed -n '100,300p' | grep -A 50 'func.*Bootstrap'Repository: maximhq/bifrost
Length of output: 41
🏁 Script executed:
#!/bin/bash
# Search for where handlers are registered
rg -n 'CompletionHandler.*RegisterRoutes|IntegrationHandler.*RegisterRoutes|OpenAIRouter' --type go transports/bifrost-http/server/ -A 3 -B 3Repository: maximhq/bifrost
Length of output: 41
🏁 Script executed:
#!/bin/bash
# Find where handlers are actually registered in the server
rg -n 'RegisterRoutes' --type go transports/bifrost-http/server/server.go -A 2 -B 2Repository: maximhq/bifrost
Length of output: 1932
🏁 Script executed:
#!/bin/bash
# Search for where CompletionHandler and IntegrationHandler are created/used
rg -n 'CompletionHandler|IntegrationHandler' --type go transports/bifrost-http/server/server.go -B 3 -A 3Repository: maximhq/bifrost
Length of output: 561
🏁 Script executed:
#!/bin/bash
# Confirm the route precedence - check if fasthttp router allows duplicate routes or if first wins
rg -n 'router.Router|POST.*images' --type go transports/bifrost-http/handlers/inference.go -A 1 -B 1 | head -40Repository: maximhq/bifrost
Length of output: 719
🏁 Script executed:
#!/bin/bash
# Check if both handlers try to register the exact same path
rg -n '/v1/images/generations' --type go transports/bifrost-http/Repository: maximhq/bifrost
Length of output: 422
🏁 Script executed:
#!/bin/bash
# Critical finding: both routes registered for /v1/images/generations
# Check if there are other OpenAI integration paths that might bypass inference handler validation
rg -n 'CreateOpenAIRoutes|/images/generations|/openai/' --type go transports/bifrost-http/integrations/openai.go | head -30Repository: maximhq/bifrost
Length of output: 657
🏁 Script executed:
#!/bin/bash
# Check the other image generation paths in OpenAI integration
cat -n transports/bifrost-http/integrations/openai.go | sed -n '416,422p'Repository: maximhq/bifrost
Length of output: 326
🏁 Script executed:
#!/bin/bash
# Verify which handler actually handles the standard /v1/images/generations requests
# by checking if inference handler truly validates before calling ToBifrostImageGenerationRequest
cat -n transports/bifrost-http/handlers/inference.go | sed -n '1427,1465p'Repository: maximhq/bifrost
Length of output: 1575
🏁 Script executed:
#!/bin/bash
# Check what pathPrefix is used for OpenAI integration
rg -n 'CreateOpenAIRoutes|pathPrefix' --type go transports/bifrost-http/handlers/integrations.go -A 5 -B 5Repository: maximhq/bifrost
Length of output: 41
🏁 Script executed:
#!/bin/bash
# Verify if model format validation is enforced at the OpenAI integration level
# Check if there's any validation in the RequestParser or PreCallback
rg -n 'RequestParser|PreCallback' --type go transports/bifrost-http/integrations/openai.go | grep -A 2 "Image\|image"Repository: maximhq/bifrost
Length of output: 41
Fix the unvalidated model string handling in the OpenAI integration's image generation route.
The integration handler for /v1/images/generations (registered first) bypasses model format validation and directly calls ToBifrostImageGenerationRequest(), which defaults to schemas.OpenAI when the model string lacks a provider prefix. This creates a routing bug: unprefixed model strings (e.g., gpt-4o) will always be assigned the OpenAI provider, even when the request should be routed to another provider based on gateway configuration.
Add model format validation in the OpenAI integration's image generation route (either in a PreCallback or RequestParser) to enforce that model strings follow the provider/model format, matching the validation already present in the inference handler.
| type ImageGenerationEventType string | ||
|
|
||
| // OpenAIImageGenerationRequest is the struct for Image Generation requests by OpenAI. | ||
| type OpenAIImageGenerationRequest struct { | ||
| Model string `json:"model"` | ||
| Prompt string `json:"prompt"` | ||
|
|
||
| schemas.ImageGenerationParameters | ||
|
|
||
| Stream *bool `json:"stream,omitempty"` | ||
| Fallbacks []string `json:"fallbacks,omitempty"` | ||
| } | ||
|
|
||
| // IsStreamingRequested implements the StreamingRequest interface | ||
| func (r *OpenAIImageGenerationRequest) IsStreamingRequested() bool { | ||
| return r.Stream != nil && *r.Stream | ||
| } | ||
|
|
||
| // OpenAIImageStreamResponse is the struct for Image Generation streaming responses by OpenAI. | ||
| type OpenAIImageStreamResponse struct { | ||
| Type ImageGenerationEventType `json:"type,omitempty"` | ||
| SequenceNumber *int `json:"sequence_number,omitempty"` | ||
| B64JSON *string `json:"b64_json,omitempty"` | ||
| PartialImageIndex *int `json:"partial_image_index,omitempty"` | ||
| CreatedAt int64 `json:"created_at,omitempty"` | ||
| Size string `json:"size,omitempty"` | ||
| Quality string `json:"quality,omitempty"` | ||
| Background string `json:"background,omitempty"` | ||
| OutputFormat string `json:"output_format,omitempty"` | ||
| RawSSE string `json:"-"` // For internal use | ||
| Usage *schemas.ImageUsage `json:"usage,omitempty"` | ||
| // Error fields for error events | ||
| Error *struct { | ||
| Code *string `json:"code,omitempty"` | ||
| Message string `json:"message,omitempty"` | ||
| Param *string `json:"param,omitempty"` | ||
| Type *string `json:"type,omitempty"` | ||
| } `json:"error,omitempty"` | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for ImageGenerationEventType definition and usage
rg "ImageGenerationEventType" -nRepository: maximhq/bifrost
Length of output: 565
🏁 Script executed:
# Search for image generation event type constants that may already exist
rg "image_generation\.(partial_image|completed|error)" -nRepository: maximhq/bifrost
Length of output: 1418
🏁 Script executed:
# Check the context around the file to understand the structure
git log --oneline -n 20 -- core/providers/openai/types.go | head -10Repository: maximhq/bifrost
Length of output: 155
🏁 Script executed:
# Look for similar event type constants in the repo
rg "EventType.*string|event.*const" -A 3 -B 1 --type go | head -50Repository: maximhq/bifrost
Length of output: 1656
Move image generation event type constants to types.go alongside the type definition.
The constants (ImageGenerationPartial, ImageGenerationCompleted, ImageGenerationError) currently defined in images.go should be moved to types.go where ImageGenerationEventType is declared. This follows the repo convention established by other providers (Mistral, Anthropic, Cohere), where event type definitions and their constants are co-located for better organization and discoverability.
🤖 Prompt for AI Agents
In `@core/providers/openai/types.go` around lines 546 - 584, Move the
ImageGeneration event constants from images.go into types.go next to the
ImageGenerationEventType declaration: define ImageGenerationPartial,
ImageGenerationCompleted, and ImageGenerationError as constants of type
ImageGenerationEventType (e.g. const ImageGenerationPartial
ImageGenerationEventType = "partial", etc.), remove their definitions from
images.go, and update any references/imports if needed so code uses the
relocated constants; keep the constants in types.go immediately adjacent to the
ImageGenerationEventType type definition for consistency with other providers.
| // Auth query is used for fine-tuned models to pass the API key in the query string | ||
| authQuery := "" | ||
| // Determine the URL based on model type | ||
| var completeURL string | ||
| if schemas.IsAllDigitsASCII(deployment) { | ||
| // Custom Fine-tuned models use OpenAPI endpoint | ||
| projectNumber := key.VertexKeyConfig.ProjectNumber | ||
| if projectNumber == "" { | ||
| return nil, providerUtils.NewConfigurationError("project number is not set for fine-tuned models", providerName) | ||
| } | ||
| if key.Value != "" { | ||
| authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value)) | ||
| } | ||
| if region == "global" { | ||
| completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1beta1/projects/%s/locations/global/endpoints/%s:generateContent", projectNumber, deployment) | ||
| } else { | ||
| completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1beta1/projects/%s/locations/%s/endpoints/%s:generateContent", region, projectNumber, region, deployment) | ||
| } | ||
|
|
||
| } else if schemas.IsImagenModel(deployment) { | ||
| // Imagen models are published models, use publishers/google/models path | ||
| if key.Value != "" { | ||
| authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value)) | ||
| } | ||
| if region == "global" { | ||
| completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1/projects/%s/locations/global/publishers/google/models/%s:predict", projectID, deployment) | ||
| } else { | ||
| completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:predict", region, projectID, region, deployment) | ||
| } | ||
| } else if schemas.IsGeminiModel(deployment) { | ||
| if key.Value != "" { | ||
| authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value)) | ||
| } | ||
| if region == "global" { | ||
| completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1/projects/%s/locations/global/publishers/google/models/%s:generateContent", projectID, deployment) | ||
| } else { | ||
| completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:generateContent", region, projectID, region, deployment) | ||
| } | ||
| } | ||
|
|
||
| // Create HTTP request for streaming | ||
| req := fasthttp.AcquireRequest() | ||
| resp := fasthttp.AcquireResponse() | ||
| defer fasthttp.ReleaseRequest(req) | ||
| defer fasthttp.ReleaseResponse(resp) | ||
|
|
||
| req.Header.SetMethod(http.MethodPost) | ||
| req.Header.SetContentType("application/json") | ||
| providerUtils.SetExtraHeaders(ctx, req, provider.networkConfig.ExtraHeaders, nil) | ||
|
|
||
| // If auth query is set, add it to the URL | ||
| // Otherwise, get the oauth2 token and set the Authorization header | ||
| if authQuery != "" { | ||
| completeURL = fmt.Sprintf("%s?%s", completeURL, authQuery) | ||
| } else { | ||
| // Getting oauth2 token | ||
| tokenSource, err := getAuthTokenSource(key) | ||
| if err != nil { | ||
| return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex) | ||
| } | ||
| token, err := tokenSource.Token() | ||
| if err != nil { | ||
| return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex) | ||
| } | ||
| req.Header.Set("Authorization", "Bearer "+token.AccessToken) | ||
| } | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Vertex AI Imagen predict endpoint API key authentication support
💡 Result:
Short answer: No — the Vertex AI Imagen predict endpoint does not accept API key authentication. Use OAuth2 (service-account or user access tokens / Application Default Credentials) and send a Bearer token in the Authorization header. See the Imagen predict examples that use "Authorization: Bearer $(gcloud auth print-access-token)". [1][2][3]
Example (curl):
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" "https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:predict" -d '{ "instances":[{ "prompt":"..." }] }'
References:
- Imagen API docs (predict examples use Bearer token). [1]
- models.predict REST reference. [2]
- Issue/discussion reporting that Vertex AI rejects API keys ("API keys are not supported by this API"). [3]
🌐 Web query:
Vertex AI custom fine-tuned models endpoint authentication requirements
💡 Result:
Short answer — Vertex AI endpoints require Google OAuth 2.0 bearer tokens (service-account or user identity via Application Default Credentials / gcloud). API keys are not supported. You can call endpoints with an access/identity token in the Authorization: Bearer header, and for deployed custom models you may configure a custom service account (prediction containers otherwise run with a Vertex-managed service account). [1][3][4]
Sources:
- How to authenticate API requests to Vertex AI (use gcloud/ADC/service account tokens). [1][4]
- Vertex AI deployed prediction containers and custom service account behavior. [3]
🌐 Web query:
Vertex AI Gemini generateContent endpoint API key vs OAuth2 authentication
💡 Result:
Short answer — both work, but use API keys for quick tests and OAuth2/service‑account (Application Default Credentials) for production.
Key differences
- How to send:
- API key: supply x-goog-api-key header (or ?key=) when calling generateContent. [1][3]
- OAuth2 / service account: obtain an access token and send Authorization: Bearer . Vertex APIs accept Google Cloud credentials/ADC. [2]
- Security & scope:
- API keys are simple but bearer-like: less granular, riskier if leaked, and should be restricted (HTTP referrer/IP/app restrictions). Not recommended for client apps. [1]
- OAuth2/service accounts use IAM, scoped OAuth tokens, fine-grained permissions, audit logs, and are recommended for production. [1][2]
- Use cases:
- API key: rapid testing, prototypes, demos. [1]
- OAuth2/service account (ADC): production servers, pipelines, backend services, CI/CD, and cases needing IAM/auditing. [1][2]
- Quotas, billing, and IAM:
- Both consume project quota/billing, but only OAuth2 ties calls to a service account / IAM policy for fine-grained access control and auditing. [2]
Recommendation
- Development/test: API key with restrictions. [1]
- Production: use OAuth2 via service accounts / Application Default Credentials (ADC) for secure, auditable access. [1][2]
Sources
[1] Google Cloud — Get a Google Cloud API key (Vertex AI / Gemini).
[2] Google Cloud — Vertex AI Generative AI REST reference (generateContent endpoints).
[3] Google — Gemini API authentication (x-goog-api-key example).
API-key authentication will not work for Vertex AI Imagen and custom fine-tuned model endpoints.
The code uses ?key=... query parameter authentication for custom fine-tuned models (lines 1503–1509), Imagen (lines 1511–1520), and Gemini (lines 1522–1530). However, Vertex AI Imagen predict endpoints and custom fine-tuned model endpoints do not accept API key authentication and will return "API keys are not supported by this API" errors. These endpoints require OAuth2 Bearer tokens.
Gemini's generateContent endpoint does support API key auth, so that path is acceptable. However, for consistency and to support all model types reliably, switch to OAuth2 (Bearer tokens via ADC or service-account credentials) for all three branches, or remove API key support from Imagen and custom fine-tuned model paths.
🤖 Prompt for AI Agents
In `@core/providers/vertex/vertex.go` around lines 1493 - 1559, The
IsAllDigitsASCII (custom fine-tuned) and IsImagenModel branches currently set
authQuery from key.Value which enables API-key auth; update those branches to
stop populating authQuery and instead always obtain an OAuth2 token via
getAuthTokenSource(key) and set req.Header.Set("Authorization", "Bearer
"+token.AccessToken) (same approach already used later), keeping the existing
projectNumber/URL construction logic via completeURL; leave the IsGeminiModel
branch behavior unchanged if you want to continue supporting API key for Gemini.
Ensure authQuery remains empty for Imagen and fine-tuned paths so the later "if
authQuery != \"\" { completeURL = fmt.Sprintf(...)} else { tokenSource... }"
flow uses OAuth2 for those models.
| size: | ||
| type: string | ||
| enum: | ||
| - "256x256" | ||
| - "512x512" | ||
| - "1024x1024" | ||
| - "1792x1024" | ||
| - "1024x1792" | ||
| - "1536x1024" | ||
| - "1024x1536" | ||
| description: Size of the generated image | ||
| quality: | ||
| type: string | ||
| enum: | ||
| - "standard" | ||
| - "hd" | ||
| description: Quality of the generated image | ||
| style: | ||
| type: string | ||
| enum: | ||
| - "natural" | ||
| - "vivid" | ||
| description: Style of the generated image | ||
| response_format: | ||
| type: string | ||
| enum: | ||
| - "url" | ||
| - "b64_json" | ||
| default: "url" | ||
| description: Format of the response. This parameter is not supported for streaming requests. | ||
| user: | ||
| type: string | ||
| description: User identifier for tracking | ||
| stream: | ||
| type: boolean | ||
| default: false | ||
| description: | | ||
| Whether to stream the response. When true, images are sent as base64 chunks via SSE. | ||
| fallbacks: | ||
| type: array | ||
| items: | ||
| type: string | ||
| description: Fallback models to try if primary model fails |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenAI Images API supported size values 1536x1024 1024x1536 quality style
💡 Result:
Yes — the Images API supports those sizes and quality options.
- Supported sizes: 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), plus "auto". [1]
- Quality options: low, medium, high, and "auto" (affects token usage/cost). [1]
- Cost/token differences by size × quality are documented per image model (e.g., gpt-image-1/gpt-image-1.5). [2][3]
- There is no separate "style" parameter in the Images API docs — control style via your prompt (or model choice); other configurable fields include format, compression, background, and action. [1][3]
Sources: OpenAI Images guide and model docs. [1][2][3]
References
[1] OpenAI — Image generation (Customize output: size, quality, auto).
[2] OpenAI — GPT Image 1 model pricing (sizes/quality).
[3] OpenAI — GPT Image 1.5 / chatgpt-image-latest model pages (sizes/quality).
🌐 Web query:
OpenAI Images API official documentation parameters size quality style
💡 Result:
Short summary of the Images API parameters you asked about:
- size — Image dimensions (examples: "1024x1024", "1536x1024", "1024x1536"); also supports "auto" (model picks). Default 1024x1024. (platform.openai.com)
- quality — Rendering quality options: "low", "medium", "high" (and "auto"); for DALL·E‑3 there is also "standard" (default) and "hd" (higher‑quality/higher‑cost). (platform.openai.com)
- style — Optional control for overall look; current valid values include "vivid" (default for some endpoints) and "natural". (help.openai.com)
For full details (model-specific allowed sizes, defaults, compression/output format, and token/cost effects) see the official Image Generation docs and DALL·E 3 help article. (platform.openai.com)
Citations:
- 1: https://platform.openai.com/docs/guides/image-generation?utm_source=openai
- 2: https://platform.openai.com/docs/guides/image-generation?utm_source=openai
- 3: https://help.openai.com/en/articles/8555480-dall-e-3-api?utm_source=openai
- 4: https://platform.openai.com/docs/guides/image-generation?utm_source=openai
🌐 Web query:
OpenAI Images API streaming SSE support image generation
💡 Result:
Yes — OpenAI’s Images/Responses APIs support streaming image generation (Server‑Sent Events style). You can call the API with stream=true and use the partial_images parameter (0–3) to receive partial image frames as they are generated; SDK examples and event types are shown in the docs. [1][2][4]
Correct the size enum — remove unsupported OpenAI values and add "auto".
The size enum includes values not supported by OpenAI's official Images API:
"256x256","512x512","1792x1024","1024x1792"do not exist in OpenAI's API- The supported OpenAI sizes are:
"1024x1024","1536x1024"(landscape),"1024x1536"(portrait), and"auto"(model-selected; default is1024x1024) - Adjust to match the official OpenAI documentation
The quality enum ("standard", "hd") and style enum ("natural", "vivid") are accurate for DALL·E 3, but note that other OpenAI models support different quality values ("low", "medium", "high", "auto"). If this schema targets DALL·E 3 exclusively, document that; otherwise, clarify which models these parameters apply to.
Streaming is supported by OpenAI's Images API (with stream=true and partial_images parameter), but verify the exact event schema/types for OpenAIImageStreamResponse match OpenAI's official streaming format before finalizing.
Also applies to: 92-129
🤖 Prompt for AI Agents
In `@docs/openapi/schemas/integrations/openai/images.yaml` around lines 21 - 63,
Update the size enum in the images schema to match OpenAI's supported values:
remove "256x256", "512x512", "1792x1024", and "1024x1792" and add "auto",
leaving "1024x1024", "1536x1024", and "1024x1536"; ensure the default remains or
is set to "1024x1024" as documented. Keep the existing quality and style enums
if this schema is DALL·E 3-specific, but add a short note in the schema
description for the quality/style fields (quality, style) clarifying they apply
to DALL·E 3 only or expand allowed values if the schema is meant for multiple
OpenAI models. Also verify the stream field description and the
response_format/streaming contract (response_format, stream) against OpenAI's
official streaming event shape (partial_images/SSE) and update the
OpenAIImageStreamResponse types elsewhere if needed.
| # 8. Image Generation | ||
|
|
||
| Gemini supports two image generation formats depending on the model: | ||
|
|
||
| 1. **Standard Gemini Format**: Uses the `/v1beta/models/{model}:generateContent` endpoint | ||
| 2. **Imagen Format**: Uses the `/v1beta/models/{model}:predict` endpoint for Imagen models (detected automatically) | ||
|
|
||
| ## Request Parameters | ||
|
|
||
| ### Parameter Mapping | ||
|
|
||
| | Parameter | Transformation | | ||
| |-----------|----------------| | ||
| | `input.prompt` | Text description of the image to generate | | ||
| | `params.n` | Number of images (mapped to `sampleCount` for Imagen, `candidateCount` for Gemini) | | ||
| | `params.size` | Image size in WxH format (e.g., `"1024x1024"`). Converted to Imagen's `imageSize` + `aspectRatio` format | | ||
| | `params.output_format` | Output format: `"png"`, `"jpeg"`, `"webp"`. Converted to MIME type for Imagen | | ||
| | `params.seed` | Seed for reproducible generation (passed directly) | | ||
| | `params.negative_prompt` | Negative prompt (passed directly) | | ||
|
|
||
| ### Extra Parameters | ||
|
|
||
| Use `extra_params` (SDK) or pass directly in request body (Gateway) for Gemini-specific fields: | ||
|
|
||
| | Parameter | Type | Notes | | ||
| |-----------|------|-------| | ||
| | `personGeneration` | string | Person generation setting (Imagen only) | | ||
| | `language` | string | Language code (Imagen only) | | ||
| | `enhancePrompt` | bool | Prompt enhancement flag (Imagen only) | | ||
| | `safetySettings` / `safety_settings` | string/array | Safety settings configuration | | ||
| | `cachedContent` / `cached_content` | string | Cached content ID | | ||
| | `labels` | object | Custom labels map | | ||
|
|
||
| <Tabs> | ||
| <Tab title="Gateway"> | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:8080/v1/images/generations \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "model": "gemini/imagen-4.0-generate-001", | ||
| "prompt": "A sunset over the mountains", | ||
| "size": "1024x1024", | ||
| "n": 2, | ||
| "output_format": "png" | ||
| }' | ||
| ``` | ||
|
|
||
| </Tab> | ||
| <Tab title="Go SDK"> | ||
|
|
||
| ```go | ||
| resp, err := client.ImageGenerationRequest(ctx, &schemas.BifrostImageGenerationRequest{ | ||
| Provider: schemas.Gemini, | ||
| Model: "imagen-4.0-generate-001", | ||
| Input: &schemas.ImageGenerationInput{ | ||
| Prompt: "A sunset over the mountains", | ||
| }, | ||
| Params: &schemas.ImageGenerationParameters{ | ||
| Size: schemas.Ptr("1024x1024"), | ||
| N: schemas.Ptr(2), | ||
| OutputFormat: schemas.Ptr("png"), | ||
| }, | ||
| }) | ||
| ``` | ||
|
|
||
| </Tab> | ||
| </Tabs> | ||
|
|
||
| ## Request Conversion | ||
|
|
||
| ### Standard Gemini Format | ||
|
|
||
| - **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Contents[0].Parts[0].Text` | ||
| - **Response Modality**: Sets `generationConfig.responseModalities = ["IMAGE"]` to indicate image generation | ||
| - **Number of Images**: `params.n` → `generationConfig.candidateCount` | ||
| - **Extra Parameters**: `safetySettings`, `cachedContent`, `labels` mapped directly | ||
|
|
||
| ### Imagen Format | ||
|
|
||
| - **Prompt**: `bifrostReq.Input.Prompt` → `req.Instances[0].Prompt` | ||
| - **Number of Images**: `params.n` → `req.Parameters.SampleCount` | ||
| - **Size Conversion**: `params.size` (WxH format) converted to: | ||
| - `imageSize`: `"1k"` (if dimensions ≤ 1024) or `"2k"` (if dimensions ≤ 2048) | ||
| - `aspectRatio`: `"1:1"`, `"3:4"`, `"4:3"`, `"9:16"`, or `"16:9"` (based on width/height ratio) | ||
| - **Output Format**: `params.output_format` (`"png"`, `"jpeg"`) → `parameters.outputOptions.mimeType` (`"image/png"`, `"image/jpeg"`) | ||
| - **Seed & Negative Prompt**: Passed directly to `parameters.seed` and `parameters.negativePrompt` | ||
| - **Extra Parameters**: `personGeneration`, `language`, `enhancePrompt`, `safetySettings` mapped to parameters | ||
|
|
||
| ## Response Conversion | ||
|
|
||
| ### Standard Gemini Format | ||
|
|
||
| - **Image Data**: Extracts `InlineData` from `candidates[0].content.parts[]` with MIME type `image/*` | ||
| - **Output Format**: Converts MIME type (`image/png`, `image/jpeg`, `image/webp`) → file extension (`png`, `jpeg`, `webp`) | ||
| - **Usage**: Extracts token usage from `usageMetadata` | ||
| - **Multiple Images**: Each image part becomes an `ImageData` entry in the response array | ||
|
|
||
| ### Imagen Format | ||
|
|
||
| - **Image Data**: Each `prediction` in `response.predictions[]` → `ImageData` with `b64_json` from `bytesBase64Encoded` | ||
| - **Output Format**: Converts `prediction.mimeType` → file extension for `outputFormat` field | ||
| - **Index**: Each prediction gets an `index` (0, 1, 2, ...) in the response array | ||
|
|
||
| ## Size Conversion | ||
|
|
||
| For Imagen format, size is converted between formats: | ||
| - **Bifrost → Imagen**: `"1024x1024"` → `imageSize: "1k"`, `aspectRatio: "1:1"` | ||
| - **Bifrost → Imagen**: `"768x1024"` → `imageSize: "1k"`, `aspectRatio: "3:4"` | ||
| - **Imagen → Bifrost**: `imageSize: "1k"`, `aspectRatio: "1:1"` → `"1024x1024"` | ||
|
|
||
| ## Endpoint Selection | ||
|
|
||
| The provider automatically selects the endpoint based on model name: | ||
| - **Imagen models** (detected via `schemas.IsImagenModel()`): Uses `/v1beta/models/{model}:predict` endpoint | ||
| - **Other models**: Uses `/v1beta/models/{model}:generateContent` endpoint with image response modality | ||
|
|
||
| ## Streaming | ||
|
|
||
| Image generation streaming is not supported by Gemini. | ||
|
|
||
| --- |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for /v1/images/generations endpoint definition
echo "=== Searching for /v1/images/generations endpoint ==="
rg -n --hidden --glob '!**/vendor/**' '/v1/images/generations|images/generations' -C 3
echo ""
echo "=== Searching for request DTO/schema fields ==="
rg -n --hidden --glob '!**/vendor/**' 'ImageGeneration.*(Request|Parameters)|output_format|response_format' -C 2
echo ""
echo "=== Searching for OpenAPI/Gateway schema files ==="
fd -e yaml -e yml -e json | head -20 | xargs grep -l 'images/generations' 2>/dev/null || echo "No matches in yaml/json files"Repository: maximhq/bifrost
Length of output: 50374
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for BifrostImageGenerationRequest struct definition
echo "=== Searching for BifrostImageGenerationRequest ==="
rg -n --hidden --glob '!**/vendor/**' 'BifrostImageGenerationRequest' -A 10
echo ""
echo "=== Searching for gateway request handling (images endpoint) ==="
rg -n --hidden --glob '!**/vendor/**' 'generateContent.*endpoint|predict.*endpoint|HandleImageGeneration' -C 3Repository: maximhq/bifrost
Length of output: 50371
🏁 Script executed:
#!/bin/bash
# Check if there's PR stack information
git log --oneline -20 | head -10
git branch -a | grep -E 'stack|parent|base' || echo "No stack branches found"
</parameter>Repository: maximhq/bifrost
Length of output: 302
Add explicit clarification for Gateway vs SDK request schemas.
The mapping table describes parameter transformations but doesn't state that the Gateway endpoint accepts OpenAI-compatible flat fields (prompt, size, n, output_format) while the SDK uses BifrostImageGenerationRequest with nested Input/Params structure. Add a brief note above the mapping table clarifying: "Gateway uses OpenAI Images API schema (flat fields); SDK uses BifrostImageGenerationRequest with Input and Params objects. The table below shows how these map internally."
🧰 Tools
🪛 LanguageTool
[style] ~562-~562: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Model & Prompt*: bifrostReq.Model → req.Model, bifrostReq.Input.Prompt → `req.Cont...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🤖 Prompt for AI Agents
In `@docs/providers/supported-providers/gemini.mdx` around lines 489 - 610, Add a
brief clarifying sentence above the "Parameter Mapping" table stating that the
Gateway accepts OpenAI-compatible flat fields (e.g., prompt, size, n,
output_format) while the SDK uses the BifrostImageGenerationRequest with nested
Input and Params objects (Input.Prompt, Params.Size, Params.N,
Params.OutputFormat); keep the sentence short and then note that the table shows
how those Gateway flat fields and SDK nested fields map internally.

Summary
Briefly explain the purpose of this PR and the problem it solves.
Changes
Type of change
Affected areas
How to test
Describe the steps to validate this change. Include commands and expected outcomes.
If adding new configs or environment variables, document them here.
Screenshots/Recordings
If UI changes, add before/after screenshots or short clips.
Breaking changes
If yes, describe impact and migration instructions.
Related issues
Link related issues and discussions. Example: Closes #123
Security considerations
Note any security implications (auth, secrets, PII, sandboxing, etc.).
Checklist
docs/contributing/README.mdand followed the guidelines