Skip to content

feat: support for image generation (#980)#1312

Closed
Radheshg04 wants to merge 17 commits intomainfrom
feat/image-gen-2
Closed

feat: support for image generation (#980)#1312
Radheshg04 wants to merge 17 commits intomainfrom
feat/image-gen-2

Conversation

@Radheshg04
Copy link
Contributor

Summary

Briefly explain the purpose of this PR and the problem it solves.

Changes

  • What was changed and why
  • Any notable design decisions or trade-offs

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Describe the steps to validate this change. Include commands and expected outcomes.

# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build

If adding new configs or environment variables, document them here.

Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

Breaking changes

  • Yes
  • No

If yes, describe impact and migration instructions.

Related issues

Link related issues and discussions. Example: Closes #123

Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

Radheshg04 and others added 16 commits January 9, 2026 12:01
This PR introduces image generation support to Bifrost and wires it into the existing gateway infrastructure. It adds core schemas, provider extensions, HTTP transport, streaming plumbing, cache integration, and initial UI support to address #950.

Implementation is in progress. Completed functionality is marked, and remaining work will be completed in follow-up commits before marking this PR ready for review.

Scoped Changes:
- [x] New /v1/images/generations endpoint (OpenAI-compatible)
- [x] Image generation via Chat Completion API (tool use pattern)
- [x] Image generation via Responses API (native support)
- [x] Streaming image delivery (base64 chunks)
- [x] Semantic caching for image generation
- [x] UI components for image rendering
- [x] Provider implementations: OpenAI DALL-E, Azure DALL-E

Rollout Plan:

- [x] Phase 1: Core schema and provider implementation (OpenAI + Azure)
- [x] Phase 2: HTTP transport and non-streaming endpoint
- [x] Phase 3: Streaming support and accumulator
- [x] Phase 4: Semantic cache integration (Base64 storage, 5min TTL)
- [x] Phase 5: UI components and documentation

- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [x] Documentation
- [ ] Chore/CI

- [x] Core (Go)
- [x] Transports (HTTP)
- [x] Providers/Integrations
- [x] Plugins
- [x] UI (Next.js)
- [x] Docs

Unit Tests
- [x] Schema serialization/deserialization
- [x] Request transformation (Bifrost → OpenAI format)
- [x] Response transformation (OpenAI → Bifrost format)
- [x] Stream chunk accumulation
- [x] Cache key generation

Integration Tests

- [x] End-to-end image generation (non-streaming)
- [x] End-to-end streaming image generation
- [x] Fallback to secondary provider
- [x] Cache hit/miss scenarios
- [x] Error handling (rate limits, invalid prompts)

Load Tests

- [x] Concurrent image generation requests
- [x] Stream memory usage under load
- [x] Cache performance at scale

End-to-end image generation
```sh
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "openai/dall-e-3",
    "prompt": "A serene Japanese garden with cherry blossoms",
    "size": "1024x1024",
    "quality": "high",
    "response_format": "b64_json"
  }'
  ```
  Expected Outcome:
  ```json
  {
  "id": "abc123",
  "created": 1699999999,
  "model": "dall-e-3",
  "data": [
    {
      "b64_json": "iVBORw0KGgo...",
      "revised_prompt": "A tranquil Japanese garden featuring blooming cherry blossom trees...",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "total_tokens": 15
  },
  "extra_fields": {
    "provider": "openai",
    "latency_ms": 8500,
    "cache_debug": null
  }
}
```

Unit Tests:
```sh
go test -v github.com/maximhq/bifrost/core/providers/openai -run TestImage
go test -v github.com/maximhq/bifrost/core -run TestImage
```
Expected outcome: PASS
Note: Might need to replace remote paths with local modules in go.mod (Once PR is merged, this directive is not needed)
```sh
replace github.com/maximhq/bifrost/core => ../../core
```

N/A

- [ ] Yes
- [x] No

Closes #950

N/A

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [ ] I verified the CI pipeline passes locally if applicable
… tests to support image gen; minor code fixes
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Image generation API (sync + SSE streaming) with multi-provider support, fallbacks, URL/base64 outputs, and rich generation parameters.
  • UI

    • Image viewer and image message components with navigation and streaming placeholders.
  • Docs

    • OpenAPI paths, schemas, and provider docs/examples for image generation.
  • Logging & Pricing

    • Image-generation logging, persistence, semantic caching, DB migrations, and pricing support.
  • Tests

    • End-to-end and caching tests for image generation (streaming and non-streaming).

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds end-to-end image generation: new request/response/stream types, Bifrost public methods and provider image APIs (implemented or stubbed), streaming accumulator and assembly, HTTP/OpenAPI routes, schemas, logging/storage/pricing, semantic-cache and tests, and UI components.

Changes

Cohort / File(s) Summary
Core API & Schemas
core/bifrost.go, core/schemas/bifrost.go, core/schemas/images.go, core/schemas/provider.go, core/schemas/utils.go, core/schemas/tracer.go
Add ImageGenerationRequest/Response/Stream types, Bifrost ImageGeneration public methods, Provider interface methods, IsImagenModel helper, and tracing fields.
Routing & HTTP handlers
transports/bifrost-http/handlers/inference.go, transports/bifrost-http/integrations/router.go, transports/bifrost-http/integrations/openai.go, transports/bifrost-http/integrations/genai.go
Add /v1/images/generations endpoints, converters for JSON and SSE, route wiring, prehooks and streaming emitter adjustments.
Providers — implemented
core/providers/openai/..., core/providers/huggingface/..., core/providers/gemini/..., core/providers/vertex/..., core/providers/azure/..., core/providers/nebius/..., core/providers/xai/...
Implement provider-specific image generation handlers, conversions, streaming support where applicable, and detailed error parsing for image endpoints.
Providers — stubs / signatures
core/providers/{anthropic,bedrock,cerebras,cohere,elevenlabs,groq,mistral,ollama,openrouter,parasail,perplexity,sgl}/*.go
Add ImageGeneration/ImageGenerationStream method stubs returning unsupported-operation errors; update streaming call-sites to new arity placeholder.
Provider conversions & encodings
core/providers/gemini/*, core/providers/openai/*, core/providers/huggingface/*, core/providers/nebius/*
Large Gemini Imagen/Gemini conversion work, blob/base64 handling changes, HuggingFace multi-provider mappings, OpenAI image stream types and converters.
Streaming framework
framework/streaming/accumulator.go, framework/streaming/images.go, framework/streaming/types.go
Add image stream chunk pool, accumulation/deduplication, assembly into final BifrostImageGenerationResponse, StreamTypeImage wiring, and processing logic.
Logging, storage & migrations
framework/logstore/*, framework/configstore/migrations.go, framework/configstore/tables/modelpricing.go, framework/modelcatalog/*
Persist image inputs/outputs in logs, add DB migrations/columns, update serialization/deserialization, and add image pricing fields and pricing logic.
Semantic cache & tests
plugins/semanticcache/{utils.go,search.go,stream.go,plugin_image_generation_test.go,test_utils.go}
Add hashing/metadata extraction for image params, caching support for image responses/streams, ordering for image chunks, and E2E semantic-cache tests.
Test utilities & validation
core/internal/testutil/*
New image generation test helpers, validators, retry frameworks and presets; note: duplicates/near-duplicates present that need cleanup.
Transport/integration wiring
transports/bifrost-http/integrations/*
Add image-generation converters, response converters for streams, and adapter changes across integrations.
OpenAPI & Docs
docs/openapi/**, docs/providers/supported-providers/**, docs/quickstart/**
Add OpenAPI paths/schemas for image generation, provider docs and examples, quickstart updates, and provider support matrices.
UI / Frontend
ui/app/workspace/logs/views/imageView.tsx, ui/components/chat/ImageMessage.tsx, ui/app/workspace/logs/views/logDetailsSheet.tsx, ui/lib/{constants,types}/...
Add ImageView and ImageMessage components, request-type labels/types, log typings, and integrate image view into log details.
Plugins & misc wiring
plugins/logging/*, plugins/mocker/*, plugins/maxim/*
Wire image fields into logging, stream processing, mock extraction, processed stream responses, and provider allowed-request UI.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant HTTPHandler
    participant Bifrost
    participant Provider
    participant StreamAccumulator
    participant RemoteAPI

    Client->>HTTPHandler: POST /v1/images/generations (stream? true/false)
    HTTPHandler->>Bifrost: ImageGenerationRequest / ImageGenerationStreamRequest

    alt Streaming
        Bifrost->>Provider: ImageGenerationStream(req)
        Provider->>RemoteAPI: open streaming/SSE
        RemoteAPI-->>Provider: stream chunks (partial/completed/error)
        Provider->>StreamAccumulator: addImageStreamChunk(chunk)
        StreamAccumulator->>StreamAccumulator: processImageStreamingResponse()
        StreamAccumulator-->>Bifrost: assembled ImageGenerationStreamResponse(s)
    else Non‑streaming
        Bifrost->>Provider: ImageGeneration(req)
        Provider->>RemoteAPI: HTTP POST
        RemoteAPI-->>Provider: JSON ImageGenerationResponse
        Provider-->>Bifrost: BifrostImageGenerationResponse
    end

    Bifrost-->>HTTPHandler: JSON or SSE (image chunks / final)
    HTTPHandler-->>Client: 200 OK with image data
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • Pratham-Mishra04

Poem

🐇
I nibbled prompts beneath moonbeams bright,
I stitched the chunks into pictures at night,
From base64 crumbs to a finished sight,
The rabbit paints pixels with coding delight! 🎨🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 3
❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is an unfilled template with no Summary, Changes, Type of change selection, Affected areas selection, How to test details, Screenshots, Breaking changes clarification, Related issues links, Security considerations, or Checklist completion. All required sections are missing. Complete the PR description template by providing: a summary of the image generation feature, detailed changes made across Go/providers/plugins/UI/docs, select the type of change (Feature), check affected areas, provide testing instructions, clarify if there are breaking changes, link related issues, address security considerations, and complete the checklist.
Linked Issues check ⚠️ Warning The PR is titled 'feat: support for image generation' but the only linked issue #123 is about Files API Support (file uploads for RAG/fine-tuning). The PR's image generation implementation does not address the Files API requirements from issue #123. Either link the correct image generation-related issues that this PR addresses, or clarify if this PR intentionally diverges from issue #123 and provide justification for the alternative scope.
Out of Scope Changes check ⚠️ Warning The changeset includes extensive image generation support (schemas, providers, handlers, streaming, logging, UI components, pricing) that is unrelated to the Files API requirements specified in linked issue #123. Additionally, there are unrelated changes like renaming caseInsensitiveLookup to CaseInsensitiveLookup and database migrations for image pricing. Remove changes unrelated to the stated objective (image generation), or update linked issues to include image generation as an intended feature. Verify that all included changes align with the PR's declared scope.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: support for image generation' clearly describes the primary change: adding image generation functionality. It is specific, concise, and accurately reflects the main scope of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 81.08% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/image-gen-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Radheshg04 Radheshg04 marked this pull request as ready for review January 12, 2026 13:36
Copy link
Contributor Author

Radheshg04 commented Jan 12, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Contributor

🧪 Test Suite Available

This PR can be tested by a repository admin.

Run tests for PR #1312

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (18)
docs/providers/supported-providers/perplexity.mdx (1)

31-33: Update the Note to include Image Generation in the unsupported operations list.

For consistency with the updated table, add "Image Generation" to the list of unsupported operations in the Note.

📝 Suggested documentation update
 <Note>
-**Unsupported Operations** (❌): Text Completions, Embeddings, Speech, Transcriptions, Files, Batch, and List Models are not supported by the upstream Perplexity API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Text Completions, Embeddings, Image Generation, Speech, Transcriptions, Files, Batch, and List Models are not supported by the upstream Perplexity API. These return `UnsupportedOperationError`.
 </Note>
docs/providers/supported-providers/cohere.mdx (1)

18-33: Add Image Generation to the Unsupported Operations note to match the table.

The table includes Image Generation | ❌ | ❌ | - (line 25), but the note omits it from the unsupported operations list (lines 31-33). This inconsistency should be resolved by adding Image Generation to the note. Cohere does not offer image generation APIs—only multimodal embeddings and image understanding—so the ❌ status is correct.

Proposed doc fix
 <Note>
-**Unsupported Operations** (❌): Text Completions, Speech, Transcriptions, Files, and Batch are not supported by the upstream Cohere API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Text Completions, Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Cohere API. These return `UnsupportedOperationError`.
 </Note>
core/schemas/responses.go (1)

540-575: Error message mentions image generation output but struct doesn't handle it.

The error messages on lines 556 and 574 reference "image generation call output" as a valid type, but ResponsesToolMessageOutputStruct only has three fields and no corresponding field or handling logic for image generation output:

  • ResponsesToolCallOutputStr
  • ResponsesFunctionToolCallOutputBlocks
  • ResponsesComputerToolCallOutput

The marshal/unmarshal methods don't check for or process image generation output data. Either add a field and handling for image generation output, or remove the reference from the error messages to match the actual struct capabilities.

plugins/semanticcache/stream.go (1)

91-129: Prevent possible panic in stream chunk sorting (missing j-side type guard).

ImageGenerationStreamResponse sorting checks i but assumes j is also an image chunk; if a mixed variant ever lands in accumulator.Chunks, this will nil-deref at Line 123/126. (This same pattern exists in the earlier branches too, but this new branch is an immediate place to harden.)

Proposed minimal fix (guard `j` before deref)
 	if accumulator.Chunks[i].Response.TranscriptionStreamResponse != nil {
 		return accumulator.Chunks[i].Response.TranscriptionStreamResponse.ExtraFields.ChunkIndex < accumulator.Chunks[j].Response.TranscriptionStreamResponse.ExtraFields.ChunkIndex
 	}
-	if accumulator.Chunks[i].Response.ImageGenerationStreamResponse != nil {
-		// For image generation, sort by Index first, then ChunkIndex
-		if accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index != accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index {
-			return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index
-		}
-		return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.ChunkIndex < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.ChunkIndex
-	}
+	if iImg := accumulator.Chunks[i].Response.ImageGenerationStreamResponse; iImg != nil {
+		jImg := accumulator.Chunks[j].Response.ImageGenerationStreamResponse
+		if jImg == nil {
+			// Mixed response variants: keep stable input order, but don't panic.
+			return false
+		}
+		// For image generation, sort by Index first, then ChunkIndex
+		if iImg.Index != jImg.Index {
+			return iImg.Index < jImg.Index
+		}
+		return iImg.ChunkIndex < jImg.ChunkIndex
+	}
 	return false
 })
docs/providers/supported-providers/elevenlabs.mdx (1)

18-35: Keep “Unsupported Operations” note consistent with the table.

Table adds “Image Generation” as ❌, but the note’s explicit list doesn’t include it—worth updating to avoid confusion.

docs/providers/supported-providers/azure.mdx (1)

18-34: Reconcile Image Generation endpoint in the table vs the section.

The Supported Operations table says /openai/v1/images/generations, but the Image Generation section uses /openai/deployments/{deployment}/images/generations?api-version={version}. Pick one or explicitly label them as “Gateway endpoint” vs “Azure upstream endpoint”.

Also applies to: 364-433

core/internal/testutil/account.go (1)

676-710: Enable ImageGeneration scenario flags for OpenAI.

OpenAI has ImageGenerationModel: "dall-e-2" configured, but the corresponding Scenarios.ImageGeneration and Scenarios.ImageGenerationStream flags are not enabled. This is inconsistent with other capabilities (e.g., SpeechSynthesisModel is paired with SpeechSynthesis: true). The image generation tests will be skipped for OpenAI unless these flags are set to true.

core/providers/gemini/types.go (1)

1114-1123: Update Blob.Data comment to clarify base64 encoding semantics.

The comment states "Required. Raw bytes." but the field actually contains a base64-encoded string. All callsites (utils.go, responses.go, speech.go, transcription.go, images.go) correctly encode bytes via encodeBytesToBase64String(). Update the comment to: // Required. Base64-encoded bytes.

docs/providers/supported-providers/vertex.mdx (1)

20-31: Clarify parameter naming in Image Generation request examples — prompt vs input.prompt.

The "Core Parameter Mapping" table (line 325) shows input.prompt, but the Gateway example (line 345) demonstrates "prompt" at the root level of the request JSON. This creates ambiguity about the expected request structure. The Go SDK example correctly reflects the struct (Input: &ImageGenerationInput{Prompt: ...}), but the Gateway curl example should be clarified: does it accept "prompt" directly, or should it be nested as "input": {"prompt": ...}? Update the documentation or table to remove this inconsistency.

The Endpoint column concern is already addressed in the "Endpoint Selection" section (lines 388–394), which clearly explains these are Vertex AI endpoints.

core/providers/gemini/speech.go (1)

146-148: Unsafe type assertion may cause panic.

Line 147 uses a type assertion without the comma-ok idiom. If BifrostContextKeyResponseFormat is missing from the context or has a different type, this will panic at runtime.

🛠️ Suggested fix with safe type assertion
 		if len(audioData) > 0 {
-			responseFormat := ctx.Value(BifrostContextKeyResponseFormat).(string)
+			responseFormat, _ := ctx.Value(BifrostContextKeyResponseFormat).(string)
 			// Gemini returns PCM audio (s16le, 24000 Hz, mono)

Using the comma-ok idiom ensures responseFormat defaults to an empty string if the context value is missing or has an unexpected type, which will then fall through to the else branch returning raw audioData.

tests/integrations/python/tests/test_openai.py (1)

74-152: Import additions look fine; update the big module docstring test list to include image-generation if that list is meant to stay authoritative.

Not required for correctness, but right now the header docstring enumerates many cases and doesn’t mention these new ones.

core/providers/bedrock/bedrock.go (1)

1360-1365: Fix logger call to use structured logging format: Line 1362 passes a printf-style format string ("%s") to Logger.Error, but the Logger interface uses structured logging with signature Error(msg string, args ...any). The "%s" will not be interpolated; instead, the message will log literally and the argument becomes an unkeyed field. Use fmt.Sprintf() to format the message before logging.

Fix
-        if err.Error != nil {
-            provider.logger.Error("file upload operation not allowed: %s", err.Error.Message)
-        }
+        if err.Error != nil {
+            provider.logger.Error(fmt.Sprintf("file upload operation not allowed: %s", err.Error.Message))
+        }
         return nil, err
core/providers/openai/openai.go (1)

2286-2293: Transcription stream: re-add keepalive/comment skipping (":" lines) to avoid bogus JSON parsing.
Right now you only skip empty lines; SSE streams commonly include : keepalive lines (and sometimes event:). This likely regresses behavior vs other stream handlers in this file.

Proposed fix
-			// Skip empty lines and comments
-			if line == "" {
+			// Skip empty lines and comments / keepalive pings
+			if line == "" || strings.HasPrefix(line, ":") || strings.HasPrefix(line, "event:") {
 				continue
 			}
framework/logstore/tables.go (1)

98-138: BuildContentSummary is missing image generation prompts, making them unsearchable.

Image generation input prompts and outputs are stored but not included in BuildContentSummary(), unlike speech input and transcription output. Add ImageGenerationInputParsed.Prompt to the searchable content summary so users can find logs by image generation prompts.

Additionally, storing full base64-encoded image outputs in the database may impact performance and storage. Consider whether this data should be truncated, metadata-only, or stored externally.

framework/streaming/types.go (1)

13-20: Initialize Data to empty slice when constructing empty BifrostImageGenerationResponse.

In ToBifrostResponse (framework/streaming/types.go, StreamTypeImage case), when creating an empty response:

imageResp = &schemas.BifrostImageGenerationResponse{}

The Data field remains nil and marshals to JSON as null. Since the field lacks omitempty, this is inconsistent with other code paths (e.g., framework/streaming/images.go:102-110) that initialize Data as an empty slice. Initialize it to avoid serialization inconsistency:

imageResp = &schemas.BifrostImageGenerationResponse{
    Data: make([]schemas.ImageData, 0),
}
transports/bifrost-http/handlers/inference.go (1)

1258-1334: Do not log entire streaming chunks on marshal failure (can leak base64 image data + explode logs).

This line can print the full chunk (including b64_json image bytes) into logs:

  • logger.Warn(..., chunk: %v", err, chunk)

That’s both a data leakage risk and a reliability risk (huge log lines, backpressure, cost).

Also, skipDoneMarker is inferred from seeing an image-gen chunk; if the channel closes without yielding chunks, you can still emit [DONE] for an image stream.

Proposed diff
-			chunkJSON, err := sonic.Marshal(chunk)
+			chunkJSON, err := sonic.Marshal(chunk)
 			if err != nil {
-				logger.Warn(fmt.Sprintf("Failed to marshal streaming response: %v, chunk: %v", err, chunk))
+				// Avoid logging full chunk (may contain large base64 payloads / sensitive data)
+				logger.Warn(fmt.Sprintf(
+					"Failed to marshal streaming response: %v (has_image=%t has_responses=%t has_error=%t)",
+					err,
+					chunk.BifrostImageGenerationStreamResponse != nil,
+					chunk.BifrostResponsesStreamResponse != nil,
+					chunk.BifrostError != nil,
+				))
 				continue
 			}
@@
-				} else if chunk.BifrostError != nil {
-					eventType = string(schemas.ResponsesStreamResponseTypeError)
+				} else if chunk.BifrostError != nil {
+					// Prefer an image-generation-specific error type constant if one exists; otherwise "error".
+					eventType = "error"
 				}

Based on learnings, keeping image-generation SSE semantics consistent (“error” event, no [DONE]) is important across transports/tests/UI.

framework/streaming/accumulator.go (1)

460-491: Cleanup(): image chunks should be reset before pooling to avoid retaining large base64 buffers.
Cleanup() currently does a.imageStreamChunkPool.Put(chunk) without clearing chunk.Delta / chunk.RawResponse etc. For images, this can pin large memory in the pool.

Proposed fix
 		for _, chunk := range accumulator.ImageStreamChunks {
-			a.imageStreamChunkPool.Put(chunk)
+			a.putImageStreamChunk(chunk)
 		}
core/providers/huggingface/huggingface.go (1)

220-265: completeRequest uses image-specific error parser for all request types.

parseHuggingFaceImageError() is called for every non-200 response at line 245, regardless of whether the request is chat, embedding, audio, speech, or transcription. The codebase has two different error structures (HuggingFaceResponseError for images and HuggingFaceHubError for other endpoints), yet all errors flow through the image parser. This creates risk of parsing failures if error formats differ by endpoint type.

The function receives isHFInferenceAudioRequest and isHFInferenceImageRequest flags but lacks request type context needed to select the appropriate error handler. completeRequestWithModelAliasCache already has access to requestType—pass it through to enable proper error branching.

🤖 Fix all issues with AI agents
In @core/providers/gemini/images.go:
- Around line 123-183: In ToBifrostImageGenerationResponse ensure
bifrostResp.Data is never nil (initialize to an empty []schemas.ImageData when
no inline images are found) and set ImageData.Index for each inline image part
(track the part index while iterating candidate.Content.Parts and assign it to
the ImageData.Index field when appending). Update the loop that builds imageData
to include an index counter and always assign bifrostResp.Data (either the
populated slice or an empty slice) before returning.

In @core/providers/huggingface/huggingface.go:
- Around line 1024-1278: The image stream chunks never set the Type field,
breaking consumers and tests; update the per-image chunk construction
(BifrostImageGenerationStreamResponse variable chunk inside the for loop) to set
chunk.Type = string(openai.ImageGenerationDelta) for non-final incremental
chunks, and set finalChunk.Type = string(openai.ImageGenerationCompleted) when
building the completion chunk (finalChunk). Ensure the Type is set consistently
whether URL or B64JSON is used and preserve existing ExtraFields and raw
request/response handling.

In @core/providers/nebius/nebius.go:
- Around line 244-343: The ImageGeneration method can panic when accessing
request.Params.ExtraParams["ai_project_id"] if ExtraParams is nil; update the
conditional that currently checks request.Params != nil to also ensure
request.Params.ExtraParams != nil before indexing (i.e., only read
request.Params.ExtraParams["ai_project_id"] when ExtraParams is non-nil), and
preserve the existing logic that url-escapes and appends the ai_project_id to
path; modify the block in ImageGeneration that inspects request.Params to
perform this nil check on ExtraParams.

In @core/providers/vertex/vertex.go:
- Around line 1420-1437: Guard against a nil request and validate the prompt
before calling getModelDeployment: at the start of
VertexProvider.ImageGeneration, check if request == nil and return a
providerUtils.NewConfigurationError with a clear message (using providerName),
then validate the prompt field on the schemas.BifrostImageGenerationRequest
(e.g., ensure request.Prompt is not nil/empty after trimming) and return a
similar configuration error if missing/empty; perform these checks before
calling provider.getModelDeployment(key, request.Model) so the function cannot
panic and fails fast on invalid input.
- Around line 1493-1559: The code incorrectly allows API-key query auth for
custom fine-tuned, Imagen, and Gemini endpoints (authQuery, completeURL
construction in the branches for schemas.IsAllDigitsASCII,
schemas.IsImagenModel, schemas.IsGeminiModel) and conditionally skips OAuth2;
remove the authQuery logic entirely (do not set or append "?key=...") and always
obtain an OAuth2 token via getAuthTokenSource and token.AccessToken, ensuring
the Authorization header is always set before the request; update the branch
code to only build completeURL (no authQuery) and remove the conditional that
bypasses token fetching when key.Value is present so that all requests use
bearer tokens.

In @framework/streaming/images.go:
- Around line 13-110: The special-case branch in
buildCompleteImageFromImageStreamChunks can nil-deref when the single final
chunk has Delta == nil and it also treats URL-only streams as empty; update the
initial single-chunk check to verify chunks[0].Delta != nil before accessing
fields, and when reconstructing images collect both Delta.B64JSON and Delta.URL
(or treat URL-only chunks as valid image entries) into the images/result
construction so the function doesn't return nil for URL-only streams; also
prefer using chunk.Delta.CreatedAt (if non-zero/non-empty) when setting Created
on the final response (falling back to time.Now() only if provider CreatedAt is
missing), and ensure RevisedPrompt handling and response ID selection still work
when some chunks have nil Delta by guarding accesses to chunk.Delta.
- Around line 207-337: The request-id handling and ImageIndex fallback are
incorrect: in processImageStreamingResponse change the request-id branch to log
the missing/empty requestID and return nil, nil (to match the comment) instead
of returning an error, and fix the ImageIndex selection so it doesn't always
choose Index (which defaults to 0); update the logic around
ImageGenerationStreamResponse.Index and SequenceNumber to use Index only when it
is explicitly set (e.g., Index != 0) and otherwise fall back to SequenceNumber
so chunk ordering is preserved.
🟠 Major comments (23)
docs/providers/supported-providers/mistral.mdx-31-33 (1)

31-33: Correct the attribution: "upstream Mistral API" does not support image generation is factually inaccurate.

Mistral added image generation support in late 2024 and expanded it in 2025. The unsupported operations are limitations of Bifrost's current integration, not Mistral's API capabilities. Update the docs to attribute the limitation correctly to the integration layer.

Proposed fix
-**Unsupported Operations** (❌): Text Completions, Image Generation, Speech (TTS), Files, and Batch are not supported by the upstream Mistral API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Text Completions, Image Generation, Speech (TTS), Files, and Batch are not currently supported by Bifrost's Mistral integration. These return `UnsupportedOperationError`.

Also update the Unsupported Features table (lines 237–245) to reflect this distinction.

transports/bifrost-http/integrations/genai.go-436-440 (1)

436-440: Consolidate model detection function with canonical implementation in core/schemas.

The function isImagenModel() duplicates the logic of schemas.IsImagenModel() (in core/schemas/utils.go), which is already used throughout the codebase (vertex.go, gemini.go). Since genai.go already imports the schemas package, use schemas.IsImagenModel(modelStr) instead of maintaining a local copy. This eliminates unnecessary duplication and ensures consistent model detection across the codebase.

The substring matching approach using strings.Contains(modelLower, "imagen") is acceptable given that Imagen models follow a consistent naming convention (e.g., "imagen-3.0-generate-001"), so no change to the detection logic is needed—only consolidation to the existing canonical function.

transports/bifrost-http/integrations/router.go-1085-1087 (1)

1085-1087: Streaming path can also panic if ImageGenerationStreamResponseConverter isn’t set.

Right now it’s called unconditionally when an image stream chunk arrives. Consider failing fast before starting the stream if the request is image-generation streaming but the converter is nil.

Proposed fix (fail fast for streaming converter)
@@
 func (g *GenericRouter) handleStreamingRequest(ctx *fasthttp.RequestCtx, config RouteConfig, bifrostReq *schemas.BifrostRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
@@
 	// Check if streaming is configured for this route
 	if config.StreamConfig == nil {
@@
 	}
+
+	// Fail fast on misconfigured image-generation streaming routes
+	if bifrostReq.ImageGenerationRequest != nil && config.StreamConfig.ImageGenerationStreamResponseConverter == nil {
+		cancel()
+		g.sendStreamError(ctx, bifrostCtx, config, newBifrostError(nil, "ImageGenerationStreamResponseConverter not configured"))
+		return
+	}
 
 	// Handle streaming using the centralized approach
 	// Pass cancel function so it can be called when the writer exits (errors, completion, etc.)
 	g.handleStreaming(ctx, bifrostCtx, config, stream, cancel)
 }

Also applies to: 1285-1287

transports/bifrost-http/integrations/router.go-203-210 (1)

203-210: Good plumbing, but add safety rails for missing image-generation converters (avoid panics).

Route config validation currently doesn’t enforce these new converters, and later code assumes they’re non-nil. Suggest adding runtime checks where they’re used (similar to the existing CountTokens converter guard) so misconfigured routes fail with a 5xx/clear error instead of panicking.

Proposed fix (nil guards similar to CountTokens)
@@
 	case bifrostReq.ImageGenerationRequest != nil:
 		imageGenerationResponse, bifrostErr := g.client.ImageGenerationRequest(bifrostCtx, bifrostReq.ImageGenerationRequest)
 		if bifrostErr != nil {
 			g.sendError(ctx, bifrostCtx, config.ErrorConverter, bifrostErr)
 			return
 		}
@@
 		if imageGenerationResponse == nil {
 			g.sendError(ctx, bifrostCtx, config.ErrorConverter, newBifrostError(nil, "Bifrost response is nil after post-request callback"))
 			return
 		}
 
+		if config.ImageGenerationResponseConverter == nil {
+			g.sendError(ctx, bifrostCtx, config.ErrorConverter, newBifrostError(nil, "ImageGenerationResponseConverter not configured"))
+			return
+		}
+
 		// Convert Bifrost response to integration-specific format and send
 		response, err = config.ImageGenerationResponseConverter(bifrostCtx, imageGenerationResponse)

Also applies to: 257-264, 278-309

tests/integrations/python/tests/test_openai.py-1166-1253 (1)

1166-1253: Guard cross-provider image params (quality, non-square size) to avoid flaky failures.

quality="low" and size="1024x1536" are not universally supported across providers/models; if image_generation scenario can include providers that don’t support those fields, these will fail hard rather than skip. Consider:

  • capability filtering in get_cross_provider_params_with_vk_for_scenario("image_generation"), or
  • skipping/xfailing when provider/model doesn’t support quality / requested size, or
  • catching a known “unsupported” error and pytest.skip(...).
ui/components/chat/ImageMessage.tsx-39-49 (1)

39-49: Pass output_format to ImageMessage component to construct correct MIME type for b64_json.

The OpenAI Images API supports multiple output formats (png, jpeg, webp). Hardcoding data:image/png breaks rendering when upstream emits other formats. The BifrostImageGenerationOutput contains output_format in params, but this information is lost in ImageMessageData. Either include the format in the data object passed to ImageMessage and use it to construct the correct data URL, or add a mime_type field to preserve the actual image format.

core/providers/openai/openai_image_test.go-13-107 (1)

13-107: Add an “empty prompt” test to match request validation expectations
You cover nil request and nil input, but not Input.Prompt == "". Given the repo emphasis on rejecting missing prompts before dispatch, this test suite should lock in the intended behavior. Based on learnings, validate nil/empty prompts before dispatch.

core/providers/elevenlabs/elevenlabs.go-442-442 (1)

442-442: Verify GetBifrostResponseForStreamResponse argument ordering after signature expansion
These call sites assume the “speech stream response” position is still the 4th argument; if the helper’s parameter list was extended for image generation, it’s easy to shift the slot and silently route the chunk into the wrong union field. A focused unit test around the helper would guard this.

Also applies to: 464-464

tests/integrations/python/tests/utils/common.py-2629-2705 (1)

2629-2705: Fix Imagen "predictions" validation: object-path incorrectly uses hasattr() instead of extracting the actual base64 value

In the Imagen prediction branch, when prediction is not a dict, has_b64 becomes a boolean from hasattr() rather than the actual base64 string. This allows assertions to pass even when bytesBase64Encoded is None or empty. Additionally, the Gemini inlineData parsing doesn't account for casing variations (inlineData vs inline_data, mimeType vs mime_type) that can occur in actual API responses.

Proposed fix
         elif (isinstance(response, dict) and "predictions" in response) or hasattr(response, "predictions"):
             # Imagen response
             predictions = response.get("predictions") if isinstance(response, dict) else response.predictions
             assert len(predictions) > 0, "Response should have at least one prediction"
             for i, prediction in enumerate(predictions):
-                has_b64 = (prediction.get("bytesBase64Encoded") if isinstance(prediction, dict) 
-                          else (hasattr(prediction, "bytesBase64Encoded") or hasattr(prediction, "bytes_base64_encoded")))
-                assert has_b64, f"Prediction {i} should have base64 encoded bytes"
+                if isinstance(prediction, dict):
+                    b64 = (
+                        prediction.get("bytesBase64Encoded")
+                        or prediction.get("bytes_base64_encoded")
+                        or prediction.get("bytesBase64")
+                    )
+                else:
+                    b64 = (
+                        getattr(prediction, "bytes_base64_encoded", None)
+                        or getattr(prediction, "bytesBase64Encoded", None)
+                        or getattr(prediction, "bytesBase64", None)
+                    )
+                assert b64, f"Prediction {i} should contain base64 encoded bytes"
+                assert isinstance(b64, str), f"Prediction {i} base64 should be a string, got {type(b64)}"
+                assert len(b64) > 100, f"Prediction {i} base64 seems too short"
                     inline_data = part.get("inlineData") if isinstance(part, dict) else getattr(part, "inline_data", None)
                     if inline_data:
                         found_image = True
-                        mime_type = inline_data.get("mimeType") if isinstance(inline_data, dict) else getattr(inline_data, "mime_type", "")
-                        data = inline_data.get("data") if isinstance(inline_data, dict) else getattr(inline_data, "data", "")
+                        mime_type = (
+                            (inline_data.get("mimeType") or inline_data.get("mime_type", "")) if isinstance(inline_data, dict)
+                            else (getattr(inline_data, "mime_type", "") or getattr(inline_data, "mimeType", ""))
+                        )
+                        data = inline_data.get("data") if isinstance(inline_data, dict) else getattr(inline_data, "data", "")
                         assert mime_type.startswith("image/"), \
                             f"Expected image mime type, got {mime_type}"
                         assert len(data) > 100, "Image data seems too short"

Also update the inlineData lookup to handle casing:

-                    inline_data = part.get("inlineData") if isinstance(part, dict) else getattr(part, "inline_data", None)
+                    inline_data = (
+                        (part.get("inlineData") or part.get("inline_data")) if isinstance(part, dict)
+                        else (getattr(part, "inline_data", None) or getattr(part, "inlineData", None))
+                    )
core/providers/nebius/images.go-11-71 (1)

11-71: Add prompt + size validation (empty/zero/negative) and consider normalizing case.

  • ToNebiusImageGenerationRequest: currently allows empty Prompt and non-positive Width/Height (e.g., "0x-1"). This should fail early (per learnings: validate nil/missing prompts before dispatch).
  • OutputFormat conversion only matches "jpeg" exactly; consider strings.ToLower before comparison.
Proposed fix
diff --git a/core/providers/nebius/images.go b/core/providers/nebius/images.go
@@
 func (provider *NebiusProvider) ToNebiusImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) (*NebiusImageGenerationRequest, error) {
 	if bifrostReq == nil || bifrostReq.Input == nil {
 		return nil, fmt.Errorf("bifrost request is nil or input is nil")
 	}
+	if strings.TrimSpace(bifrostReq.Input.Prompt) == "" {
+		return nil, fmt.Errorf("prompt is required")
+	}
@@
 			width, err := strconv.Atoi(size[0])
 			if err != nil {
 				return nil, fmt.Errorf("invalid width in size %q: %w", *bifrostReq.Params.Size, err)
 			}
@@
 			height, err := strconv.Atoi(size[1])
 			if err != nil {
 				return nil, fmt.Errorf("invalid height in size %q: %w", *bifrostReq.Params.Size, err)
 			}
+			if width <= 0 || height <= 0 {
+				return nil, fmt.Errorf("invalid size %q: width and height must be > 0", *bifrostReq.Params.Size)
+			}
@@
 		if bifrostReq.Params.OutputFormat != nil {
-			req.ResponseExtension = bifrostReq.Params.OutputFormat
+			ext := strings.ToLower(strings.TrimSpace(*bifrostReq.Params.OutputFormat))
+			req.ResponseExtension = &ext
 		}
-		if req.ResponseExtension != nil && *req.ResponseExtension == "jpeg" {
+		if req.ResponseExtension != nil && *req.ResponseExtension == "jpeg" {
 			req.ResponseExtension = schemas.Ptr("jpg")
 		}
core/providers/nebius/images.go-62-68 (1)

62-68: Use SafeExtractFloat64Pointer and round to int for guidance_scale extraction.

NebiusImageGenerationRequest.GuidanceScale is typed as *int (0-100 range), but the current SafeExtractIntPointer will fail silently if the input is a float. Guidance scale is commonly provided as a fractional value (e.g., 7.5). Instead, extract as float64 and convert to int with proper rounding:

if v, ok := schemas.SafeExtractFloat64Pointer(bifrostReq.Params.ExtraParams["guidance_scale"]); ok {
    rounded := int(math.Round(*v))
    req.GuidanceScale = &rounded
}
plugins/semanticcache/utils.go-114-116 (1)

114-116: Potential nil deref in hash generation for image requests.

hashInput.Params = req.ImageGenerationRequest.Params will panic if RequestType is image-generation but req.ImageGenerationRequest is nil.

Proposed fix
 case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
-	hashInput.Params = req.ImageGenerationRequest.Params
+	if req.ImageGenerationRequest != nil {
+		hashInput.Params = req.ImageGenerationRequest.Params
+	}
core/providers/openai/images.go-31-42 (1)

31-42: Fix nil receiver + avoid always emitting empty params.

Two issues:

  1. request can be nil (panic on request.Model).
  2. Params: &request.ImageGenerationParameters forces params to be present even when empty ({}), which can change semantics and cache keys.
Proposed fix
 func (request *OpenAIImageGenerationRequest) ToBifrostImageGenerationRequest() *schemas.BifrostImageGenerationRequest {
+	if request == nil {
+		return nil
+	}
 	provider, model := schemas.ParseModelString(request.Model, schemas.OpenAI)
 
+	var params *schemas.ImageGenerationParameters
+	if request.ImageGenerationParameters != (schemas.ImageGenerationParameters{}) {
+		p := request.ImageGenerationParameters
+		params = &p
+	}
+
 	return &schemas.BifrostImageGenerationRequest{
 		Provider: provider,
 		Model:    model,
 		Input: &schemas.ImageGenerationInput{
 			Prompt: request.Prompt,
 		},
-		Params:    &request.ImageGenerationParameters,
+		Params:    params,
 		Fallbacks: schemas.ParseFallbacks(request.Fallbacks),
 	}
 }
docs/openapi/paths/inference/images.yaml-3-57 (1)

3-57: OpenAPI: move description out of the text/event-stream media type; document partial/error SSE event types too.

  • description under content > text/event-stream is not a valid Media Type Object field in OAS; put that text in the 200.description or in the referenced schema’s description.
  • Also document image_generation.partial_image and error event types (not just image_generation.completed). Based on learnings, enforce the SSE event type values consistently.
core/providers/openai/openai.go-2496-2533 (1)

2496-2533: Streaming ImageGeneration: nil-check is good; also validate Input/prompt.
You guard request == nil, but you can still send a null/empty prompt (and then rely on provider errors). Align this with the non-streaming validation and the repo’s image-generation validation pattern. Based on learnings, validate missing prompts before dispatch.

core/providers/openai/openai.go-2382-2491 (1)

2382-2491: Validate nil request / missing prompt before dispatching image generation.
ImageGeneration() doesn’t guard req == nil or req.Input == nil / empty prompt; later you deref request.Model in error paths. This should follow the existing pattern used elsewhere in the repo for image generation validation. Based on learnings, enforce nil/empty prompt checks.

Proposed fix
 func (provider *OpenAIProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key,
 	req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
 
+	if req == nil || req.Input == nil || strings.TrimSpace(req.Input.Prompt) == "" {
+		return nil, providerUtils.NewBifrostOperationError("invalid request: prompt is required", nil, provider.GetProviderKey())
+	}
+
 	if err := providerUtils.CheckOperationAllowed(schemas.OpenAI, provider.customProviderConfig, schemas.ImageGenerationRequest); err != nil {
 		return nil, err
 	}
plugins/semanticcache/plugin_image_generation_test.go-103-110 (1)

103-110: Config missing required fields (Provider, Keys, Dimension) disables semantic caching.

The test intends to test semantic search but silently falls back to direct image generation. Config{Threshold: 0.5} omits Provider, Keys, and Dimension, causing Init() to log a warning and disable semantic caching. Match the pattern from NewTestSetup() and tests in plugin_core_test.go by providing all required fields alongside the custom Threshold.

framework/streaming/types.go-360-387 (1)

360-387: Ensure image responses always marshal data as [], not null.
When imageResp == nil, you create &schemas.BifrostImageGenerationResponse{} but don’t initialize Data. Given core/schemas/images.go defines Data []ImageData \json:"data"`, a nil slice can serialize to null`, which is a common client-breaker.

Proposed fix
 	case StreamTypeImage:
 		imageResp := p.Data.ImageGenerationOutput
 		if imageResp == nil {
 			imageResp = &schemas.BifrostImageGenerationResponse{}
+			imageResp.Data = []schemas.ImageData{}
 			if p.RequestID != "" {
 				imageResp.ID = p.RequestID
 			}
 			if p.Model != "" {
 				imageResp.Model = p.Model
 			}
 		}
core/providers/azure/azure.go-1212-1252 (1)

1212-1252: Add nil/prompt validation before accessing request.Model (prevents panic + bad upstream calls).
request.Model is read at Line 1222; if request == nil you’ll panic. Also validate request.Input.Prompt to avoid dispatching empty prompts. Based on learnings, this should be consistently enforced.

Proposed fix
 func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key,
 	request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if request == nil || request.Input == nil || strings.TrimSpace(request.Input.Prompt) == "" {
+		return nil, providerUtils.NewBifrostOperationError("prompt is required", nil, provider.GetProviderKey())
+	}
+	if strings.TrimSpace(request.Model) == "" {
+		return nil, providerUtils.NewBifrostOperationError("model is required", nil, provider.GetProviderKey())
+	}
 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}

Based on learnings, please apply the same guard pattern across providers that implement image generation.

tests/integrations/python/tests/test_google.py-1690-1853 (1)

1690-1853: Don’t let missing GEMINI_API_KEY / transient HTTP issues become hard failures or get silently skipped.

  • test_41a_* / test_41c_* call get_api_key(provider) but are not decorated with @skip_if_no_api_key(...) → will raise and fail the suite when env isn’t set.
  • test_41b_imagen_predict catches Exception and skips, which can hide real regressions (e.g., schema/response shape changes). Narrow the exception and only skip on expected “not available” conditions.
Proposed diff
@@
     # IMAGE GENERATION TEST CASES
     # =========================================================================
 
+    @skip_if_no_api_key("gemini")
     @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("image_generation"))
     def test_41a_image_generation_simple(self, test_config, provider, model):
@@
-        from .utils.config_loader import get_integration_url, get_config
-        from .utils.common import get_api_key
+        from .utils.config_loader import get_integration_url
+        from .utils.common import get_api_key
@@
         response = requests.post(url, json=payload, headers=headers, timeout=60)
         assert response.status_code == 200, f"Request failed with status {response.status_code}: {response.text}"
@@
-    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("imagen"))
+    @skip_if_no_api_key("gemini")
+    @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("imagen"))
     def test_41b_imagen_predict(self, test_config, provider, model):
@@
-        try:
-            response = requests.post(url, json=payload, headers=headers, timeout=60)
-            assert response.status_code == 200, f"Request failed with status {response.status_code}: {response.text}"
-            
-            response_data = response.json()
-            
-            # Validate response structure
-            assert_valid_image_generation_response(response_data, "google")
-        except Exception as e:
-            # Imagen may not be available in all regions or configurations
-            pytest.skip(f"Imagen generation failed (may not be available): {e}")
+        response = requests.post(url, json=payload, headers=headers, timeout=60)
+        if response.status_code in (404, 501):
+            pytest.skip(f"Imagen endpoint not available (status={response.status_code})")
+        response.raise_for_status()
+        response_data = response.json()
+        assert_valid_image_generation_response(response_data, "google")
@@
+    @skip_if_no_api_key("gemini")
     @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("image_generation"))
     def test_41c_image_generation_with_text(self, test_config, provider, model):
@@
-        if "candidates" in response_data and response_data["candidates"]:
-            for candidate in response_data["candidates"]:
-                if "content" in candidate and "parts" in candidate["content"]:
-                    for part in candidate["content"]["parts"]:
-                        if "text" in part and part["text"]:
-                            found_text = True
-                        if "inlineData" in part and part["inlineData"]:
-                            found_image = True
+        for candidate in (response_data.get("candidates") or []):
+            for part in ((candidate.get("content") or {}).get("parts") or []):
+                if part.get("text"):
+                    found_text = True
+                if part.get("inlineData"):
+                    found_image = True
framework/streaming/accumulator.go-293-321 (1)

293-321: addImageStreamChunk: FinalTimestamp shouldn’t depend on de-dup + fix indentation bug.
Right now FinalTimestamp is only set when the chunk is “new”; if a final chunk is duplicated (or arrives after a duplicate key), FinalTimestamp can remain unset. Also, Line 317 indentation is off (easy to miss in review; can hide logic mistakes).

Proposed fix
 func (a *Accumulator) addImageStreamChunk(requestID string, chunk *ImageStreamChunk, isFinalChunk bool) error {
 	acc := a.getOrCreateStreamAccumulator(requestID)
 	acc.mu.Lock()
 	defer acc.mu.Unlock()

 	// ...
 	chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex)
 	if _, seen := acc.ImageChunksSeen[chunkKey]; !seen {
 		acc.ImageChunksSeen[chunkKey] = struct{}{}
 		acc.ImageStreamChunks = append(acc.ImageStreamChunks, chunk)
-		if isFinalChunk {
-		acc.FinalTimestamp = chunk.Timestamp
-		}
 	}
+	if isFinalChunk {
+		acc.FinalTimestamp = chunk.Timestamp
+	}
 	return nil
 }
core/providers/gemini/gemini.go-1614-1671 (1)

1614-1671: ImageGeneration: add nil/empty prompt validation + guard against nil request-body conversion.
schemas.IsImagenModel(request.Model) will panic if request is nil, and the request converter can still return nil (e.g., missing Input). Also, per learnings, prompt should be validated before dispatch.

Proposed fix
 func (provider *GeminiProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if request == nil || request.Input == nil || strings.TrimSpace(request.Input.Prompt) == "" {
+		return nil, providerUtils.NewBifrostOperationError(
+			schemas.ErrProviderCreateRequest,
+			fmt.Errorf("image generation prompt is required"),
+			provider.GetProviderKey(),
+		)
+	}
+
 	// Check if image gen is allowed for this provider
 	if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ImageGenerationRequest); err != nil {
 		return nil, err
 	}

 	// check for imagen models
 	if schemas.IsImagenModel(request.Model) {
 		return provider.handleImagenImageGeneration(ctx, key, request)
 	}
 	// Prepare body
 	jsonData, bifrostErr := providerUtils.CheckContextAndGetRequestBody(
 		ctx,
 		request,
-		func() (any, error) { return ToGeminiImageGenerationRequest(request), nil },
+		func() (any, error) {
+			reqBody := ToGeminiImageGenerationRequest(request)
+			if reqBody == nil {
+				return nil, fmt.Errorf("image generation request could not be converted to Gemini format")
+			}
+			return reqBody, nil
+		},
 		provider.GetProviderKey())
core/providers/gemini/gemini.go-1673-1749 (1)

1673-1749: Imagen handler: respect configured BaseURL/path overrides and avoid logging entire image payloads.

  • baseURL := "https://generativelanguage.googleapis.com/..." ignores provider.networkConfig.BaseURL and providerUtils.GetPathFromContext(...) overrides (common in this repo for routing/proxy).
  • Debug(... string(resp.Body())) can log huge base64 payloads (and potentially user content), which is risky and noisy.
Proposed fix
 func (provider *GeminiProvider) handleImagenImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
 	providerName := provider.GetProviderKey()

 	// Prepare Imagen request body
 	jsonData, bifrostErr := providerUtils.CheckContextAndGetRequestBody(
 		ctx,
 		request,
-		func() (any, error) { return ToImagenImageGenerationRequest(request), nil },
+		func() (any, error) {
+			reqBody := ToImagenImageGenerationRequest(request)
+			if reqBody == nil {
+				return nil, fmt.Errorf("imagen request could not be converted (missing input?)")
+			}
+			return reqBody, nil
+		},
 		providerName)
 	if bifrostErr != nil {
 		return nil, bifrostErr
 	}

-	baseURL := fmt.Sprintf("https://generativelanguage.googleapis.com/v1beta/models/%s:predict", request.Model)
+	path := providerUtils.GetPathFromContext(ctx, "/models/"+request.Model+":predict")
+	baseURL := provider.networkConfig.BaseURL + path

 	// ...
 	if resp.StatusCode() != fasthttp.StatusOK {
-		provider.logger.Debug(fmt.Sprintf("error from %s provider (Imagen): %s", providerName, string(resp.Body())))
+		provider.logger.Debug(fmt.Sprintf("error from %s provider (Imagen): status=%d", providerName, resp.StatusCode()))
 		return nil, parseGeminiError(resp, &providerUtils.RequestMetadata{
 			Provider:    providerName,
 			Model:       request.Model,
 			RequestType: schemas.ImageGenerationRequest,
 		})
 	}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
plugins/semanticcache/utils.go (2)

88-116: Guard against nil ImageGenerationRequest in request-hash generation.

hashInput.Params = req.ImageGenerationRequest.Params will panic if req.ImageGenerationRequest is nil (even if RequestType says image gen). Other cases sometimes guard (e.g., Speech). Suggest mirroring that pattern.

Proposed diff
@@
 	case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
-		hashInput.Params = req.ImageGenerationRequest.Params
+		if req.ImageGenerationRequest != nil {
+			hashInput.Params = req.ImageGenerationRequest.Params
+		}
 	}

379-413: Don’t store full base64 images in semantic-cache metadata (likely size/cost blow-up).

metadata["response"] already contains the whole image generation response; additionally storing metadata["image_b64"] duplicates the largest fields and can exceed vector-store metadata limits / increase write costs. Strongly suggest storing only URLs (or a short hash/preview) and maybe counts, not full b64 blobs.

Proposed diff (drop b64 duplication; keep urls + revised prompts)
@@
 	if res.ImageGenerationResponse != nil {
 		var imageURLs []string
-		var imageB64 []string
 		var revisedPrompts []string
@@
 			if img.B64JSON != "" {
-				imageB64 = append(imageB64, img.B64JSON)
+				// Intentionally not duplicating base64 blobs into metadata.
 			}
@@
 		metadata["image_urls"] = imageURLs
-		metadata["image_b64"] = imageB64
 		metadata["revised_prompts"] = revisedPrompts
 	}
plugins/semanticcache/stream.go (1)

91-129: Fix potential nil deref in image-generation chunk sorting comparator.

In the new block, accumulator.Chunks[j].Response.ImageGenerationStreamResponse may be nil even when i is non-nil, which would panic during sort.

Proposed diff (nil-safe compare)
@@
 		if accumulator.Chunks[i].Response.ImageGenerationStreamResponse != nil {
+			if accumulator.Chunks[j].Response.ImageGenerationStreamResponse == nil {
+				// Push non-image-gen chunks after image-gen chunks deterministically
+				return true
+			}
 			// For image generation, sort by Index first, then ChunkIndex
 			if accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index != accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index {
 				return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.Index < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index
 			}
 			return accumulator.Chunks[i].Response.ImageGenerationStreamResponse.ChunkIndex < accumulator.Chunks[j].Response.ImageGenerationStreamResponse.ChunkIndex
 		}
transports/bifrost-http/integrations/genai.go (2)

36-62: Fix misrouting: IsEmbedding can shadow IsImageGeneration
Right now RequestConverter checks geminiReq.IsEmbedding before geminiReq.IsImageGeneration. In extractAndSetModelFromURL, :predict can set IsEmbedding = true while isImageGenerationRequest(r) can also set IsImageGeneration = true, causing the request to be treated as embedding. Make these flags mutually exclusive (or reorder the converter).

Proposed fix (mutual exclusivity + safer ordering)
- } else if geminiReq.IsEmbedding {
+ } else if geminiReq.IsImageGeneration {
+   return &schemas.BifrostRequest{
+     ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(),
+   }, nil
+ } else if geminiReq.IsEmbedding {
    return &schemas.BifrostRequest{
      EmbeddingRequest: geminiReq.ToBifrostEmbeddingRequest(),
    }, nil
  }
  ...
- } else if geminiReq.IsImageGeneration {
-   return &schemas.BifrostRequest{
-     ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(),
-   }, nil
- }
- r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)
+ r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)
+ if r.IsImageGeneration {
+   r.IsEmbedding = false
+   r.IsSpeech = false
+   r.IsTranscription = false
+   r.IsCountTokens = false
+ }

Also applies to: 413-431


362-369: Avoid panics: type-assert model path param safely
modelStr := model.(string) can panic if the router ever passes a non-string. Defensive ok checking is cheap here.

Proposed fix
- modelStr := model.(string)
+ modelStr, ok := model.(string)
+ if !ok || modelStr == "" {
+   return fmt.Errorf("model parameter must be a non-empty string")
+ }
framework/streaming/types.go (1)

345-393: Bug: Transcription response extra fields look truncated (RawResponse/CacheDebug missing).
Right now StreamTypeTranscription sets RawRequest then falls into StreamTypeImage without populating RawResponse/CacheDebug (unlike other response types).

Proposed fix
 case StreamTypeTranscription:
   transcriptionResp := p.Data.TranscriptionOutput
   if transcriptionResp == nil {
     transcriptionResp = &schemas.BifrostTranscriptionResponse{}
   }
   resp.TranscriptionResponse = transcriptionResp
   resp.TranscriptionResponse.ExtraFields = schemas.BifrostResponseExtraFields{
     RequestType:    schemas.TranscriptionRequest,
     Provider:       p.Provider,
     ModelRequested: p.Model,
     Latency:        p.Data.Latency,
   }
   if p.RawRequest != nil {
     resp.TranscriptionResponse.ExtraFields.RawRequest = p.RawRequest
   }
+  if p.Data.RawResponse != nil {
+    resp.TranscriptionResponse.ExtraFields.RawResponse = *p.Data.RawResponse
+  }
+  if p.Data.CacheDebug != nil {
+    resp.TranscriptionResponse.ExtraFields.CacheDebug = p.Data.CacheDebug
+  }
 case StreamTypeImage:
transports/bifrost-http/handlers/inference.go (1)

1258-1334: Major: [DONE] emission decision should not depend on the last processed chunk (Responses/ImageGen correctness).
Right now includeEventType is reset per chunk, but the post-loop [DONE] check uses its final value. If the last non-nil chunk doesn’t match the typed-event condition, you could incorrectly emit [DONE] for Responses/ImageGen streams. Based on learnings, Responses-style streams must not rely on [DONE].

Proposed fix
-  var includeEventType bool
+  var includeEventType bool
   // Use streaming response writer
   ctx.Response.SetBodyStreamWriter(func(w *bufio.Writer) {
@@
-    var skipDoneMarker bool
+    var skipDoneMarker bool
+    var streamUsesTypedEvents bool
@@
       for chunk := range stream {
@@
         includeEventType = false
         if chunk.BifrostResponsesStreamResponse != nil ||
           chunk.BifrostImageGenerationStreamResponse != nil ||
           (chunk.BifrostError != nil && (chunk.BifrostError.ExtraFields.RequestType == schemas.ResponsesStreamRequest || chunk.BifrostError.ExtraFields.RequestType == schemas.ImageGenerationStreamRequest)) {
           includeEventType = true
+          streamUsesTypedEvents = true
         }
@@
-    if !includeEventType && !skipDoneMarker {
+    if !streamUsesTypedEvents && !skipDoneMarker {
       // Send the [DONE] marker ...
core/providers/huggingface/huggingface.go (2)

114-218: URL path inconsistency in retry logic for hfInference provider.

The initial request builds the URL with originalModelName (line 128), but the retry (line 202) rebuilds it with modelName (validated) for non-skipValidation cases. Since hfInference embeds the model name in the URL path (/hf-inference/models/{modelName}/...), the retry will hit a different endpoint than the initial attempt.

The skipValidation optimization correctly avoids this for falAI, nebius, and together image generation by ensuring retryModelName equals originalModelName. However, for hfInference and other request types where skipValidation is false, the URL changes on retry after cache invalidation, creating an inconsistent retry behavior.

Either the initial URL should be built after validation, or the retry should consistently use originalModelName as originally provided.


220-265: Branch error parsing on isHFInferenceImageRequest or request type; all non-200 responses currently route through parseHuggingFaceImageError regardless of request type.

The parameter isHFInferenceImageRequest is unused in the error handler (line 251-252). This causes chat, embedding, speech, and transcription errors to be parsed through an image-specific error parser, which is incorrect. Additionally, metadata is passed as nil unlike the streaming path (line 1108), losing Provider/Model/RequestType context in error responses.

Refactor to either branch error parsing on isHFInferenceImageRequest or pass requestType to completeRequest and select the appropriate error handler, and include metadata.

🤖 Fix all issues with AI agents
In @core/providers/azure/azure.go:
- Around line 1212-1305: The ImageGenerationStream method should not attempt SSE
streaming for Azure because Azure OpenAI image generation doesn't support
streaming; replace the current implementation of ImageGenerationStream to
immediately return
providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationStreamRequest,
provider.GetProviderKey()). Also ensure the non-streaming ImageGeneration path
sends the required "model" field: verify
openai.HandleOpenAIImageGenerationRequest is called with request.Model
(deployment) included in the request body/params per Azure docs (the function
call in ImageGeneration and the handler HandleOpenAIImageGenerationRequest
should ensure the "model" field is present).

In @core/providers/gemini/gemini.go:
- Around line 1673-1749: The Imagen handler hard-codes the endpoint URL in
handleImagenImageGeneration which bypasses provider.networkConfig.BaseURL and
providerUtils.GetPathFromContext; change the baseURL construction to use
provider.networkConfig.BaseURL + providerUtils.GetPathFromContext(ctx,
"/models/"+request.Model+":predict") (preserving the model substitution) so the
request respects configured BaseURL and any context path overrides, then use
that baseURL for req.SetRequestURI as before.

In @core/providers/gemini/images.go:
- Around line 123-180: The ToBifrostImageGenerationResponse function can
overwrite bifrostResp.Data with a nil slice and may panic when iterating
candidate.Content.Parts if any part is nil; ensure you keep bifrostResp.Data as
the initialized empty slice when no images are found by only assigning imageData
when it is non-nil/has length, and add nil guards when iterating
candidate.Content.Parts (check that candidate.Content != nil and each part !=
nil before accessing part.InlineData and part.InlineData.MIMEType); also only
append to imageMetadata and set ImageGenerationResponseParameters when you
actually pushed metadata elements.

In @core/providers/openai/openai_test.go:
- Around line 43-83: The test scenarios block named TestScenarios currently
disables every scenario except ImageGeneration/ImageGenerationStream; restore
full coverage by re-enabling the needed scenarios (set the boolean flags like
TextCompletion, SimpleChat, CompletionStream, MultiTurnConversation, ToolCalls,
Embedding, Transcription, ListModels, File* tests, Batch* tests,
StructuredOutputs, etc. to true) or replace the manual override with the
canonical default (e.g., use testutil.DefaultScenarios or the existing baseline
scenario constant) so that openai_test.go runs the full suite rather than only
image generation before merging.

In @core/providers/openai/openai.go:
- Around line 2637-2641: The streaming goroutine for image responses is missing
the context cancellation and timeout cleanup used elsewhere; update the
goroutine that currently defers close(responseChan) and
providerUtils.ReleaseStreamingResponse(resp) to also defer the same
cancellation/timeout handlers used in HandleOpenAIChatCompletionStreaming —
e.g., add defer calls to providerUtils.HandleStreamingContextCancellation(ctx,
cancel, resp, responseChan) and providerUtils.HandleStreamingExitTimeout(ctx,
cancel, resp, responseChan) (or the exact handler names used in that handler) so
the stream correctly propagates cancellations/timeouts and performs identical
cleanup.

In @docs/providers/supported-providers/nebius.mdx:
- Around line 153-156: Update the guidance_scale type and extraction to use
float64: change the documented type for guidance_scale from int to float in the
provider docs, change the struct field GuidanceScale *int to GuidanceScale
*float64 in the Nebius provider types, and replace the use of
SafeExtractIntPointer() with SafeExtractFloat64Pointer() where guidance_scale is
parsed (e.g., in the image extraction logic that currently calls
SafeExtractIntPointer for guidance_scale) so the value is handled as a float64
pointer throughout.

In @framework/modelcatalog/pricing.go:
- Around line 101-113: The code captures imageUsage for
ImageGenerationStreamResponse but CalculateCostFromUsage only treats request
types equal to schemas.ImageGenerationRequest as image pricing; to fix,
normalize stream request types to the image request type before calling
CalculateCostFromUsage (or pass a derived requestType variable): detect when
imageUsage != nil and extraFields.RequestType indicates a stream (e.g.,
"image_generation_stream" or the enum for ImageGenerationStreamRequest) and set
requestType = schemas.ImageGenerationRequest (or the equivalent constant) so
CalculateCostFromUsage treats stream responses as images; update the call site
that uses extraFields.RequestType to use this normalized requestType (affected
symbols: imageUsage, extraFields.RequestType, ImageGenerationStreamResponse,
CalculateCostFromUsage).

In @framework/streaming/images.go:
- Around line 13-110: In buildCompleteImageFromImageStreamChunks, preserve the
original provider timestamps and any streamed URL fields: don't set Created to
time.Now().Unix() for the multi-chunk path—use the first non-empty
chunk.Delta.CreatedAt (same source used in the single-chunk path) when
assembling finalResponse.ID/Created, and while reconstructing images, collect
and set per-image URL values (chunk.Delta.URL) similar to how B64JSON is
concatenated so the final schemas.BifrostImageGenerationResponse.Data entries
include URL when present (use the URL from the final chunk for that image if
multiple chunks provide it).

In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 11-216: Both tests are integration tests that should be gated;
update TestImageGenerationCacheBasicFunctionality and
TestImageGenerationSemanticSearch to early-skip when running in short mode or
when the OpenAI API key is missing (os.Getenv("OPENAI_API_KEY") == ""), and
consider allowing an opt-in build tag for CI. Insert checks at the top of each
test (use testing.Short() and the env var) to call t.Skipf with a clear message
if gating conditions are not met, and keep the existing
CreateContextWithCacheKey/CreateImageGenerationRequest logic unchanged so only
the tests’ entry points are gated.

In @tests/integrations/python/tests/test_google.py:
- Around line 1691-1856: Rename the unused fixture parameter test_config to
_test_config in test_41a_image_generation_simple, test_41b_imagen_predict, and
test_41c_image_generation_with_text to silence ARG002; in
test_41b_imagen_predict replace the broad except Exception with targeted
handlers for network/HTTP and JSON decode errors (catch
requests.exceptions.RequestException and json.JSONDecodeError around the
requests.post/response.json() block); and in test_41c_image_generation_with_text
replace the nested "if 'candidates' in response_data and
response_data['candidates']" and deep indexing with safe .get() calls (e.g.,
response_data.get("candidates", []) and candidate.get("content",
{}).get("parts", [])) when iterating to avoid KeyError and improve readability.

In @ui/app/workspace/logs/views/imageView.tsx:
- Around line 83-93: The pagination buttons render icons only and lack
accessible labels; update the two Button usages that call goToPrevious and
goToNext (the elements rendering ChevronLeft and ChevronRight) to include
descriptive aria-label attributes (e.g., "Previous image" and "Next image") and
optionally add matching title attributes for hover/tooltips so screen readers
and mouse users can discover the controls.

In @ui/lib/types/logs.ts:
- Around line 142-174: The BifrostImageGenerationData interface currently
requires index but the UI doesn't rely on it; change the
BifrostImageGenerationData type so index is optional (index?: number) to match
ImageMessageData and avoid forcing callers to manufacture it; then run a quick
grep for usages of BifrostImageGenerationData (and any code referencing
image.index) and ensure callers use the existing fallback pattern (e.g.,
image.index ?? 0) so nothing breaks.
🟡 Minor comments (6)
transports/bifrost-http/handlers/inference.go-1414-1499 (1)

1414-1499: Good: request validation and streaming hookup are in the right place; minor message nit.
The early model parsing + prompt checks are good; consider changing "prompt can not be empty" to "prompt cannot be empty" for consistency.

docs/openapi/schemas/inference/images.yaml-21-31 (1)

21-31: Missing "auto" option in size enum.

The Go struct ImageGenerationParameters (from relevant snippets) includes "auto" as a valid size option, but it's missing from this OpenAPI schema.

📝 Suggested fix
         size:
           type: string
           enum:
             - "256x256"
             - "512x512"
             - "1024x1024"
             - "1792x1024"
             - "1024x1792"
             - "1536x1024"
             - "1024x1536"
+            - "auto"
           description: Size of the generated image
core/providers/openai/openai.go-2857-2860 (1)

2857-2860: Missing context error check before processing scanner errors.

Other streaming handlers check ctx.Err() before processing scanner errors to let the defer handle cancellation/timeout. Without this check, scanner errors from cancelled contexts may be incorrectly logged and processed.

🐛 Proposed fix
 		if err := scanner.Err(); err != nil {
+			// If context was cancelled/timed out, let defer handle it
+			if ctx.Err() != nil {
+				return
+			}
 			logger.Warn(fmt.Sprintf("Error reading stream: %v", err))
 			providerUtils.ProcessAndSendError(ctx, postHookRunner, err, responseChan, schemas.ImageGenerationStreamRequest, providerName, request.Model, logger)
 		}
core/providers/gemini/images.go-348-389 (1)

348-389: MIME parsing fallback should strip parameters (e.g., image/png; charset=binary).

convertMimeTypeToExtension will return png; charset=binary today. Consider splitting on ; first.

core/internal/testutil/test_retry_framework.go-2947-3087 (1)

2947-3087: Minor: function name/comment mismatch (StreamValidationRetry vs WithImageGenerationStreamRetry).

Either rename the function to WithImageGenerationStreamValidationRetry or adjust the comment to match what the API is.

core/providers/gemini/images.go-182-250 (1)

182-250: Use SafeExtractSafetySettings helper and validate prompt is not empty before creating request.

The SafetySettings extraction at lines 201–209 bypasses the type-safe helper already defined and used elsewhere in the codebase. Replace the direct type assertion safetySettings.([]SafetySetting) with the SafeExtractSafetySettings helper (see gemini/transcription.go:123 and gemini/responses.go:125 for the pattern).

Additionally, validate that the prompt is not empty before returning the request. Return nil when bifrostReq.Input == nil or the prompt is blank, so CheckContextAndGetRequestBody can raise a structured error instead of allowing a malformed request to propagate downstream.

🧹 Nitpick comments (39)
core/schemas/plugin.go (1)

49-51: Stale comment references old function name.

The comment still references the unexported caseInsensitiveLookup name, but the function has been renamed to the exported CaseInsensitiveLookup.

📝 Suggested fix
-// caseInsensitiveLookup looks up a key in a case-insensitive manner for a map of strings
+// CaseInsensitiveLookup looks up a key in a case-insensitive manner for a map of strings
 // Returns the value if found, otherwise an empty string
 func CaseInsensitiveLookup(data map[string]string, key string) string {
docs/providers/supported-providers/vertex.mdx (1)

25-25: Verify the endpoint value in the supported operations table.

The endpoint column shows /generateContent or /predict (Imagen), but the endpoint selection section (lines 392-394) shows more complete paths. Consider aligning with the detailed endpoint format for consistency, or clarify that these are simplified endpoint suffixes.

docs/providers/supported-providers/gemini.mdx (1)

571-576: Consider clarifying valid aspect ratio mappings.

The size conversion mentions aspect ratios like "1:1", "3:4", "4:3", "9:16", "16:9", but it's unclear what happens if a user provides a size that doesn't map cleanly to these ratios (e.g., "800x600" which is 4:3 but "850x600" which isn't). Consider adding a note about how non-standard ratios are handled or if they default to the nearest supported ratio.

framework/configstore/tables/modelpricing.go (1)

40-46: LGTM! Consider adding cache creation cost for symmetry.

The new image generation pricing fields are well-structured with proper GORM column tags and nullable types. The naming follows the existing conventions.

One consideration: you have CacheReadInputImageTokenCost but no corresponding CacheCreationInputImageTokenCost. The text token pricing has both CacheReadInputTokenCost and CacheCreationInputTokenCost. If image token caching follows the same pattern, you may want to add the creation cost field for completeness.

Optional: Add cache creation cost field for symmetry
 	// Image generation pricing
 	InputCostPerImageToken       *float64 `gorm:"default:null;column:input_cost_per_image_token" json:"input_cost_per_image_token,omitempty"`
 	OutputCostPerImageToken      *float64 `gorm:"default:null;column:output_cost_per_image_token" json:"output_cost_per_image_token,omitempty"`
 	InputCostPerImage            *float64 `gorm:"default:null;column:input_cost_per_image" json:"input_cost_per_image,omitempty"`
 	OutputCostPerImage           *float64 `gorm:"default:null;column:output_cost_per_image" json:"output_cost_per_image,omitempty"`
 	CacheReadInputImageTokenCost *float64 `gorm:"default:null;column:cache_read_input_image_token_cost" json:"cache_read_input_image_token_cost,omitempty"`
+	CacheCreationInputImageTokenCost *float64 `gorm:"default:null;column:cache_creation_input_image_token_cost" json:"cache_creation_input_image_token_cost,omitempty"`
core/providers/xai/xai.go (1)

230-244: Minor grammar fix in doc comment.

Line 230: "a image" → "an image".

Verify xAI image generation API support.

This implementation assumes xAI's API is OpenAI-compatible for image generation at /v1/images/generations. Please confirm xAI actually exposes this endpoint.

Does xAI Grok API support image generation endpoint?
plugins/semanticcache/utils.go (2)

3-15: Verify sonic is safe/compatible for this repo’s Go/runtime constraints before relying on it.

sonic can be great for throughput, but it’s an additional dependency with different performance/unsafe characteristics than encoding/json. Please confirm it’s supported in your Go version/build flags and that marshaling output is acceptable for how the cached metadata["response"] is consumed.


477-516: Image-generation input normalization + params→metadata extraction are reasonable; consider key-collision strategy.

The new extractImageGenerationParametersToMetadata mirrors other param extractors well. One thing to sanity-check across the whole stack: metadata keys like "n", "seed", "size" are now shared across multiple request types—if downstream filtering assumes keys are type-specific, you may want a "request_type" tag or image_* prefixing.

Also applies to: 650-657, 956-1008

tests/integrations/python/tests/test_openai.py (1)

1170-1257: Reduce flakiness in cross-provider image-gen tests (lint + per-parameter compatibility).

  1. Unused fixture arg (Ruff ARG002): rename test_config_test_config in test_52a..test_52d.
  2. Quality/size compatibility: quality="low" and "1024x1536" won’t be accepted by every provider/model even if they support basic image generation—consider skipping on “unsupported parameter / invalid size” errors unless the scenario is guaranteed to be that strict.
Proposed diff (lint-only)
@@
-    def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled):
+    def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled):
@@
-    def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled):
+    def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled):
@@
-    def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled):
+    def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled):
@@
-    def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled):
+    def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):
core/providers/nebius/nebius_test.go (1)

47-49: Minor: Consider consistent ordering of test scenario flags.

The ImageGenerationStream flag (line 49) is placed after CompleteEnd2End (line 48), whereas in gemini_test.go it immediately follows ImageGeneration. Consider aligning the ordering for consistency across provider tests.

Suggested reordering for consistency
 			MultipleImages:        true,
 			ImageGeneration:       true,
+			ImageGenerationStream: false,
 			CompleteEnd2End:       true,
-			ImageGenerationStream: false,
 			Embedding:             true, // Nebius supports embeddings
docs/providers/supported-providers/xai.mdx (1)

124-141: LGTM - Image generation documentation section.

The documentation clearly explains request/response conversion, endpoint path, and streaming limitations. The reference to OpenAI Image Generation docs is helpful for detailed parameter information.

Minor style suggestion: The static analysis tool flagged three lines starting with similar patterns (Model, Parameters, Endpoint). Consider varying the format for improved readability:

📝 Optional style improvement
-- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt`
-- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the request struct via struct embedding
-- **Endpoint**: `/v1/images/generations`
+| Field | Mapping |
+|-------|---------|
+| Model & Prompt | `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` |
+| Parameters | All fields from `bifrostReq.Params` (`ImageGenerationParameters`) embedded via struct embedding |
+| Endpoint | `/v1/images/generations` |
core/providers/nebius/types.go (1)

26-29: Consider Go naming convention for URL acronym.

Go convention prefers URL over Url for acronyms. However, since this must match the JSON field name "url" for API compatibility, this is acceptable as-is.

core/internal/testutil/validation_presets.go (1)

222-222: Remove leftover implementation comment.

This comment appears to be a leftover note from development and should be removed as it's redundant — the function is already in this file.

🧹 Proposed fix
-// In validation_presets.go - add this function
+// ImageGenerationExpectations returns validation expectations for image generation scenarios
core/providers/openai/openai_image_test.go (1)

12-13: Consider renaming test function for clarity.

The function name TestImageGenerationStreamingRequestConversion suggests it tests streaming-specific behavior, but it actually tests the general ToOpenAIImageGenerationRequest conversion. Consider renaming to TestToOpenAIImageGenerationRequest for accuracy.

✏️ Suggested rename
-// TestImageGenerationStreamingRequestConversion
-func TestImageGenerationStreamingRequestConversion(t *testing.T) {
+// TestToOpenAIImageGenerationRequest tests Bifrost to OpenAI request conversion
+func TestToOpenAIImageGenerationRequest(t *testing.T) {
core/schemas/provider.go (1)

352-357: Minor grammar issue in doc comments.

The comments say "a image" which should be "an image".

📝 Suggested fix
-	// ImageGeneration performs a image generation request
+	// ImageGeneration performs an image generation request
 	ImageGeneration(ctx *BifrostContext, key Key, request *BifrostImageGenerationRequest) (
 		*BifrostImageGenerationResponse, *BifrostError)
-	// ImageGenerationStream performs a image generation stream request
+	// ImageGenerationStream performs an image generation stream request
 	ImageGenerationStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key,
 		request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)
core/internal/testutil/response_validation.go (1)

1096-1167: Harden ProviderSpecific parsing (avoid brittle .(int) / .(string) assertions)
expectations.ProviderSpecific["min_images"].(int) / ["expected_size"].(string) will silently skip validation when the value isn’t the exact asserted type (common if values originate from JSON/untyped maps). Consider accepting int/float64/json.Number and coercing. Also consider asserting response.ExtraFields.RequestType is an image generation type when present, to catch misrouted responses early.

ui/app/workspace/logs/views/imageView.tsx (1)

22-39: Consider memoizing images to avoid effect churn
images is a new array each render, so the [images, totalImages] effect will run frequently. This likely bails out, but useMemo (or depending on imageOutput?.data and totalImages) would make the clamp effect more stable.

core/providers/openai/images.go (1)

31-57: Avoid Params aliasing (params = &request.ImageGenerationParameters)
Returning a pointer to the receiver’s embedded params can leak mutations across layers. Safer to copy into a local before taking the address (and deep-copy ExtraParams if you rely on it later).

Proposed fix
- var params *schemas.ImageGenerationParameters
+ var params *schemas.ImageGenerationParameters
  if request.N != nil || request.Background != nil || request.Moderation != nil ||
    ...
    len(request.ExtraParams) > 0 {
-   params = &request.ImageGenerationParameters
+   p := request.ImageGenerationParameters
+   params = &p
  }
docs/openapi/paths/inference/images.yaml (2)

3-58: Clarify where stream is modeled (query vs body) so clients can discover SSE behavior.
The description says stream=true, but this path definition doesn’t declare a query parameter. If stream is a request-body field (likely), consider explicitly stating “stream in request body” or adding a parameters: entry if it’s query-based. This reduces SDK/client confusion.


13-20: Avoid over-asserting “Supported Providers” in the spec unless it’s enforced.
If runtime support is feature-flagged / stack-dependent, consider wording like “Examples” or “Known supported providers” to avoid docs drifting from actual routing/provider enablement.

transports/bifrost-http/integrations/router.go (1)

1085-1087: Done-marker suppression for /images/generations is directionally right, but path-based checks are brittle.
You’re correctly avoiding data: [DONE] for typed-event streams, but strings.Contains(config.Path, "/images/generations") can drift if paths change. A per-route DisableDoneMarker (or a StreamTerminationMode) would be easier to maintain than hardcoding substrings.

Also applies to: 1190-1193

core/providers/huggingface/images.go (2)

95-129: Size parsing is strict for Nebius but best-effort for fal-ai; consider consistent behavior.
Right now, invalid size errors for Nebius but silently no-ops for fal-ai (and stream). If you want predictable UX, either (a) validate size format once in a shared layer, or (b) consistently “best-effort” across providers and document it.

Also applies to: 156-231


233-323: Response conversion is reasonable; consider populating ID/Created when upstream provides it.
Some branches return only Model + Data. If upstream responses contain IDs/timestamps (Nebius/fal/together), mapping them would improve consistency with other providers’ image responses.

core/providers/gemini/responses.go (1)

2443-2556: Consider avoiding decode+re-encode for base64 image/audio/file inline data (if Blob.Data invariant is base64).
Right now you decode base64 and then re-encode to base64 again for Blob.Data. If the invariant is already “base64 string”, passing it through avoids CPU/memory overhead on large media payloads.

plugins/semanticcache/plugin_image_generation_test.go (2)

71-86: Avoid failing on performance ratios in tests (log-only), unless the repo explicitly treats perf regressions as test failures.
Even the “10x slower” threshold can happen under CI contention, causing noisy failures unrelated to correctness. Prefer logging perf stats or gating perf asserts behind an env flag.

Also applies to: 198-213, 377-396


103-196: Semantic cache test may be non-deterministic; consider soft-asserting or widening diagnostics.
A strict “semantic match expected” can fail if embeddings shift or if the cache keying differs; consider logging CacheDebug fields and skipping (or retrying) if no semantic hit occurs, depending on how stable you expect this to be.

core/bifrost.go (1)

1008-1094: Add ctx-nil handling consistency + clarify nil-response behavior for ImageGenerationRequest.

ImageGenerationRequest relies on handleRequest() to normalize ctx when nil, which is fine, but the method’s own nil-response guard (response == nil || response.ImageGenerationResponse == nil) can turn a plugin short-circuit/mis-route into a generic “nil response” error that will not trigger fallbacks (fallbacks already happened inside handleRequest). Consider either:

  • treating this as an internal invariant violation (and include more details), or
  • moving this check into provider dispatch where you can attach provider/model and preserve a clearer cause.
core/schemas/images.go (2)

4-16: Request schema: consider serializing ExtraParams (or document that it’s internal-only).

ImageGenerationParameters.ExtraParams is json:"-", so callers can’t actually send arbitrary provider params over the API surface unless there’s separate merge logic elsewhere. If the intent is to support pass-through params, consider json:"extra_params,omitempty" (and validate/sanitize) or document clearly that this is internal/populated by plugins/transports only.


22-38: Parameter typing: prefer map[string]any over map[string]interface{}.

Not a behavior change, but aligns with modern Go style and improves readability across the codebase.

framework/streaming/images.go (1)

112-205: Potential perf issue: rebuilding full base64 on every delta chunk is O(n²).

processAccumulatedImageStreamingChunks calls buildCompleteImageFromImageStreamChunks even for non-final chunks, which sorts and concatenates all chunks every time. If images are large / chunk counts high, this will be a hot path.

Consider returning a true delta response for non-final chunks (or incrementally appending to per-image builders stored in the accumulator).

Also applies to: 322-336

framework/streaming/types.go (1)

360-393: Consider setting ImageGenerationResponse.Created when you synthesize a fallback response.
You already set ID and Model; setting Created from p.Data.StartTimestamp would better match other API responses.

core/providers/huggingface/types.go (1)

352-440: Optional: consider moving the image-generation structs into a dedicated file (e.g., images_types.go).
This file is already a “kitchen sink” and will be harder to maintain as more providers/models are added.

core/providers/openai/types.go (1)

546-546: Define constants for ImageGenerationEventType to ensure consistency.

The type is declared but no constants are defined. Based on learnings, the SSE event type values should be "image_generation.partial_image", "image_generation.completed", and "error". Without defined constants, consumers may use inconsistent string literals.

♻️ Suggested constants
 type ImageGenerationEventType string
+
+const (
+	ImageGenerationPartialImage ImageGenerationEventType = "image_generation.partial_image"
+	ImageGenerationCompleted    ImageGenerationEventType = "image_generation.completed"
+	ImageGenerationError        ImageGenerationEventType = "error"
+)
core/providers/openai/openai.go (1)

2706-2706: Use constant instead of string literal for error type check.

For consistency with the established patterns and to avoid typos, consider using a constant instead of the string literal "error". This aligns with the learnings about enforcing SSE event type values consistently.

♻️ Suggested change
-			if response.Type == "error" {
+			if response.Type == ImageGenerationError {

This requires defining ImageGenerationError as a constant in types.go (as suggested in the earlier comment).

core/providers/gemini/types.go (2)

64-88: Clarify/normalize the dual request shapes (GenerateContent vs Imagen :predict) to avoid accidental mixed payloads.

GeminiGenerationRequest now carries Imagen-only fields (Instances, Parameters) alongside Contents. Consider enforcing mutual exclusivity at the conversion layer (or adding a comment/guard) so callers don’t accidentally populate both and send an ambiguous request.

Also applies to: 79-85


90-93: Consolidate Imagen instance modeling—prompt must be required, not optional.

Two conflicting types exist for Imagen instances:

  • ImagenInstance.Prompt (string, omitempty) — optional
  • GeminiImagenRequest.Instances[].Prompt (*string) — required

Google Vertex AI Imagen API requires the prompt field. The optional semantics in ImagenInstance contradict this contract and force defensive checks in conversion logic (images.go:27). Use a single exported type across both GeminiGenerationRequest.Instances and GeminiImagenRequest.Instances, and make Prompt required (*string without omitempty).

Also applies to: 1702-1707

core/providers/gemini/images.go (2)

252-346: Validate empty prompt and add legacy key support consistently for Imagen-specific ExtraParams.

ToImagenImageGenerationRequest should return nil if bifrostReq.Input.Prompt is empty (same structured-error rationale). Also, if you’re supporting legacy snake_case for safety, consider doing the same for image_size / aspect_ratio if those ever existed.


440-468: Consider surfacing filtered/blocked images (raiFilteredReason) as errors or metadata.

Right now RaiFilteredReason is ignored; consumers may get empty/invalid image data without explanation. At least propagate it into ExtraFields / error path when present.

core/providers/huggingface/huggingface.go (1)

968-1022: Streaming URL construction: verify it matches router expectations and custom path overrides.

You bypass getInferenceProviderRouteURL and assemble "/fal-ai/%s/stream" directly (then run through GetRequestPath). Please confirm the final URL matches the non-stream route conventions and any custom provider overrides.

core/internal/testutil/test_retry_framework.go (1)

980-994: DefaultImageGenerationRetryConfig returns TestRetryConfig (generic) — ensure typed configs follow the repo convention.

Given the convention in core/internal/testutil/ (generic GetTestRetryConfigForScenario() then build typed config copying only selected fields), please ensure new image-generation tests construct ImageGenerationRetryConfig the same way (with an empty Conditions slice unless explicitly needed).

Based on learnings, the typed-config construction convention should be applied consistently.

Also applies to: 1250-1251

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (25)
docs/providers/supported-providers/groq.mdx (1)

16-35: Add Image Generation to the Unsupported Operations note for consistency.

Image Generation is marked as ❌ in the Supported Operations table (line 25) but is omitted from the "Unsupported Operations" explanation (line 34), creating a documentation inconsistency. The implementation correctly returns UnsupportedOperationError for both ImageGeneration and ImageGenerationStream methods.

Proposed doc fix
-**Unsupported Operations** (❌): Embeddings, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Image Generation, Embeddings, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`.
plugins/semanticcache/utils.go (3)

88-116: Prevent nil deref for image-generation requests in hash generation.
hashInput.Params = req.ImageGenerationRequest.Params (Line 115) will panic if RequestType is image-generation but req.ImageGenerationRequest is nil (similar guard exists for Speech).

Proposed fix
 case schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest:
 	hashInput.Params = req.TranscriptionRequest.Params
 case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
-	hashInput.Params = req.ImageGenerationRequest.Params
+	if req.ImageGenerationRequest != nil {
+		hashInput.Params = req.ImageGenerationRequest.Params
+	}
 }

379-413: Guard against res == nil and avoid unbounded metadata growth (esp. base64 duplication).

  1. res.ImageGenerationResponse access (Line 392) will panic if res is nil.
  2. Storing metadata["response"] plus image_b64 (Line 410) can explode metadata size and potentially exceed vector store limits / hurt performance (base64 images can be very large). Consider storing only URLs, counts, hashes, or gating base64 fields behind a config/size cap.
Proposed fix (nil guard + lighter metadata defaults)
 func (plugin *Plugin) addSingleResponse(ctx context.Context, responseID string, res *schemas.BifrostResponse, embedding []float32, metadata map[string]interface{}, ttl time.Duration) error {
+	if res == nil {
+		return fmt.Errorf("failed to store response: nil response")
+	}
 	// Marshal response as string
 	responseData, err := sonic.Marshal(res)
 	if err != nil {
 		return fmt.Errorf("failed to marshal response: %w", err)
 	}

 	// Add response field to metadata
 	metadata["response"] = string(responseData)
 	metadata["stream_chunks"] = []string{}

 	// image specific metadata
 	if res.ImageGenerationResponse != nil {
 		var imageURLs []string
-		var imageB64 []string
 		var revisedPrompts []string

 		for _, img := range res.ImageGenerationResponse.Data {
 			if img.URL != "" {
 				imageURLs = append(imageURLs, img.URL)
 			}
-			if img.B64JSON != "" {
-				imageB64 = append(imageB64, img.B64JSON)
-			}
 			if img.RevisedPrompt != "" {
 				revisedPrompts = append(revisedPrompts, img.RevisedPrompt)
 			}
 		}

 		metadata["image_urls"] = imageURLs
-		metadata["image_b64"] = imageB64
 		metadata["revised_prompts"] = revisedPrompts
 	}

480-516: Prevent nil deref in getInputForCaching for image generation.
return req.ImageGenerationRequest.Input (Line 513) will panic if req.ImageGenerationRequest is nil.

Proposed fix
 case schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest:
 	return req.TranscriptionRequest.Input
 case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
-	return req.ImageGenerationRequest.Input
+	if req.ImageGenerationRequest == nil {
+		return nil
+	}
+	return req.ImageGenerationRequest.Input
 default:
 	return nil
 }
docs/providers/supported-providers/overview.mdx (1)

53-61: Document "Images (stream)" semantics in Notes

The matrix includes an "Images (stream)" column, but the Notes section defines "Images" without explaining the streaming variant. Add a note clarifying that streaming is invoked by passing stream=true in the request to /v1/images/generations, which triggers Server-Sent Events (SSE) delivery with base64-encoded image chunks.

Suggested addition: "Images (stream)" refers to streaming image generation using the same /v1/images/generations endpoint with stream=true, which delivers image chunks as base64 data via Server-Sent Events (SSE).

ui/lib/types/schemas.ts (1)

327-345: Back-compat risk: new allowed_requests keys may break parsing of existing configs.

If older configs omit these keys, z.boolean() will reject them. Consider defaulting them to false (so missing keys still parse).

Proposed fix
@@
 export const allowedRequestsSchema = z.object({
@@
-	image_generation: z.boolean(),
-	image_generation_stream: z.boolean(),
+	image_generation: z.boolean().default(false),
+	image_generation_stream: z.boolean().default(false),
 	count_tokens: z.boolean(),
 	list_models: z.boolean(),
 });
core/providers/huggingface/models.go (1)

46-90: Add image generation tags to enable discovery via tags alone.

Tag-based model discovery currently has no image generation matching. Models with image generation tags will be incorrectly filtered out when pipeline_tag is missing or incorrect.

Fix: Add image generation tag cases to switch statement
 	for _, tag := range tags {
 		tagLower := strings.ToLower(tag)
 		switch {
 		case tagLower == "automatic-speech-recognition" ||
 			tagLower == "speech-to-text" || strings.Contains(tagLower, "speech-recognition"):
 			addMethods(schemas.TranscriptionRequest)
+		case tagLower == "text-to-image" ||
+			strings.Contains(tagLower, "text-to-image") ||
+			strings.Contains(tagLower, "image-generation"):
+			addMethods(schemas.ImageGenerationRequest)
 		}
 	}

Regarding ImageGenerationStreamRequest: HuggingFace supports streaming via the fal-ai inference provider only. The current logic should add ImageGenerationStreamRequest alongside ImageGenerationRequest for consistency with other capability patterns (chat, speech), since the runtime will handle provider-specific restrictions.

docs/providers/supported-providers/mistral.mdx (1)

18-33: Correct the upstream capability claims—Mistral API actually supports Files, Batch, and TTS.

Verification shows the doc incorrectly claims these features are "not supported by the upstream Mistral API":

  • Files API — Mistral DOES support /v1/files (batch uploads up to 512 MB per file)
  • Batch API — Mistral DOES support asynchronous batch jobs via /v1/files upload + job creation
  • TTS/Speech — Mistral DOES support text-to-speech (Voxtral models, available since July 2025)
  • Text Completions — Mistral deprecated the old generic endpoint in favor of /v1/chat/completions and /v1/fim/completions (so the claim is partly correct but misleading)
  • Image Generation — Correctly noted as supported upstream but not yet in Bifrost (verified: launched Nov 18, 2024)

Update the "Unsupported Operations" note and the operations table to accurately reflect which features Mistral's public API actually supports vs. which are missing from Bifrost's integration.

docs/providers/supported-providers/openrouter.mdx (1)

134-144: Minor consistency: “Embedding” vs “Embeddings” naming.

The table uses “Embeddings” (Line 24) while the “Unsupported Features” section uses “Embedding” (Line 138). Consider aligning the label to reduce confusion.

core/providers/gemini/transcription.go (2)

34-45: Don’t silently drop invalid inline audio without any signal.

Currently (Line 36-40) invalid base64 just gets skipped, which can lead to “empty audio” requests later with no actionable error. At minimum, consider tracking a flag/count in Params.ExtraParams so the caller/debug logs can explain why audio is missing.


106-166: Fix potential nil-pointer panic on bifrostReq.Input.File.

ToGeminiTranscriptionRequest assumes bifrostReq.Input is non-nil (Line 158), but the function accepts arbitrary input and will panic if Input is nil.

Proposed fix
-	// Add audio file if present
-	if len(bifrostReq.Input.File) > 0 {
+	// Add audio file if present
+	if bifrostReq.Input != nil && len(bifrostReq.Input.File) > 0 {
 		parts = append(parts, &Part{
 			InlineData: &Blob{
 				MIMEType: utils.DetectAudioMimeType(bifrostReq.Input.File),
 				Data:     encodeBytesToBase64String(bifrostReq.Input.File),
 			},
 		})
 	}
core/schemas/responses.go (1)

540-575: Blocker: error message claims image-generation output support, but code doesn’t implement it (and may mis-unmarshal objects).

Right now you only updated the failure message (Line 556-557, 574-575). ResponsesToolMessageOutputStruct still can’t marshal/unmarshal an image-generation output variant, and UnmarshalJSON will also happily treat any JSON object as ResponsesComputerToolCallOutputData (because missing fields don’t error), which will likely break once image-gen outputs are objects.

Concrete direction (add an explicit image-gen variant + discriminate objects)
 type ResponsesToolMessageOutputStruct struct {
 	ResponsesToolCallOutputStr            *string // Common output string for tool calls and outputs (used by function, custom and local shell tool calls)
 	ResponsesFunctionToolCallOutputBlocks []ResponsesMessageContentBlock
 	ResponsesComputerToolCallOutput       *ResponsesComputerToolCallOutputData
+	ResponsesImageGenerationCallOutput    *ResponsesImageGenerationCall
 }

 func (output ResponsesToolMessageOutputStruct) MarshalJSON() ([]byte, error) {
 	if output.ResponsesToolCallOutputStr != nil {
 		return Marshal(*output.ResponsesToolCallOutputStr)
 	}
 	if output.ResponsesFunctionToolCallOutputBlocks != nil {
 		return Marshal(output.ResponsesFunctionToolCallOutputBlocks)
 	}
 	if output.ResponsesComputerToolCallOutput != nil {
 		return Marshal(output.ResponsesComputerToolCallOutput)
 	}
+	if output.ResponsesImageGenerationCallOutput != nil {
+		return Marshal(output.ResponsesImageGenerationCallOutput)
+	}
 	return nil, fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output")
 }

 func (output *ResponsesToolMessageOutputStruct) UnmarshalJSON(data []byte) error {
 	var str string
 	if err := Unmarshal(data, &str); err == nil {
 		output.ResponsesToolCallOutputStr = &str
 		return nil
 	}
 	var array []ResponsesMessageContentBlock
 	if err := Unmarshal(data, &array); err == nil {
 		output.ResponsesFunctionToolCallOutputBlocks = array
 		return nil
 	}
-	var computerToolCallOutput ResponsesComputerToolCallOutputData
-	if err := Unmarshal(data, &computerToolCallOutput); err == nil {
-		output.ResponsesComputerToolCallOutput = &computerToolCallOutput
-		return nil
-	}
+	// Discriminate object shapes to avoid false-positive unmarshalling into structs with optional fields.
+	var obj map[string]any
+	if err := Unmarshal(data, &obj); err == nil {
+		if t, ok := obj["type"].(string); ok && t == "computer_screenshot" {
+			var computerToolCallOutput ResponsesComputerToolCallOutputData
+			if err := Unmarshal(data, &computerToolCallOutput); err == nil {
+				output.ResponsesComputerToolCallOutput = &computerToolCallOutput
+				return nil
+			}
+		}
+		if _, ok := obj["result"]; ok {
+			var imageGenOutput ResponsesImageGenerationCall
+			if err := Unmarshal(data, &imageGenOutput); err == nil {
+				output.ResponsesImageGenerationCallOutput = &imageGenOutput
+				return nil
+			}
+		}
+	}
 	return fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output")
 }
core/providers/xai/xai_test.go (1)

25-54: Remove ImageGenerationStream: true from test config—XAI provider explicitly rejects this operation.

The XAI provider's ImageGenerationStream method (xai.go:247) returns providerUtils.NewUnsupportedOperationError() with the comment "ImageGenerationStream is not supported by the xAI provider." Enabling this scenario in the test will cause failures since the provider cannot support it.

Change line 46 in xai_test.go:

ImageGenerationStream: false,
plugins/semanticcache/test_utils.go (1)

356-365: Skipping tests when Weaviate isn’t reachable is pragmatic, but consider an explicit opt-in/out.
Right now any connection/config failure becomes a skip, which can mask integration regressions in CI. If CI should enforce Weaviate-backed tests, consider gating this behavior behind an env var (e.g., REQUIRE_WEAVIATE=1).

core/providers/vertex/vertex_test.go (1)

25-55: Vertex test config conflicts with global capability matrix for streaming image generation.
core/internal/testutil/account.go marks Vertex ImageGenerationStream: false, but this test enables it. If Vertex doesn’t implement streaming image-gen, the suite will fail or hang.

Proposed fix (if Vertex image-gen streaming is not supported)
@@
 		Scenarios: testutil.TestScenarios{
@@
 			ImageGeneration:       true,
-			ImageGenerationStream: true,
+			ImageGenerationStream: false,
 			MultipleImages:        true,
plugins/logging/operations.go (1)

544-555: Extract imageUsage from ImageGenerationOutputParsed for cost recalculation of image-generation requests.

The parameter list is correct (9 parameters match all call sites), but calculateCostForLog unconditionally passes nil for the imageUsage parameter even when handling image-generation requests. For image-generation requests with ImageGenerationRequest or ImageGenerationStreamRequest types, extract the Usage field from logEntry.ImageGenerationOutputParsed and pass it as the 9th argument to CalculateCostFromUsage—similar to how the main CalculateCost function handles it (lines 108–110 in pricing.go). Also ensure ImageGenerationOutputParsed is deserialized alongside TokenUsageParsed and CacheDebugParsed when needed.

framework/modelcatalog/main.go (1)

223-230: Missing ImageGenerationRequest in request type lookup.

GetPricingEntryForModel iterates over request types but does not include schemas.ImageGenerationRequest. This means callers using this method won't find pricing entries for image generation models.

Proposed fix
 	for _, mode := range []schemas.RequestType{
 		schemas.TextCompletionRequest,
 		schemas.ChatCompletionRequest,
 		schemas.ResponsesRequest,
 		schemas.EmbeddingRequest,
 		schemas.SpeechRequest,
 		schemas.TranscriptionRequest,
+		schemas.ImageGenerationRequest,
 	} {
framework/streaming/types.go (2)

345-392: Fix missing RawResponse/CacheDebug propagation for transcription (likely regression).
In case StreamTypeTranscription, only RawRequest is set; other stream types include RawResponse and CacheDebug. This can silently drop debug data for transcription streams.

Proposed fix
 case StreamTypeTranscription:
@@
 		if p.RawRequest != nil {
 			resp.TranscriptionResponse.ExtraFields.RawRequest = p.RawRequest
 		}
+		if p.Data.RawResponse != nil {
+			resp.TranscriptionResponse.ExtraFields.RawResponse = *p.Data.RawResponse
+		}
+		if p.Data.CacheDebug != nil {
+			resp.TranscriptionResponse.ExtraFields.CacheDebug = p.Data.CacheDebug
+		}
 	case StreamTypeImage:

13-20: StreamTypeImage is properly handled; however, StreamTypeTranscription is missing RawResponse and CacheDebug field population.

StreamTypeImage is correctly added to the switch statement in ToBifrostResponse (lines 360–391) and properly copies RawRequest, RawResponse, and CacheDebug fields just like other types (Text, Chat, Audio, Responses). It is also correctly used in framework/streaming/images.go.

The issue is that StreamTypeTranscription (lines 345–358) does not populate RawResponse and CacheDebug, unlike all other stream types. This creates an inconsistency where transcription responses will have missing debug information that other types include.

transports/bifrost-http/integrations/genai.go (2)

36-64: Fix request routing precedence: ImageGeneration must win over Embedding when both flags are set.
Today the converter checks IsEmbedding before IsImageGeneration, but extractAndSetModelFromURL can set both (e.g., :predict defaulting to embedding + body indicates IMAGE). That will misroute image generation requests.

Proposed fix (reorder + enforce mutual exclusivity)
 RequestConverter: func(ctx *schemas.BifrostContext, req interface{}) (*schemas.BifrostRequest, error) {
 	if geminiReq, ok := req.(*gemini.GeminiGenerationRequest); ok {
 		if geminiReq.IsCountTokens {
 			return &schemas.BifrostRequest{
 				CountTokensRequest: geminiReq.ToBifrostResponsesRequest(),
 			}, nil
-		} else if geminiReq.IsEmbedding {
+		} else if geminiReq.IsImageGeneration {
+			return &schemas.BifrostRequest{
+				ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(),
+			}, nil
+		} else if geminiReq.IsEmbedding {
 			return &schemas.BifrostRequest{
 				EmbeddingRequest: geminiReq.ToBifrostEmbeddingRequest(),
 			}, nil
 		} else if geminiReq.IsSpeech {
 			return &schemas.BifrostRequest{
 				SpeechRequest: geminiReq.ToBifrostSpeechRequest(),
 			}, nil
 		} else if geminiReq.IsTranscription {
 			return &schemas.BifrostRequest{
 				TranscriptionRequest: geminiReq.ToBifrostTranscriptionRequest(),
 			}, nil
-		} else if geminiReq.IsImageGeneration {
-			return &schemas.BifrostRequest{
-				ImageGenerationRequest: geminiReq.ToBifrostImageGenerationRequest(),
-			}, nil
 		} else {
 			return &schemas.BifrostRequest{
 				ResponsesRequest: geminiReq.ToBifrostResponsesRequest(),
 			}, nil
 		}
 	}
 	return nil, errors.New("invalid request type")
 },

And in extractAndSetModelFromURL:

- r.IsEmbedding = isEmbedding
+ r.IsEmbedding = isEmbedding && !(isImagenPredict || isImageGenerationRequest(r))
@@
- r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)
+ r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)

Also applies to: 426-429


370-411: :predict defaulting to embedding is risky and not mitigated by body checks; requests will route to embedding before image generation can be detected.

For non-Imagen models with :predict, isEmbedding is set to true (line 410). While isImageGenerationRequest() checks the request body for responseModalities: ["IMAGE"] (line 509), this provides no safety because the downstream if-else chain (lines 40–62) checks IsEmbedding before IsImageGeneration. Any future non-Imagen image models using :predict will be incorrectly routed to embedding regardless of request body content.

Consider:

  • Explicitly checking request body for image modalities before defaulting to embedding, or
  • Requiring isImageGenerationRequest() to return true for non-Imagen :predict endpoints instead of treating them as embedding by default.
transports/bifrost-http/handlers/inference.go (1)

1258-1334: Fix SSE termination for Responses API: emit [DONE] based on API type, not chunk observation.

The current implementation derives whether to emit [DONE] from observing chunks at runtime: includeEventType is set only if chunks are observed from Responses/ImageGen APIs. If a Responses stream closes without yielding any chunks, includeEventType remains false and the handler incorrectly emits [DONE]. However, the code explicitly documents that "OpenAI responses API doesn't use [DONE] marker, it ends when the stream closes," making this behavior incorrect.

Replace the chunk-derived logic with an explicit emitDoneMarker bool parameter passed from each endpoint handler. This moves the decision from unreliable runtime state to static configuration: Responses API (handleStreamingResponses and handleStreamingImageGeneration) passes false, all others pass true.

Proposed diff
-func (h *CompletionHandler) handleStreamingTextCompletion(ctx *fasthttp.RequestCtx, req *schemas.BifrostTextCompletionRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
+func (h *CompletionHandler) handleStreamingTextCompletion(ctx *fasthttp.RequestCtx, req *schemas.BifrostTextCompletionRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
 	getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
 		return h.client.TextCompletionStreamRequest(bifrostCtx, req)
 	}
-	h.handleStreamingResponse(ctx, getStream, cancel)
+	h.handleStreamingResponse(ctx, getStream, cancel, true)
 }

 func (h *CompletionHandler) handleStreamingChatCompletion(ctx *fasthttp.RequestCtx, req *schemas.BifrostChatRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
 	getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
 		return h.client.ChatCompletionStreamRequest(bifrostCtx, req)
 	}
-	h.handleStreamingResponse(ctx, getStream, cancel)
+	h.handleStreamingResponse(ctx, getStream, cancel, true)
 }

 func (h *CompletionHandler) handleStreamingResponses(ctx *fasthttp.RequestCtx, req *schemas.BifrostResponsesRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
 	getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
 		return h.client.ResponsesStreamRequest(bifrostCtx, req)
 	}
-	h.handleStreamingResponse(ctx, getStream, cancel)
+	h.handleStreamingResponse(ctx, getStream, cancel, false)
 }

 func (h *CompletionHandler) handleStreamingSpeech(ctx *fasthttp.RequestCtx, req *schemas.BifrostSpeechRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
 	getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
 		return h.client.SpeechStreamRequest(bifrostCtx, req)
 	}
-	h.handleStreamingResponse(ctx, getStream, cancel)
+	h.handleStreamingResponse(ctx, getStream, cancel, true)
 }

 func (h *CompletionHandler) handleStreamingTranscriptionRequest(ctx *fasthttp.RequestCtx, req *schemas.BifrostTranscriptionRequest, bifrostCtx *schemas.BifrostContext, cancel context.CancelFunc) {
 	getStream := func() (chan *schemas.BifrostStream, *schemas.BifrostError) {
 		return h.client.TranscriptionStreamRequest(bifrostCtx, req)
 	}
-	h.handleStreamingResponse(ctx, getStream, cancel)
+	h.handleStreamingResponse(ctx, getStream, cancel, true)
 }

-func (h *CompletionHandler) handleStreamingResponse(ctx *fasthttp.RequestCtx, getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), cancel context.CancelFunc) {
+func (h *CompletionHandler) handleStreamingResponse(ctx *fasthttp.RequestCtx, getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), cancel context.CancelFunc, emitDoneMarker bool) {
 	// ...
-		var skipDoneMarker bool
-
 		// Process streaming responses
 		for chunk := range stream {
 			// ...
-			// Image generation streams don't use [DONE] marker
-			if chunk.BifrostImageGenerationStreamResponse != nil {
-				skipDoneMarker = true
-			}
 			// ...
 		}
-		if !includeEventType && !skipDoneMarker {
+		if emitDoneMarker && !includeEventType {
 			if _, err := fmt.Fprint(w, "data: [DONE]\n\n"); err != nil {
 				// ...
 			}
 		}
 		cancel()
 	})
 }
core/providers/gemini/types.go (2)

1114-1123: Remove omitempty tag from Blob.Data field to align with "Required" semantics.

Blob.Data is marked as "Required" in the comment but tagged with json:"data,omitempty", creating a semantic conflict. If the field is required by the Gemini API, the tag should be json:"data" without omitempty.

Recommended fix
 type Blob struct {
   // Optional. Display name of the blob. Used to provide a label or filename to distinguish
   // blobs. This field is not currently used in the Gemini GenerateContent calls.
   DisplayName string `json:"displayName,omitempty"`
-  // Required. Base64-encoded bytes.
-  Data string `json:"data,omitempty"`
+  // Required. Base64-encoded bytes.
+  Data string `json:"data"`
   // Required. The IANA standard MIME type of the source data.
   MIMEType string `json:"mimeType,omitempty"`
 }

Current implementation correctly encodes/decodes base64 data through helper functions; no blast radius concerns or double-encoding issues detected.


64-88: Normalize Imagen parameter casing and unify instance representation to prevent silent mapping failures.

The GeminiImagenParameters.ImageSize type comment documents "1K", "2K", "4K" (uppercase), but the converter function convertImagenFormatToSize() (lines 96–100) only handles lowercase "1k", "2k". Google/Vertex Imagen API docs show the canonical values as uppercase but accept both cases; however, the internal converter is case-sensitive and will silently default to 1024 if uppercase values are encountered.

Additionally, there are two incompatible Imagen instance representations: ImagenInstance (line 91–93) has Prompt string, while GeminiImagenRequest.Instances (line 1703–1705) uses an inline struct with Prompt *string. Unify these to a single canonical type to avoid pointer/nullability mismatches.

core/providers/huggingface/huggingface.go (1)

114-218: Build URL with validated model ID for hfInference to avoid preventable 404s.

For hfInference (chat, embedding, transcription), getInferenceProviderRouteURL encodes the modelName into the URL path: /hf-inference/models/{modelName}/pipeline/{pipeline}. The code builds the URL at line 128 using originalModelName, but only validates it at line 136—causing a 404 on the first attempt when originalModelName differs from the ProviderModelID returned by validation.

For skipValidation providers (falAI, nebius, together image generation), this is not an issue: falAI encodes the model in the path but uses originalModelName consistently (the comment confirms format is already correct), while nebius and together use fixed URL paths without encoding the model name.

Reorder validation to occur before URL construction, or defer URL building until after getValidatedProviderModelID returns for non-skipValidation cases.

🤖 Fix all issues with AI agents
In @core/internal/testutil/test_retry_conditions.go:
- Around line 907-956: EmptyImageGenerationCondition currently implements
TestRetryCondition by accepting *schemas.BifrostResponse; change its ShouldRetry
signature (EmptyImageGenerationCondition.ShouldRetry) to accept
*schemas.BifrostImageGenerationResponse, keep the same logic but adapt
references to the image-generation response type; update
DefaultImageGenerationRetryConfig to return ImageGenerationRetryConfig (not
TestRetryConfig) and ensure the new EmptyImageGenerationCondition is appended to
that config's conditions slice so it implements ImageGenerationRetryCondition
and is registered for image-generation retry handling.

In @core/providers/gemini/gemini.go:
- Around line 1614-1671: ImageGeneration currently dereferences request (calling
schemas.IsImagenModel(request.Model)) without validating request and does not
enforce presence of a prompt; add an early nil-check for the request in
ImageGeneration and return a proper *schemas.BifrostError when request == nil,
then validate the prompt/content field on request (reject empty/missing prompt
per repo pattern) before calling schemas.IsImagenModel or building the JSON body
(affects ImageGeneration, ToGeminiImageGenerationRequest usage and
provider.completeRequest); ensure the error includes ExtraFields.Provider,
ModelRequested and RequestType like other failures.

In @docs/openapi/schemas/inference/images.yaml:
- Around line 100-124: The OpenAPI schema lists
ImageGenerationResponse.parameters nested under "params", but the Go type
BifrostImageGenerationResponse embeds *ImageGenerationResponseParameters without
a JSON tag so those fields are flattened at the top level; remove the "params"
property from ImageGenerationResponse and instead promote all properties defined
in ImageGenerationResponseParameters to be top-level properties of
ImageGenerationResponse (or use an allOf referencing
ImageGenerationResponseParameters to achieve the same flattening), ensuring the
names and types exactly match ImageGenerationResponseParameters and keeping
existing refs like ImageUsage and BifrostResponseExtraFields intact.

In @docs/providers/supported-providers/openrouter.mdx:
- Around line 24-33: Update the documentation text that currently states these
unsupported operations return `UnsupportedOperationError` to instead state they
return a `BifrostError` with an error code of `"unsupported_operation"`;
specifically change the note on unsupported operations in the OpenRouter
provider doc (the table/Note block describing Embeddings, Image Generation,
Speech, Transcriptions, Files, Batch) to mention `BifrostError` and include the
`"unsupported_operation"` code so the documented error structure matches the
provider's actual response.

In @framework/streaming/images.go:
- Around line 207-214: Guard against nil ctx (or nil *ctx) at the start of
Accumulator.processImageStreamingResponse to avoid dereferencing and panics:
check that ctx != nil and *ctx != nil before using (*ctx).Value(...), and return
a clear error if missing; keep the existing behavior of not failing the request
flow beyond returning an error. Specifically update
processImageStreamingResponse to validate ctx and *ctx before extracting
requestID using schemas.BifrostContextKeyRequestID.
🟠 Major comments (20)
docs/providers/supported-providers/xai.mdx-124-141 (1)

124-141: Avoid brittle cross-doc anchor & verify schema superset claim.

  • The link to openai#7-image-generation (line 130) depends on section numbering; if the OpenAI docs get reorganized, this link will break. Consider whether your doc system supports stable anchor IDs (e.g., #image-generation-section).
  • "Bifrost's response schema is a superset of OpenAI's format" (line 138) is a strong architectural claim. Verify in this PR that the implementation preserves all fields from xAI's response, especially any provider-specific extensions beyond OpenAI's standard fields.
core/providers/openai/images.go-13-28 (1)

13-28: ToOpenAIImageGenerationRequest: missing fallbacks mapping and asymmetric model handling.

The function ignores the Fallbacks field from bifrostReq, which the reverse function ToBifrostImageGenerationRequest properly handles. Additionally, while the reverse function defensively strips provider prefixes from models using ParseModelString, the forward direction passes the model through unchanged—this should be normalized for consistency.

Proposed fix (model normalization + fallbacks mapping)
 func ToOpenAIImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) *OpenAIImageGenerationRequest {
 	if bifrostReq == nil || bifrostReq.Input == nil || bifrostReq.Input.Prompt == "" {
 		return nil
 	}
 
+	_, model := schemas.ParseModelString(bifrostReq.Model, schemas.OpenAI)
+
 	req := &OpenAIImageGenerationRequest{
-		Model:  bifrostReq.Model,
+		Model:  model,
 		Prompt: bifrostReq.Input.Prompt,
 	}
 
 	if bifrostReq.Params != nil {
 		req.ImageGenerationParameters = *bifrostReq.Params
 	}
+
+	if len(bifrostReq.Fallbacks) > 0 {
+		req.Fallbacks = make([]string, 0, len(bifrostReq.Fallbacks))
+		for _, fb := range bifrostReq.Fallbacks {
+			if fb.Provider == "" || fb.Model == "" {
+				continue
+			}
+			req.Fallbacks = append(req.Fallbacks, string(fb.Provider)+"/"+fb.Model)
+		}
+	}
 	return req
 }
plugins/logging/operations.go-41-42 (1)

41-42: Be careful logging image-generation outputs: base64 payloads can explode log size.
If an image response contains b64_json, persisting it into image_generation_output can massively bloat the DB and degrade log search/UI. Consider truncating/omitting base64 fields (or storing only URLs + metadata) when serializing these fields.

Also applies to: 127-134, 287-295

tests/integrations/python/tests/utils/common.py-2624-2724 (1)

2624-2724: Tighten Google “predictions” validation to check actual base64 bytes, not just attribute presence.
Right now an object with an empty bytes_base64_encoded-like field can still pass.

Proposed fix
@@
         elif (isinstance(response, dict) and "predictions" in response) or hasattr(response, "predictions"):
             # Imagen response
             predictions = response.get("predictions") if isinstance(response, dict) else response.predictions
             assert len(predictions) > 0, "Response should have at least one prediction"
             for i, prediction in enumerate(predictions):
-                has_b64 = (prediction.get("bytesBase64Encoded") if isinstance(prediction, dict) 
-                          else (hasattr(prediction, "bytesBase64Encoded") or hasattr(prediction, "bytes_base64_encoded")))
-                assert has_b64, f"Prediction {i} should have base64 encoded bytes"
+                if isinstance(prediction, dict):
+                    b64 = prediction.get("bytesBase64Encoded") or prediction.get("bytes_base64_encoded")
+                else:
+                    b64 = getattr(prediction, "bytesBase64Encoded", None) or getattr(prediction, "bytes_base64_encoded", None)
+
+                assert b64, f"Prediction {i} should have base64 encoded bytes"
+                # Base64 should not be trivially small
+                assert len(b64) > 100, f"Prediction {i} base64 seems too short"
tests/integrations/python/tests/test_google.py-80-83 (1)

80-83: test_41b_imagen_predict will fail (not skip) when API key is missing.

Unlike 41a/41c, test_41b_imagen_predict is missing @skip_if_no_api_key("google") (or equivalent), and get_api_key(provider) will raise before you reach the current try/except that calls pytest.skip(...).

Proposed fix
@@
     @pytest.mark.parametrize("provider,model", get_cross_provider_params_for_scenario("imagen"))
+    @skip_if_no_api_key("google")
     def test_41b_imagen_predict(self, _test_config, provider, model):

Also applies to: 96-96, 1691-1792

transports/bifrost-http/integrations/router.go-203-210 (1)

203-210: Add nil-guards for new image generation converters to avoid runtime panics.

handleNonStreamingRequest calls config.ImageGenerationResponseConverter(...) (Line 718) and streaming calls config.StreamConfig.ImageGenerationStreamResponseConverter(...) (Line 1286) without checking they’re configured. If an integration route enables image generation request conversion but forgets these, this will panic.

Proposed fix
@@
 	case bifrostReq.ImageGenerationRequest != nil:
+		if config.ImageGenerationResponseConverter == nil {
+			g.sendError(ctx, bifrostCtx, config.ErrorConverter, newBifrostError(nil, "ImageGenerationResponseConverter not configured"))
+			return
+		}
 		imageGenerationResponse, bifrostErr := g.client.ImageGenerationRequest(bifrostCtx, bifrostReq.ImageGenerationRequest)
@@
-		response, err = config.ImageGenerationResponseConverter(bifrostCtx, imageGenerationResponse)
+		response, err = config.ImageGenerationResponseConverter(bifrostCtx, imageGenerationResponse)
@@
 				case chunk.BifrostImageGenerationStreamResponse != nil:
+					if config.StreamConfig.ImageGenerationStreamResponseConverter == nil {
+						log.Printf("ImageGenerationStreamResponseConverter not configured for route: %s", config.Path)
+						continue
+					}
 					eventType, convertedResponse, err = config.StreamConfig.ImageGenerationStreamResponseConverter(bifrostCtx, chunk.BifrostImageGenerationStreamResponse)

Also applies to: 256-264, 277-307, 696-718, 1284-1287

docs/openapi/paths/integrations/openai/images.yaml-1-86 (1)

1-86: Mark Azure api-version as required: true and add concrete SSE event examples for clarity.

Azure OpenAI image generation endpoints require an explicit api-version query parameter per Azure REST API specifications. Currently, api-version is documented but not required (lines 62–67). Update to required: true to enforce compliance with Azure's actual API contract and reduce client integration friction.

Additionally, while the schema in docs/openapi/schemas/integrations/openai/images.yaml defines OpenAIImageStreamResponse event types (image_generation.partial_image, image_generation.completed) and payloads, adding a brief inline SSE example in the endpoint description (showing actual event: and data: format) will help client developers understand the concrete wire protocol without cross-referencing the schema definition.

Note: This api-version pattern affects all Azure OpenAI endpoints in the codebase (audio, embeddings, text, models, responses, chat). Consider applying the same required: true fix consistently across all Azure integration endpoints.

framework/streaming/images.go-14-110 (1)

14-110: Multi-chunk image streaming drops URL-only outputs and uses inconsistent timestamps.

The multi-chunk reconstruction path has two issues:

  1. URL-only outputs are lost: The code only accumulates B64JSON chunks into the images map (line 47). When a provider streams only URL without B64JSON, len(images)==0 triggers return nil (lines 70-72), discarding valid image data. The single-chunk fast path correctly handles URL, but the multi-chunk path ignores it entirely (line 89).

  2. Timestamp inconsistency: The single-chunk path uses chunks[0].Delta.CreatedAt (line 20), but the multi-chunk path uses time.Now().Unix() (line 104). Should prefer the stream's CreatedAt when available for consistency.

framework/streaming/images.go-248-259 (1)

248-259: ExtraFields.ChunkIndex > 0 ignores valid 0 index, breaking priority rule.

When the first chunk arrives (ChunkIndex = 0), the condition is false, so the code falls through to use PartialImageIndex instead—violating the stated priority to use ExtraFields.ChunkIndex. Since ExtraFields.ChunkIndex is always set by the provider (line 2814 in openai.go), the condition should not exclude 0. This can cause chunk mis-ordering downstream.

Change the condition to >= 0 or remove the guard entirely and always prioritize ExtraFields.ChunkIndex over the nullable PartialImageIndex.

tests/integrations/python/tests/test_openai.py-1170-1257 (1)

1170-1257: Gate unsupported image parameters to prevent cross-provider test failures.

The quality="low" (line 1228) and size="1024x1536" (line 1252) parameters are specific to OpenAI's gpt-image-1 model. These tests run against all providers supporting the image_generation scenario (including Gemini, which does not support these parameters), causing failures for non-OpenAI providers. Add error handling to skip on unsupported parameter errors, or create separate scenario categories (image_generation_quality, image_generation_sizes) that gate to OpenAI-only.

Separately, the test_config fixture is unused in all four test methods. For consistency with the pattern used in test_google.py (e.g., _test_config), consider renaming to _test_config if the fixture serves only to satisfy test infrastructure requirements.

Suggested approach (parameter handling)
- def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled):
+ def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled):
     ...
+    try:
         response = client.images.generate(
             model=format_provider_model(provider, model),
             prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
             n=1,
             size="1024x1024",
             quality="low",
+        )
+    except Exception as e:
+        if "quality" in str(e).lower() and ("unsupported" in str(e).lower() or "invalid" in str(e).lower()):
+            pytest.skip(f"quality parameter not supported for {provider}/{model}")
+        raise
core/providers/azure/azure.go-1215-1304 (1)

1215-1304: Add nil validation for image generation requests to avoid panics; reuse getModelDeployment for consistency.
Both methods read request.Model without checking request != nil.

Proposed fix
 func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key,
 	request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is required", nil, provider.GetProviderKey())
+	}
 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}
 
-	deployment := key.AzureKeyConfig.Deployments[request.Model]
-	if deployment == "" {
-		return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey())
-	}
+	deployment, err := provider.getModelDeployment(key, request.Model)
+	if err != nil {
+		return nil, err
+	}
@@
 func (provider *AzureProvider) ImageGenerationStream(
@@
 	request *schemas.BifrostImageGenerationRequest,
 ) (chan *schemas.BifrostStream, *schemas.BifrostError) {
 
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is required", nil, provider.GetProviderKey())
+	}
 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}
 
-	//
-	deployment := key.AzureKeyConfig.Deployments[request.Model]
-	if deployment == "" {
-		return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey())
-	}
+	deployment, err := provider.getModelDeployment(key, request.Model)
+	if err != nil {
+		return nil, err
+	}
core/providers/gemini/gemini.go-1673-1749 (1)

1673-1749: Don’t hardcode generativelanguage base URL; normalize model and reuse configured BaseURL/context path.
handleImagenImageGeneration ignores provider.networkConfig.BaseURL and providerUtils.GetPathFromContext, and uses request.Model directly in the URL. This can break custom base URLs and any model normalization the rest of the provider relies on.

Proposed fix
 func (provider *GeminiProvider) handleImagenImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
 	providerName := provider.GetProviderKey()
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is required", nil, providerName)
+	}
 
@@
-	baseURL := fmt.Sprintf("https://generativelanguage.googleapis.com/v1beta/models/%s:predict", request.Model)
+	// Ensure we use configured BaseURL + request-scoped path rewriting
+	_, model := schemas.ParseModelString(request.Model, schemas.Gemini)
+	baseURL := strings.TrimRight(provider.networkConfig.BaseURL, "/")
+	urlPath := providerUtils.GetPathFromContext(ctx, "/models/"+model+":predict")
 
@@
-	req.SetRequestURI(baseURL)
+	req.SetRequestURI(baseURL + urlPath)
transports/bifrost-http/handlers/inference.go-1414-1486 (1)

1414-1486: Guard cancel() usage when ConvertToBifrostContext fails; also tighten prompt validation.

At Line 1466 you call cancel() even when bifrostCtx == nil. Unless ConvertToBifrostContext guarantees a non-nil cancel func on failure, this can panic. Also, prompt should likely be strings.TrimSpace(req.Prompt) == "" to reject whitespace-only prompts.

Proposed diff
 	bifrostCtx, cancel := lib.ConvertToBifrostContext(ctx, h.handlerStore.ShouldAllowDirectKeys(), h.config.GetHeaderFilterConfig())
 	if bifrostCtx == nil {
-		cancel()
+		if cancel != nil {
+			cancel()
+		}
 		SendError(ctx, fasthttp.StatusInternalServerError, "Failed to convert context")
 		return
 	}

-	if req.ImageGenerationInput == nil || req.Prompt == "" {
+	if req.ImageGenerationInput == nil || strings.TrimSpace(req.Prompt) == "" {
 		SendError(ctx, fasthttp.StatusBadRequest, "prompt can not be empty")
 		return
 	}
core/internal/testutil/image_generation.go-58-69 (1)

58-69: Enforce deterministic image decode validation by setting response_format=b64_json and output_format=png.

The test currently accepts URL-only responses (lines 104-106), but dimension validation only executes when b64_json is present (line 109). Without explicitly requesting b64_json, providers may return URLs instead, silently skipping dimension checks. Additionally, some providers may default to webp format, which fails to decode since only PNG and JPEG decoders are imported (lines 9-10), causing flaky test failures.

Proposed diff
 				Params: &schemas.ImageGenerationParameters{
 					Size:    bifrost.Ptr("1024x1024"),
 					Quality: bifrost.Ptr("low"),
 					N:       bifrost.Ptr(1),
+					ResponseFormat: bifrost.Ptr("b64_json"),
+					OutputFormat:   bifrost.Ptr("png"),
 				},
core/providers/openai/openai.go-2534-2874 (1)

2534-2874: Fix image streaming to handle multi-image generation: multiple issues with event indexing, usage fields, and premature stream termination.

The implementation has several critical issues for multi-image streams:

  1. Premature stream termination: The code returns after the first isCompleted event (if isCompleted { return }), but OpenAI's spec indicates "Final event for each image"—meaning N-image requests should receive N completed events. This truncates multi-image streams.

  2. Fragile image indexing: When incompleteImages is empty (no prior partial chunks), completed events default to imageIndex = 0, collapsing multiple images. This occurs when:

    • Provider emits only completed events (no partials)
    • Provider omits partial_image_index
    • Chunks arrive out of expected order

    Consider tracking a monotonic counter for completed events, falling back when partial indices are unavailable.

  3. Incomplete usage field copy: Only InputTokens, OutputTokens, and TotalTokens are copied from response.Usage. The InputTokensDetails field is dropped, which may contain important token breakdown information. Copy all fields or preserve response.Usage as-is.

Event types (image_generation.partial_image, image_generation.completed) correctly match the OpenAI spec.

core/providers/gemini/images.go-183-251 (1)

183-251: Enforce “non-empty prompt” before building image-generation provider payloads.

Both ToGeminiImageGenerationRequest() and ToImagenImageGenerationRequest() will happily build requests with Prompt == "". Based on learnings, image-gen requests should be rejected early to avoid dispatching null/empty bodies to providers.

Proposed minimal guard (lets CheckContextAndGetRequestBody produce the structured nil-body error)
 func ToGeminiImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) *GeminiGenerationRequest {
   if bifrostReq == nil {
     return nil
   }
-  if bifrostReq.Input == nil {
+  if bifrostReq.Input == nil || strings.TrimSpace(bifrostReq.Input.Prompt) == "" {
     return nil
   }
   ...
 }

 func ToImagenImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) *GeminiImagenRequest {
-  if bifrostReq == nil || bifrostReq.Input == nil {
+  if bifrostReq == nil || bifrostReq.Input == nil || strings.TrimSpace(bifrostReq.Input.Prompt) == "" {
     return nil
   }
   ...
 }

Also applies to: 253-351

core/providers/gemini/images.go-396-443 (1)

396-443: convertSizeToImagenFormat() only implements 5 of 10 documented aspect ratios.

The type definition in types.go documents support for "1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", and "21:9", but the function only handles 1:1, 3:4, 4:3, 9:16, and 16:9. For input sizes matching 2:3, 3:2, 4:5, 5:4, or 21:9, the function returns an empty aspect ratio string, causing data loss in callers.

core/providers/gemini/images.go-90-121 (1)

90-121: Missing support for "4K" size and additional aspect ratios, with case-sensitivity issue.

convertImagenFormatToSize() is incomplete and case-sensitive. The type comments document support for "1K", "2K", "4K" and 10 aspect ratios ("1:1", "2:3", "3:2", "3:4", "4:3", "4:5", "5:4", "9:16", "16:9", "21:9"), but the function only handles lowercase "1k"/"2k" (no "4K") and only 5 aspect ratios, missing "2:3", "3:2", "4:5", "5:4", "21:9". This will cause silent failures for valid requests using uppercase sizes or unsupported ratios.

Add case-insensitivity with strings.ToLower(), support "4k", and add all documented aspect ratios to match the type definition.

core/providers/huggingface/huggingface.go-882-966 (1)

882-966: Add empty/whitespace prompt validation to converter (match OpenAI/Nebius pattern).

ToHuggingFaceImageGenerationRequest() only validates bifrostReq.Input == nil but not empty or whitespace-only prompts. OpenAI checks bifrostReq.Input.Prompt == "" and Nebius validates strings.TrimSpace(request.Input.Prompt) == "". Add validation for empty/whitespace prompt to match the repo-wide pattern.

core/providers/huggingface/huggingface.go-1024-1278 (1)

1024-1278: Always set StreamEndIndicator when stream goroutine exits, even with zero chunks.

The handler only sets BifrostContextKeyStreamEndIndicator on error paths or when chunkIndex > 0. If the stream ends cleanly without yielding any image chunks (e.g., stream yields only non-data: lines), consumers won't receive the end indicator, breaking stream completion logic in plugins like telemetry and logging.

Use a defer at the goroutine entry to unconditionally set the indicator:

Proposed fix
 go func() {
   defer providerUtils.ReleaseStreamingResponse(resp)
+  defer ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
   defer close(responseChan)
   ...
   if chunkIndex > 0 {
      ...
-     ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
      providerUtils.ProcessAndSendResponse(...)
   }
   ...
   if err := scanner.Err(); err != nil {
      ...
-     ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
      providerUtils.ProcessAndSendBifrostError(...)
   }
 }()

Regarding SSE parsing: bufio.Scanner's line-based approach is appropriate if fal-ai guarantees single-line JSON payloads (standard SSE format data: {json}\n\n). Codify this assumption in a comment if confirmed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
core/providers/utils/utils.go (1)

907-918: Correct propagation, but missing in sibling error-handling functions.

This addition correctly propagates ImageGenerationStreamResponse through the streaming pipeline, following the established pattern for other response types.

However, the same propagation is missing in ProcessAndSendBifrostError (lines 959-966) and ProcessAndSendError (lines 1125-1131). If a post-hook transforms an error into a valid image generation response, that response will be silently dropped.

🔧 Proposed fix for ProcessAndSendBifrostError
 	if processedResponse != nil {
 		streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse
 		streamResponse.BifrostChatResponse = processedResponse.ChatResponse
 		streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse
 		streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse
 		streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse
+		streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse
 	}
🔧 Proposed fix for ProcessAndSendError
 	if processedResponse != nil {
 		streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse
 		streamResponse.BifrostChatResponse = processedResponse.ChatResponse
 		streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse
 		streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse
 		streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse
+		streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse
 	}
docs/providers/supported-providers/groq.mdx (1)

34-34: Update the Note to include Image Generation.

The Note lists unsupported operations but doesn't include the newly added "Image Generation" row from the table above.

📝 Suggested fix
-**Unsupported Operations** (❌): Embeddings, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Embeddings, Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Groq API. These return `UnsupportedOperationError`.
plugins/semanticcache/utils.go (1)

379-412: Avoid duplicating / storing full base64 images in semantic-cache metadata (size + retention risk).
metadata["response"] already contains the marshaled response; additionally storing image_b64 can explode metadata size and may exceed vectorstore limits or retention expectations.

Proposed change: keep URLs + revised prompts, drop image_b64 from metadata
 	// image specific metadata
 	if res.ImageGenerationResponse != nil {
 		var imageURLs []string
-		var imageB64 []string
 		var revisedPrompts []string
 
 		for _, img := range res.ImageGenerationResponse.Data {
 			if img.URL != "" {
 				imageURLs = append(imageURLs, img.URL)
 			}
-			if img.B64JSON != "" {
-				imageB64 = append(imageB64, img.B64JSON)
-			}
 			if img.RevisedPrompt != "" {
 				revisedPrompts = append(revisedPrompts, img.RevisedPrompt)
 			}
 		}
 
 		metadata["image_urls"] = imageURLs
-		metadata["image_b64"] = imageB64
 		metadata["revised_prompts"] = revisedPrompts
 	}
core/providers/gemini/speech.go (1)

147-147: Potential panic: unchecked type assertion on context value.

Line 147 performs an unguarded type assertion ctx.Value(BifrostContextKeyResponseFormat).(string). If the context value is missing or not a string, this will panic.

🐛 Proposed fix
-			responseFormat := ctx.Value(BifrostContextKeyResponseFormat).(string)
+			responseFormat, _ := ctx.Value(BifrostContextKeyResponseFormat).(string)
docs/providers/supported-providers/gemini.mdx (1)

19-30: Correct Files endpoint in table to match Google's official API path

The table lists /upload/storage/v1beta/files, but Google's official Generative AI Files API specifies /upload/v1beta/files (without "storage" in the path). Update line 27 to /upload/v1beta/files for accuracy.

core/providers/openai/openai.go (1)

2286-2311: TranscriptionStream: comment lines (':') should be skipped to avoid noisy unmarshal warnings

Line 2290 only checks line == "", but the comment says “Skip empty lines and comments”. Other SSE handlers skip strings.HasPrefix(line, ":") as well; doing so avoids repeatedly trying to unmarshal comment lines.

Proposed fix
-			// Skip empty lines and comments
-			if line == "" {
+			// Skip empty lines and comments
+			if line == "" || strings.HasPrefix(line, ":") {
 				continue
 			}
🤖 Fix all issues with AI agents
In @core/internal/testutil/response_validation.go:
- Around line 240-275: The image validation can panic when
expectations.ProviderSpecific is nil because code accesses
expectations.ProviderSpecific["min_images"] and ["expected_size"] without
guarding; update ValidateImageGenerationResponse/validateImageGenerationFields
(where ProviderSpecific keys are read) to first check if
expectations.ProviderSpecific != nil (or use a safe lookup with value, ok :=
expectations.ProviderSpecific["min_images"]) before indexing, and treat missing
keys as absent/defaults; apply the same nil-guard pattern to the other
occurrences referenced (lines ~1092-1166) to prevent nil-map panics.

In @core/providers/azure/azure.go:
- Around line 1212-1305: Both ImageGeneration and ImageGenerationStream
dereference request.Model before checking for nil; add an early guard to return
a configuration error when request == nil and also validate that request.Model
is not empty, then use the existing getModelDeployment (or
provider.getModelDeployment) helper to resolve the deployment instead of
directly indexing key.AzureKeyConfig.Deployments; update both functions to
perform: if request == nil { return nil,
providerUtils.NewConfigurationError("request is nil", provider.GetProviderKey())
} then if request.Model == "" { return nil,
providerUtils.NewConfigurationError("model not specified",
provider.GetProviderKey()) } and replace the direct deployment lookup with a
call to getModelDeployment to obtain the deployment and handle missing
deployment uniformly.

In @core/providers/gemini/gemini.go:
- Around line 1614-1754: Add the same nil and prompt validation used across
providers to both ImageGeneration and handleImagenImageGeneration: check request
!= nil and that request.Input (or equivalent) contains a non-empty Prompt
(return the appropriate Bifrost error as other providers do) before referencing
request.Model or building the request body. In handleImagenImageGeneration, stop
hardcoding the Google URL; construct the request URL from
provider.networkConfig.BaseURL combined with the path helper (e.g.,
providerUtils.GetPathFromContext or the repo’s path-building utility) and the
model predict suffix instead of using
"https://generativelanguage.googleapis.com/…", so routing/proxy/BaseURL
overrides are respected. Ensure these validations return early and match
existing error creation patterns used elsewhere in the provider code.

In @core/providers/gemini/images.go:
- Around line 90-121: The convertImagenFormatToSize function is missing several
aspect ratio cases documented in GeminiImagenParameters (2:3, 3:2, 4:5, 5:4,
21:9), causing those inputs to fall back to a square size; update
convertImagenFormatToSize to handle these ratios by adding switch branches for
"2:3" (width = baseSize*2/3, height = baseSize), "3:2" (width = baseSize, height
= baseSize*2/3), "4:5" (width = baseSize*4/5, height = baseSize), "5:4" (width =
baseSize, height = baseSize*4/5), and "21:9" (width = baseSize*21/9, height =
baseSize) using strconv.Itoa on the computed integers so the function returns
correct WxH strings consistent with existing cases.

In @core/providers/gemini/transcription.go:
- Around line 36-41: The code silently skips parts when
decodeBase64StringToBytes(part.InlineData.Data) fails, causing inconsistent
error handling with ToBifrostSpeechResponse; update the handler so decoding
failures are propagated (or at minimum logged) rather than silently continued:
replace the current "continue" behavior with returning or bubbling up a wrapped
error (e.g., fmt.Errorf or errors.Wrap) from the enclosing function, or if
returning isn't possible, call the module logger with a clear message including
part identifiers and the decode error before continuing to keep observability
consistent with ToBifrostSpeechResponse.

In @core/providers/openai/openai.go:
- Around line 2382-2491: Add explicit validation to prevent nil/missing-prompt
panics: in ImageGeneration ensure the incoming req is non-nil and contains a
non-empty prompt (or prompts) before calling HandleOpenAIImageGenerationRequest;
in HandleOpenAIImageGenerationRequest check request != nil and that the prompt
field(s) are present/non-empty before calling ToOpenAIImageGenerationRequest or
providerUtils.CheckContextAndGetRequestBody, and return an appropriate
BifrostError (using the existing providerUtils error constructors and
schemas.ImageGenerationRequest/providerName) when validation fails.

In @core/schemas/plugin.go:
- Around line 49-51: Update the top-of-function doc comment to reference the
exported function name CaseInsensitiveLookup (instead of the old unexported
caseInsensitiveLookup) and ensure the comment follows Go's convention of
starting with the function name: e.g., "CaseInsensitiveLookup looks up..." and
then keep the rest of the description unchanged.

In @docs/openapi/paths/integrations/openai/images.yaml:
- Around line 62-66: The OpenAPI spec is missing the required flag for the Azure
OpenAI image generation query parameter named "api-version"; update the
parameter block for name: api-version (in: query) in
docs/openapi/paths/integrations/openai/images.yaml to include required: true so
the spec enforces the mandatory api-version query parameter for the
images/generations endpoint.

In @docs/providers/supported-providers/vertex.mdx:
- Line 314: The header " # 5. Image Generation" in
docs/providers/supported-providers/vertex.mdx is misnumbered; change that header
to "# 4. Image Generation" and update the subsequent "List Models" header to "#
5. List Models" (or renumber any following top-level headers accordingly) so the
section sequence 1, 2, 3, 4, 5 is preserved; locate and edit the literal header
lines "5. Image Generation" and the "List Models" header to their corrected
numeric values.
- Line 402: Update the Markdown heading in the vertex provider doc: change the
heading text from "# 6. List Models" to "# 5. List Models" (the header in the
file that currently reads "6. List Models"); if the "Image Generation" section
is moved to section 4 as noted, ensure this header becomes section 5 so
numbering remains sequential.

In @docs/quickstart/go-sdk/multimodal.mdx:
- Around line 85-88: The image generation example should note that token-based
usage metrics are not provided by image-generation providers, so response.Usage
will often be nil; update the multimodal example around the response.Usage check
to either remove the Usage printing or add a short inline comment clarifying
that for image generation endpoints response.Usage (and Usage.TotalTokens) may
be empty/not populated and thus won’t produce metrics. Ensure the change
references response.Usage and Usage.TotalTokens so readers know which fields are
affected.

In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 388-402: The branch that handles duration2 < duration1 computes
speedup := float64(duration1)/float64(duration2) which is always >1, so the
subsequent check if speedup < 0.1 is unreachable; remove that inner conditional
and its t.Errorf call (keep the t.Logf). Locate the block that compares
duration1 and duration2 (variables duration1, duration2, speedup) inside the
test and delete the dead-speedup check to avoid dead code.

In @transports/bifrost-http/handlers/inference.go:
- Around line 141-158: The imageParamsKnownFields map is missing keys present on
schemas.ImageGenerationParameters which causes typed fields to be treated as
ExtraParams; update the imageParamsKnownFields variable to include at least
"seed", "negative_prompt", and "num_inference_steps" so those keys are
recognized as known fields and not placed into ExtraParams, ensuring
ImageGenerationParameters handling remains authoritative.

In @transports/bifrost-http/integrations/router.go:
- Around line 203-210: RegisterRoutes currently only validates
RequestConverter/ErrorConverter but can panic when route configs lack other
required converters; update RegisterRoutes (and any route config validation) to
explicitly check for nil on route-specific converter fields such as
ImageGenerationResponseConverter and ImageGenerationStreamResponseConverter (and
any other converters referenced in the blocks around lines ~256-264 and
~277-309), and either return a clear structured error indicating the missing
converter or assign a safe default/no-op converter to avoid panics at runtime;
reference the converter types ImageGenerationResponseConverter and
ImageGenerationStreamResponseConverter when adding these checks.
- Around line 1085-1087: The ImageGenerationStreamResponseConverter on
StreamConfig can be nil and cause a panic during the streaming loop; add a
nil-check guard where you call g.client.ImageGenerationStreamRequest and where
the stream is consumed (same pattern used for other converters), returning a
clear error if StreamConfig.ImageGenerationStreamResponseConverter is nil, or
add a startup validation that ensures ImageGenerationStreamResponseConverter is
set when StreamConfig exists so ImageGenerationStreamRequest and its stream loop
never dereference a nil converter.
- Around line 696-718: Guard the call to ImageGenerationResponseConverter:
before invoking config.ImageGenerationResponseConverter(bifrostCtx,
imageGenerationResponse) check whether config.ImageGenerationResponseConverter
is nil and if so call g.sendError(ctx, bifrostCtx, config.ErrorConverter,
newBifrostError(nil, "missing ImageGenerationResponseConverter for
integration")) and return; ensure this nil-check happens after verifying
imageGenerationResponse is non-nil (and keep existing sendError/newBifrostError
usage and the surrounding post-callback handling intact).

In @ui/app/workspace/logs/views/imageView.tsx:
- Around line 27-34: The effect that clamps currentIndex is running
unnecessarily because `images` is recreated each render; update the dependency
array for the useEffect that references `setCurrentIndex` and `totalImages` to
only depend on `totalImages` (i.e., change useEffect([...]) to useEffect(() => {
... }, [totalImages]);); alternatively, if you need the effect to run when the
actual image content changes, memoize `images` using `useMemo` (wrap the
`imageOutput?.data?.filter(...)` expression in useMemo with
`[imageOutput?.data]`) and keep `images` in the dependency array so the effect
only triggers on real changes.
🧹 Nitpick comments (29)
core/schemas/utils.go (1)

1183-1186: Consider aligning case-sensitivity behavior with other model-checking functions.

This function applies strings.ToLower() before matching, whereas the adjacent functions (IsNovaModel, IsAnthropicModel, IsMistralModel, IsGeminiModel) perform case-sensitive matching. This creates inconsistent behavior: IsImagenModel("IMAGEN-3") returns true, but IsGeminiModel("GEMINI-1.5") returns false.

If case-insensitive matching is intentional for Imagen models, the logic is correct as-is. Otherwise, consider aligning with the existing pattern:

🔧 Suggested change for consistency
 // IsImagenModel checks if the model is an Imagen model.
 func IsImagenModel(model string) bool {
-	return strings.Contains(strings.ToLower(model), "imagen")
+	return strings.Contains(model, "imagen")
 }
framework/logstore/migrations.go (1)

684-748: Consider combining both image generation column migrations.

Both migrationAddImageGenerationOutputColumn and migrationAddImageGenerationInputColumn add columns for the same feature. Similar to how migrationAddResponsesOutputColumn (lines 198-259) adds multiple related columns in a single migration, consolidating these would reduce migration overhead and keep related schema changes atomic.

♻️ Suggested consolidation
-func migrationAddImageGenerationOutputColumn(ctx context.Context, db *gorm.DB) error {
+func migrationAddImageGenerationColumns(ctx context.Context, db *gorm.DB) error {
 	opts := *migrator.DefaultOptions
 	opts.UseTransaction = true
 	m := migrator.New(db, &opts, []*migrator.Migration{{
-		ID: "logs_add_image_generation_output_column",
+		ID: "logs_add_image_generation_columns",
 		Migrate: func(tx *gorm.DB) error {
 			tx = tx.WithContext(ctx)
 			migrator := tx.Migrator()
 			if !migrator.HasColumn(&Log{}, "image_generation_output") {
 				if err := migrator.AddColumn(&Log{}, "image_generation_output"); err != nil {
 					return err
 				}
 			}
+			if !migrator.HasColumn(&Log{}, "image_generation_input") {
+				if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil {
+					return err
+				}
+			}
 			return nil
 		},
 		Rollback: func(tx *gorm.DB) error {
 			tx = tx.WithContext(ctx)
 			migrator := tx.Migrator()
 			if migrator.HasColumn(&Log{}, "image_generation_output") {
 				if err := migrator.DropColumn(&Log{}, "image_generation_output"); err != nil {
 					return err
 				}
 			}
+			if migrator.HasColumn(&Log{}, "image_generation_input") {
+				if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil {
+					return err
+				}
+			}
 			return nil
 		},
 	}})
 	err := m.Migrate()
 	if err != nil {
-		return fmt.Errorf("error while adding image generation output column: %s", err.Error())
+		return fmt.Errorf("error while adding image generation columns: %s", err.Error())
 	}
 	return nil
 }
-
-func migrationAddImageGenerationInputColumn(ctx context.Context, db *gorm.DB) error {
-	opts := *migrator.DefaultOptions
-	opts.UseTransaction = true
-	m := migrator.New(db, &opts, []*migrator.Migration{{
-		ID: "logs_add_image_generation_input_column",
-		Migrate: func(tx *gorm.DB) error {
-			tx = tx.WithContext(ctx)
-			migrator := tx.Migrator()
-			if !migrator.HasColumn(&Log{}, "image_generation_input") {
-				if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil {
-					return err
-				}
-			}
-			return nil
-		},
-		Rollback: func(tx *gorm.DB) error {
-			tx = tx.WithContext(ctx)
-			migrator := tx.Migrator()
-			if migrator.HasColumn(&Log{}, "image_generation_input") {
-				if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil {
-					return err
-				}
-			}
-			return nil
-		},
-	}})
-	err := m.Migrate()
-	if err != nil {
-		return fmt.Errorf("error while adding image generation input column: %s", err.Error())
-	}
-	return nil
-}

And update triggerMigrations:

-	if err := migrationAddImageGenerationOutputColumn(ctx, db); err != nil {
-		return err
-	}
-	if err := migrationAddImageGenerationInputColumn(ctx, db); err != nil {
+	if err := migrationAddImageGenerationColumns(ctx, db); err != nil {
 		return err
 	}

That said, the current implementation is correct and follows the defensive patterns established in the codebase.

ui/app/workspace/logs/views/imageView.tsx (1)

75-81: Simplify misleading isStreaming prop.

The expression isStreaming && !currentImage is always false here because this code is inside the currentImage && (...) block where currentImage is truthy. This is confusing to read.

Since ImageMessage only uses isStreaming to show a skeleton when there's no usable image data, and currentImage always has usable data (it passed the filter), you can simply omit the prop or explicitly pass false.

Proposed simplification
 								<ImageMessage 
 									image={{
 										...currentImage,
 										output_format: imageOutput?.params?.output_format,
 									}} 
-									isStreaming={isStreaming && !currentImage} 
 								/>
core/providers/huggingface/models.go (1)

67-69: Consider adding tag-based detection for image generation.

The pipeline-based detection for "text-to-image" is correct. However, other capabilities like embeddings (lines 74-77), chat (lines 78-83), speech (lines 84-86), and transcription (lines 87-89) have both pipeline-based AND tag-based detection. For consistency, consider adding tag-based detection for image generation models.

🔧 Suggested addition for tag-based image generation detection
 		case tagLower == "automatic-speech-recognition" ||
 			tagLower == "speech-to-text" || strings.Contains(tagLower, "speech-recognition"):
 			addMethods(schemas.TranscriptionRequest)
+		case tagLower == "text-to-image" || tagLower == "image-generation" ||
+			strings.Contains(tagLower, "text-to-image") || strings.Contains(tagLower, "image-gen"):
+			addMethods(schemas.ImageGenerationRequest)
 		}
docs/providers/supported-providers/openai.mdx (1)

240-246: Minor style suggestion: Vary sentence beginnings.

Per static analysis, three successive bullet points begin with the same structural pattern. Consider varying the phrasing slightly:

📝 Suggested rewording
-OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion:
+OpenAI serves as Bifrost's baseline schema for image generation, requiring minimal conversion:

-- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt`
-- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed.
+- **Model & Prompt**: Maps `bifrostReq.Model` → `req.Model` and `bifrostReq.Input.Prompt` → `req.Prompt`
+- **Parameters**: Fields from `bifrostReq.Params` (`ImageGenerationParameters`) embed directly into the OpenAI request struct. No field mapping or transformation is performed.
 - **Streaming**: When streaming is requested, `stream: true` is set in the request body.
core/providers/huggingface/utils.go (1)

170-186: Consider defining explicit behavior for image-gen streaming on nebius/together routes.

At Line 173–176 and Line 180–185, nebius and together only handle schemas.ImageGenerationRequest. If image-gen streaming is intentionally unsupported for these, it may be cleaner to return an UnsupportedOperationError for schemas.ImageGenerationStreamRequest (vs a generic fmt.Errorf(...)), to keep error semantics consistent across providers.

plugins/semanticcache/test_utils.go (1)

356-365: Be careful: t.Skipf can mask regressions if CI expects Weaviate to be present.

If CI/environment should guarantee the vector store, consider gating the skip behind an env like REQUIRE_WEAVIATE=1 (fail in CI, skip locally).

docs/providers/supported-providers/nebius.mdx (1)

135-189: Consider loosening output_format docs to include "jpg" as accepted input (not just "jpeg").

The code path typically normalizes "jpeg""jpg", but clients often send "jpg" directly—documenting both reduces confusion.

Also, the LanguageTool nit about repeated sentence starts is safe to ignore unless you’re already editing that block.

docs/providers/supported-providers/azure.mdx (2)

20-31: Potential doc confusion: table endpoint /openai/v1/images/generations vs examples using /v1/images/generations.

If both routes exist (gateway vs provider-prefixed), consider adding a one-liner clarifying which endpoint family the table refers to and which the examples use.


364-429: Verify the documented request JSON matches the actual image-generation request schema exposed by the gateway.

This section mixes OpenAI-style fields (e.g., prompt, size, n) with the internal Bifrost struct shape (input.prompt, params.*) used elsewhere—please align with the OpenAPI spec in this Graphite stack (or document both shapes explicitly, if both are supported).

core/internal/testutil/account.go (1)

1129-1155: Minor: XAI scenario field ordering differs from other providers.

The ImageGeneration and ImageGenerationStream fields are placed after ListModels in the XAI config (lines 1153-1154), whereas other providers place them earlier in the Scenarios struct. This doesn't affect functionality but slightly reduces consistency.

Suggested reordering for consistency
 		Scenarios: TestScenarios{
 			TextCompletion:        false, // Not typical
 			SimpleChat:            true,
 			CompletionStream:      true,
 			MultiTurnConversation: true,
 			ToolCalls:             true,
 			MultipleToolCalls:     true,
 			End2EndToolCalling:    true,
 			AutomaticFunctionCall: true,
 			ImageURL:              true,
 			ImageBase64:           true,
 			MultipleImages:        true,
 			CompleteEnd2End:       true,
 			SpeechSynthesis:       false, // Not supported
 			SpeechSynthesisStream: false, // Not supported
 			Transcription:         false, // Not supported
 			TranscriptionStream:   false, // Not supported
 			Embedding:             false, // Not supported
+			ImageGeneration:       true,
+			ImageGenerationStream: false,
 			ListModels:            true,
-			ImageGeneration:       true,
-			ImageGenerationStream: false,
 		},
tests/integrations/python/tests/utils/common.py (1)

2707-2723: Consider adding type hints for consistency.

The new functions assert_valid_image_generation_response and assert_image_generation_usage lack return type hints, while other similar functions in this file (e.g., assert_valid_embedding_response) have -> None annotations.

Add return type hints
-def assert_valid_image_generation_response(response: Any, library: str = "openai"):
+def assert_valid_image_generation_response(response: Any, library: str = "openai") -> None:
-def assert_image_generation_usage(response: Any, library: str = "openai"):
+def assert_image_generation_usage(response: Any, library: str = "openai") -> None:
plugins/semanticcache/utils.go (1)

956-1008: ImageGeneration parameter-to-metadata mapping looks complete; watch key collisions via ExtraParams.
maps.Copy(metadata, params.ExtraParams) can overwrite typed keys (e.g., "size", "quality"). If that’s intended (provider-specific override), ok; otherwise consider copying ExtraParams under a prefix like "extra_*" to avoid silent overrides.

transports/bifrost-http/handlers/inference.go (1)

1258-1334: Avoid logging entire streaming chunks on marshal failure (can leak large/secret payloads).
logger.Warn(..., chunk: %v) may dump base64 images, attachments, or other sensitive content into logs. Consider logging only the chunk type + request id/correlation fields.

framework/streaming/accumulator.go (1)

293-324: Optional: reduce allocations in imageChunkKey if image streaming is high-throughput.

fmt.Sprintf("%d:%d", ...) allocates; if this ends up in a hot path, consider a small strconv.AppendInt-based builder or a fixed-width encoding.

core/internal/testutil/test_retry_framework.go (1)

1250-1252: Consider separating retry configs for ImageGeneration vs ImageGenerationStream scenarios.

Right now both map to DefaultImageGenerationRetryConfig(). If you want stream-specific behavior (e.g., retry on incomplete stream / stream errors), using StreamingRetryConfig() (or a new image-stream-specific config) for "ImageGenerationStream" would be clearer.

core/providers/openai/types.go (1)

546-584: Add explicit constants for image SSE event types to prevent drift/typos.

The unconstrained ImageGenerationEventType should have typed constants defined. OpenAI's official image generation streaming API uses image_generation.partial_image and image_generation.completed as event types—defining these as constants enforces the contract and prevents typos.

Proposed change
 type ImageGenerationEventType string

+const (
+	ImageGenerationEventPartial   ImageGenerationEventType = "image_generation.partial_image"
+	ImageGenerationEventCompleted ImageGenerationEventType = "image_generation.completed"
+)
core/providers/gemini/utils.go (1)

1454-1469: LGTM! Clean base64 encoding helper.

The encodeBytesToBase64String function correctly handles nil/empty input and uses standard base64 encoding as documented.

One minor observation: the local variable declaration could be simplified, but this is a stylistic preference.

✨ Optional simplification
 func encodeBytesToBase64String(bytes []byte) string {
-	var base64str string
-
-	if len(bytes) > 0 {
-		// Use standard base64 encoding to match external SDK expectations
-		base64str = base64.StdEncoding.EncodeToString(bytes)
-	}
-
-	return base64str
+	if len(bytes) == 0 {
+		return ""
+	}
+	return base64.StdEncoding.EncodeToString(bytes)
 }
core/providers/nebius/nebius_test.go (1)

31-32: LGTM! Image generation test configuration added for Nebius.

The configuration correctly enables image generation testing with the black-forest-labs/flux-schnell model and sets streaming to false.

Minor observation: ImageGenerationStream (line 49) is placed after CompleteEnd2End (line 48), which slightly breaks the logical grouping where related flags are adjacent. Consider placing it immediately after ImageGeneration for consistency with other provider test files.

Also applies to: 47-49

core/providers/xai/xai.go (1)

230-244: LGTM! ImageGeneration method correctly delegates to OpenAI handler.

The implementation properly:

  • Uses the shared HandleOpenAIImageGenerationRequest handler
  • Passes all required configuration (client, URL, headers, raw request/response flags)
  • Uses providerUtils.GetPathFromContext for path resolution
  • Follows the established pattern of other methods in this file

Minor: Line 230 has a grammatical issue: "performs a image" should be "performs an image".

📝 Minor grammar fix
-// ImageGeneration performs a image generation request to the xAI API.
+// ImageGeneration performs an image generation request to the xAI API.
core/schemas/provider.go (1)

352-357: Minor grammar nit in doc comments.

The doc comments say "a image" instead of "an image".

📝 Suggested fix
-	// ImageGeneration performs a image generation request
+	// ImageGeneration performs an image generation request
 	ImageGeneration(ctx *BifrostContext, key Key, request *BifrostImageGenerationRequest) (
 		*BifrostImageGenerationResponse, *BifrostError)
-	// ImageGenerationStream performs a image generation stream request
+	// ImageGenerationStream performs an image generation stream request
 	ImageGenerationStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key,
 		request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)
core/providers/nebius/nebius.go (1)

259-259: Consider using provider.GetProviderKey() for consistency.

The hardcoded schemas.Nebius is used here, but other methods in this file (e.g., Speech, Transcription, batch operations) use provider.GetProviderKey(). While Nebius doesn't currently support custom provider configs, using the getter would maintain consistency and future-proof the code.

♻️ Suggested change
 	path := providerUtils.GetPathFromContext(ctx, "/v1/images/generations")
-	providerName := schemas.Nebius
+	providerName := provider.GetProviderKey()
core/providers/bedrock/bedrock.go (1)

1347-1355: Unsupported ImageGeneration endpoints: OK, but provider naming is now inconsistent vs other unsupported ops in this file.
Here you use provider.GetProviderKey() while Speech*/Transcription* use schemas.Bedrock. If custom provider names are supported, consider standardizing all unsupported ops to provider.GetProviderKey() for consistent ExtraFields.Provider.

docs/openapi/paths/inference/images.yaml (1)

1-58: OpenAPI path wiring looks consistent; consider de-risking the hard-coded model list.
The “Supported Providers” section enumerates specific model IDs that will drift; recommend either (a) keeping it provider-level without concrete model IDs, or (b) generating from a central source.

plugins/semanticcache/plugin_image_generation_test.go (1)

38-41: Redundant return after t.Skipf.

t.Skipf marks the test as skipped and returns from the test function via runtime.Goexit(), so the explicit return on line 40 is unnecessary. This pattern repeats throughout the file (lines 139-141, 239-241, 251-253, 265-267, 296-298).

♻️ Suggested cleanup
 	if err1 != nil {
-		t.Skipf("First image generation request failed (may be rate limited): %v", err1)
-		return
+		t.Skipf("First image generation request failed (may be rate limited): %v", err1)
 	}
core/internal/testutil/image_generation.go (1)

154-287: Avoid coupling generic stream testutil to the OpenAI provider for completion detection

Line 233 uses openai.ImageGenerationCompleted to detect stream completion. That couples core/internal/testutil to a specific provider package. If the intent is “canonical event types across all providers”, prefer moving the canonical constants to core/schemas (or core/providers/utils) and referencing them from there.

core/providers/gemini/types.go (1)

1702-1707: Inconsistent struct definition: GeminiImagenRequest.Instances uses anonymous struct instead of ImagenInstance.

GeminiImagenRequest.Instances defines an anonymous struct with Prompt *string, while ImagenInstance (line 91-93) uses Prompt string. Consider using the ImagenInstance type for consistency:

♻️ Suggested refactor for consistency
 type GeminiImagenRequest struct {
-	Instances *[]struct {
-		Prompt *string `json:"prompt"`
-	} `json:"instances"`
+	Instances  *[]ImagenInstance       `json:"instances"`
 	Parameters GeminiImagenParameters `json:"parameters"`
 }

Note: This would require updating ImagenInstance.Prompt to be *string if nil prompts need to be distinguished, or updating callers to handle the non-pointer type.

core/providers/gemini/images.go (2)

379-394: Asymmetric format handling: gif is handled in convertMimeTypeToExtension but not in convertOutputFormatToMimeType.

For consistency in round-trip conversions, consider adding gif support:

♻️ Add gif support for symmetry
 func convertOutputFormatToMimeType(outputFormat string) string {
 	format := strings.ToLower(strings.TrimSpace(outputFormat))
 	switch format {
 	case "png":
 		return "image/png"
 	case "jpg", "jpeg":
 		return "image/jpeg"
 	case "webp":
 		return "image/webp"
+	case "gif":
+		return "image/gif"
 	default:
 		return ""
 	}
 }

425-443: Missing aspect ratio detection in convertSizeToImagenFormat.

Similar to convertImagenFormatToSize, this function only detects a subset of aspect ratios. Consider adding detection for the full set supported by Imagen.

♻️ Add missing aspect ratio detection
 	// Common aspect ratios with tolerance
 	if ratio >= 0.99 && ratio <= 1.01 {
 		aspectRatio = "1:1"
+	} else if ratio >= 0.66 && ratio <= 0.68 {
+		aspectRatio = "2:3"
+	} else if ratio >= 1.49 && ratio <= 1.51 {
+		aspectRatio = "3:2"
 	} else if ratio >= 0.74 && ratio <= 0.76 {
 		aspectRatio = "3:4"
 	} else if ratio >= 1.32 && ratio <= 1.34 {
 		aspectRatio = "4:3"
+	} else if ratio >= 0.79 && ratio <= 0.81 {
+		aspectRatio = "4:5"
+	} else if ratio >= 1.24 && ratio <= 1.26 {
+		aspectRatio = "5:4"
 	} else if ratio >= 0.56 && ratio <= 0.57 {
 		aspectRatio = "9:16"
 	} else if ratio >= 1.77 && ratio <= 1.78 {
 		aspectRatio = "16:9"
+	} else if ratio >= 2.32 && ratio <= 2.34 {
+		aspectRatio = "21:9"
 	}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
docs/providers/supported-providers/ollama.mdx (1)

31-35: Update the Note to include Image Generation.

The Note lists unsupported operations but doesn't include Image Generation, which is now shown as ❌ in the table above.

📝 Suggested fix
 <Note>
-**Unsupported Operations** (❌): Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`.

 Ollama is self-hosted. Ensure you have an Ollama instance running and configured with the correct BaseURL (e.g., `http://localhost:11434`).
 </Note>
core/utils.go (1)

50-53: Add ImageGenerationRequest and ImageGenerationStreamRequest to model validation.

The isModelRequired function currently excludes ImageGenerationRequest and ImageGenerationStreamRequest, but the HTTP handler explicitly validates that a model is required for these requests (lines 1427-1429 in inference.go). This validation should be enforced at the core API level in isModelRequired to ensure consistent model requirement checking for all code paths, including direct API calls that bypass the HTTP handler.

Suggested change
 func isModelRequired(reqType schemas.RequestType) bool {
-	return reqType == schemas.TextCompletionRequest || reqType == schemas.TextCompletionStreamRequest || reqType == schemas.ChatCompletionRequest || reqType == schemas.ChatCompletionStreamRequest || reqType == schemas.ResponsesRequest || reqType == schemas.ResponsesStreamRequest || reqType == schemas.SpeechRequest || reqType == schemas.SpeechStreamRequest || reqType == schemas.TranscriptionRequest || reqType == schemas.TranscriptionStreamRequest || reqType == schemas.EmbeddingRequest
+	return reqType == schemas.TextCompletionRequest || reqType == schemas.TextCompletionStreamRequest || reqType == schemas.ChatCompletionRequest || reqType == schemas.ChatCompletionStreamRequest || reqType == schemas.ResponsesRequest || reqType == schemas.ResponsesStreamRequest || reqType == schemas.SpeechRequest || reqType == schemas.SpeechStreamRequest || reqType == schemas.TranscriptionRequest || reqType == schemas.TranscriptionStreamRequest || reqType == schemas.EmbeddingRequest || reqType == schemas.ImageGenerationRequest || reqType == schemas.ImageGenerationStreamRequest
ui/lib/types/schemas.ts (1)

327-345: Add defaults for new required allowed_requests fields to prevent validation failures on existing configs.

When allowed_requests is stored with older field subsets and later re-validated, missing image_generation and image_generation_stream will fail schema parsing. Use .default(false) for these new fields to maintain backward compatibility with existing persisted configurations.

Proposed fix
 export const allowedRequestsSchema = z.object({
   text_completion: z.boolean(),
   text_completion_stream: z.boolean(),
   chat_completion: z.boolean(),
   chat_completion_stream: z.boolean(),
   responses: z.boolean(),
   responses_stream: z.boolean(),
   embedding: z.boolean(),
   speech: z.boolean(),
   speech_stream: z.boolean(),
   transcription: z.boolean(),
   transcription_stream: z.boolean(),
-  image_generation: z.boolean(),
-  image_generation_stream: z.boolean(),
+  image_generation: z.boolean().default(false),
+  image_generation_stream: z.boolean().default(false),
   count_tokens: z.boolean(),
   list_models: z.boolean(),
 });
core/schemas/responses.go (1)

540-575: Correctness gap: error message claims image generation output support that isn't implemented.

ResponsesToolMessageOutputStruct error messages (lines 556, 574) claim support for "image generation call output", but the struct has no corresponding field and no marshal/unmarshal branch to handle it. The struct only contains:

  • ResponsesToolCallOutputStr
  • ResponsesFunctionToolCallOutputBlocks
  • ResponsesComputerToolCallOutput

Image generation outputs will fail to serialize/deserialize with a misleading error message.

Add a ResponsesImageGenerationOutput field (using the appropriate image generation output type) and handle it in both MarshalJSON and UnmarshalJSON, following the same pattern as ResponsesComputerToolCallOutput.

core/providers/gemini/speech.go (1)

135-158: Fix unsafe type assertion on context value that will panic when request.Params is nil.

The code assumes BifrostContextKeyResponseFormat is always set in context, but ctx.SetValue() is only called when request.Params != nil (line in gemini.go). When Params is omitted, the unsafe assertion ctx.Value(BifrostContextKeyResponseFormat).(string) panics with "interface {} is nil, not string".

Use a safe type assertion with a default fallback to "wav":

Fix
-            if len(audioData) > 0 {
-                responseFormat := ctx.Value(BifrostContextKeyResponseFormat).(string)
+            if len(audioData) > 0 {
+                responseFormat, _ := ctx.Value(BifrostContextKeyResponseFormat).(string)
+                if responseFormat == "" {
+                    responseFormat = "wav"
+                }
                 // Gemini returns PCM audio (s16le, 24000 Hz, mono)
                 // Convert to WAV for standard playable output format
                 if responseFormat == "wav" {
framework/modelcatalog/main.go (1)

223-236: Add ImageGenerationRequest to pricing lookup.

ImageGenerationRequest is defined in the request types but missing from the pricing lookup loop. The codebase has full pricing support for image generation (see pricing.go which explicitly handles ImageGenerationResponse and normalizes ImageGenerationStreamRequest), so the method should include schemas.ImageGenerationRequest in the iteration at line 223 to avoid pricing lookup failures for image generation models.

transports/bifrost-http/handlers/inference.go (1)

1261-1336: Use hardcoded "error" for image-generation SSE error events instead of schemas.ResponsesStreamResponseTypeError.

The error event type is currently sourced from a Responses API constant, which creates unnecessary coupling. Since BifrostImageGenerationStreamResponse.Type is a plain string field (distinct from the Responses API enum), and there is no dedicated image-generation error type constant, the error event type should be explicitly hardcoded as "error".

Fix
- } else if chunk.BifrostError != nil {
-     eventType = string(schemas.ResponsesStreamResponseTypeError)
- }
+ } else if chunk.BifrostError != nil {
+     eventType = "error"
+ }
core/providers/gemini/types.go (1)

64-88: Unify Imagen request types; current split risks omitting required prompt.

You now have:

  • ImagenInstance{ Prompt string \json:"prompt,omitempty"` }` (omitempty + non-pointer)
  • GeminiImagenRequest.Instances *[]struct{ Prompt *string \json:"prompt"` }` (required + pointer + anonymous type)

This makes it easy to accidentally serialize an Imagen request without a prompt (or maintain two converters forever). Suggest: use a single ImagenInstance everywhere, and make prompt required (no omitempty) for the outbound Imagen request type.

Concrete direction
 type ImagenInstance struct {
-    Prompt string `json:"prompt,omitempty"`
+    Prompt string `json:"prompt"`
 }

 type GeminiImagenRequest struct {
-    Instances *[]struct {
-        Prompt *string `json:"prompt"`
-    } `json:"instances"`
-    Parameters GeminiImagenParameters `json:"parameters"`
+    Instances  []ImagenInstance       `json:"instances"`
+    Parameters GeminiImagenParameters `json:"parameters,omitempty"`
 }

Also applies to: 90-93, 1702-1727

core/providers/huggingface/huggingface.go (1)

220-265: Unused parameter isHFInferenceImageRequest in completeRequest.

The parameter isHFInferenceImageRequest is added to the function signature but is never used within the function body. This is dead code that adds unnecessary complexity to the API surface.

Either remove the unused parameter or implement the intended differentiation for image requests.

🐛 If no special handling is needed, remove the parameter
-func (provider *HuggingFaceProvider) completeRequest(ctx *schemas.BifrostContext, jsonData []byte, url string, key string, isHFInferenceAudioRequest bool, isHFInferenceImageRequest bool) ([]byte, time.Duration, *schemas.BifrostError) {
+func (provider *HuggingFaceProvider) completeRequest(ctx *schemas.BifrostContext, jsonData []byte, url string, key string, isHFInferenceAudioRequest bool) ([]byte, time.Duration, *schemas.BifrostError) {

And update all call sites accordingly.

🤖 Fix all issues with AI agents
In @core/internal/testutil/test_retry_framework.go:
- Around line 2939-3095: The top comment block references
WithImageGenerationStreamValidationRetry but the actual function is named
WithImageGenerationStreamRetry; fix the mismatch by either renaming the function
to WithImageGenerationStreamValidationRetry (and update all call sites/tests) or
by updating the comment header and any documentation to say
WithImageGenerationStreamRetry to match the function; ensure the chosen name
follows the existing "*StreamValidationRetry" helper naming convention and
update any related references (comments, tests, or other helpers) to remain
consistent.
- Around line 161-165: DefaultImageGenerationRetryConfig currently returns a
TestRetryConfig with []TestRetryCondition while WithImageGenerationRetry expects
ImageGenerationRetryConfig with []ImageGenerationRetryCondition, so default
conditions are never applied; update DefaultImageGenerationRetryConfig to return
an ImageGenerationRetryConfig (or provide a conversion) that copies retry
timings from GetTestRetryConfigForScenario() and leaves Conditions nil/empty per
testutil convention, or alternatively change WithImageGenerationRetry to accept
TestRetryConfig—ensure the types for Conditions match (TestRetryCondition vs
ImageGenerationRetryCondition) and update any factory/constructor usages (e.g.,
DefaultImageGenerationRetryConfig, WithImageGenerationRetry,
ImageGenerationRetryConfig, TestRetryConfig, GetTestRetryConfigForScenario) so
default conditions are preserved and consistent across the codebase.

In @core/providers/gemini/gemini.go:
- Around line 1673-1749: The handleImagenImageGeneration function currently
hardcodes baseURL to
"https://generativelanguage.googleapis.com/v1beta/models/%s:predict", which
ignores provider.networkConfig.BaseURL; change baseURL construction to prefer
provider.networkConfig.BaseURL (falling back to the hardcoded default) and
append the path "/v1beta/models/{model}:predict" using request.Model so
custom/proxy/regional endpoints are honored; update any tests or callers that
assume the old constant and ensure request URI still uses
req.SetRequestURI(baseURL) as before.

In @core/providers/gemini/images.go:
- Around line 253-351: In ToImagenImageGenerationRequest, guard against empty
prompts by checking bifrostReq.Input.Prompt before building instances and return
nil when the prompt is empty or all-whitespace; specifically, after the nil
checks for bifrostReq and bifrostReq.Input, retrieve the prompt
(bifrostReq.Input.Prompt) and if strings.TrimSpace(prompt) == "" then return nil
instead of constructing the instances/GeminiImagenRequest, ensuring you import
strings and keep all other mapping logic unchanged.
- Around line 90-121: In convertImagenFormatToSize, normalize the imageSize
input to lower-case and add a "4k" case so values like "1K"/"2K"/"4K" are
handled correctly; update the switch on *imageSize in the
convertImagenFormatToSize function to use strings.ToLower(*imageSize) and
include cases for "1k" => 1024, "2k" => 2048, and "4k" => 4096 so the baseSize
calculation produces correct dimensions for subsequent aspectRatio handling.

In @core/providers/huggingface/images.go:
- Around line 14-154: The Nebius mapping has two fixes: change
NebiusImageGenerationRequest.GuidanceScale from *int to *float64 in
core/providers/nebius/types.go, then in ToHuggingFaceImageGenerationRequest
update the extraction call from schemas.SafeExtractIntPointer(...) to
schemas.SafeExtractFloat64Pointer(...) and assign the returned *float64 to
req.GuidanceScale; also add prompt validation in the nebius case by checking
strings.TrimSpace(bifrostReq.Input.Prompt) == "" and returning an error if empty
(keep references to NebiusImageGenerationRequest,
ToHuggingFaceImageGenerationRequest, GuidanceScale, SafeExtractFloat64Pointer,
and bifrostReq.Input.Prompt).

In @core/providers/huggingface/types.go:
- Around line 352-440: The Fal‑AI and Together image structs are incomplete and
need schema alignment and tests: add the missing optional fields to
HuggingFaceFalAIImageGenerationRequest (negative_prompt, guidance_scale,
num_inference_steps, seed, output_format, sync_mode, enable_safety_checker and
any others mirrored from HuggingFaceFalAIImageStreamRequest), update
HuggingFaceFalAIImageGenerationResponse to match the API envelope (e.g.,
top‑level metadata like requestId/status/createdAt and a data array wrapping
Images/FalAIImage) instead of the flattened struct, and modify
HuggingFaceTogetherImageGenerationRequest to accept explicit width/height (or
both width/height and size) and a steps parameter; then add focused
unmarshalling unit tests that feed representative mocked JSON payloads for each
of HuggingFaceFalAIImageGenerationRequest,
HuggingFaceFalAIImageGenerationResponse, HuggingFaceFalAIImageStreamResponse,
and HuggingFaceTogetherImageGenerationRequest to validate decoding and catch
schema drift.

In @core/providers/nebius/types.go:
- Around line 11-24: The GuidanceScale field on the NebiusImageGenerationRequest
struct is typed as *int but must accept decimals per Nebius API; change the
GuidanceScale field declaration in NebiusImageGenerationRequest from *int to
*float64 (retain the json tag `guidance_scale,omitempty`) and update any
construction, unmarshalling, validation, or usages that assume an integer (e.g.,
code that sets or reads GuidanceScale) to handle *float64 values instead.

In @core/providers/openai/images.go:
- Around line 35-39: The filterXAISpecificParameters function currently only
clears Quality, Style, and Size but must also remove all XAI-unsupported image
params so they aren't sent to the Grok API; update filterXAISpecificParameters
to set Background, Moderation, PartialImages, OutputCompression, OutputFormat,
Seed, NegativePrompt, and NumInferenceSteps on req.ImageGenerationParameters to
nil (in addition to the existing Quality, Style, Size) so only model, prompt, n,
and response_format remain.

In @core/providers/openai/openai.go:
- Around line 2493-2874: HandleOpenAIImageGenerationStreaming currently returns
on the first ImageGenerationCompleted event, which closes the stream and drops
images when n>1; modify the loop so that when isCompleted is true you only exit
the goroutine after all images have finished (e.g., after removing the completed
index from incompleteImages check that len(incompleteImages) == 0 before
returning), ensuring you continue processing interleaved partial chunks until
every image is completed; update the return logic in the loop (inside
HandleOpenAIImageGenerationStreaming) to only return when no incomplete images
remain rather than on the first completed event.

In @core/providers/xai/xai.go:
- Around line 230-244: The doc comment above the ImageGeneration function
contains a grammar issue; update the comment string to read cleanly (e.g.,
"ImageGeneration performs an image generation request to the xAI API.") so it
correctly uses "an image" and matches the function name and purpose; edit the
comment immediately above the ImageGeneration method in xai.go and keep the rest
of the implementation unchanged.

In @docs/openapi/schemas/inference/images.yaml:
- Around line 1-242: The OpenAPI schema enums and token-detail fields are out of
sync with the implementation: add "auto" to the size enum (used by
core/schemas/images.go), add "jpg" as an allowed value alongside "jpeg" for
output_format (providers normalize "jpg"↔"jpeg"), and extend ImageTokenDetails
to include the NImages field (match the implementation's
OutputTokensDetails.NImages naming or use snake_case consistent with other
schemas) so streaming/usage payloads align with
ImageGenerationRequest/ImageGenerationResponse and
ImageGenerationStreamResponse. Ensure the changes touch the size and
output_format definitions in ImageGenerationRequest and ImageGenerationResponse
and update the ImageTokenDetails object to include the NImages token count
field.

In @docs/quickstart/gateway/multimodal.mdx:
- Around line 49-85: In the "Basic Image Generation" section update formatting
by adding a blank line after the closing bash/code block and another blank line
before the "## Audio Understanding" header, and expand the JSON example under
**Response format:** (the Image generation response example) to include the
missing fields: add "output_tokens" in the "usage" object, include optional
ImageGenerationResponseParameters fields ("background", "output_format",
"quality", "size") where applicable, and add an "extra_fields" object at the
top-level response example so the documented response matches the actual
implementation.

In @framework/logstore/tables.go:
- Around line 98-102: Summary: Persisting full base64 image outputs in
ImageGenerationOutput (and similar fields) can be huge and slow queries;
sanitize or separate blobs. Fix: before serializing/storing into the struct
fields ImageGenerationOutput (and other image-related log fields), strip or
redact any b64_json/base64 fields (or replace with metadata/URLs and
revised_prompt) by adding a sanitizer helper (e.g., sanitizeImageGenOutput or
ImageGenerationOutputParsedSanitize) that removes or truncates large base64
blobs; call this helper wherever image outputs are marshaled for DB persistence
(the code paths that populate ImageGenerationOutput, ImageGenerationOutputParsed
and analogous fields around lines noted); alternatively implement storing the
raw blob in a separate blob store with retention controls and save only a
reference/metadata in ImageGenerationOutput. Ensure tests cover that stored JSON
no longer contains b64_json and that existing URL/metadata fields remain.

In @framework/streaming/images.go:
- Around line 18-35: The loop accesses chunks[i].Delta.Type and other Delta
fields without ensuring Delta is non-nil, which can panic; update the condition
to first check chunks[i].Delta != nil (e.g., if chunks[i].Delta != nil &&
(chunks[i].Delta.Type == string(openai.ImageGenerationCompleted) ||
chunks[i].FinishReason != nil)) and only then read Delta.ID, Delta.CreatedAt,
Delta.ExtraFields.ModelRequested, Delta.B64JSON, Delta.URL and
Delta.RevisedPrompt when building the BifrostImageGenerationResponse so all
Delta field accesses are guarded; also ensure the loop iterates correctly over
chunks (use for i := range chunks or a classic index loop) so you’re referencing
valid indices.

In @framework/streaming/types.go:
- Around line 14-20: In convertAccResultToProcessedStreamResponse in
plugins/maxim/main.go the code fails to detect ImageGenerationOutput and so
falls back to StreamTypeChat; add an else-if branch checking
accResult.ImageGenerationOutput != nil and set streamType =
streaming.StreamTypeImage, and include accResult.ImageGenerationOutput in the
returned AccumulatedData (alongside AudioOutput and TranscriptionOutput) so the
caller receives the image output field.

In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 388-394: The branch that handles the case where duration2 <
duration1 (cache faster) contains an inverted and redundant failure check:
remove the inner if that tests speedup < 0.1 (which would only be true if the
cache were massively slower) and its t.Errorf call; keep only the t.Logf
reporting the speedup. This change affects the test variables duration1,
duration2 and speedup in the streaming cache timing block so locate that block
in plugin_image_generation_test.go and delete the erroneous conditional and
error emission.

In @tests/integrations/python/tests/test_google.py:
- Around line 1740-1748: The test function test_41b_imagen_predict is missing
the @skip_if_no_api_key("google") decorator which other imagen tests (test_41a,
test_41c) use; add the @skip_if_no_api_key("google") decorator directly above
the def test_41b_imagen_predict(...) declaration so the test is skipped when the
Google API key is not configured, keeping behavior consistent with the other
imagen tests.
🧹 Nitpick comments (43)
docs/providers/supported-providers/bedrock.mdx (1)

41-49: Update the Note to include Image Generation or clarify its unsupported status.

The new row correctly documents that Image Generation is unsupported for Bedrock. However, the Note at lines 45-49 only mentions Speech and Transcriptions as unsupported "by the upstream AWS Bedrock API."

Unlike Speech/STT, AWS Bedrock does support image generation through models like Amazon Titan Image Generator and Stable Diffusion. Consider updating the Note to either:

  1. Include Image Generation in the list and adjust the wording (e.g., "not yet implemented" vs. "not supported upstream"), or
  2. Add a separate clarification that Image Generation support for Bedrock is planned for a future release.

This helps users understand whether the limitation is temporary (Bifrost implementation gap) or permanent (upstream API limitation).

📝 Suggested documentation update
 <Note>
-**Unsupported Operations** (❌): Speech (TTS) and Transcriptions (STT) are not supported by the upstream AWS Bedrock API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Speech (TTS) and Transcriptions (STT) are not supported by the upstream AWS Bedrock API. Image Generation is not yet implemented for Bedrock in Bifrost. These return `UnsupportedOperationError`.

 **Limitations**: Images must be in base64 or data URI format (remote URLs not supported). Text completion streaming is not supported.
 </Note>
docs/providers/supported-providers/overview.mdx (1)

18-38: Minor table formatting inconsistencies.

The image generation columns are correctly documented. Two small spacing issues in the table:

  • Line 28 (Hugging Face): ✅| ✅ — missing space before the pipe after the Images column.
  • Line 30 (Nebius): |❌ — missing space after the pipe near the Files column.

These won't break rendering but create inconsistency in the raw markdown.

🔧 Suggested fix
-| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅| ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
-| Nebius (`nebius/<model>`) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |❌ | ❌ | ❌ |
+| Nebius (`nebius/<model>`) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
ui/app/workspace/providers/fragments/allowedRequestsFields.tsx (1)

76-78: LGTM!

The new request type entries follow the existing naming convention and are correctly placed in the array.

Optional: The column split (lines 82-83) now creates a 6 vs 9 imbalance. Consider adjusting to slice(0, 8) for a more balanced 8 vs 7 layout:

-	const leftColumn = REQUEST_TYPES.slice(0, 6);
-	const rightColumn = REQUEST_TYPES.slice(6);
+	const leftColumn = REQUEST_TYPES.slice(0, 8);
+	const rightColumn = REQUEST_TYPES.slice(8);
tests/integrations/python/tests/test_google.py (3)

1691-1696: Inconsistent fixture parameter naming.

The parameter _test_config uses an underscore prefix (indicating unused), whereas all other tests in this file use test_config. If the fixture is intentionally unused, consider removing it from the signature entirely for clarity. If it might be used later, use the consistent name test_config.

Suggested fix

Either remove the unused parameter:

-    def test_41a_image_generation_simple(self, _test_config, provider, model):
+    def test_41a_image_generation_simple(self, provider, model):

Or use consistent naming if it may be needed:

-    def test_41a_image_generation_simple(self, _test_config, provider, model):
+    def test_41a_image_generation_simple(self, test_config, provider, model):

Apply the same change to test_41b_imagen_predict and test_41c_image_generation_with_text.


1702-1703: Remove redundant in-function imports.

get_integration_url and get_api_key are already imported at the module level (lines 107 and 102 respectively). These in-function imports are redundant and add unnecessary noise.

Suggested fix

Remove the in-function imports from all three test methods:

-        from .utils.config_loader import get_integration_url, get_config
-        from .utils.common import get_api_key

Also applies to: 1750-1751, 1804-1805


1841-1857: Consider strengthening the multi-modal response assertion.

The test requests both IMAGE and TEXT modalities but only asserts found_text or found_image. Since the test is explicitly requesting both modalities, consider asserting that at least an image is returned (the primary expectation), or logging when only text is returned to help debug potential issues.

Suggested enhancement
         # At least one of text or image should be present
         assert found_text or found_image, "Response should contain text or image"
         
+        # Log what was received for debugging
+        if not found_image:
+            print(f"⚠ Multi-modal request returned text only (no image)")
         if found_image:
             print("✓ Multi-modal response with image generated successfully")
+        if found_text and found_image:
+            print("✓ Multi-modal response with both text and image generated successfully")
tests/integrations/python/config.yml (1)

490-717: Consider adding model_capabilities entries for the new image generation models.

For consistency with other specialized models (e.g., tts-1, whisper-1, embedding models), consider adding capability entries for the image generation models (gpt-image-1, gemini-2.5-flash-image, imagen-4.0-generate-001). This would document their capabilities (or lack thereof for chat/tools/vision) and any relevant limits.

Example structure
  # OpenAI Image Generation Models
  "gpt-image-1":
    chat: false
    tools: false
    vision: false
    streaming: false
    speech: false
    transcription: false
    embeddings: false
    image_generation: true
    max_tokens: null
    context_window: null

  # Gemini Image Generation Models  
  "gemini-2.5-flash-image":
    chat: false
    tools: false
    vision: false
    streaming: false
    image_generation: true
    max_tokens: null
    context_window: null

  "imagen-4.0-generate-001":
    chat: false
    tools: false
    vision: false
    streaming: false
    image_generation: true
    max_tokens: null
    context_window: null
tests/integrations/python/tests/test_openai.py (2)

91-93: Unused import: IMAGE_GENERATION_DETAILED_PROMPT

IMAGE_GENERATION_DETAILED_PROMPT is imported but never used in the test file. All image generation tests use IMAGE_GENERATION_SIMPLE_PROMPT instead.

Consider either removing the unused import or adding a test case that exercises the detailed prompt (e.g., a test that validates more complex prompt handling).

Option 1: Remove unused import
-    IMAGE_GENERATION_DETAILED_PROMPT,
-    # Image Generation utilities
-    IMAGE_GENERATION_SIMPLE_PROMPT,
+    # Image Generation utilities
+    IMAGE_GENERATION_SIMPLE_PROMPT,
Option 2: Add a test using detailed prompt
@pytest.mark.parametrize(
    "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation")
)
def test_52e_image_generation_detailed_prompt(self, test_config, provider, model, vk_enabled):
    """Test Case 52e: Image generation with detailed prompt"""
    if provider == "_no_providers_" or model == "_no_model_":
        pytest.skip("No providers configured for this scenario")

    client = get_provider_openai_client(provider, vk_enabled=vk_enabled)
    response = client.images.generate(
        model=format_provider_model(provider, model),
        prompt=IMAGE_GENERATION_DETAILED_PROMPT,
        n=1,
        size="1024x1024",
    )

    assert_valid_image_generation_response(response, "openai")
    assert len(response.data) == 1

1240-1256: Add image count assertion for consistency with other tests.

Unlike test_52a, test_52b, and test_52c, this test doesn't assert the expected image count. For consistency and completeness, consider adding the assertion.

Suggested addition
         # Validate response structure
         assert_valid_image_generation_response(response, "openai")
+        
+        # Verify we got exactly 1 image
+        assert len(response.data) == 1, f"Expected 1 image, got {len(response.data)}"
ui/lib/types/logs.ts (2)

142-174: Consider tightening the image data shape (URL vs base64) and avoiding redundant types.

Right now BifrostImageGenerationData / ImageMessageData allow both url and b64_json simultaneously and duplicate fields. Consider a discriminated union (or at least a type-level “at least one of url/b64_json”) to prevent impossible states and reduce drift between the two interfaces. Also: base64 fields can be huge—ensure the UI/log views won’t eagerly render them.


309-314: LogEntry image_generation_input may be missing important request context.

image_generation_input?: { prompt: string } likely won’t be enough to reproduce/debug (e.g., size/quality/output_format/background). If those are available in backend logs, consider mirroring them here (or reusing the request type shape).

core/internal/testutil/test_retry_conditions.go (1)

907-956: Streaming retry condition may false-positive on non-payload chunks.

EmptyImageGenerationCondition assumes ImageGenerationStreamResponse always contains B64JSON or URL. If the stream includes “in_progress / generating / partial_image metadata” events where payload is legitimately empty, this will trigger retries incorrectly. Suggest gating the check on the stream event/type (e.g., only validate payload on “partial_image” / “completed/done” events).

core/providers/gemini/transcription.go (1)

36-41: Silent skip on base64 decode failure may hide data corruption.

When decodeBase64StringToBytes fails, the code silently skips the part without logging or propagating the error. This could mask malformed input data or encoding issues.

Consider logging a warning when decode fails to aid debugging:

Proposed improvement
 			if part.InlineData != nil && strings.HasPrefix(strings.ToLower(part.InlineData.MIMEType), "audio/") {
 				decodedData, err := decodeBase64StringToBytes(part.InlineData.Data)
 				if err != nil {
-					// Skip this part if decoding fails
+					// Log and skip this part if decoding fails
+					// Consider adding a logger parameter or using a package-level logger
 					continue
 				}
docs/providers/supported-providers/openai.mdx (1)

218-273: Image Generation documentation is comprehensive and well-structured.

The section covers all key aspects:

  • Request parameters with appropriate types and constraints
  • Request/response conversion details
  • Streaming event types (image_generation.partial_image, image_generation.completed, error) are consistent with the expected SSE format per repository patterns.

One minor stylistic note from static analysis: lines 244-246 have three successive bullet points starting with the same pattern. Consider varying the sentence structure for readability, though this is optional.

core/providers/gemini/utils.go (1)

1454-1469: Consider simplifying the encoding function.

The implementation is correct but slightly verbose. A more idiomatic approach:

♻️ Optional simplification
 func encodeBytesToBase64String(bytes []byte) string {
-	var base64str string
-
-	if len(bytes) > 0 {
-		// Use standard base64 encoding to match external SDK expectations
-		base64str = base64.StdEncoding.EncodeToString(bytes)
+	if len(bytes) == 0 {
+		return ""
 	}
-
-	return base64str
+	return base64.StdEncoding.EncodeToString(bytes)
 }
framework/logstore/migrations.go (1)

684-748: Consider combining into a single migration.

Both migrations follow the correct pattern with proper HasColumn checks and rollback safety. However, since these columns are part of the same feature (image generation), you could consider combining them into a single migration (e.g., logs_add_image_generation_columns) similar to how migrationAddCostAndCacheDebugColumn combines related columns. This reduces migration table entries and keeps related schema changes atomic.

That said, the current approach is functionally correct and keeps rollbacks granular.

♻️ Optional: Combined migration
-func migrationAddImageGenerationOutputColumn(ctx context.Context, db *gorm.DB) error {
+func migrationAddImageGenerationColumns(ctx context.Context, db *gorm.DB) error {
 	opts := *migrator.DefaultOptions
 	opts.UseTransaction = true
 	m := migrator.New(db, &opts, []*migrator.Migration{{
-		ID: "logs_add_image_generation_output_column",
+		ID: "logs_add_image_generation_columns",
 		Migrate: func(tx *gorm.DB) error {
 			tx = tx.WithContext(ctx)
 			migrator := tx.Migrator()
 			if !migrator.HasColumn(&Log{}, "image_generation_output") {
 				if err := migrator.AddColumn(&Log{}, "image_generation_output"); err != nil {
 					return err
 				}
 			}
+			if !migrator.HasColumn(&Log{}, "image_generation_input") {
+				if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil {
+					return err
+				}
+			}
 			return nil
 		},
 		Rollback: func(tx *gorm.DB) error {
 			tx = tx.WithContext(ctx)
 			migrator := tx.Migrator()
 			if migrator.HasColumn(&Log{}, "image_generation_output") {
 				if err := migrator.DropColumn(&Log{}, "image_generation_output"); err != nil {
 					return err
 				}
 			}
+			if migrator.HasColumn(&Log{}, "image_generation_input") {
+				if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil {
+					return err
+				}
+			}
 			return nil
 		},
 	}})
 	err := m.Migrate()
 	if err != nil {
-		return fmt.Errorf("error while adding image generation output column: %s", err.Error())
+		return fmt.Errorf("error while adding image generation columns: %s", err.Error())
 	}
 	return nil
 }
core/internal/testutil/validation_presets.go (1)

222-237: Remove leftover instruction comment.

Line 222 contains a leftover instruction comment that should be removed.

🧹 Proposed fix
-// In validation_presets.go - add this function
+// ImageGenerationExpectations returns validation expectations for image generation scenarios
 func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
core/schemas/provider.go (1)

352-357: Minor: Fix grammar in doc comments.

The interface method signatures are correct and consistent with other methods.

📝 Proposed fix
-	// ImageGeneration performs a image generation request
+	// ImageGeneration performs an image generation request
 	ImageGeneration(ctx *BifrostContext, key Key, request *BifrostImageGenerationRequest) (
 		*BifrostImageGenerationResponse, *BifrostError)
-	// ImageGenerationStream performs a image generation stream request
+	// ImageGenerationStream performs an image generation stream request
 	ImageGenerationStream(ctx *BifrostContext, postHookRunner PostHookRunner, key Key,
 		request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)
docs/providers/supported-providers/azure.mdx (1)

364-429: LGTM! Comprehensive documentation for Azure Image Generation.

The section covers all essential aspects:

  • Parameter mapping with deployment ID handling
  • Gateway and Go SDK examples
  • Response conversion and streaming behavior

Minor formatting note: Line 429 is missing a newline before the --- separator.

📝 Proposed fix for formatting
 Image generation streaming is supported and uses OpenAI's streaming format with Server-Sent Events (SSE).
+
 ---
core/providers/utils/utils.go (1)

959-966: Consider adding ImageGenerationStreamResponse to ProcessAndSendBifrostError.

The ProcessAndSendResponse function (line 914) was updated to propagate ImageGenerationStreamResponse, but ProcessAndSendBifrostError (lines 959-966) was not updated with the same field. This could lead to inconsistent behavior if an image generation stream encounters an error that needs to include partial response data.

💡 Suggested addition for consistency
 	streamResponse := &schemas.BifrostStream{}
 	if processedResponse != nil {
 		streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse
 		streamResponse.BifrostChatResponse = processedResponse.ChatResponse
 		streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse
 		streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse
 		streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse
+		streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse
 	}
core/internal/testutil/account.go (1)

360-368: Missing: Nebius provider test config lacks image generation settings.

The Nebius provider block doesn't include ImageGenerationModel or enable ImageGeneration scenario, but core/providers/nebius/nebius.go implements the ImageGeneration method. Consider adding test coverage.

Suggested addition for Nebius provider config

You may need to add a Nebius entry to AllProviderConfigs with image generation settings, or update the existing Nebius key configuration with an ImageGenerationModel and enable the ImageGeneration scenario flag.

core/providers/azure/azure.go (3)

1212-1252: Use getModelDeployment helper for consistency.

Other methods in this file (e.g., TextCompletion, ChatCompletion, Embedding, Speech) use provider.getModelDeployment(key, request.Model) which provides a more descriptive error message. The direct map access here works but is inconsistent with the established pattern.

♻️ Suggested refactor for consistency
 func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key,
 	request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}

-	deployment := key.AzureKeyConfig.Deployments[request.Model]
-	if deployment == "" {
-		return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey())
+	deployment, err := provider.getModelDeployment(key, request.Model)
+	if err != nil {
+		return nil, err
 	}

1269-1269: Remove stray empty comment.

Line 1269 has an empty comment (//) that appears to be leftover and should be removed.

🧹 Remove empty comment
-	//
 	deployment := key.AzureKeyConfig.Deployments[request.Model]

1257-1304: Use getModelDeployment helper for consistency.

Same issue as in ImageGeneration - use the helper method for consistency with other provider methods.

Also, the method signature has an extra blank line at line 1263 which could be cleaned up.

♻️ Suggested refactor
 func (provider *AzureProvider) ImageGenerationStream(
 	ctx *schemas.BifrostContext,
 	postHookRunner schemas.PostHookRunner,
 	key schemas.Key,
 	request *schemas.BifrostImageGenerationRequest,
 ) (chan *schemas.BifrostStream, *schemas.BifrostError) {
-
 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}

-	//
-	deployment := key.AzureKeyConfig.Deployments[request.Model]
-	if deployment == "" {
-		return nil, providerUtils.NewConfigurationError(fmt.Sprintf("deployment not found for model %s", request.Model), provider.GetProviderKey())
+	deployment, err := provider.getModelDeployment(key, request.Model)
+	if err != nil {
+		return nil, err
 	}
core/internal/testutil/response_validation.go (1)

236-238: Minor formatting inconsistency.

The extra blank lines and brace placement at lines 236-238 differ slightly from the other validator functions (e.g., ValidateSpeechResponse ends at line 235 without extra blank lines before the closing brace).

🧹 Clean up formatting
 	logValidationResults(t, result, scenarioName)
-
-	return result
-
+	return result
 }
core/providers/nebius/types.go (1)

26-29: Consider Go naming convention for Url field.

Go convention for acronyms is to use all caps (URL instead of Url). This is a minor style issue.

🧹 Naming convention fix
 type NebiusLora struct {
-	Url   string `json:"url"`
+	URL   string `json:"url"`
 	Scale int    `json:"scale"`
 }
framework/modelcatalog/pricing.go (1)

282-319: Per-image pricing tier check uses zero value for tier determination.

When tokens are all zero (the condition at line 282), imageTotalTokens will be 0, making the isAbove128k check at line 292 always false. This is likely intentional (per-image pricing doesn't use token tiers), but the variable and check are misleading. Consider removing or clarifying.

♻️ Suggested simplification
 		// Use per-image pricing when tokens are nil/zero
 		// Extract number of images from ImageTokenDetails if available
 		numImages := 1
 		if imageUsage.OutputTokensDetails != nil && imageUsage.OutputTokensDetails.NImages > 0 {
 			numImages = imageUsage.OutputTokensDetails.NImages
 		} else if imageUsage.InputTokensDetails != nil && imageUsage.InputTokensDetails.NImages > 0 {
 			numImages = imageUsage.InputTokensDetails.NImages
 		}

-		isAbove128k := imageTotalTokens > TokenTierAbove128K
-
 		var inputPerImageRate, outputPerImageRate *float64
-		if isAbove128k {
-			inputPerImageRate = pricing.InputCostPerImageAbove128kTokens
-			// Note: OutputCostPerImageAbove128kTokens may not exist in TableModelPricing
-			// For now, use regular OutputCostPerImage even above 128k
-		} else {
-			inputPerImageRate = pricing.InputCostPerImage
-		}
+		// Per-image pricing doesn't use token tiers (tokens are zero in this path)
+		inputPerImageRate = pricing.InputCostPerImage
 		// Use OutputCostPerImage if available
 		outputPerImageRate = pricing.OutputCostPerImage
docs/providers/supported-providers/gemini.mdx (1)

560-576: Consider varying sentence structure for readability.

The "Request Conversion" subsections have repeated sentence patterns starting with bold terms. While functional, varying the structure slightly would improve readability.

plugins/semanticcache/utils.go (1)

382-382: Mixed JSON marshaling libraries.

Line 382 uses sonic.Marshal while other places in this file use json.Marshal (e.g., lines 119, 348, 746, 835). Consider using sonic consistently throughout for performance, or document the rationale for mixing.

♻️ Option: Standardize on sonic for consistency

If performance is the goal, consider replacing json.Marshal calls with sonic.Marshal throughout the file for consistency. Alternatively, if json.Marshal is preferred for stability, revert line 382 to use json.Marshal.

plugins/semanticcache/plugin_image_generation_test.go (1)

38-41: Unreachable code after t.Skipf.

t.Skipf marks the test as skipped and continues execution, so the return statement is reachable. However, the pattern t.Skipf(...); return is redundant as t.Skipf already stops test execution. This pattern appears multiple times in the file.

♻️ Simplified pattern
 	if err1 != nil {
-		t.Skipf("First image generation request failed (may be rate limited): %v", err1)
-		return
+		t.Skipf("First image generation request failed (may be rate limited): %v", err1)
 	}

Note: t.Skipf calls runtime.Goexit() internally, so the return is technically unreachable. Removing it makes the code cleaner.

framework/streaming/images.go (1)

270-274: Remove empty/dead code block.

This comment block with no actionable code appears to be leftover from development.

♻️ Proposed cleanup
-		// Extract usage if available
-		if result.ImageGenerationStreamResponse.Usage != nil {
-			// Note: ImageUsage doesn't directly map to BifrostLLMUsage, but we can store it
-			// The actual usage will be extracted in processAccumulatedImageStreamingChunks
-		}
core/providers/openai/types.go (1)

549-562: Consider making Prompt a *string for parity with other request types (if you need “omitted vs empty”).
Currently Prompt is a required string. If upstream code ever needs to distinguish “missing” from “empty”, this won’t be possible. If not needed, current approach is fine.

core/internal/testutil/image_generation.go (2)

57-130: Avoid image decode flakes: request an explicit output format (or skip dimension checks when format isn’t supported).
Right now the test only registers image/jpeg and image/png, but providers may return webp (or other) in b64_json. Consider forcing OutputFormat (and/or ResponseFormat) in Params, or making the dimension assertion conditional on successful image.DecodeConfig.


228-238: Remove provider-specific constant from shared testutil; use canonical string instead.

The testutil imports core/providers/openai and uses openai.ImageGenerationCompleted at line 232, but this module is used by multiple providers (Azure, Gemini, Vertex, Nebius, XAI). Other providers don't define their own ImageGenerationCompleted constant, creating an avoidable coupling. Since all providers normalize to the canonical "image_generation.completed" string, replace the provider-specific constant:

Proposed diff
-						if imgResp.Type == string(openai.ImageGenerationCompleted) {
+						if imgResp.Type == "image_generation.completed" {

This also allows removing the core/providers/openai import from testutil.

core/providers/huggingface/types.go (1)

151-163: Make HuggingFaceErrorDetail.Loc optional for resilience.
FastAPI error shapes can vary; if loc is absent/null, strict json:"loc" still unmarshals (nil slice), but marking it omitempty (and possibly Msg/Type too) makes this struct more tolerant across providers and versions.

framework/streaming/accumulator.go (2)

293-325: Avoid per-chunk string allocations in image de-dup key (hot path).

imageChunkKey() uses fmt.Sprintf, which will allocate for every chunk; this can get expensive for long streams / many images. Prefer a non-alloc key (e.g., uint64 packing or a small struct key).

Proposed refactor (uint64 key)
- ImageChunksSeen            make(map[string]struct{}),
+ ImageChunksSeen            make(map[uint64]struct{}),

- func imageChunkKey(imageIndex, chunkIndex int) string {
- 	return fmt.Sprintf("%d:%d", imageIndex, chunkIndex)
- }
+ func imageChunkKey(imageIndex, chunkIndex int) uint64 {
+ 	return (uint64(uint32(imageIndex)) << 32) | uint64(uint32(chunkIndex))
+ }

- chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex)
- if _, seen := acc.ImageChunksSeen[chunkKey]; !seen {
+ chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex)
+ if _, seen := acc.ImageChunksSeen[chunkKey]; !seen {
  	acc.ImageChunksSeen[chunkKey] = struct{}{}
  	acc.ImageStreamChunks = append(acc.ImageStreamChunks, chunk)
  }

114-132: Nice: pooling reset wrappers reduce reuse bugs; consider consistency for index fields across chunk types.

putImageStreamChunk() resets ChunkIndex/ImageIndex, but the other put*StreamChunk() helpers don’t reset their index fields (if any). Not required, but aligning the reset policy across chunk types reduces risk if future code reads stale indices after reuse.

Also applies to: 469-486

core/providers/gemini/images.go (2)

183-251: labels extraction is likely too strict (map[string]string); consider accepting map[string]any.

labels.(map[string]string) will usually fail if ExtraParams came from JSON (commonly map[string]any). Consider converting entry-by-entry to map[string]string (or supporting both).


445-473: Imagen RaiFilteredReason is dropped; consider surfacing as an error or metadata.

If predictions are filtered, returning base64 data may be empty and you’ll return “success” with unusable output. Consider mapping RaiFilteredReason into a Bifrost error (or at least ExtraParams / debug metadata) so clients can distinguish “model refused” vs “transport failed”.

core/internal/testutil/test_retry_framework.go (1)

1249-1252: Good: scenario mapping is wired; ensure stream scenario uses the correct retry wrapper.

You map "ImageGenerationStream" to DefaultImageGenerationRetryConfig() (generic). Please confirm the stream test path uses WithImageGenerationStreamRetry() (generic) rather than the typed non-stream helper, so the intended config type actually matches.

core/providers/huggingface/huggingface.go (3)

133-163: Consider extracting duplicate body modification logic.

The JSON body modification code (parsing, setting model field, re-encoding) is duplicated between the initial request path (lines 155-162) and the retry path (lines 187-194). While functional, this could be extracted to a small helper function to reduce duplication.

♻️ Optional refactor to reduce duplication
// Helper function to update model in JSON body
func updateModelInJSONBody(jsonData []byte, modelName string) []byte {
    var reqBody map[string]interface{}
    if err := sonic.Unmarshal(jsonData, &reqBody); err == nil {
        reqBody["model"] = modelName
        if newJSON, err := sonic.Marshal(reqBody); err == nil {
            return newJSON
        }
    }
    return jsonData
}

Then use in both locations:

-       var reqBody map[string]interface{}
-       if err := sonic.Unmarshal(jsonData, &reqBody); err == nil {
-           // For other providers (embeddings, hf-inference images), use validated model ID
-           reqBody["model"] = modelName
-           if newJSON, err := sonic.Marshal(reqBody); err == nil {
-               updatedJSONData = newJSON
-           }
-       }
+       updatedJSONData = updateModelInJSONBody(jsonData, modelName)

1127-1130: Minor: Variable initialization could benefit from explicit nil handling.

The variables lastB64Data, lastURLData, lastJsonData are initialized as empty strings. When used in the final chunk (line 1252), an empty lastJsonData would set RawResponse to an empty string rather than omitting it. Consider checking for non-empty before setting.

♻️ Optional: Only set RawResponse if data exists
 			if sendBackRawResponse {
-				finalChunk.ExtraFields.RawResponse = lastJsonData
+				if lastJsonData != "" {
+					finalChunk.ExtraFields.RawResponse = lastJsonData
+				}
 			}

1231-1274: Minor: Stream end indicator potentially set twice.

When chunkIndex > 0, the stream end indicator is set at line 1254 (for the completion chunk). If scanner.Err() also returns an error, it's set again at line 1272. This is harmless but redundant. Consider restructuring to only set once.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
docs/providers/supported-providers/anthropic.mdx (1)

26-33: Update the Note to include Image Generation.

The Image Generation row was added to the Supported Operations table showing it's unsupported (❌), but the Note on line 32 still only lists "Embeddings, Speech, and Transcriptions" as unsupported operations. For consistency, Image Generation should be added to the Note.

Suggested fix
 <Note>
-**Unsupported Operations** (❌): Embeddings, Speech, and Transcriptions are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Embeddings, Image Generation, Speech, and Transcriptions are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`.
 </Note>
transports/bifrost-http/integrations/genai.go (1)

36-65: Fix mutual exclusivity of IsEmbedding vs IsImageGeneration (current order misroutes).
Right now a :predict request can end up with both flags true, and RequestConverter will pick EmbeddingRequest first (Line 42-46), even if the client asked for IMAGE modality (Line 426-429).

Proposed fix (ensure flags don’t conflict, and only force embedding when it’s clearly embedding)
 func extractAndSetModelFromURL(ctx *fasthttp.RequestCtx, bifrostCtx *schemas.BifrostContext, req interface{}) error {
@@
-	// Determine if :predict is for image generation (Imagen) or embedding
-	// Imagen models use :predict for image generation
-	isImagenPredict := isPredict && schemas.IsImagenModel(modelStr)
-	if isPredict && !isImagenPredict {
-		// :predict for non-Imagen models is embedding
-		isEmbedding = true
-	}
+	// Determine if :predict is for image generation (Imagen) or embedding
+	// NOTE: Do not force embedding if the request clearly asks for IMAGE output.
+	isImagenPredict := isPredict && schemas.IsImagenModel(modelStr)

@@
 	case *gemini.GeminiGenerationRequest:
 		r.Model = modelStr
 		r.Stream = isStreaming
-		r.IsEmbedding = isEmbedding
 		r.IsCountTokens = isCountTokens
@@
-		// Detect if this is an image generation request
-		// isImagenPredict takes precedence for :predict endpoints
-		r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)
+		// Detect image generation intent first
+		r.IsImageGeneration = isImagenPredict || isImageGenerationRequest(r)
+
+		// :predict for non-Imagen defaults to embedding, but only if it's not image generation
+		r.IsEmbedding = isEmbedding || (isPredict && !isImagenPredict && !r.IsImageGeneration)

 		return nil
 	}

Also applies to: 370-434

plugins/logging/operations.go (1)

78-191: Fix raw_response gating precedence (can write "null" when RawResponse is nil).
Current condition effectively becomes disableContentLogging==nil || (... && RawResponse!=nil). Likely intended: “content logging enabled AND raw response present”.

Proposed fix
diff --git a/plugins/logging/operations.go b/plugins/logging/operations.go
@@
-	if p.disableContentLogging == nil || !*p.disableContentLogging && data.RawResponse != nil {
+	if (p.disableContentLogging == nil || !*p.disableContentLogging) && data.RawResponse != nil {
 		rawResponseBytes, err := sonic.Marshal(data.RawResponse)
 		if err != nil {
 			p.logger.Error("failed to marshal raw response: %v", err)
framework/modelcatalog/main.go (1)

218-236: Consider including schemas.ImageGenerationRequest in GetPricingEntryForModel lookup.
With image-generation pricing added, this function may fail to return pricing for image-only models (or models whose pricing is stored under the image-generation mode).

Proposed fix
 	for _, mode := range []schemas.RequestType{
 		schemas.TextCompletionRequest,
 		schemas.ChatCompletionRequest,
 		schemas.ResponsesRequest,
 		schemas.EmbeddingRequest,
 		schemas.SpeechRequest,
 		schemas.TranscriptionRequest,
+		schemas.ImageGenerationRequest,
 	} {
core/providers/gemini/responses.go (1)

1597-1661: Add defensive prefix check to image and audio handlers matching the file handler pattern.

The file handler (line 1652) defensively checks if !strings.HasPrefix(fileDataURL, "data:") before wrapping as data URL, but image (line 1616) and audio (line 1627) handlers do not. While Blob.Data is documented as "Base64-encoded bytes," apply the same defensive check consistently:

For images (line 1616):

if !strings.HasPrefix(blob.Data, "data:") {
    imageURL = fmt.Sprintf("data:%s;base64,%s", mimeType, blob.Data)
} else {
    imageURL = blob.Data
}

For audio (line 1627), either apply the same guard or add a comment explaining why it's safe to skip.

transports/bifrost-http/handlers/inference.go (1)

1261-1333: Fix SSE termination + avoid logging full chunks (can emit wrong [DONE] / leak huge base64).

  • includeEventType/skipDoneMarker are inferred from chunks; if the upstream closes without emitting any chunks, the handler will still send data: [DONE], which is incorrect for Responses/Image streams.
  • Logging chunk: %v on marshal errors can dump large image payloads / sensitive fields and cause log spikes.
Proposed fix (make protocol explicit + truncate logs)
-func (h *CompletionHandler) handleStreamingResponse(ctx *fasthttp.RequestCtx, getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError), cancel context.CancelFunc) {
+type sseMode int
+
+const (
+	sseModeLegacyDone sseMode = iota // data: ... + data: [DONE]
+	sseModeTypedEvents               // event: ... + data: ... (no [DONE])
+)
+
+func (h *CompletionHandler) handleStreamingResponse(
+	ctx *fasthttp.RequestCtx,
+	getStream func() (chan *schemas.BifrostStream, *schemas.BifrostError),
+	cancel context.CancelFunc,
+	mode sseMode,
+) {
   // Set SSE headers
   ctx.SetContentType("text/event-stream")
   ctx.Response.Header.Set("Cache-Control", "no-cache")
   ctx.Response.Header.Set("Connection", "keep-alive")
@@
-	var includeEventType bool
+	includeEventType := mode == sseModeTypedEvents
@@
-		var skipDoneMarker bool
-
 		// Process streaming responses
 		for chunk := range stream {
@@
-			includeEventType = false
-			if chunk.BifrostResponsesStreamResponse != nil ||
-				chunk.BifrostImageGenerationStreamResponse != nil ||
-				(chunk.BifrostError != nil && (chunk.BifrostError.ExtraFields.RequestType == schemas.ResponsesStreamRequest || chunk.BifrostError.ExtraFields.RequestType == schemas.ImageGenerationStreamRequest)) {
-				includeEventType = true
-			}
-
-			// Image generation streams don't use [DONE] marker
-			if chunk.BifrostImageGenerationStreamResponse != nil {
-				skipDoneMarker = true
-			}
+			// In typed-events mode, always use event lines (even for errors).
+			// In legacy mode, never use event lines.
@@
 			chunkJSON, err := sonic.Marshal(chunk)
 			if err != nil {
-				logger.Warn(fmt.Sprintf("Failed to marshal streaming response: %v, chunk: %v", err, chunk))
+				// Avoid logging full chunks (may contain large base64 / sensitive content)
+				logger.Warn(fmt.Sprintf("Failed to marshal streaming response: %v (chunk types: resp=%t chat=%t img=%t err=%t)",
+					err,
+					chunk.BifrostResponsesStreamResponse != nil,
+					chunk.BifrostChatResponse != nil,
+					chunk.BifrostImageGenerationStreamResponse != nil,
+					chunk.BifrostError != nil,
+				))
 				continue
 			}
@@
 		}
-
-		if !includeEventType && !skipDoneMarker {
+		if mode == sseModeLegacyDone {
 			// Send the [DONE] marker to indicate the end of the stream (only for non-responses/image-gen APIs)
 			if _, err := fmt.Fprint(w, "data: [DONE]\n\n"); err != nil {
 				logger.Warn(fmt.Sprintf("Failed to write SSE [DONE] marker: %v", err))
 				cancel() // Client disconnected (write error), cancel upstream stream
 				return
 			}
 		}

And update callers:

 h.handleStreamingResponse(ctx, getStream, cancel)
+// text/chat/speech/transcription: legacy [DONE]
+h.handleStreamingResponse(ctx, getStream, cancel, sseModeLegacyDone)

 // responses/image-gen: typed events, no [DONE]
-h.handleStreamingResponse(ctx, getStream, cancel)
+h.handleStreamingResponse(ctx, getStream, cancel, sseModeTypedEvents)

Based on learnings, OpenAI Responses-style streaming should not rely on a [DONE] sentinel.

🤖 Fix all issues with AI agents
In @core/bifrost.go:
- Around line 2411-2416: The helpers are missing ImageGenerationRequest and
ImageGenerationStreamRequest cases; update isModelRequired to treat both
ImageGenerationRequest and ImageGenerationStreamRequest as model-required types,
add cases in the fallback parsing switch that copy Provider and Model into
fallbackReq.ImageGenerationRequest / ImageGenerationStreamRequest (mirroring the
existing text/audio/chat handling), and extend plugins/mocker.getRequestInput to
handle ImageGenerationRequest and ImageGenerationStreamRequest by returning the
appropriate prompt/input string for image generation instead of the empty
default. Ensure you reference the types ImageGenerationRequest and
ImageGenerationStreamRequest and the symbols isModelRequired,
fallbackReq.ImageGenerationRequest/fallbackReq.ImageGenerationStreamRequest,
handleProviderStreamRequest behavior, and getRequestInput when making these
changes.

In @core/providers/gemini/gemini.go:
- Around line 1614-1671: In ImageGeneration, first guard against a nil request
and validate required fields before calling schemas.IsImagenModel: ensure
request != nil, that request.Model is non-empty, and that request.Prompt (or
request.Prompts/whatever the image payload requires) is present; only then call
schemas.IsImagenModel(request.Model) or delegate to handleImagenImageGeneration.
If validation fails, return a proper *schemas.BifrostError (matching the
existing validation/error pattern used elsewhere for image generation) with
ExtraFields set (Provider, ModelRequested, RequestType) so downstream code
cannot panic or send bad requests.

In @core/providers/gemini/images.go:
- Around line 188-256: In ToGeminiImageGenerationRequest, do not propagate
bifrostReq.Params.N into geminiReq.GenerationConfig.CandidateCount because
Gemini image generation always returns a single image; remove or replace the
assignment that sets GenerationConfig.CandidateCount from Params.N and instead
set CandidateCount = 1 (or leave it unset/zero) for image modality. Locate the
conditional that checks bifrostReq.Params.N and change it so it either does
nothing or explicitly clamps/assigns int32(1) to GenerationConfig.CandidateCount
to avoid using Params.N.

In @core/providers/huggingface/images.go:
- Around line 14-154: In ToHuggingFaceImageGenerationRequest validate that
bifrostReq.Input.Prompt is non-nil/empty (trim whitespace and return an error if
empty) before constructing any provider-specific request (apply this check at
the top of the function before the switch), and in the nebius branch normalize
ResponseExtension by using strings.ToLower(*req.ResponseExtension) when
comparing to "jpeg" (i.e., replace the case-sensitive comparison with a
lowercase comparison) so output_format is handled case-insensitively.

In @core/providers/nebius/images.go:
- Around line 62-68: The extracted guidance_scale from
bifrostReq.Params.ExtraParams is currently assigned directly to
req.GuidanceScale without bounds checking; after calling
schemas.SafeExtractIntPointer to get v, validate and enforce it is within 0–100
(either clamp values below 0 to 0 and above 100 to 100 or return a validation
error), then assign the validated value to req.GuidanceScale; update the
handling around bifrostReq.Params.ExtraParams and schemas.SafeExtractIntPointer
so req.GuidanceScale always receives a guaranteed 0–100 integer.
- Around line 74-95: The function ToBifrostImageResponse currently hard-codes
ExtraFields.Provider to schemas.Nebius; change it to respect custom provider
aliases by either turning ToBifrostImageResponse into a method on NebiusProvider
(e.g., func (p *NebiusProvider) ToBifrostImageResponse(...)) or by adding a
providerName parameter (e.g., providerName schemas.ModelProvider) and using that
value for ExtraFields.Provider; ensure you obtain the remapped name from
customProviderConfig lookup when called and populate
schemas.BifrostImageGenerationResponse.ExtraFields.Provider with the
passed/derived provider name instead of schemas.Nebius.
- Around line 12-20: In ToNebiusImageGenerationRequest validate that
bifrostReq.Input.Prompt is not nil/empty or only whitespace (trim and error if
empty), and avoid taking pointers into caller-owned structs by copying values
into the NebiusImageGenerationRequest (e.g., allocate local variables for model
and prompt and assign their addresses or change NebiusImageGenerationRequest to
hold values), update the construction of NebiusImageGenerationRequest to use
those copies, and return an error if prompt validation fails; reference:
function ToNebiusImageGenerationRequest, types NebiusImageGenerationRequest,
bifrostReq, Model, Prompt.

In @core/providers/openai/openai.go:
- Around line 2756-2786: The code incorrectly defaults imageIndex to 0 and
ignores explicit indices; change logic so both partial and completed chunks use
response.PartialImageIndex when it is non-nil, and only fall back to tracked
state when it is nil: in the else branch for partial chunks, if
response.PartialImageIndex != nil use it and mark
incompleteImages[imageIndex]=true; if nil allocate a new unique index (e.g.,
nextUnused or maxKey+1) instead of leaving 0; in the isCompleted branch, first
check if response.PartialImageIndex != nil and use that index (and delete from
incompleteImages), otherwise pick the oldest incomplete as the fallback and log
as before (using logger.Warn when falling back).

In @core/providers/vertex/vertex.go:
- Around line 1420-1644: Add a nil-request guard at the top of
VertexProvider.ImageGeneration to return a configuration/operation error if
request is nil before accessing request.Model, and ensure requestBody is
nil-safe after the conversion step by checking if requestBody != nil before
calling delete(requestBody, "region"); if requestBody can be nil create/return a
suitable error or initialize an empty map. Update references in ImageGeneration
to use these checks so neither request.Model dereference nor delete(requestBody,
"region") can panic.

In @framework/modelcatalog/pricing.go:
- Around line 276-383: The image-token counts are being priced with the generic
token rates; update the token-rate selection to prefer image-token rate fields
when present: create separate inputImageTokenRate and outputImageTokenRate
(checking Pricing fields InputCostPerImageToken,
InputCostPerImageTokenAbove128kTokens, InputCostPerImageTokenAbove200kTokens and
the corresponding OutputCostPerImageToken variants) and fall back to the generic
inputTokenRate/outputTokenRate if image-specific fields are nil; then use
inputTextTokens*inputTokenRate + inputImageTokens*inputImageTokenRate and
outputTextTokens*outputTokenRate + outputImageTokens*outputImageTokenRate when
computing inputCost/outputCost (modify the rate-selection logic around pricing,
and the final cost calculations where inputCost/outputCost are assigned).

In @plugins/semanticcache/stream.go:
- Around line 121-127: The comparison block in the sort comparator only checks
accumulator.Chunks[i].Response.ImageGenerationStreamResponse for nil but then
reads
accumulator.Chunks[j].Response.ImageGenerationStreamResponse.Index/ChunkIndex,
risking a nil deref; update the comparator in stream.go to also verify
accumulator.Chunks[j].Response.ImageGenerationStreamResponse != nil (same
pattern used in other branches) before accessing its fields and handle the
mixed-type case (e.g., treat nil as less/greater or fall through to other
comparisons) so both i and j are safely validated when comparing Index and
ChunkIndex.

In @transports/bifrost-http/integrations/router.go:
- Around line 1290-1291: The switch branch invocations for streaming converters
(TextStreamResponseConverter, ChatStreamResponseConverter,
ResponsesStreamResponseConverter, SpeechStreamResponseConverter,
TranscriptionStreamResponseConverter, ImageGenerationStreamResponseConverter)
can panic if any converter is nil; update the switch in router.go to nil-check
each converter on config.StreamConfig before calling it (e.g., verify
config.StreamConfig.ImageGenerationStreamResponseConverter != nil) and handle
the missing-converter case gracefully (log an error via the router's logger and
return a proper error/skip the chunk instead of calling the nil func).
Alternatively, add a route initialization validation that ensures all required
StreamConfig converters are non-nil and fail fast; reference the existing switch
handling around chunk.BifrostImageGenerationStreamResponse and the StreamConfig
field names to locate the changes.
🟡 Minor comments (15)
docs/providers/supported-providers/huggingface.mdx-19-19 (1)

19-19: Date reference may need updating.

The table header references "as of December 2025" but the current date is January 2026. Consider updating to reflect the current state.

docs/providers/supported-providers/overview.mdx-28-28 (1)

28-28: Minor formatting inconsistency.

The Hugging Face row has ✅| without a space before the pipe on the Images column, while other rows have consistent spacing. This is a minor visual inconsistency.

-| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅| ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
+| Hugging Face (`huggingface/<model>`) | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
tests/integrations/python/config.yml-186-187 (1)

186-187: Inconsistent environment variable casing.

The nebius API key uses Nebius_API_KEY which has mixed casing. Environment variables are conventionally all uppercase for consistency with other entries (e.g., OPENAI_API_KEY, GEMINI_API_KEY).

   huggingface: "HUGGINGFACE_API_KEY"
-  nebius: "Nebius_API_KEY"
+  nebius: "NEBIUS_API_KEY"
plugins/semanticcache/utils.go-391-412 (1)

391-412: Potential metadata size concern with base64 image data.

Storing full image_b64 data in metadata (line 402-403, 410) could result in very large entries. Base64-encoded images can be several megabytes, which may exceed metadata size limits in some vector stores (e.g., Pinecone has a 40KB metadata limit per vector).

Consider:

  1. Storing only a hash or truncated prefix of the base64 data for cache key matching
  2. Storing image data separately from vector metadata
  3. Documenting the metadata size implications for users configuring semantic cache
tests/integrations/python/tests/test_openai.py-1173-1173 (1)

1173-1173: Fix Ruff ARG002: unused test_config arg.
Rename to _test_config (or remove it) in the 4 new tests.

Proposed minimal change
-def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled):
+def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled):
@@
-def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled):
+def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled):
@@
-def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled):
+def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled):
@@
-def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled):
+def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):

Also applies to: 1195-1195, 1218-1218, 1241-1241

core/internal/testutil/validation_presets.go-222-237 (1)

222-237: Remove the stray “add this function” comment.
Looks like a local note that shouldn’t ship.

Proposed fix
-// In validation_presets.go - add this function
 func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
plugins/semanticcache/plugin_image_generation_test.go-110-223 (1)

110-223: Add consistent integration-test guards (Short mode + missing API key) to all tests.
Only TestImageGenerationCacheBasicFunctionality checks testing.Short() and OPENAI_API_KEY. The other tests will run in -short and/or with missing creds, and may “skip” for the wrong reason (e.g., auth failure).

Proposed diff (pattern)
 func TestImageGenerationSemanticSearch(t *testing.T) {
+	if testing.Short() {
+		t.Skip("skipping integration test in -short mode")
+	}
+	if os.Getenv("OPENAI_API_KEY") == "" {
+		t.Skip("OPENAI_API_KEY not set")
+	}
 	// Initialize test with custom threshold
 	config := &Config{

Also applies to: 224-275, 276-401

plugins/semanticcache/plugin_image_generation_test.go-346-350 (1)

346-350: Streaming cache test: chunk-count equality may be flaky.
Cached streams may legitimately have different chunking/segmentation vs live streams (even if content is equivalent). Consider asserting completion + cache hit + non-empty data instead of strict chunk count equality.

framework/modelcatalog/pricing.go-109-120 (1)

109-120: Normalize stream request types even when usage is missing.
Today normalization to ImageGenerationRequest is gated on imageUsage != nil; if a provider returns no usage (but you still have requestType=stream), you’ll skip any image pricing lookup keyed by normalized type.

Proposed diff
-		if imageUsage != nil && requestType == schemas.ImageGenerationStreamRequest {
+		if requestType == schemas.ImageGenerationStreamRequest {
 			requestType = schemas.ImageGenerationRequest
 		}
core/internal/testutil/image_generation.go-261-287 (1)

261-287: Avoid double-reporting stream errors on failure.
You fold streamErrors into validationErrors, and later append(validationResult.Errors, validationResult.StreamErrors...), so failures print duplicates.

Proposed diff (keep streamErrors separate)
-				// Stream errors should cause the test to fail - convert them to validation errors
-				if len(streamErrors) > 0 {
-					validationErrors = append(validationErrors, fmt.Sprintf("Stream errors encountered: %s", strings.Join(streamErrors, "; ")))
-				}
+				// Stream errors should cause the test to fail, but keep them separate for reporting.
core/internal/testutil/image_generation.go-22-27 (1)

22-27: Add scenario flag gate to non-streaming image generation test.

RunImageGenerationTest currently runs whenever ImageGenerationModel is configured, ignoring the testConfig.Scenarios.ImageGeneration flag. The streaming variant correctly gates on Scenarios.ImageGenerationStream first (line 155). Add the same gate to the non-streaming path to prevent unintended test execution in stacked PR scenarios where tests are selectively enabled via Scenarios flags.

Proposed diff
 func RunImageGenerationTest(t *testing.T, client *bifrost.Bifrost, ctx context.Context, testConfig ComprehensiveTestConfig) {
+	if !testConfig.Scenarios.ImageGeneration {
+		t.Logf("Image generation not enabled for provider %s", testConfig.Provider)
+		return
+	}
 	if testConfig.ImageGenerationModel == "" {
 		t.Logf("Image generation not configured for provider %s", testConfig.Provider)
 		return
 	}
core/providers/openai/images.go-41-69 (1)

41-69: Populate Fallbacks in reverse conversion to achieve round-trip symmetry.
ToBifrostImageGenerationRequest() parses request.Fallbacks (line 67), but ToOpenAIImageGenerationRequest() never populates req.Fallbacks from bifrostReq.Fallbacks, causing data loss on round-trip conversion. Either set fallbacks in the reverse converter or remove it from the forward converter to maintain consistency.

transports/bifrost-http/handlers/inference.go-1417-1489 (1)

1417-1489: Image generation handler: good validation + ExtraParams extraction; minor grammar correction needed.

Change "prompt can not be empty""prompt cannot be empty" for consistency with other error messages in the codebase (e.g., nebius.go, logging.go).

The cancel() call when bifrostCtx == nil is safe—ConvertToBifrostContext always returns a valid cancel function alongside the context.

core/internal/testutil/test_retry_framework.go-2939-3095 (1)

2939-3095: Stream retry helper is consistent; minor naming/doc nit.
WithImageGenerationStreamRetry comment references “ValidationRetry” but the function name doesn’t; consider aligning terminology for grep-ability.

core/providers/huggingface/huggingface.go-943-947 (1)

943-947: Misleading comment - refers to "Nebius converter" but function name is generic.

The comment says "Unmarshal response using Nebius converter" but the function is named UnmarshalHuggingFaceImageGenerationResponse. This appears to be a copy-paste error from another provider's implementation.

📝 Fix the misleading comment
-	// Unmarshal response using Nebius converter
+	// Unmarshal response to Bifrost format
 	bifrostResponse, convErr := UnmarshalHuggingFaceImageGenerationResponse(responseBody, request.Model)
🧹 Nitpick comments (30)
docs/providers/supported-providers/sgl.mdx (1)

25-25: Update the Note and Unsupported Features section for consistency.

The table row correctly documents that SGLang does not support image generation. However, for documentation consistency, consider updating:

  1. The Note on line 32 to include "Image Generation" in the list of unsupported operations
  2. The "Unsupported Features" table (lines 105-110) to add an Image Generation row
📝 Suggested updates

Line 32:

-**Unsupported Operations** (❌): Speech, Transcriptions, Files, and Batch are not supported by the upstream SGL API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream SGL API. These return `UnsupportedOperationError`.

Unsupported Features table:

 | Feature | Reason |
 |---------|--------|
+| Image Generation | Not offered by SGL API |
 | Speech/TTS | Not offered by SGL API |
 | Transcription/STT | Not offered by SGL API |
 | Batch Operations | Not offered by SGL API |
 | File Management | Not offered by SGL API |
docs/providers/supported-providers/gemini.mdx (2)

560-576: Consider simplifying the bullet point formatting for readability.

The bullet points in the "Standard Gemini Format" and "Imagen Format" subsections use a bold key-value pattern. While functional, consider using a table format similar to the Parameter Mapping section above for visual consistency.

This is a minor stylistic suggestion - the current format is acceptable.


600-604: Consider simplifying implementation detail reference.

The mention of schemas.IsImagenModel() exposes internal implementation details that may not be meaningful to end users. Consider rephrasing to focus on observable behavior rather than code references.

📝 Suggested documentation improvement
 ## Endpoint Selection

 The provider automatically selects the endpoint based on model name:
-- **Imagen models** (detected via `schemas.IsImagenModel()`): Uses `/v1beta/models/{model}:predict` endpoint
+- **Imagen models** (e.g., `imagen-4.0-generate-001`): Uses `/v1beta/models/{model}:predict` endpoint
 - **Other models**: Uses `/v1beta/models/{model}:generateContent` endpoint with image response modality
ui/app/workspace/providers/fragments/allowedRequestsFields.tsx (1)

82-83: Consider rebalancing column split.

With the two new request types, REQUEST_TYPES now has 15 items, creating a 6/9 column split. For better visual balance, consider adjusting to a 7/8 or 8/7 split:

-	const leftColumn = REQUEST_TYPES.slice(0, 6);
-	const rightColumn = REQUEST_TYPES.slice(6);
+	const leftColumn = REQUEST_TYPES.slice(0, 8);
+	const rightColumn = REQUEST_TYPES.slice(8);
docs/providers/supported-providers/azure.mdx (1)

429-429: Missing newline before horizontal rule.

The horizontal rule --- should have a blank line before it for proper Markdown rendering to avoid it being interpreted as a setext heading.

 Image generation streaming is supported and uses OpenAI's streaming format with Server-Sent Events (SSE).
+
 ---
core/providers/azure/azure.go (1)

1269-1269: Remove orphan comment.

Line 1269 has an empty comment that appears to be leftover from editing.

-	//
 	deployment := key.AzureKeyConfig.Deployments[request.Model]
tests/integrations/python/tests/test_google.py (1)

1739-1741: Consider narrowing the exception type for better error visibility.

The broad except Exception catch can mask unexpected errors (e.g., test logic bugs, network timeouts) that differ from expected "Imagen not available" scenarios. Consider catching more specific exceptions from the Google GenAI SDK (e.g., google.api_core.exceptions.NotFound or similar) to let genuine failures surface.

♻️ Suggested refinement
-        except Exception as e:
-            # Imagen may not be available in all regions or configurations
-            pytest.skip(f"Imagen generation failed: {e}")
+        except (google.api_core.exceptions.NotFound, 
+                google.api_core.exceptions.PermissionDenied,
+                google.api_core.exceptions.InvalidArgument) as e:
+            # Imagen may not be available in all regions or configurations
+            pytest.skip(f"Imagen generation failed: {e}")
plugins/semanticcache/utils.go (1)

11-11: Consider using sonic.Marshal consistently for response storage.

The addSingleResponse function uses sonic.Marshal (line 382) to serialize the response, while other marshaling operations in this file use json.Marshal (lines 119, 348, 746, 835). Since the project uses sonic as the standard serialization library throughout the codebase, consider aligning this usage for consistency.

Note: This is a stylistic concern. The response string is stored but not hashed or compared, so the choice between sonic and json has no functional impact on cache correctness.

core/providers/bedrock/bedrock.go (1)

1347-1355: Use a consistent provider identifier in unsupported-operation errors.
In this file, other unsupported methods pass schemas.Bedrock but ImageGeneration passes provider.GetProviderKey(). Please standardize (either all schemas.Bedrock or all provider.GetProviderKey()), otherwise clients may see inconsistent “provider” values depending on operation.

Proposed adjustment (pick one convention)
 func (provider *BedrockProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationRequest, schemas.Bedrock)
 }

 func (provider *BedrockProvider) ImageGenerationStream(ctx *schemas.BifrostContext, postHookRunner schemas.PostHookRunner, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationStreamRequest, provider.GetProviderKey())
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.ImageGenerationStreamRequest, schemas.Bedrock)
 }
transports/bifrost-http/integrations/genai.go (1)

504-522: isImageGenerationRequest heuristic is reasonable; consider tightening the fallback.
The model-name fallback (IsImagenModel(req.Model)) may cause false positives if non-image models include “imagen” in a path/alias. If you’ve got a more authoritative discriminator available (endpoint suffix, request fields, or provider capability map), prefer that over substring matching.

core/providers/huggingface/models.go (1)

57-91: Also recognize image generation from tags (not just PipelineTag).
Many HF models rely on tags; consider adding text-to-image, image-generation, stable-diffusion, etc. to the tags switch so listing doesn’t under-report capabilities.

core/providers/openai/openai_test.go (1)

25-84: Test config change is clear; consider cost/flakiness gating for image generation.
Since image generation can be slow/expensive, it may be worth guarding these scenarios behind a dedicated env flag (in addition to OPENAI_API_KEY) if CI stability/cost is a concern.

ui/app/workspace/logs/views/logDetailsSheet.tsx (1)

75-78: Consider adding image generation support to copyRequestBody.

The "Copy request body" feature doesn't support image generation requests. If this is intentional, consider updating the error message to explicitly mention image generation as unsupported. Otherwise, this could be a follow-up enhancement to support copying image generation request bodies.

core/providers/huggingface/errors.go (1)

12-65: FastAPI detail.loc parsing: consider handling non-float numeric types to avoid dropping path segments.
Today only (string|float64) are included; if loc elements decode as int/json.Number, the location string becomes incomplete.

Proposed tweak
diff --git a/core/providers/huggingface/errors.go b/core/providers/huggingface/errors.go
@@
 				for _, locPart := range detail.Loc {
 					if locStr, ok := locPart.(string); ok {
 						locParts = append(locParts, locStr)
-					} else if locNum, ok := locPart.(float64); ok {
+					} else if locNum, ok := locPart.(float64); ok {
 						locParts = append(locParts, fmt.Sprintf("%.0f", locNum))
+					} else {
+						// Best-effort fallback (covers int/json.Number/etc.)
+						locParts = append(locParts, fmt.Sprintf("%v", locPart))
 					}
 				}
core/internal/testutil/response_validation.go (1)

240-275: Image-generation validation helpers are reasonable; consider also asserting ExtraFields.RequestType when available.
Would catch provider wiring issues early (e.g., response tagged as the wrong request type).

Also applies to: 1092-1166

core/internal/testutil/validation_presets.go (1)

363-370: Make customParams["min_images"] extraction tolerant to non-int numeric types.
If customParams ever comes from JSON/YAML, numbers often arrive as float64 and the current code will always fall back to defaults.

Possible fix
 	case "ImageGeneration":
-		if minImages, ok := customParams["min_images"].(int); ok {
-			if expectedSize, ok := customParams["expected_size"].(string); ok {
-				return ImageGenerationExpectations(minImages, expectedSize)
-			}
-		}
+		if expectedSize, ok := customParams["expected_size"].(string); ok {
+			if minImages, ok := schemas.SafeExtractInt(customParams["min_images"]); ok {
+				return ImageGenerationExpectations(minImages, expectedSize)
+			}
+		}
 		return ImageGenerationExpectations(1, "1024x1024")
ui/app/workspace/logs/views/imageView.tsx (2)

75-81: The isStreaming prop logic is correct but could be clearer.

The expression isStreaming && !currentImage on line 80 will always evaluate to false here since we're inside the currentImage && block. This is actually the intended behavior (ImageMessage should only show streaming placeholder when there's no image data), but the logic is redundant.

Consider simplifying to isStreaming={false} or removing the prop entirely for clarity:

♻️ Suggested simplification
 <ImageMessage 
   image={{
     ...currentImage,
     output_format: imageOutput?.params?.output_format,
   }} 
-  isStreaming={isStreaming && !currentImage} 
+  isStreaming={false}
 />

9-11: Consider importing ImageGenerationInput from shared types.

This interface duplicates the definition from core/schemas/images.go. If there's a corresponding TypeScript type in the shared types (similar to BifrostImageGenerationOutput in ui/lib/types/logs.ts), consider importing it for consistency.

framework/streaming/images.go (1)

270-274: Remove empty/unused code block.

This if-block has only a comment and no actual code. If usage extraction is intended for future implementation, consider adding a TODO comment or removing the block entirely.

♻️ Suggested cleanup
-		// Extract usage if available
-		if result.ImageGenerationStreamResponse.Usage != nil {
-			// Note: ImageUsage doesn't directly map to BifrostLLMUsage, but we can store it
-			// The actual usage will be extracted in processAccumulatedImageStreamingChunks
-		}
+		// Note: Usage (ImageUsage) is stored in chunk.Delta and extracted in processAccumulatedImageStreamingChunks
core/providers/gemini/images.go (2)

11-93: ToBifrostImageGenerationRequest: parameters are only mapped in the Imagen-instances path.
If a request comes via Contents + Parameters, those params get dropped. Consider applying the same request.Parameters mapping after the Contents prompt extraction too.


128-186: Response conversion looks reasonable; consider erroring on “no candidates / no predictions”.
Right now, ToBifrostImageGenerationResponse can return a success with empty Data. If downstream assumes at least one image, returning a structured error would be safer.

Also applies to: 450-478

core/providers/huggingface/images.go (1)

156-231: Stream request builder should assert fal-ai provider to avoid misuse.
Since this always returns a fal-ai stream request type, it’s safer to verify splitIntoModelProvider(bifrostReq.Model) resolves to falAI and otherwise error.

framework/streaming/accumulator.go (1)

293-324: Image chunk dedup uses string keys; consider a cheaper key type.
fmt.Sprintf("%d:%d", ...) allocates per chunk; a uint64 composite key (or a small struct key) avoids that hot-path overhead.

core/providers/gemini/types.go (1)

1702-1707: Redundant struct definition - consider reusing ImagenInstance type.

GeminiImagenRequest.Instances uses an inline anonymous struct with *string for Prompt, while ImagenInstance (line 91-93) already exists with string for Prompt. This creates two different representations for the same concept.

Consider consolidating to a single type for consistency and maintainability:

♻️ Suggested consolidation
 type GeminiImagenRequest struct {
-	Instances *[]struct {
-		Prompt *string `json:"prompt"`
-	} `json:"instances"`
+	Instances  *[]ImagenInstance       `json:"instances"`
 	Parameters GeminiImagenParameters `json:"parameters"`
 }

If the pointer semantics are required for the API, update ImagenInstance:

 type ImagenInstance struct {
-	Prompt string `json:"prompt,omitempty"`
+	Prompt *string `json:"prompt,omitempty"`
 }
core/providers/huggingface/huggingface.go (5)

1086-1103: Ensure consistent error typing for timeout vs cancellation.

The error handling correctly differentiates between context cancellation, timeout, and other errors. However, the context.Canceled check should also handle wrapped errors.

♻️ Use errors.Is for wrapped error compatibility

The current code already uses errors.Is, which handles wrapped errors correctly. This is good practice.


1123-1129: Unused variables lastB64Data, lastURLData, lastJsonData when no chunks are received.

These variables are initialized but only used when chunkIndex > 0. This is fine, but consider moving their declarations inside the loop or the final chunk block to reduce scope.


1261-1274: Scanner error handling occurs after completion chunk is sent.

If scanner.Err() returns an error after successfully processing chunks, both a completion chunk and an error will be sent to the channel. This could confuse consumers expecting either success or failure.

Consider checking for scanner errors before sending the completion chunk:

♻️ Check scanner error before sending completion
+		// Check for scanner errors first
+		if err := scanner.Err(); err != nil {
+			bifrostErr := providerUtils.NewBifrostOperationError(
+				fmt.Sprintf("Error reading fal-ai stream: %v", err),
+				err,
+				providerName,
+			)
+			bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{
+				Provider:       providerName,
+				ModelRequested: request.Model,
+				RequestType:    schemas.ImageGenerationStreamRequest,
+			}
+			ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
+			providerUtils.ProcessAndSendBifrostError(ctx, postHookRunner, bifrostErr, responseChan, logger)
+			return
+		}
+
 		// Stream closed - send completion chunk
 		if chunkIndex > 0 {
 			// ... completion chunk logic
 		}
-
-		if err := scanner.Err(); err != nil {
-			// ... error handling (remove this block)
-		}

1231-1259: Consider sending the stream end indicator only once.

The stream end indicator (BifrostContextKeyStreamEndIndicator) is set in multiple places: on error (line 1179, 1272), and on successful completion (line 1254). While this works, it could be cleaner to set it in a single deferred function or just before closing the channel.


250-253: Rename parseHuggingFaceImageError for clarity—it handles all request types, not just images.

The function name is misleading since it's used for chat completions, embeddings, and other request types beyond image generation. While the underlying error structure (FastAPI validation errors in the Detail field) is consistent across HuggingFace inference endpoints, the function should be renamed to reflect its generic purpose, such as parseHuggingFaceError.

Additionally, completeRequest passes nil metadata (line 252) while completeRequestWithModelAliasCache passes proper metadata—consider making this consistent for better error context.

framework/streaming/types.go (1)

360-393: Missing MaxImageChunkIndex tracking for image streaming consistency.

ImageStreamChunk contains TokenUsage, Cost, and other metadata fields, and images.go extracts these from what should be the final chunk. However, unlike Chat, Audio, and Transcription streaming which track Max*ChunkIndex and use getter methods (e.g., getLastChatChunk()), image streaming directly accesses the last array element via acc.ImageStreamChunks[len(acc.ImageStreamChunks)-1]. This approach is fragile for out-of-order chunk arrival and inconsistent with other stream types.

Adding MaxImageChunkIndex to StreamAccumulator and a corresponding getLastImageChunk() method would improve robustness and maintain consistency across all streaming types.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/internal/testutil/account.go (1)

1131-1157: Add missing Fallbacks field to XAI provider configuration.

The XAI provider configuration at lines 1131-1157 is missing the Fallbacks field. Every other provider in the file includes a Fallbacks field pointing to OpenAI with "gpt-4o-mini" as the fallback model. XAI should follow the same pattern for consistency.

core/providers/anthropic/anthropic.go (1)

629-699: Don’t emit the final summary chunk after an in-stream error.

When event.ToBifrostChatCompletionStream() returns bifrostErr != nil, the code sends an error and breaks—then proceeds to emit the final usage/summary chunk after the loop. That can confuse clients (error followed by “final” chunk).

Proposed fix
 			response, bifrostErr, isLastChunk := event.ToBifrostChatCompletionStream()
 			if bifrostErr != nil {
 				bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{
 					RequestType:    schemas.ChatCompletionStreamRequest,
 					Provider:       providerName,
 					ModelRequested: modelName,
 				}
 				ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
 				providerUtils.ProcessAndSendBifrostError(ctx, postHookRunner, bifrostErr, responseChan, logger)
-				break
+				return
 			}
@@
 			if isLastChunk {
 				break
 			}
core/providers/huggingface/huggingface.go (1)

220-265: parseHuggingFaceImageError should not be called with nil metadata—request context is lost for all callers.

Despite the misleading name, parseHuggingFaceImageError is actually generic and safe for all endpoints (chat, embedding, speech, transcription, image generation). However, completeRequest calls it with nil for metadata at line 256, which prevents ExtraFields (Provider, ModelRequested, RequestType) from being populated in error responses. This degrades error reporting across all endpoints using completeRequest.

Pass the appropriate request metadata instead of nil, or refactor to construct metadata from the context and request type information available in completeRequest.

🤖 Fix all issues with AI agents
In @core/internal/testutil/response_validation.go:
- Around line 240-275: ValidateImageGenerationResponse calls
validateImageGenerationFields which assumes expectations.ProviderSpecific is
non-nil; add a nil-guard so we never index into a nil map/slice. Update either
ValidateImageGenerationResponse to ensure expectations.ProviderSpecific is
initialized to an empty map/struct before calling validateImageGenerationFields,
or modify validateImageGenerationFields to check if
expectations.ProviderSpecific == nil and handle it safely (return early or use
safe defaults) before any indexing or range operations; reference the
ValidateImageGenerationResponse and validateImageGenerationFields functions and
the expectations.ProviderSpecific field when making the change.
- Around line 1096-1122: In validateImageGenerationFields, avoid panics by first
checking expectations.ProviderSpecific != nil before accessing ["min_images"],
and make parsing robust by using a type switch to accept int, int64, float64,
json.Number (or string that can be parsed to int) to derive minImages as an int;
if conversion fails, skip the min_images check or record an error. Then compare
len(response.Data) to the parsed minImages and update result.Passed,
result.Errors, and result.MetricsCollected["image_count"] as before.

In @core/providers/gemini/gemini.go:
- Around line 1614-1753: ImageGeneration can panic if request == nil and
handleImagenImageGeneration may dereference a nil conversion result; add a nil
guard at the top of ImageGeneration (mirror nebius/openai pattern) that returns
a missing-parameter error when request is nil, and in
handleImagenImageGeneration verify the result of
imagenResponse.ToBifrostImageGenerationResponse() is non-nil before setting
response.ExtraFields; if nil, return an appropriate Bifrost error (e.g., a
provider response decode/operation error) including providerName and
request.Model to avoid panics.

In @core/providers/huggingface/images.go:
- Around line 14-154: In ToHgingFaceImageGenerationRequest's falAI case, restore
parity with ToHuggingFaceImageStreamRequest by mapping missing fields from
bifrostReq.Params and bifrostReq.Params.ExtraParams into the
HuggingFaceFalAIImageGenerationRequest: copy OutputFormat ->
ResponseFormat/ResponseFormat field, Seed -> Seed, NegativePrompt ->
NegativePrompt, and extract NumInferenceSteps, GuidanceScale (and other numeric
extra params) using schemas.SafeExtractIntPointer into
NumInferenceSteps/GuidanceScale (or the fal-ai equivalents) and normalize image
size/format handling similar to the stream converter; update the falAI branch in
ToHuggingFaceImageGenerationRequest (function name
ToHuggingFaceImageGenerationRequest, type
HuggingFaceFalAIImageGenerationRequest) to include these mappings consistent
with ToHuggingFaceImageStreamRequest.

In @core/providers/openai/openai.go:
- Around line 2382-2491: Add the same nil/request validation used by
ImageGenerationStream to the non-streaming path: in
HandleOpenAIImageGenerationRequest (and the ImageGeneration wrapper) first check
request != nil and return a BifrostError if nil, then validate the prompt/input
fields on the request (e.g., ensure request.Prompt or the equivalent image
prompt field is non-empty) and return a clear Bifrost user/operation error when
missing; do this before calling
CheckContextAndGetRequestBody/ToOpenAIImageGenerationRequest so we never marshal
or send a null/invalid body to the provider.

In @core/providers/vertex/vertex_test.go:
- Around line 45-46: The test incorrectly enables streaming image generation by
setting ImageGenerationStream: true while the Vertex provider does not support
it; update the test configuration to set ImageGenerationStream: false so it
matches the provider capability (the code path that returns
UnsupportedOperationError in vertex.go for image streaming should no longer be
exercised). Ensure the test config object with fields ImageGeneration and
ImageGenerationStream is modified accordingly.

In @docs/openapi/schemas/inference/images.yaml:
- Around line 3-100: The OpenAPI ImageGenerationRequest schema is inconsistent
with the core BifrostImageGenerationRequest shape (core/schemas/images.go):
update the docs or the handler so they match; specifically, either modify
docs/docs/openapi/schemas/inference/images.yaml ImageGenerationRequest to
include a top-level provider (string) and an input object that nests prompt and
any per-request params (e.g., n, size, quality, style, response_format,
background, moderation, partial_images, output_compression, output_format, user,
stream, fallbacks) to mirror BifrostImageGenerationRequest.input/params, or
change the handler to accept the flat top-level fields currently in the YAML;
ensure the request field names and nesting (provider, model, input, params)
exactly match the struct names in BifrostImageGenerationRequest so generated
clients and decoding work correctly.

In @framework/modelcatalog/pricing.go:
- Around line 276-426: The tiering is computed from imageTotalTokens (set from
imageUsage.TotalTokens) but that value can be zero when providers omit
total_tokens; update the logic before computing isAbove128k/isAbove200k to
compute a fallback imageTotalTokens by summing available counts from
imageUsage.InputTokens, imageUsage.OutputTokens and the breakdowns
imageUsage.InputTokensDetails.{TextTokens,ImageTokens} and
imageUsage.OutputTokensDetails.{TextTokens,ImageTokens} when TotalTokens==0 so
tier flags (isAbove128k/isAbove200k) use the derived total; adjust the branch
that sets imageTotalTokens (and any places that later read it) to use this
computed fallback to determine the correct tiered rates.

In @framework/streaming/images.go:
- Around line 18-35: Loop body dereferences chunks[i].Delta and its ExtraFields
without nil checks. Add a guard at the top of the loop (e.g., if chunks[i].Delta
== nil { continue }) to skip chunks that lack Delta, and also defensively check
Delta.ExtraFields before accessing ModelRequested (e.g., model := ""; if
chunks[i].Delta.ExtraFields != nil { model =
chunks[i].Delta.ExtraFields.ModelRequested }). Then build and return the
BifrostImageGenerationResponse using these safe values (still referencing
chunks, Delta, Delta.ExtraFields, and BifrostImageGenerationResponse).

In @plugins/logging/main.go:
- Around line 34-48: The pool cleanup in putUpdateLogData is missing resets for
ImageGenerationOutput, RawRequest, and RawResponse, which allows
image-generation and raw payloads to leak between reused UpdateLogData
instances; update putUpdateLogData to explicitly set data.ImageGenerationOutput
= nil, data.RawRequest = nil, and data.RawResponse = nil before calling
p.updateDataPool.Put(data) so all pooled fields are cleared; reference the
UpdateLogData struct fields ImageGenerationOutput, RawRequest, RawResponse and
the putUpdateLogData function in plugins/logging/pool.go when making the change.

In @plugins/mocker/main.go:
- Around line 718-722: PreHook currently filters out image generation requests
so the image prompt extraction in the case for schemas.ImageGenerationRequest
and schemas.ImageGenerationStreamRequest is never reached; update PreHook to
allow these types by including schemas.ImageGenerationRequest and
schemas.ImageGenerationStreamRequest in the allowed request types (modify the
condition that now only allows schemas.ChatCompletionRequest and
schemas.ResponsesRequest). After enabling the path, extend
generateSuccessShortCircuit and handleDefaultBehavior to handle image generation
responses by adding branches that construct appropriate image-generation
short-circuit responses (parallel to the existing ChatResponse handling) when
the request type is ImageGenerationRequest or ImageGenerationStreamRequest, and
ensure the code path uses the extracted prompt from the existing
ImageGenerationRequest.Input.Prompt handling.

In @plugins/semanticcache/plugin_image_generation_test.go:
- Around line 109-223: The test TestImageGenerationSemanticSearch should be made
opt-in: at its start check testing.Short() and ENV OPENAI_API_KEY and call
t.Skip with explanatory messages if short mode is set or the key is missing;
update the test initializer in TestImageGenerationSemanticSearch to do these
early guards (before creating Config/NewTestSetupWithConfig) so CI won’t run the
external OpenAI-dependent semantic-cache test when credentials are absent or
when -short is requested.
- Around line 224-401: Both TestImageGenerationDifferentParameters and
TestImageGenerationStreamCaching need the same integration-test guard used in
TestImageGenerationCacheBasicFunctionality to prevent accidental network calls
during unit runs; update each test to early-skip when the integration
flag/condition used by TestImageGenerationCacheBasicFunctionality is false
(i.e., replicate the same guard check and t.Skip behavior), placing the guard at
the top of TestImageGenerationDifferentParameters and
TestImageGenerationStreamCaching so they only run when the integration gate is
enabled.

In @tests/integrations/python/config.yml:
- Around line 45-46: Replace the incorrectly-cased env var string
"Nebius_API_KEY" with the uppercase "NEBIUS_API_KEY" throughout the test config
so the tests pick up the actual environment variable; update every occurrence
(e.g., the entries currently showing "Nebius_API_KEY" around the noted ranges
such as lines ~45, 104-106, 133-135, 170-175, 186-188, 208-209, 311-313,
353-355, 457-462, 486-487) to "NEBIUS_API_KEY".

In @tests/integrations/python/tests/test_google.py:
- Around line 1687-1787: The tests use get_provider_google_client(provider)
across multiple providers but only have @skip_if_no_api_key("google"), causing
get_api_key(provider) to raise ValueError at runtime; inside each test
(test_41a_image_generation_simple, test_41b_imagen_predict,
test_41c_image_generation_with_text) add an early provider-specific API key
check (call get_api_key(provider) or wrap get_provider_google_client(provider)
in a try/except that skips the test on ValueError) before creating the client,
rename the unused test_config parameter to _test_config in those test
signatures, and replace the broad except Exception in test_41b_imagen_predict
with except ValueError as e (or the specific exception raised by get_api_key) so
missing-key cases are skipped cleanly.

In @tests/integrations/python/tests/test_openai.py:
- Around line 1166-1258: Tests test_52b, test_52c, and test_52d assume
parameters unsupported by some configured models (e.g., dall-e-3); instead of
hardcoded model string checks, query provider/model capabilities before running
assertions: in each test (test_52b_image_generation_multiple,
test_52c_image_generation_quality, test_52d_image_generation_different_sizes)
call the capability helper (or extend
get_provider_openai_client/format_provider_model) to check flags like
supports_multiple_images, supported_quality_values, and supported_sizes and skip
if capability missing, or adapt the request to use a supported value (e.g., use
n=1 when multiple not supported, choose a quality from supported_quality_values,
and pick sizes from supported_sizes) so the tests exercise only valid parameter
combinations for the given provider/model.

In @tests/integrations/python/tests/utils/common.py:
- Around line 2620-2724: The Google branch in
assert_valid_image_generation_response is too permissive and has inconsistent
key casing and weak base64 checks; update assert_valid_image_generation_response
to accept both camelCase and snake_case for inlineData/inline_data and
mimeType/mime_type and data/data, validate image data with a real base64 check
(e.g., regex for base64 chars and padding) plus a length threshold, ensure
mime_type/mimeType starts with "image/", and tighten the Imagen predictions
check so that for each prediction in predictions you verify the bytes field
exists AND is non-empty (handle bytesBase64Encoded and bytes_base64_encoded)
rather than only asserting attribute presence; reference these symbols when
changing logic: assert_valid_image_generation_response, the local variables
candidates/parts/inline_data/inlineData, mimeType/mime_type, data, and
predictions/bytesBase64Encoded/bytes_base64_encoded.
🟡 Minor comments (6)
core/providers/bedrock/bedrock.go-691-692 (1)

691-692: Confirm: all GetBifrostResponseForStreamResponse call sites correctly migrated to 6-arg signature.

Verified that all five call sites (lines 691, 937, 951, 1128, 1221) pass exactly 6 arguments with the expected parameter positioning (text completion, chat, responses variants). Signature alignment is sound.

Address inconsistency in unsupported method stubs: ImageGeneration (line 1349) and ImageGenerationStream (line 1354) use provider.GetProviderKey() for the second argument, while TranscriptionStream (line 1344) uses schemas.Bedrock. Standardize to one approach for consistency.

core/internal/testutil/validation_presets.go-222-237 (1)

222-237: Make min_images parsing tolerant (int vs float64) and drop the stray inline comment.

This is testutil, but if customParams is ever JSON-decoded, numbers will come through as float64 and you’ll always fall back to defaults.

Proposed adjustment
-// In validation_presets.go - add this function
 func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
@@
 	case "ImageGeneration":
-		if minImages, ok := customParams["min_images"].(int); ok {
-			if expectedSize, ok := customParams["expected_size"].(string); ok {
-				return ImageGenerationExpectations(minImages, expectedSize)
-			}
-		}
+		var minImages int
+		switch v := customParams["min_images"].(type) {
+		case int:
+			minImages = v
+		case float64:
+			minImages = int(v)
+		}
+		if minImages > 0 {
+			if expectedSize, ok := customParams["expected_size"].(string); ok && expectedSize != "" {
+				return ImageGenerationExpectations(minImages, expectedSize)
+			}
+			return ImageGenerationExpectations(minImages, "1024x1024")
+		}
 		return ImageGenerationExpectations(1, "1024x1024")

Also applies to: 363-370

docs/providers/supported-providers/nebius.mdx-135-189 (1)

135-189: Good: guidance_scale documented as int (0-100) and ai_project_id behavior is clear.
This matches the Nebius image-gen typing/range expectation (as per learnings). Only nit: the “Model & Prompt” bullet list could be lightly reworded to avoid repetitive sentence starts.

core/providers/nebius/types.go-56-75 (1)

56-75: UnmarshalJSON does not handle null or empty input.

If data is null (JSON literal) or empty, the current implementation may behave unexpectedly. The strings.TrimSpace on empty data results in an empty string, which passes the len(trimmed) > 0 check as false, then attempts to unmarshal as a string which may fail or produce an empty message.

Suggested improvement
 func (d *NebiusErrorDetail) UnmarshalJSON(data []byte) error {
+	// Handle null JSON value
+	if string(data) == "null" {
+		return nil
+	}
+
 	// First, try to unmarshal as an array (validation errors)
 	trimmed := strings.TrimSpace(string(data))
+	if len(trimmed) == 0 {
+		return nil
+	}
 	if len(trimmed) > 0 && trimmed[0] == '[' {
framework/modelcatalog/pricing.go-164-173 (1)

164-173: Update documentation example for CalculateCostFromUsage method signature.

The example in docs/architecture/framework/model-catalog.mdx (lines 169-180) is outdated. It shows only 8 arguments and is missing the deployment parameter (between model and usage) and the imageUsage parameter (at the end). All code call sites have been correctly updated with the 9 required parameters.

core/providers/huggingface/huggingface.go-968-1022 (1)

968-1022: ImageGenerationStream error construction: avoid using dynamic strings as “operation code”.
providerUtils.NewBifrostOperationError appears to expect a stable error code (you use constants elsewhere). Passing a formatted sentence as the “code” will make error handling/analytics inconsistent. Prefer a dedicated constant (or use NewUnsupportedOperationError if that’s the intended category).

🧹 Nitpick comments (27)
core/schemas/utils.go (1)

1183-1186: Inconsistent case handling compared to sibling functions.

This function uses strings.ToLower(model) before checking for "imagen", but all other model detection functions in this file (IsNovaModel, IsAnthropicModel, IsMistralModel, IsGeminiModel) perform case-sensitive matching directly.

If case-insensitivity is intentional for Imagen models (e.g., providers may report "Imagen" vs "imagen"), please add a comment explaining this. Otherwise, consider aligning with the existing pattern for consistency:

♻️ Suggested fix for consistency
 // IsImagenModel checks if the model is an Imagen model.
 func IsImagenModel(model string) bool {
-	return strings.Contains(strings.ToLower(model), "imagen")
+	return strings.Contains(model, "imagen")
 }

Please verify whether Imagen model identifiers can appear in mixed case across different providers, which would justify the case-insensitive check.

docs/quickstart/gateway/multimodal.mdx (1)

97-98: Minor: Remove extra blank line.

There's a double blank line between the Image Generation section and Audio Understanding section.

📝 Suggested fix
 }

Audio Understanding: Analyzing Audio with AI

</details>

</blockquote></details>
<details>
<summary>docs/providers/supported-providers/xai.mdx (1)</summary><blockquote>

`132-134`: **Consider varying sentence structure (optional).**

Per static analysis, three successive bullet points begin with similar patterns. Consider rewording for better readability.

<details>
<summary>📝 Suggested alternative wording</summary>

```diff
-**Request Conversion**
-
-xAI uses the same conversion as OpenAI (see [OpenAI Image Generation](/providers/supported-providers/openai#7-image-generation)):
-
-- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt`
-- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the request struct via struct embedding
-- **Endpoint**: `/v1/images/generations`
+**Request Conversion**
+
+xAI uses the same conversion as OpenAI (see [OpenAI Image Generation](/providers/supported-providers/openai#7-image-generation)):
+
+| Field | Mapping |
+|-------|---------|
+| Model & Prompt | `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt` |
+| Parameters | All fields from `bifrostReq.Params` (`ImageGenerationParameters`) embedded via struct embedding |
+| Endpoint | `/v1/images/generations` |
docs/providers/supported-providers/vertex.mdx (1)

25-25: Consider consistency in streaming indicator.

The Image Generation row shows ✅ for non-streaming and - for streaming. Other unsupported streaming operations in this table use (e.g., Embeddings, List Models). Consider whether - is intentionally indicating "not applicable" or if it should be for consistency.

Based on the code (ImageGenerationStream returns UnsupportedOperationError), using would be more consistent with other providers' documentation.

📝 Suggested change for consistency
-| Image Generation | ✅ | - | `/generateContent` or `/predict` (Imagen) |
+| Image Generation | ✅ | ❌ | `/generateContent` or `/predict` (Imagen) |
tests/integrations/python/tests/test_google.py (1)

2318-2339: New function-call extractor looks fine; consider using it to reduce repeated hasattr(response, "function_calls") checks.
-->

core/providers/bedrock/bedrock.go (1)

1347-1355: Nit: make unsupported-operation provider naming consistent across this file.
These new stubs use provider.GetProviderKey(), while older unsupported methods (e.g., Speech/Transcription) pass schemas.Bedrock, which can produce mismatched provider names when custom provider config is used.

Proposed consistency tweak
 func (provider *BedrockProvider) Speech(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostSpeechRequest) (*schemas.BifrostSpeechResponse, *schemas.BifrostError) {
-    return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, schemas.Bedrock)
+    return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
 }
core/providers/gemini/types.go (1)

1702-1707: Consider using a named type for Instances.

The anonymous struct pattern for Instances is unusual and harder to work with:

Instances *[]struct {
    Prompt *string `json:"prompt"`
} `json:"instances"`

Consider using the already-defined ImagenInstance type for consistency:

♻️ Suggested improvement
 type GeminiImagenRequest struct {
-	Instances *[]struct {
-		Prompt *string `json:"prompt"`
-	} `json:"instances"`
+	Instances []ImagenInstance `json:"instances"`
 	Parameters GeminiImagenParameters `json:"parameters"`
 }

Note: ImagenInstance is already defined at lines 91-93 and could be reused here.

core/providers/nebius/errors.go (1)

11-68: Add a nil-guard for bifrostErr and consider pairing validation message + location.

HandleProviderAPIError is assumed non-nil; a guard avoids a panic if that contract ever changes. Also, aggregating locations separately can make debugging harder when multiple validation errors exist.

Proposed tweak
 func parseNebiusImageError(resp *fasthttp.Response, meta *providerUtils.RequestMetadata) *schemas.BifrostError {
 	var nebiusErr NebiusError
 	bifrostErr := providerUtils.HandleProviderAPIError(resp, &nebiusErr)
+	if bifrostErr == nil {
+		return &schemas.BifrostError{
+			IsBifrostError: false,
+			Error: &schemas.ErrorField{
+				Message: "provider API error",
+			},
+		}
+	}

 	if bifrostErr.Error == nil {
 		bifrostErr.Error = &schemas.ErrorField{}
 	}
@@
 		if len(nebiusErr.Detail.ValidationErrors) > 0 {
-			var messages []string
-			var locations []string
+			var messages []string
@@
 			for _, detail := range nebiusErr.Detail.ValidationErrors {
 				if detail.Msg != "" {
-					messages = append(messages, detail.Msg)
-				}
-				if len(detail.Loc) > 0 {
-					locations = append(locations, strings.Join(detail.Loc, "."))
+					if len(detail.Loc) > 0 {
+						messages = append(messages, detail.Msg+" ["+strings.Join(detail.Loc, ".")+"]")
+					} else {
+						messages = append(messages, detail.Msg)
+					}
 				}
 			}
@@
-			if len(locations) > 0 {
-				locationStr := strings.Join(locations, ", ")
-				if message == "" {
-					message = "[" + locationStr + "]"
-				} else {
-					message = message + " [" + locationStr + "]"
-				}
-			}
 		}
 	}
plugins/logging/main.go (1)

574-583: Consider DB/UI bloat from storing full image outputs in logs.

If BifrostImageGenerationResponse.Data contains base64 payloads, logging it can balloon DB size and slow log queries/UI rendering. If this is expected, OK; otherwise consider storing only metadata (counts, sizes, URLs) or gating behind a dedicated config.

ui/app/workspace/logs/views/imageView.tsx (2)

19-39: LGTM: index clamping avoids out-of-range on output updates; navigation is safe.
Optional: images could be useMemo’d if this view re-renders frequently with large outputs.


56-99: Consider whether ImageMessage should receive isStreaming even when currentImage exists.
Right now you pass isStreaming={isStreaming && !currentImage}, so once the first image arrives you’ll never show the streaming skeleton/progress even if more images are still streaming in. If that’s intended, ignore; if not, wire isStreaming (and maybe progress) differently.

framework/logstore/migrations.go (1)

684-748: LGTM: guarded add/drop column migrations with transactions enabled.
Minor: consider grouping input+output into one migration ID if you want them to be “atomic” as a pair (optional).

docs/openapi/schemas/integrations/openai/images.yaml (1)

98-104: Minor inconsistency in stream event type enum.

The type enum has inconsistent naming: image_generation.partial_image and image_generation.completed follow a namespace pattern, but error does not. Consider using image_generation.error for consistency with the other values.

Suggested fix
     enum:
       - "image_generation.partial_image"
       - "image_generation.completed"
-      - "error"
+      - "image_generation.error"
core/providers/nebius/nebius.go (1)

129-135: Consider escaping ai_project_id in ChatCompletion for consistency.

The ImageGeneration method at line 269 properly uses url.QueryEscape when appending ai_project_id to the query string, but ChatCompletion uses fmt.Sprint(rawID) directly without escaping. This could be a security concern if the value contains special characters.

Suggested fix
 	// Append query parameter if present
 	if rawID, ok := request.Params.ExtraParams["ai_project_id"]; ok && rawID != nil {
+		escapedID := url.QueryEscape(fmt.Sprint(rawID))
 		if strings.Contains(path, "?") {
-			path = path + "&ai_project_id=" + fmt.Sprint(rawID)
+			path = path + "&ai_project_id=" + escapedID
 		} else {
-			path = path + "?ai_project_id=" + fmt.Sprint(rawID)
+			path = path + "?ai_project_id=" + escapedID
 		}
 	}
core/providers/nebius/types.go (1)

26-29: Consider using URL instead of Url for consistency with Go naming conventions.

Go convention typically uses all-caps for acronyms in exported identifiers (e.g., URL, ID, HTTP).

Suggested change
 type NebiusLora struct {
-	Url   string `json:"url"`
+	URL   string `json:"url"`
 	Scale int    `json:"scale"`
 }
core/internal/testutil/image_generation.go (1)

284-287: Slice append may unintentionally modify the original Errors slice.

Using append(validationResult.Errors, validationResult.StreamErrors...) can modify the backing array of validationResult.Errors if it has sufficient capacity. Since this is only used for error reporting and the result is immediately consumed, the practical impact is minimal, but it's worth noting for correctness.

Safer alternative
 	if !validationResult.Passed {
-		allErrors := append(validationResult.Errors, validationResult.StreamErrors...)
+		allErrors := make([]string, 0, len(validationResult.Errors)+len(validationResult.StreamErrors))
+		allErrors = append(allErrors, validationResult.Errors...)
+		allErrors = append(allErrors, validationResult.StreamErrors...)
 		t.Fatalf("❌ Image generation stream validation failed: %s", strings.Join(allErrors, "; "))
 	}
core/providers/azure/azure.go (1)

1269-1269: Remove orphaned empty comment.

Line 1269 has an empty comment // which appears to be a leftover. Either remove it or add meaningful documentation.

Suggested fix
-	//
 	deployment := key.AzureKeyConfig.Deployments[request.Model]
framework/modelcatalog/pricing.go (1)

16-120: Normalize image-generation stream request type without depending on imageUsage presence.

Right now, pricing normalization only happens when imageUsage != nil (Line 116). If upstream ever produces an image stream response where ExtraFields.RequestType is image_generation_stream but Usage is absent, you’ll silently skip normalization and risk a pricing miss. Consider normalizing purely based on requestType.

core/providers/openai/openai.go (1)

2493-2888: Streaming parser: bufio.Scanner max token (10MB) is a likely ceiling for base64-heavy chunks.

If OpenAI emits large b64_json payloads, scanner.Scan() will fail with ErrTooLong and you’ll end up treating it as a stream read error at the end (Lines 2881-2884). Consider switching to a bufio.Reader-based SSE parser, or at least raising the limit defensively.

Minimal mitigation (bump scanner limit)
-		scanner := bufio.NewScanner(resp.BodyStream())
-		buf := make([]byte, 0, 1024*1024)
-		scanner.Buffer(buf, 10*1024*1024)
+		scanner := bufio.NewScanner(resp.BodyStream())
+		buf := make([]byte, 0, 1024*1024)
+		// Image payloads (base64) can be large; keep this comfortably above expected chunk sizes.
+		scanner.Buffer(buf, 64*1024*1024)
transports/bifrost-http/integrations/router.go (1)

1195-1416: Avoid emitting duplicate event: lines when the converter returns a full SSE string.

Right now you write event: <eventType> before checking whether convertedResponse is a preformatted SSE string (Lines 1308-1377). If an integration ever returns both (non-empty eventType + "event: ...\ndata: ...\n\n"), the stream output becomes malformed.

Proposed fix
-				if eventType != "" {
-					// OPENAI RESPONSES FORMAT: Use event: and data: lines for OpenAI responses API compatibility
-					if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil {
-						cancel() // Client disconnected (write error), cancel upstream stream
-						return
-					}
-				}
+				// If the converter returned a *full* SSE string (already contains event/data),
+				// don't prepend a second event line.
+				if s, ok := convertedResponse.(string); !ok || (!strings.HasPrefix(s, "event:") && !strings.HasPrefix(s, "data:")) {
+					if eventType != "" {
+						if _, err := fmt.Fprintf(w, "event: %s\n", eventType); err != nil {
+							cancel()
+							return
+						}
+					}
+				}
framework/streaming/images.go (1)

271-274: Remove empty if block.

This if block contains only a comment explaining that usage will be extracted elsewhere. Consider removing it to reduce code noise.

♻️ Proposed fix
-		// Extract usage if available
-		if result.ImageGenerationStreamResponse.Usage != nil {
-			// Note: ImageUsage doesn't directly map to BifrostLLMUsage, but we can store it
-			// The actual usage will be extracted in processAccumulatedImageStreamingChunks
-		}
+		// Note: ImageUsage extraction is handled in processAccumulatedImageStreamingChunks
framework/streaming/types.go (1)

121-128: Consider adding MaxImageChunkIndex for consistency with other stream types.

Other stream types (Chat, Responses, Transcription, Audio) have corresponding Max*ChunkIndex fields in StreamAccumulator for metadata extraction. The image stream uses a composite string key for de-duplication (ImageChunksSeen map[string]struct{}), which is appropriate for per-image scoping, but there's no MaxImageChunkIndex field.

If image streaming metadata (TokenUsage, Cost, FinishReason) needs to be extracted from the last chunk similar to other stream types, consider adding a tracking mechanism.

docs/openapi/schemas/inference/images.yaml (1)

101-177: Either use or remove ImageGenerationResponseParameters.

ImageGenerationResponseParameters is defined but not referenced by ImageGenerationResponse. If the intent is to mirror the Go embedding (*ImageGenerationResponseParameters), consider composing via allOf, otherwise drop the unused schema to avoid drift.

framework/streaming/accumulator.go (1)

293-325: Dedup key allocs + missing defensive checks (optional).

  • fmt.Sprintf("%d:%d", ...) allocates per chunk; you could use a [2]int key (or a small struct) to reduce GC in high-throughput streams.
  • Consider guarding chunk == nil and negative indices to avoid panics / map blowups from malformed provider output.
Proposed (lower-allocation) key change
- func imageChunkKey(imageIndex, chunkIndex int) string {
- 	return fmt.Sprintf("%d:%d", imageIndex, chunkIndex)
- }
+ type imageChunkKey struct {
+ 	imageIndex int
+ 	chunkIndex int
+ }
- chunkKey := imageChunkKey(chunk.ImageIndex, chunk.ChunkIndex)
+ chunkKey := imageChunkKey{imageIndex: chunk.ImageIndex, chunkIndex: chunk.ChunkIndex}
  if _, seen := acc.ImageChunksSeen[chunkKey]; !seen {
- ImageChunksSeen:            make(map[string]struct{}),
+ ImageChunksSeen:            make(map[imageChunkKey]struct{}),
core/providers/huggingface/images.go (1)

15-18: Add “missing prompt” validation (align with existing repo behavior).
You guard bifrostReq == nil || bifrostReq.Input == nil, but a blank prompt currently passes through. If core-level validation is bypassed (or this helper is used directly), we’ll send invalid requests downstream. Based on learnings, image generation should reject missing prompts before dispatch.

Also applies to: 157-160

core/providers/huggingface/huggingface.go (1)

882-966: ImageGeneration: add “missing prompt” validation (align with repo expectation).
This method relies on ToHuggingFaceImageGenerationRequest, which currently doesn’t reject blank prompts; consider rejecting empty/whitespace prompts here before dispatch (same rationale as other providers per learnings).

core/internal/testutil/test_retry_framework.go (1)

980-994: Consider renaming DefaultImageGenerationRetryConfig() to clarify it returns the generic TestRetryConfig.

The function name suggests it returns ImageGenerationRetryConfig (the type-specific config), but it actually returns the deprecated generic TestRetryConfig. This creates confusion when callers manually convert it to ImageGenerationRetryConfig before passing to WithImageGenerationRetry() (see image_generation.go:48-55).

Either rename to indicate it's the generic stream-oriented config (e.g., DefaultImageGenerationTestRetryConfig()), or add a helper that converts TestRetryConfigImageGenerationRetryConfig to follow the established pattern for domain-specific configs.

Also applies to line 1250-1251 where it's returned in the scenario router.

Comment on lines +2097 to +2260
func WithImageGenerationRetry(
t *testing.T,
config ImageGenerationRetryConfig,
context TestRetryContext,
expectations ResponseExpectations,
scenarioName string,
operation func() (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError),
) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {

var lastResponse *schemas.BifrostImageGenerationResponse
var lastError *schemas.BifrostError

for attempt := 1; attempt <= config.MaxAttempts; attempt++ {
context.AttemptNumber = attempt

// Execute the operation
response, err := operation()
lastResponse = response
lastError = err

// If we have a response, validate it FIRST
if response != nil {
validationResult := ValidateImageGenerationResponse(t, response, err, expectations, scenarioName)

// If validation passes, we're done!
if validationResult.Passed {
return response, err
}

// Validation failed - ALWAYS retry validation failures for functionality checks
// Network errors are handled by bifrost core, so these are content/functionality validation errors
if attempt < config.MaxAttempts {
// ALWAYS retry on timeout errors - this takes precedence over all other conditions
if err != nil && isTimeoutError(err) {
retryReason := fmt.Sprintf("❌ timeout error detected: %s", GetErrorMessage(err))
if config.OnRetry != nil {
config.OnRetry(attempt, retryReason, t)
}

// Calculate delay with exponential backoff
delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
time.Sleep(delay)
continue
}

// Check other retry conditions first (for logging/debugging)
shouldRetryFromConditions, conditionReason := checkImageGenerationRetryConditions(response, err, context, config.Conditions)

// ALWAYS retry on validation failures - this is the primary purpose of these tests
// Content validation errors indicate functionality issues that should be retried
shouldRetry := len(validationResult.Errors) > 0
var retryReason string

if shouldRetry {
// Validation failures are the primary retry reason - ALWAYS prefix with ❌
retryReason = fmt.Sprintf("❌ validation failure (content/functionality check): %s", strings.Join(validationResult.Errors, "; "))
// Append condition-based reason if present for additional context
if shouldRetryFromConditions && conditionReason != "" {
retryReason += fmt.Sprintf(" | also: %s", conditionReason)
}
} else if shouldRetryFromConditions {
// Fallback to condition-based retry if no validation errors (edge case)
// Ensure ❌ prefix for consistency with error logging
shouldRetry = true
if !strings.Contains(conditionReason, "❌") {
retryReason = fmt.Sprintf("❌ %s", conditionReason)
} else {
retryReason = conditionReason
}
}

if shouldRetry {
if config.OnRetry != nil {
config.OnRetry(attempt, retryReason, t)
}

// Calculate delay with exponential backoff
delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
time.Sleep(delay)
continue
}
}

// All retries failed validation - create a BifrostError to force test failure
validationErrors := strings.Join(validationResult.Errors, "; ")

if config.OnFinalFail != nil {
finalErr := fmt.Errorf("❌ validation failed after %d attempts: %s", attempt, validationErrors)
config.OnFinalFail(attempt, finalErr, t)
}

// Return nil response + BifrostError so calling test fails
statusCode := 400
testFailureError := &schemas.BifrostError{
IsBifrostError: true,
StatusCode: &statusCode,
Error: &schemas.ErrorField{
Message: fmt.Sprintf("❌ Validation failed after %d attempts: %s", attempt, validationErrors),
},
}
return nil, testFailureError
}

// If we have an error without a response, check if we should retry
if err != nil && attempt < config.MaxAttempts {
// ALWAYS retry on timeout errors - this takes precedence over other conditions
if isTimeoutError(err) {
retryReason := fmt.Sprintf("❌ timeout error detected: %s", GetErrorMessage(err))
if config.OnRetry != nil {
config.OnRetry(attempt, retryReason, t)
}

// Calculate delay with exponential backoff
delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
time.Sleep(delay)
continue
}

shouldRetry, retryReason := checkImageGenerationRetryConditions(response, err, context, config.Conditions)

// ALWAYS retry on non-structural errors (network errors are handled by bifrost core)
// If no condition matches, still retry on any error as it's likely transient
if !shouldRetry {
shouldRetry = true
errorMsg := GetErrorMessage(err)
if !strings.Contains(errorMsg, "❌") {
errorMsg = fmt.Sprintf("❌ %s", errorMsg)
}
retryReason = fmt.Sprintf("❌ non-structural error (will retry): %s", errorMsg)
} else if !strings.Contains(retryReason, "❌") {
retryReason = fmt.Sprintf("❌ %s", retryReason)
}

if shouldRetry {
if config.OnRetry != nil {
config.OnRetry(attempt, retryReason, t)
}

// Calculate delay with exponential backoff
delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
time.Sleep(delay)
continue
}
}

// If we get here, either we got a final error or no more retries
break
}

// Final failure callback
if config.OnFinalFail != nil && lastError != nil {
errorMsg := "unknown error"
if lastError.Error != nil {
errorMsg = lastError.Error.Message
}
// Ensure error message has ❌ prefix if not already present
if !strings.Contains(errorMsg, "❌") {
errorMsg = fmt.Sprintf("❌ %s", errorMsg)
}
config.OnFinalFail(config.MaxAttempts, fmt.Errorf("❌ final error: %s", errorMsg), t)
}

return lastResponse, lastError
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential false-pass: (nil response, nil err) currently ends retries with success-like return.
If operation() ever returns (nil, nil), this function breaks out and returns (nil, nil) without calling OnFinalFail—tests could incorrectly pass depending on the caller. Recommend treating response == nil as retryable failure (similar to how stream wrappers handle responseChannel == nil).

Minimal defensive fix
 		// Execute the operation
 		response, err := operation()
 		lastResponse = response
 		lastError = err

+		// Treat nil response as a failure (even if err is nil) to avoid false-passing tests
+		if response == nil {
+			if attempt < config.MaxAttempts {
+				retryReason := "❌ image generation response is nil"
+				if config.OnRetry != nil {
+					config.OnRetry(attempt, retryReason, t)
+				}
+				delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
+				time.Sleep(delay)
+				continue
+			}
+			statusCode := 400
+			return nil, &schemas.BifrostError{
+				IsBifrostError: true,
+				StatusCode:     &statusCode,
+				Error: &schemas.ErrorField{
+					Message: fmt.Sprintf("❌ image generation response is nil after %d attempts", attempt),
+				},
+			}
+		}
-
-		// If we have a response, validate it FIRST
-		if response != nil {
+		// If we have a response, validate it FIRST
+		if response != nil {

var parsed int64
parsed, parseErr = strconv.ParseInt(v, 10, 64)
if parseErr == nil {
minImages = int(parsed)

Check failure

Code scanning / CodeQL

Incorrect conversion between integer types High test

Incorrect conversion of a signed 64-bit integer from strconv.ParseInt to a lower bit size type int without an upper bound check.

Copilot Autofix

AI about 1 month ago

In general, to fix this class of problem you must ensure that the bit size used when parsing matches the size of the target type, or you must explicitly check that the parsed value is within the target type’s bounds before converting. Here, the target type is int (potentially 32‑bit), but we parse as 64‑bit and then cast without checks in the json.Number and string cases.

The best fix with minimal functional change is to validate that the parsed int64 lies within the representable range of int on all platforms before doing int(parsed). math does not provide MaxInt/MinInt, but Go defines int to be either 32 or 64 bits, so we can compute the appropriate bounds based on the platform’s int size. A simple, portable pattern is:

const (
    maxInt = int(^uint(0) >> 1)
    minInt = -maxInt - 1
)

We can define these constants once in this file and then, before assigning minImages = int(parsed), check that parsed is between minInt and maxInt. If it is out of range, we set parseErr to a descriptive error and avoid performing the narrowing conversion. This needs to be applied to all code paths where an int64 from user/config parsing is converted to int — in this snippet, that is the json.Number case (1131) and the string case (1137). No new imports are required.

Concretely:

  • Add two const declarations near the top of core/internal/testutil/response_validation.go to define maxInt and minInt.
  • Replace minImages = int(parsed) in the json.Number case with a bounds check against [minInt, maxInt], setting parseErr if out of range.
  • Do the same in the string case.
  • Leave the rest of the logic unchanged; if parseErr is set, the existing code already logs a warning and skips the min_images check.
Suggested changeset 1
core/internal/testutil/response_validation.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/core/internal/testutil/response_validation.go b/core/internal/testutil/response_validation.go
--- a/core/internal/testutil/response_validation.go
+++ b/core/internal/testutil/response_validation.go
@@ -11,6 +11,11 @@
 	"github.com/maximhq/bifrost/core/schemas"
 )
 
+const (
+	maxInt = int(^uint(0) >> 1)
+	minInt = -maxInt - 1
+)
+
 // =============================================================================
 // RESPONSE VALIDATION FRAMEWORK
 // =============================================================================
@@ -1128,13 +1133,21 @@
 				var parsed int64
 				parsed, parseErr = v.Int64()
 				if parseErr == nil {
-					minImages = int(parsed)
+					if parsed < int64(minInt) || parsed > int64(maxInt) {
+						parseErr = fmt.Errorf("min_images value %d out of range for int", parsed)
+					} else {
+						minImages = int(parsed)
+					}
 				}
 			case string:
 				var parsed int64
 				parsed, parseErr = strconv.ParseInt(v, 10, 64)
 				if parseErr == nil {
-					minImages = int(parsed)
+					if parsed < int64(minInt) || parsed > int64(maxInt) {
+						parseErr = fmt.Errorf("min_images value %d out of range for int", parsed)
+					} else {
+						minImages = int(parsed)
+					}
 				}
 			default:
 				parseErr = fmt.Errorf("unsupported type for min_images: %T", v)
EOF
@@ -11,6 +11,11 @@
"github.com/maximhq/bifrost/core/schemas"
)

const (
maxInt = int(^uint(0) >> 1)
minInt = -maxInt - 1
)

// =============================================================================
// RESPONSE VALIDATION FRAMEWORK
// =============================================================================
@@ -1128,13 +1133,21 @@
var parsed int64
parsed, parseErr = v.Int64()
if parseErr == nil {
minImages = int(parsed)
if parsed < int64(minInt) || parsed > int64(maxInt) {
parseErr = fmt.Errorf("min_images value %d out of range for int", parsed)
} else {
minImages = int(parsed)
}
}
case string:
var parsed int64
parsed, parseErr = strconv.ParseInt(v, 10, 64)
if parseErr == nil {
minImages = int(parsed)
if parsed < int64(minInt) || parsed > int64(maxInt) {
parseErr = fmt.Errorf("min_images value %d out of range for int", parsed)
} else {
minImages = int(parsed)
}
}
default:
parseErr = fmt.Errorf("unsupported type for min_images: %T", v)
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed and resolved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
plugins/maxim/main.go (2)

125-138: Add ImageGenerationOutput assignment to AccumulatedData struct initialization.

The ImageGenerationOutput field is checked to determine the stream type but is not being copied to the AccumulatedData struct. Both AccumulatedData and StreamAccumulatorResult have this field, and it should be assigned alongside AudioOutput and TranscriptionOutput for consistency:

ImageGenerationOutput: accResult.ImageGenerationOutput,

This should be added to the struct initialization at lines 125-138.


548-567: Add image generation request type handling to the result logging switch.

ImageGenerationRequest and ImageGenerationStreamRequest types are not handled in this switch statement. Since these types are supported across the codebase (e.g., in mocker and logging plugins) and providers like OpenAI and Nebius support image generation, Maxim should log image generation results consistently. Add:

case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
	if streamResponse != nil {
		logger.AddResultToGeneration(generationID, streamResponse.ToBifrostResponse().ImageGenerationResponse)
	} else {
		logger.AddResultToGeneration(generationID, result.ImageGenerationResponse)
	}
core/schemas/responses.go (1)

540-575: Error messages reference "image generation call output" but there's no corresponding field or handling logic.

The error messages at lines 556 and 574 mention "image generation call output" as a valid type, but:

  1. The ResponsesToolMessageOutputStruct struct (lines 540-544) doesn't include an image generation output field
  2. MarshalJSON only handles 3 types: string, array blocks, and computer tool call output
  3. UnmarshalJSON only attempts to unmarshal those same 3 types

This creates a misleading error message. Either:

  • Remove "image generation call output" from the error messages if it's not a valid type for this struct, or
  • Add the missing field and handling logic if image generation outputs should be supported here
🔧 Option A: Remove misleading text from error messages
-	return nil, fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output")
+	return nil, fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data")
-	return fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data nor an image generation call output")
+	return fmt.Errorf("responses tool message output struct is neither a string nor an array of responses message content blocks nor a computer tool call output data")
framework/modelcatalog/main.go (1)

223-230: Add schemas.ImageGenerationRequest to the pricing lookup modes.

GetPricingEntryForModel at lines 223-230 uses a hardcoded list of RequestType modes that excludes schemas.ImageGenerationRequest. Since image generation models are supported in this codebase (Gemini image, Imagen, DALL-E, etc.) and pricing.go explicitly handles image generation pricing, omitting ImageGenerationRequest from the mode loop prevents these models' pricing from being retrieved.

Add schemas.ImageGenerationRequest to the mode list to ensure image generation models are discoverable via GetPricingEntryForModel.

core/providers/gemini/responses.go (2)

459-465: Fix thought-signature extraction for stream output_item.added (bad field check + missing base64 decode).

Current code checks bifrostResp.Item.EncryptedContent (likely non-existent) and then casts the encrypted base64 string to []byte, which corrupts the signature and may not compile depending on the ResponsesMessage struct shape.

Proposed fix
 case schemas.ResponsesStreamResponseTypeOutputItemAdded:
-    if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil && bifrostResp.Item.EncryptedContent != nil {
-        candidate.Content.Parts = append(candidate.Content.Parts, &Part{
-            ThoughtSignature: []byte(*bifrostResp.Item.ResponsesReasoning.EncryptedContent),
-        })
-    }
+    if bifrostResp.Item != nil && bifrostResp.Item.ResponsesReasoning != nil &&
+        bifrostResp.Item.ResponsesReasoning.EncryptedContent != nil {
+        decodedSig, err := base64.StdEncoding.DecodeString(*bifrostResp.Item.ResponsesReasoning.EncryptedContent)
+        if err == nil {
+            candidate.Content.Parts = append(candidate.Content.Parts, &Part{
+                ThoughtSignature: decodedSig,
+            })
+        }
+    }

1609-1618: Apply consistent prefix guard for image data URLs, matching file handling.

The image handling at lines 1609-1618 should include the same prefix check that protects file handling at lines 1649-1652. Currently, if blob.Data unexpectedly contains a data: prefix, the image code will produce malformed URLs like :image/.... The file path already prevents this with a guard—apply the same pattern to images.

Proposed fix
 if isImageMimeType(mimeType) {
-    // Convert to base64 data URL
-    imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, blob.Data)
+    imageDataURL := blob.Data
+    if !strings.HasPrefix(imageDataURL, "data:") {
+        imageDataURL = fmt.Sprintf("data:%s;base64,%s", mimeType, imageDataURL)
+    }
     return &schemas.ResponsesMessageContentBlock{
         Type: schemas.ResponsesInputMessageContentBlockTypeImage,
         ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{
-            ImageURL: &imageURL,
+            ImageURL: &imageDataURL,
         },
     }
 }

Also applies to: 1649-1652

plugins/semanticcache/utils.go (1)

380-412: Avoid duplicating large image payloads in semantic cache metadata (base64 can explode storage)
metadata["response"] already contains the full response (potentially with b64_json), and you also add metadata["image_b64"]. Consider storing only URLs / revised prompts, or store hashes/lengths for base64 instead of the full strings.

🤖 Fix all issues with AI agents
In `@core/internal/testutil/response_validation.go`:
- Around line 1097-1157: The min_images parsing in validateImageGenerationFields
silently records parse errors without failing and performs unchecked narrowing
to int; update the switch in validateImageGenerationFields to (1) validate
numeric ranges when converting int64/float64/json.Number to int by checking
against int64(intMin)/int64(intMax) (for float64 ensure it is an integer value
via math.Trunc and within bounds), (2) treat any parse/overflow error as a hard
validation failure by setting result.Passed = false and appending a clear error
message (instead of only recording a warning), and (3) keep recording
result.MetricsCollected["image_count"] and comparing against minImages only when
conversion succeeded; reference the ProviderSpecific["min_images"] handling and
result.Errors/result.Passed modifications in your patch.

In `@core/providers/azure/azure.go`:
- Around line 1257-1303: The streaming ImageGenerationStream call currently
passes nil converters to openai.HandleOpenAIImageGenerationStreaming so each
streamed BifrostImageGenerationResponse chunk never gets the deployment
metadata; update the openai.HandleOpenAIImageGenerationStreaming invocation in
ImageGenerationStream to supply a postResponseConverter that sets
resp.ExtraFields.ModelDeployment = deployment when resp != nil (i.e., a small
function that mutates and returns the response) so every streamed chunk is
stamped with the ModelDeployment value just like the sync path does.

In `@core/providers/gemini/gemini.go`:
- Around line 1614-1671: Update the incorrect comment in
handleImagenImageGeneration to say it uses the Generative Language API
(x-goog-api-key / /models/{model}:predict), not "Vertex AI endpoint with API key
auth", and add a nil-check after calling
GeminiImagenResponse.ToBifrostImageGenerationResponse() in
handleImagenImageGeneration (or wherever GeminiImagenResponse is converted) to
handle the case where ToBifrostImageGenerationResponse() returns nil before you
dereference response.ExtraFields; if nil, return a BifrostError with ExtraFields
populated (Provider, ModelRequested, RequestType) or a suitable error. Ensure
references to ToBifrostImageGenerationResponse, GeminiImagenResponse, and
handleImagenImageGeneration are used to locate the changes.

In `@core/providers/huggingface/huggingface_test.go`:
- Around line 34-36: The ImageGenerationModel value is using a duplicated
"fal-ai" prefix; update the test data to remove the extra segment so it matches
other fal-ai entries and the URL-building logic in utils.go (which already
prepends "/fal-ai/"); specifically change the ImageGenerationModel in
huggingface_test.go from "fal-ai/fal-ai/flux/dev" to either "fal-ai/flux/dev" or
just "flux/dev" as appropriate, and make the same fix in
core/internal/testutil/account.go where "fal-ai/fal-ai/flux-2" appears so both
locations are consistent with utils.go's model path handling.

In `@core/providers/huggingface/images.go`:
- Around line 329-349: The switch branch handling falAI currently only reads the
legacy flattened falResponse.Images; update it to handle both formats by
checking HuggingFaceFalAIImageGenerationResponse.Data.Images first and using
that if non-empty, otherwise fall back to falResponse.Images (or vice-versa per
types.go), then map whichever slice is present into schemas.ImageData
(preserving URL, B64JSON, Index) before returning the
schemas.BifrostImageGenerationResponse; alternatively, if you choose not to
implement fallback, add a clear comment in the falAI branch stating only the
flattened Images format is expected and why.

In `@core/providers/huggingface/models.go`:
- Around line 67-69: The text-to-image pipeline handling currently only
registers non-streaming requests; update the switch case handling
"text-to-image" to call addMethods with both schemas.ImageGenerationRequest and
schemas.ImageGenerationStreamRequest (so streaming image-generation is
supported), and extend the tag-based capability detection loop (the code that
checks tags for embeddings/chat/speech/transcription) to detect image-generation
tags such as "text-to-image", "image-generation", "image-generation.*" (or
similar patterns) and mark the model accordingly so tag detection matches the
pipeline case.

In `@tests/integrations/python/config.yml`:
- Around line 170-171: The huggingface image_generation model path has a
duplicated namespace; update the value of the image_generation key (currently
"fal-ai/fal-ai/flux/dev") to the correct model ID "fal-ai/flux/dev" so it
follows the namespace/model/variant format used by FAL.
♻️ Duplicate comments (19)
docs/openapi/schemas/inference/images.yaml (1)

195-203: Consider adding n_images field to ImageTokenDetails.

Per the Go implementation, OutputTokensDetails.NImages tracks the number of generated images. Adding this field would align the schema with the actual response shape.

♻️ Suggested addition
 ImageTokenDetails:
   type: object
   properties:
     image_tokens:
       type: integer
       description: Tokens used for images
     text_tokens:
       type: integer
       description: Tokens used for text
+    n_images:
+      type: integer
+      description: Number of images generated
core/providers/openai/openai_test.go (1)

43-83: All test scenarios remain disabled - blocking for merge.

This was previously flagged: all scenarios except ImageGeneration and ImageGenerationStream are set to false. This effectively disables comprehensive OpenAI provider testing.

Before merging to main, please re-enable the appropriate test scenarios to maintain full test coverage. Compare with the HuggingFace test configuration which properly keeps other scenarios enabled alongside the new image generation tests.

core/providers/vertex/vertex.go (2)

1493-1559: API-key authentication is not supported for Vertex AI predict endpoints.

The code still uses API-key-in-query authentication for Imagen (lines 1514-1516), Gemini (lines 1523-1525), and fine-tuned models (lines 1503-1505). Per Vertex AI documentation, these endpoints require OAuth2 Bearer tokens—API keys will fail with "API keys are not supported by this API" errors.

The conditional at lines 1545-1558 skips OAuth2 token fetching when authQuery is set, leaving no fallback.


1420-1431: Add nil-request guard to prevent panic on request.Model dereference.

Line 1427 dereferences request.Model before any nil check. While upstream validation may exist, defensive programming within the method prevents panics if called incorrectly.

Based on learnings, the pattern used in Nebius (and OpenAI) should be applied here.

Proposed fix
 func (provider *VertexProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
 	providerName := provider.GetProviderKey()
 
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, providerName)
+	}
+
+	if request.Input == nil || strings.TrimSpace(request.Input.Prompt) == "" {
+		return nil, providerUtils.NewBifrostOperationError("prompt cannot be empty", nil, providerName)
+	}
+
 	if key.VertexKeyConfig == nil {
 		return nil, providerUtils.NewConfigurationError("vertex key config is not set", providerName)
 	}
docs/providers/supported-providers/nebius.mdx (1)

153-156: Change guidance_scale type from int to float to match Nebius API specification.

The Nebius API specifies guidance_scale as a numeric float (range 0–100), but the documentation shows int. This was flagged in a previous review and requires updates in both documentation and implementation.

core/providers/azure/azure.go (1)

1215-1225: Guard nil/empty requests and use getModelDeployment (prevents panics).
This is the same issue already called out in prior review comments: request.Model is read without a nil check, and deployment lookup bypasses getModelDeployment.

Also applies to: 1257-1273

framework/modelcatalog/pricing.go (1)

276-319: Tiering fallback still missing when imageUsage.TotalTokens is zero but other token counts exist.

When providers omit total_tokens, imageTotalTokens remains 0 (Line 279), causing isAbove128k/isAbove200k (Lines 322-324) to always be false. This under-tiers requests that should use higher-tier pricing.

The per-image path (Lines 282-319) correctly short-circuits when all tokens are zero, but the token-based path (Lines 321+) will mis-tier if TotalTokens is zero while InputTokens/OutputTokens are populated.

Proposed fix to compute fallback total
 	// Use imageUsage.TotalTokens for tier determination
 	imageTotalTokens := imageUsage.TotalTokens
+	if imageTotalTokens == 0 {
+		// Compute fallback from available token counts
+		if imageUsage.InputTokensDetails != nil {
+			imageTotalTokens += imageUsage.InputTokensDetails.ImageTokens + imageUsage.InputTokensDetails.TextTokens
+		} else {
+			imageTotalTokens += imageUsage.InputTokens
+		}
+		if imageUsage.OutputTokensDetails != nil {
+			imageTotalTokens += imageUsage.OutputTokensDetails.ImageTokens + imageUsage.OutputTokensDetails.TextTokens
+		} else {
+			imageTotalTokens += imageUsage.OutputTokens
+		}
+	}
 
 	// Check if tokens are zero/nil - if so, use per-image pricing
-	if imageTotalTokens == 0 && imageUsage.InputTokens == 0 && imageUsage.OutputTokens == 0 {
+	if imageTotalTokens == 0 {
framework/streaming/images.go (1)

45-114: Multi-chunk path still drops URL and uses synthetic Created timestamp.

As noted in a prior review, the multi-chunk reconstruction path:

  1. Sets Created: time.Now().Unix() (Line 108) instead of preserving provider timestamps from chunks
  2. Does not track or preserve URL fields (Lines 84-94 only set B64JSON, Index, RevisedPrompt)

If a provider streams URL-only images or includes the timestamp only in certain chunks, this data is lost.

Proposed fix
 	// Reconstruct complete images from chunks
 	images := make(map[int]*strings.Builder)
+	imageURLs := make(map[int]string)
 	var model string
 	var revisedPrompts map[int]string = make(map[int]string)
+	var createdAt int64

 	for _, chunk := range chunks {
 		if chunk.Delta == nil {
 			continue
 		}

 		// Extract metadata
 		if model == "" && chunk.Delta.ExtraFields.ModelRequested != "" {
 			model = chunk.Delta.ExtraFields.ModelRequested
 		}

+		// Track created timestamp (first non-zero wins)
+		if createdAt == 0 && chunk.Delta.CreatedAt != 0 {
+			createdAt = chunk.Delta.CreatedAt
+		}

 		// Store revised prompt if present (usually in first chunk)
 		if chunk.Delta.RevisedPrompt != "" {
 			revisedPrompts[chunk.ImageIndex] = chunk.Delta.RevisedPrompt
 		}

+		// Track URL if present (last non-empty wins)
+		if chunk.Delta.URL != "" {
+			imageURLs[chunk.ImageIndex] = chunk.Delta.URL
+		}

 		// Reconstruct base64 for each image
 		// ...
 	}

 	// In ImageData construction:
 	imageData = append(imageData, schemas.ImageData{
 		B64JSON:       builder.String(),
+		URL:           imageURLs[imageIndex],
 		Index:         imageIndex,
 		RevisedPrompt: revisedPrompts[imageIndex],
 	})

 	// In finalResponse:
 	finalResponse := &schemas.BifrostImageGenerationResponse{
 		ID:      responseID,
-		Created: time.Now().Unix(),
+		Created: func() int64 { if createdAt != 0 { return createdAt }; return time.Now().Unix() }(),
 		Model:   model,
 		Data:    imageData,
 	}
docs/openapi/paths/integrations/openai/images.yaml (1)

62-66: Add required: true for api-version parameter.

Azure OpenAI image generation endpoints require the api-version query parameter. The endpoint pattern per Microsoft docs is POST https://<resource>.openai.azure.com/openai/deployments/<deployment>/images/generations?api-version=<api_version>.

🔧 Suggested fix
       - name: api-version
         in: query
+        required: true
         schema:
           type: string
         description: Azure API version
core/providers/openai/openai.go (1)

2665-2879: Fix image stream termination: handle [DONE] and don’t exit on the first completed image
Today the loop (a) treats data: [DONE] as JSON (warn-loop), and (b) returns on the first image_generation.completed, truncating n>1.

Proposed fix (handle [DONE], track expected images, only end once all are complete)
@@
-		lastChunkTime := startTime
+		lastChunkTime := startTime
+		expectedImages := 1
+		if request != nil && request.Params != nil && request.Params.N != nil && *request.Params.N > 0 {
+			expectedImages = *request.Params.N
+		}
+		completedImages := make(map[int]struct{}, expectedImages)
@@
-			jsonData := strings.TrimSpace(strings.TrimPrefix(line, "data:"))
+			jsonData := strings.TrimSpace(strings.TrimPrefix(line, "data:"))
 			if jsonData == "" {
 				continue
 			}
+			if jsonData == "[DONE]" {
+				break
+			}
@@
-			// Determine if this is the final chunk
+			// Determine if this is a completed chunk (OpenAI emits one "completed" per image)
 			isCompleted := response.Type == ImageGenerationCompleted
@@
-			// Only set PartialImageIndex for partial images, not for completed events
-			if !isCompleted {
-				chunk.PartialImageIndex = response.PartialImageIndex
-			}
+			// Preserve provider-supplied index for both partial and completed chunks
+			chunk.PartialImageIndex = response.PartialImageIndex
@@
-			if isCompleted {
+			if isCompleted {
+				completedImages[imageIndex] = struct{}{}
 				if collectedUsage != nil {
@@
-				ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
 			}
@@
-			if isCompleted {
-				return
-			}
+			if isCompleted && len(completedImages) >= expectedImages {
+				ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
+				return
+			}
 		}
transports/bifrost-http/integrations/router.go (1)

1279-1295: Streaming can still panic: converters are invoked without nil checks (incl. image generation).

config.StreamConfig.*ResponseConverter(...) is called directly (Line 1281-1291). If an integration forgets to set one converter, this will nil-deref at runtime.

Proposed fix (pattern for each case)
 				switch {
 				case chunk.BifrostTextCompletionResponse != nil:
+					if config.StreamConfig.TextStreamResponseConverter == nil {
+						err = fmt.Errorf("TextStreamResponseConverter not configured")
+						break
+					}
 					eventType, convertedResponse, err = config.StreamConfig.TextStreamResponseConverter(bifrostCtx, chunk.BifrostTextCompletionResponse)
 				case chunk.BifrostChatResponse != nil:
+					if config.StreamConfig.ChatStreamResponseConverter == nil {
+						err = fmt.Errorf("ChatStreamResponseConverter not configured")
+						break
+					}
 					eventType, convertedResponse, err = config.StreamConfig.ChatStreamResponseConverter(bifrostCtx, chunk.BifrostChatResponse)
@@
 				case chunk.BifrostImageGenerationStreamResponse != nil:
+					if config.StreamConfig.ImageGenerationStreamResponseConverter == nil {
+						err = fmt.Errorf("ImageGenerationStreamResponseConverter not configured")
+						break
+					}
 					eventType, convertedResponse, err = config.StreamConfig.ImageGenerationStreamResponseConverter(bifrostCtx, chunk.BifrostImageGenerationStreamResponse)
core/providers/gemini/images.go (3)

79-92: Potential nil-pointer panic when iterating content.Parts (guard part == nil).

If content.Parts is []*Part (typical in this codebase), part.Text (Line 83) can panic.

Proposed fix
 		for _, content := range request.Contents {
 			for _, part := range content.Parts {
-				if part.Text != "" {
+				if part == nil {
+					continue
+				}
+				if part.Text != "" {
 					bifrostReq.Input.Prompt = part.Text
 					break
 				}
 			}

97-128: Imagen imageSize parsing is case-sensitive and missing "4k".

convertImagenFormatToSize() only matches "1k"/"2k" (Line 102-107). If callers send "1K"/"2K"/"4K", this will produce wrong sizes.

Proposed fix
 func convertImagenFormatToSize(imageSize *string, aspectRatio *string) string {
 	// Default size based on imageSize parameter
 	baseSize := 1024
 	if imageSize != nil {
-		switch *imageSize {
-		case "2k":
+		switch strings.ToLower(strings.TrimSpace(*imageSize)) {
+		case "4k":
+			baseSize = 4096
+		case "2k":
 			baseSize = 2048
-		case "1k":
+		case "1k":
 			baseSize = 1024
 		}
 	}

130-188: Nil receiver + “empty Data but success” risk in ToBifrostImageGenerationResponse().

  • If response is nil, response.ResponseID will panic (Line 131-134).
  • If there’s no InlineData, you return bifrostResp with Data: [] and no error (Line 169-175), which is a silent failure mode.
Proposed fix
 func (response *GenerateContentResponse) ToBifrostImageGenerationResponse() (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if response == nil {
+		return nil, &schemas.BifrostError{
+			IsBifrostError: false,
+			Error: &schemas.ErrorField{Message: "nil Gemini response"},
+		}
+	}
 	bifrostResp := &schemas.BifrostImageGenerationResponse{
 		ID:    response.ResponseID,
 		Model: response.ModelVersion,
 		Data:  []schemas.ImageData{},
 	}
@@
 			// Only assign imageData when it has elements
-			if len(imageData) > 0 {
-				bifrostResp.Data = imageData
-				// Only set ImageGenerationResponseParameters when metadata exists
-				if len(imageMetadata) > 0 {
-					bifrostResp.ImageGenerationResponseParameters = &imageMetadata[0]
-				}
-			}
+			if len(imageData) == 0 {
+				return nil, &schemas.BifrostError{
+					IsBifrostError: false,
+					Error: &schemas.ErrorField{Message: "no image data found in Gemini response"},
+				}
+			}
+			bifrostResp.Data = imageData
+			if len(imageMetadata) > 0 {
+				bifrostResp.ImageGenerationResponseParameters = &imageMetadata[0]
+			}
core/internal/testutil/test_retry_framework.go (3)

980-995: Type mismatch: DefaultImageGenerationRetryConfig() returns TestRetryConfig, but image-gen retry uses ImageGenerationRetryConfig conditions.
This is easy to accidentally “wire wrong” (default conditions never evaluated), and it also diverges from the core/internal/testutil convention of building typed configs from GetTestRetryConfigForScenario() and leaving Conditions empty. Based on learnings, this should be consistent across core/internal/testutil.

Suggested verification (confirm how image-gen tests build ImageGenerationRetryConfig and whether conditions are ever applied):

#!/bin/bash
set -euo pipefail

rg -n --type=go '\bDefaultImageGenerationRetryConfig\s*\(' -S
rg -n --type=go '\bWithImageGenerationRetry\s*\(' -S
rg -n --type=go '\bImageGenerationRetryConfig\s*\{' -S
rg -n --type=go '\bGetTestRetryConfigForScenario\s*\(' core/internal/testutil -S

If the intent is “typed configs derive timings from scenario config, conditions empty”, then consider changing DefaultImageGenerationRetryConfig() to return the same baseline timings but no conditions, and ensure the typed config is constructed in the test helper accordingly.

Also applies to: 1250-1252


2097-2260: Avoid false-pass: (nil response, nil err) currently exits retries and returns success-like (nil, nil).
If operation() ever returns (nil, nil), this breaks out without OnFinalFail, and callers may treat it as success.

Proposed fix
@@
 		// Execute the operation
 		response, err := operation()
 		lastResponse = response
 		lastError = err
 
+		// Treat nil response as a failure (even if err is nil) to avoid false-passing tests
+		if response == nil {
+			if attempt < config.MaxAttempts {
+				retryReason := "❌ image generation response is nil"
+				if config.OnRetry != nil {
+					config.OnRetry(attempt, retryReason, t)
+				}
+				delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
+				time.Sleep(delay)
+				continue
+			}
+			statusCode := 400
+			return nil, &schemas.BifrostError{
+				IsBifrostError: true,
+				StatusCode:     &statusCode,
+				Error: &schemas.ErrorField{
+					Message: fmt.Sprintf("❌ image generation response is nil after %d attempts", attempt),
+				},
+			}
+		}
-
-		// If we have a response, validate it FIRST
-		if response != nil {
+		// If we have a response, validate it FIRST
+		if response != nil {
 			validationResult := ValidateImageGenerationResponse(t, response, err, expectations, scenarioName)

2947-2955: Comment/name mismatch: “WithImageGenerationStreamValidationRetry” vs WithImageGenerationStreamRetry.
Either rename the function for consistency with other *StreamValidationRetry helpers, or fix the comment header.

tests/integrations/python/tests/test_google.py (1)

1691-1786: Fix cross-provider key gating + Ruff (ARG002/BLE001) in image-generation tests (avoid skipping real failures).
These tests are parametrized across providers, but @skip_if_no_api_key("google") only guards the Google key; get_provider_google_client(provider) can still raise for missing provider keys. Also except Exception will swallow AssertionError and violate Ruff BLE001.

Proposed diff
@@
     `@skip_if_no_api_key`("google")
     `@pytest.mark.parametrize`("provider,model", get_cross_provider_params_for_scenario("image_generation"))
-    def test_41a_image_generation_simple(self, test_config, provider, model):
+    def test_41a_image_generation_simple(self, _test_config, provider, model):
@@
-        # Get provider-specific client
-        client = get_provider_google_client(provider)
+        # Get provider-specific client (skip cleanly if provider key is missing)
+        try:
+            client = get_provider_google_client(provider)
+        except ValueError:
+            pytest.skip(f"No API key available for provider {provider}")
@@
     `@skip_if_no_api_key`("google")
     `@pytest.mark.parametrize`("provider,model", get_cross_provider_params_for_scenario("imagen"))
-    def test_41b_imagen_predict(self, test_config, provider, model):
+    def test_41b_imagen_predict(self, _test_config, provider, model):
@@
-        # Get provider-specific client
-        client = get_provider_google_client(provider)
+        # Get provider-specific client (skip cleanly if provider key is missing)
+        try:
+            client = get_provider_google_client(provider)
+        except ValueError:
+            pytest.skip(f"No API key available for provider {provider}")
@@
-        try:
-            response = client.models.generate_content(
+        try:
+            response = client.models.generate_content(
                 model=format_provider_model(provider, model),
                 contents=IMAGE_GENERATION_SIMPLE_PROMPT,
                 config=types.GenerateContentConfig()
             )
@@
-        except Exception as e:
+        except Exception as e:  # TODO: narrow to google-genai specific exceptions once confirmed
             # Imagen may not be available in all regions or configurations
             pytest.skip(f"Imagen generation failed: {e}")
@@
     `@skip_if_no_api_key`("google")
     `@pytest.mark.parametrize`("provider,model", get_cross_provider_params_for_scenario("image_generation"))
-    def test_41c_image_generation_with_text(self, test_config, provider, model):
+    def test_41c_image_generation_with_text(self, _test_config, provider, model):
@@
-        # Get provider-specific client
-        client = get_provider_google_client(provider)
+        # Get provider-specific client (skip cleanly if provider key is missing)
+        try:
+            client = get_provider_google_client(provider)
+        except ValueError:
+            pytest.skip(f"No API key available for provider {provider}")
@@
-        for candidate in response.candidates:
-            if hasattr(candidate, "content") and candidate.content:
-                if hasattr(candidate.content, "parts") and candidate.content.parts:
-                    for part in candidate.content.parts:
-                        if hasattr(part, "text") and part.text:
-                            found_text = True
-                        if hasattr(part, "inline_data") and part.inline_data:
-                            found_image = True
+        for candidate in (getattr(response, "candidates", None) or []):
+            content = getattr(candidate, "content", None)
+            for part in (getattr(content, "parts", None) or []):
+                if getattr(part, "text", None):
+                    found_text = True
+                if getattr(part, "inline_data", None):
+                    found_image = True

To fully resolve BLE001 without risking false skips, please confirm which exception types google.genai raises for unavailable Imagen / region restrictions, and replace the Exception catch accordingly.

google genai python sdk (google-genai) what exception types are raised for API errors / invalid requests?
tests/integrations/python/tests/test_openai.py (1)

1166-1258: Tighten cross-provider capability gating for image-gen params + fix Ruff ARG002.
As written, n=2, quality="low", and size="1024x1536" can be unsupported for some provider/model combos; also test_config is unused.

Proposed diff
@@
     `@pytest.mark.parametrize`(
         "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation")
     )
-    def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled):
+    def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled):
@@
     `@pytest.mark.parametrize`(
         "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation")
     )
-    def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled):
+    def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled):
@@
-        if model == "gemini-2.5-flash-image":
+        if provider == "gemini" and model == "gemini-2.5-flash-image":
             pytest.skip("Gemini 2.5 flash image does not support multiple images")
@@
-        response = client.images.generate(
-            model=format_provider_model(provider, model),
-            prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
-            n=2,
-            size="1024x1024",
-        )
+        try:
+            response = client.images.generate(
+                model=format_provider_model(provider, model),
+                prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
+                n=2,
+                size="1024x1024",
+            )
+        except Exception as e:
+            pytest.skip(f"{provider}/{model} does not support n=2 image generation: {e}")
@@
     `@pytest.mark.parametrize`(
         "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation")
     )
-    def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled):
+    def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled):
@@
-        response = client.images.generate(
-            model=format_provider_model(provider, model),
-            prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
-            n=1,
-            size="1024x1024",
-            quality="low",  # gpt-image-1 supports quality parameter
-        )
+        try:
+            response = client.images.generate(
+                model=format_provider_model(provider, model),
+                prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
+                n=1,
+                size="1024x1024",
+                quality="low",  # intended for gpt-image-1
+            )
+        except Exception as e:
+            pytest.skip(f"{provider}/{model} does not support quality param: {e}")
@@
     `@pytest.mark.parametrize`(
         "provider,model,vk_enabled", get_cross_provider_params_with_vk_for_scenario("image_generation")
     )
-    def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled):
+    def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):
@@
-        response = client.images.generate(
-            model=format_provider_model(provider, model),
-            prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
-            n=1,
-            size="1024x1536",  # Portrait orientation
-        )
+        try:
+            response = client.images.generate(
+                model=format_provider_model(provider, model),
+                prompt=IMAGE_GENERATION_SIMPLE_PROMPT,
+                n=1,
+                size="1024x1536",  # Portrait orientation
+            )
+        except Exception as e:
+            pytest.skip(f"{provider}/{model} does not support size=1024x1536: {e}")

Based on learnings, please also verify the stack updates the integration config so providers.openai.image_generation is set to gpt-image-1 (and ideally document that in the test config to avoid DALLE-3 constraints leaking back in).

🧹 Nitpick comments (28)
core/schemas/utils.go (1)

1183-1186: Inconsistent case handling compared to peer functions.

This function uses strings.ToLower() for case-insensitive matching, while peer functions like IsNovaModel, IsGeminiModel, IsMistralModel, and IsAnthropicModel do not apply case normalization. Consider aligning with the existing pattern unless there's a specific reason Imagen model identifiers may vary in case.

Suggested change for consistency
 // IsImagenModel checks if the model is an Imagen model.
 func IsImagenModel(model string) bool {
-	return strings.Contains(strings.ToLower(model), "imagen")
+	return strings.Contains(model, "imagen")
 }

If case-insensitivity is intentional (e.g., provider APIs return mixed-case model names), please disregard this suggestion and consider adding a brief comment explaining the rationale.

ui/components/chat/ImageMessage.tsx (2)

40-53: Move pure helper function outside component scope.

getMimeType is a pure function with no dependencies on component state or props. Defining it inside the component causes it to be recreated on every render.

♻️ Suggested refactor
+// Convert output_format to MIME type for data URLs
+const getMimeType = (format?: string): string => {
+  switch (format?.toLowerCase()) {
+    case 'png':
+      return 'image/png';
+    case 'jpeg':
+    case 'jpg':
+      return 'image/jpeg';
+    case 'webp':
+      return 'image/webp';
+    default:
+      return 'image/png';
+  }
+};
+
 export const ImageMessage: React.FC<ImageMessageProps> = ({
   image,
   isStreaming,
   streamProgress,
 }) => {
   // Streaming placeholder...
   
   // No usable image data...
   
-  // Convert output_format to MIME type for data URLs
-  const getMimeType = (format?: string): string => {
-    switch (format?.toLowerCase()) {
-      case 'png':
-        return 'image/png';
-      case 'jpeg':
-      case 'jpg':
-        return 'image/jpeg';
-      case 'webp':
-        return 'image/webp';
-      default:
-        // Default to PNG for backward compatibility
-        return 'image/png';
-    }
-  };
-
   const dataUrl = image.url 

63-68: Consider enhanced alt text and image error handling.

Two suggestions for improved UX and accessibility:

  1. Alt text: ImageMessageData includes revised_prompt which often better describes AI-generated images. Consider using it when prompt is unavailable.

  2. Error handling: No onError handler means broken URLs or invalid base64 will show the browser's default broken image icon without user feedback.

♻️ Suggested improvements
+  const altText = image.prompt || image.revised_prompt || `Generated image ${(image.index ?? 0) + 1}`;
+
   return (
     <div className="my-4">
       <Card className="p-0">
         <div className="border border-border overflow-auto">
           <img
             src={dataUrl}
-            alt={image.prompt || `image-${image.index ?? 0}`}
+            alt={altText}
             className="w-auto h-auto"
             loading="lazy"
+            onError={(e) => {
+              e.currentTarget.style.display = 'none';
+            }}
           />
         </div>
       </Card>
     </div>
   );
docs/quickstart/gateway/multimodal.mdx (1)

49-98: LGTM — Image generation section is well-documented.

The new Image Generation section follows the established documentation patterns in this file, with a clear curl example and comprehensive response format including usage metrics and extra_fields.

Optional nit: Lines 97-98 introduce a double blank line before the next section, while other sections use a single blank line. Consider removing one for consistency.

Optional formatting fix
     }
 }

Audio Understanding: Analyzing Audio with AI


</details>

</blockquote></details>
<details>
<summary>tests/integrations/python/config.yml (1)</summary><blockquote>

`456-461`: **Consider adding explicit `false` values for unsupported scenarios.**

While the minimal configuration works (undefined scenarios likely default to false), other providers explicitly list all scenarios. For consistency and clarity, consider explicitly defining unsupported scenarios as `false`.



This is a minor consistency suggestion - the current configuration is functional.

</blockquote></details>
<details>
<summary>docs/providers/supported-providers/groq.mdx (1)</summary><blockquote>

`25-25`: **Image Generation entry missing from Unsupported Features table.**

The Image Generation row was added to the Supported Operations table (line 25), but unlike the Cerebras and Parasail documentation, it's not added to the Unsupported Features table (lines 126-135). For consistency across provider documentation, consider adding an entry.



<details>
<summary>📝 Suggested addition to Unsupported Features table</summary>

Add after line 131:
```diff
 | Embedding | Not offered by Groq API |
+| Image Generation | Not offered by Groq API |
 | Speech/TTS | Not offered by Groq API |
core/providers/nebius/errors.go (1)

22-51: Consider preserving Detail.Message when validation errors are also present.

When both Detail.Message and ValidationErrors exist, the current logic overwrites the original Detail.Message (line 24) with the aggregated validation error messages (line 41). This may lose valuable context from the original error message.

Consider combining both messages instead:

💡 Suggested change to preserve both messages
 	if nebiusErr.Detail != nil {
+		var originalMessage string
 		if nebiusErr.Detail.Message != nil {
-			message = *nebiusErr.Detail.Message
+			originalMessage = *nebiusErr.Detail.Message
 		}

 		if len(nebiusErr.Detail.ValidationErrors) > 0 {
 			var messages []string
 			var locations []string

 			for _, detail := range nebiusErr.Detail.ValidationErrors {
 				if detail.Msg != "" {
 					messages = append(messages, detail.Msg)
 				}
 				if len(detail.Loc) > 0 {
 					locations = append(locations, strings.Join(detail.Loc, "."))
 				}
 			}

 			if len(messages) > 0 {
 				message = strings.Join(messages, "; ")
 			}
 			if len(locations) > 0 {
 				locationStr := strings.Join(locations, ", ")
 				if message == "" {
 					message = "[" + locationStr + "]"
 				} else {
 					message = message + " [" + locationStr + "]"
 				}
 			}
+			// Prepend original message if present
+			if originalMessage != "" && message != "" {
+				message = originalMessage + ": " + message
+			} else if originalMessage != "" {
+				message = originalMessage
+			}
+		} else {
+			message = originalMessage
 		}
 	}
core/providers/openai/types.go (1)

546-546: Define constants for ImageGenerationEventType values.

ImageGenerationEventType is declared as a type alias but no constants are defined in this file. Per learnings, the expected values are "image_generation.partial_image", "image_generation.completed", and "error". Consider defining these as constants for type safety and discoverability.

♻️ Suggested constants
const (
	ImageGenerationEventTypePartialImage ImageGenerationEventType = "image_generation.partial_image"
	ImageGenerationEventTypeCompleted    ImageGenerationEventType = "image_generation.completed"
	ImageGenerationEventTypeError        ImageGenerationEventType = "error"
)
docs/openapi/schemas/inference/images.yaml (2)

80-86: Add "jpg" as an accepted value for output_format.

The schema only lists png, webp, and jpeg, but providers may normalize jpgjpeg. For API completeness and to avoid client confusion, consider adding jpg to the enum.

♻️ Suggested change
         output_format:
           type: string
           enum:
             - "png"
             - "webp"
             - "jpeg"
+            - "jpg"
           description: Output image format

Also applies to: 122-128, 252-258


166-177: Remove orphaned ImageGenerationResponseParameters schema.

The schema is not referenced anywhere in the codebase and duplicates properties already defined inline in ImageGenerationResponse (lines 119–143). Removing it will eliminate schema bloat without impact.

docs/providers/supported-providers/openai.mdx (1)

218-273: Comprehensive documentation for Image Generation API.

The documentation thoroughly covers:

  • Request parameters with types and notes
  • Bifrost-to-OpenAI conversion details
  • Streaming response handling with SSE event types
  • Per-chunk metadata and tracking

One minor style suggestion from static analysis: Lines 244-246 have three consecutive bullet points starting with the same pattern. Consider varying the sentence structure for readability.

📝 Optional: Vary sentence beginnings
-**Request Conversion (Bifrost → OpenAI)**
-
-OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion:
-
-- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt`
-- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed.
-- **Streaming**: When streaming is requested, `stream: true` is set in the request body.
+**Request Conversion (Bifrost → OpenAI)**
+
+OpenAI is the baseline schema for image generation. Parameters are passed through with minimal conversion:
+
+- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Prompt`
+- **Parameters**: All fields from `bifrostReq.Params` (`ImageGenerationParameters`) are embedded directly into the OpenAI request struct via struct embedding. No field mapping or transformation is performed.
+- For **Streaming** requests, `stream: true` is set in the request body.
core/internal/testutil/validation_presets.go (1)

222-237: Consider removing the inline comment on line 222.

The comment // In validation_presets.go - add this function appears to be a leftover from development notes. It's redundant since the function is already in the file.

Proposed fix
-// In validation_presets.go - add this function
+// ImageGenerationExpectations returns validation expectations for image generation scenarios
 func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
docs/providers/supported-providers/azure.mdx (1)

429-429: Minor formatting: Add newline before section separator.

Line 429 has the --- separator immediately after the "Streaming" section content without a blank line, which differs from the formatting used elsewhere in this file.

📝 Suggested fix
 Image generation streaming is supported and uses OpenAI's streaming format with Server-Sent Events (SSE).
+
 ---
framework/logstore/migrations.go (1)

684-748: Consider consolidating migrations for related columns.

The two migrations (migrationAddImageGenerationOutputColumn and migrationAddImageGenerationInputColumn) follow the established pattern and are correct. However, since both columns are part of the same feature (image generation support), they could be consolidated into a single migration function, similar to how migrationAddCostAndCacheDebugColumn handles multiple related columns.

That said, keeping them separate provides finer-grained rollback control. This is a minor suggestion and can be deferred.

♻️ Optional: Consolidated migration approach
func migrationAddImageGenerationColumns(ctx context.Context, db *gorm.DB) error {
	opts := *migrator.DefaultOptions
	opts.UseTransaction = true
	m := migrator.New(db, &opts, []*migrator.Migration{{
		ID: "logs_add_image_generation_columns",
		Migrate: func(tx *gorm.DB) error {
			tx = tx.WithContext(ctx)
			migrator := tx.Migrator()
			if !migrator.HasColumn(&Log{}, "image_generation_input") {
				if err := migrator.AddColumn(&Log{}, "image_generation_input"); err != nil {
					return err
				}
			}
			if !migrator.HasColumn(&Log{}, "image_generation_output") {
				if err := migrator.AddColumn(&Log{}, "image_generation_output"); err != nil {
					return err
				}
			}
			return nil
		},
		Rollback: func(tx *gorm.DB) error {
			tx = tx.WithContext(ctx)
			migrator := tx.Migrator()
			if migrator.HasColumn(&Log{}, "image_generation_input") {
				if err := migrator.DropColumn(&Log{}, "image_generation_input"); err != nil {
					return err
				}
			}
			if migrator.HasColumn(&Log{}, "image_generation_output") {
				if err := migrator.DropColumn(&Log{}, "image_generation_output"); err != nil {
					return err
				}
			}
			return nil
		},
	}})
	err := m.Migrate()
	if err != nil {
		return fmt.Errorf("error while adding image generation columns: %s", err.Error())
	}
	return nil
}
core/providers/nebius/types.go (2)

10-24: Avoid emitting null for required request fields (model, prompt).
Since these are pointers without omitempty, a nil value serializes as null (easy to hit if a converter returns nil). Consider making them non-pointer string, or ensure converters/dispatch validate non-nil/non-empty before marshaling (same pattern used elsewhere for image requests).


55-75: NebiusErrorDetail.UnmarshalJSON: consider supporting object-shaped errors too (not just string/array).
Today, an object payload (e.g. {"message":"..."}) will fail to parse.

Possible hardening
 func (d *NebiusErrorDetail) UnmarshalJSON(data []byte) error {
 	// First, try to unmarshal as an array (validation errors)
 	trimmed := strings.TrimSpace(string(data))
 	if len(trimmed) > 0 && trimmed[0] == '[' {
@@
 	}
 
 	// If not an array, try to unmarshal as a string
 	var message string
 	if err := sonic.Unmarshal(data, &message); err != nil {
-		return err
+		// Fallback: object payloads like {"message":"..."}
+		var obj struct {
+			Message *string `json:"message"`
+			Detail  *string `json:"detail"`
+			Error   *string `json:"error"`
+		}
+		if err2 := sonic.Unmarshal(data, &obj); err2 != nil {
+			return err
+		}
+		if obj.Message != nil {
+			d.Message = obj.Message
+			return nil
+		}
+		if obj.Detail != nil {
+			d.Message = obj.Detail
+			return nil
+		}
+		if obj.Error != nil {
+			d.Message = obj.Error
+			return nil
+		}
+		// Unknown object shape; keep original error
+		return err
 	}
 	d.Message = &message
 	return nil
 }
core/providers/bedrock/bedrock.go (1)

1347-1355: Unsupported image ops: good to return provider.GetProviderKey(); consider aligning older unsupported ops too.
Right now Speech/SpeechStream/... pass schemas.Bedrock while image uses the computed provider key (helps custom providers).

core/providers/gemini/responses.go (1)

1873-1900: Deduplicate inline-data-to-content-block logic (and apply the same data-URL prefix guard).
convertGeminiCandidatesToResponsesOutput re-implements inline data handling that overlaps with convertGeminiInlineDataToContentBlock, and currently hardcodes data:<mime>;base64,<data> without guarding for pre-prefixed data URLs (same risk as above).

core/internal/testutil/image_generation.go (1)

170-204: Align stream retry config construction with non-stream pattern in this directory.

The non-stream test explicitly constructs a typed ImageGenerationRetryConfig and sets Conditions: []ImageGenerationRetryCondition{}, while the stream test passes retryConfig (generic TestRetryConfig from the helper) directly to WithImageGenerationStreamRetry. Since the stream function expects TestRetryConfig but the non-stream expects ImageGenerationRetryConfig, consider either:

  • Having the stream test construct a typed ImageGenerationRetryConfig to match the non-stream pattern
  • Creating a dedicated ImageGenerationStreamRetryConfig type for consistency

This aligns retry config construction across similar test patterns in the directory.

framework/modelcatalog/pricing.go (1)

379-419: Consider consolidating repetitive image token rate logic.

The three tier branches (Lines 379-392, 393-406, 407-419) are identical since tiered image token pricing fields don't exist yet. This can be simplified to a single block.

Suggested simplification
-	// Determine image-specific token rates, with tiered pricing support
-	// Check for image token pricing fields and fall back to generic rates if not available
-	if isAbove200k {
-		// Prefer tiered image token pricing above 200k, fall back to base image token rate, then generic rate
-		// Note: InputCostPerImageTokenAbove200kTokens and OutputCostPerImageTokenAbove200kTokens
-		// may not exist in TableModelPricing yet, so we check base image token rate as fallback
-		if pricing.InputCostPerImageToken != nil {
-			inputImageTokenRate = *pricing.InputCostPerImageToken
-		} else {
-			inputImageTokenRate = inputTokenRate
-		}
-		if pricing.OutputCostPerImageToken != nil {
-			outputImageTokenRate = *pricing.OutputCostPerImageToken
-		} else {
-			outputImageTokenRate = outputTokenRate
-		}
-	} else if isAbove128k {
-		// ... same logic ...
-	} else {
-		// ... same logic ...
-	}
+	// Determine image-specific token rates
+	// TODO: Add tiered image token pricing when fields are added to TableModelPricing
+	if pricing.InputCostPerImageToken != nil {
+		inputImageTokenRate = *pricing.InputCostPerImageToken
+	} else {
+		inputImageTokenRate = inputTokenRate
+	}
+	if pricing.OutputCostPerImageToken != nil {
+		outputImageTokenRate = *pricing.OutputCostPerImageToken
+	} else {
+		outputImageTokenRate = outputTokenRate
+	}
core/bifrost.go (1)

1008-1059: Minor grammar nit in doc comments.

Lines 1008 and 1061 use "a image" which should be "an image" for correct grammar.

📝 Suggested fix
-// ImageGenerationRequest sends a image generation request to the specified provider.
+// ImageGenerationRequest sends an image generation request to the specified provider.
-// ImageGenerationStreamRequest sends a image generation stream request to the specified provider.
+// ImageGenerationStreamRequest sends an image generation stream request to the specified provider.
docs/openapi/paths/integrations/openai/images.yaml (1)

46-86: Consider adding streaming documentation to Azure endpoint.

The Azure endpoint description is minimal compared to the OpenAI endpoint. Since the response section includes text/event-stream, consider adding a note about SSE streaming support similar to the main endpoint.

core/schemas/images.go (2)

81-102: Streaming schema likely needs an externally-visible per-image identifier
Right now Index is not serialized (json:"-"). If you intend to support n>1 image streaming for clients, consider emitting either:

  • index (int) on the stream response, or
  • always populate partial_image_index on completed chunks too (so it’s present on every chunk).

22-38: Include ExtraParams in direct cache hash for image generation requests

ExtraParams is json:"-" and thus excluded from the request hash used for direct cache matching. Since providers use ExtraParams to modify output (verified in OpenAI, Gemini, Hugging Face, and other provider implementations), requests with identical parameters but different ExtraParams would incorrectly cache hit. For direct-only cache mode, this creates silent correctness issues. While ExtraParams appears in metadata for semantic caching, the hash-based direct cache should explicitly include ExtraParams to ensure cache key uniqueness.

core/internal/testutil/account.go (1)

47-48: Clarify fallback source of truth for image generation (generic Fallbacks vs ImageGenerationFallbacks).

ComprehensiveTestConfig.Fallbacks is commented as being used “for chat, responses, image and reasoning tests” (Line 78), but you also added ImageGenerationFallbacks (Line 85). This is ambiguous and can easily lead to the harness using the wrong list.

Proposed fix
-    Fallbacks                []schemas.Fallback     // for chat, responses, image and reasoning tests
+    Fallbacks                []schemas.Fallback     // for chat, responses, and reasoning tests
     TextCompletionFallbacks  []schemas.Fallback     // for text completion tests
     TranscriptionFallbacks   []schemas.Fallback     // for transcription tests
     SpeechSynthesisFallbacks []schemas.Fallback     // for speech synthesis tests
     EmbeddingFallbacks       []schemas.Fallback     // for embedding tests
     SkipReason               string                 // Reason to skip certain tests
     ImageGenerationModel     string                 // Model for image generation
     ImageGenerationFallbacks []schemas.Fallback     // Fallbacks for image generation

Also applies to: 84-90

framework/streaming/accumulator.go (1)

293-325: Potential hot-path allocs: image de-dup key uses fmt.Sprintf("%d:%d", ...).
If image streaming is high-volume, consider a non-allocating key (e.g., uint64(imageIndex)<<32 | uint64(uint32(chunkIndex))) with map[uint64]struct{}.

core/providers/huggingface/images.go (1)

120-135: Size parsing is tolerant (silent failure) — verify this is intentional.

Unlike the Nebius case which returns errors for invalid size format, the fal-ai case silently skips setting ImageSize when parsing fails. This means malformed size strings like "invalid" or "100" won't produce an error. If this is intentional (relying on fal-ai API validation), consider adding a brief comment explaining the design choice for maintainability.

core/providers/openai/images.go (1)

41-69: Avoid params aliasing + mirror prompt validation in reverse conversion.
Params: &request.ImageGenerationParameters couples the returned Bifrost request to the OpenAI request struct; also Prompt can be empty and still produce a non-nil Bifrost request.

Proposed diff
 func (request *OpenAIImageGenerationRequest) ToBifrostImageGenerationRequest() *schemas.BifrostImageGenerationRequest {
 	if request == nil {
 		return nil
 	}
+	if request.Prompt == "" {
+		return nil
+	}
@@
 	var params *schemas.ImageGenerationParameters
 	if request.N != nil || request.Background != nil || request.Moderation != nil ||
@@
-		params = &request.ImageGenerationParameters
+		paramsCopy := request.ImageGenerationParameters
+		params = &paramsCopy
 	}

Comment on lines +259 to +263
if result.ImageGenerationStreamResponse.Index >= 0 {
chunk.ImageIndex = result.ImageGenerationStreamResponse.Index
} else {
chunk.ImageIndex = result.ImageGenerationStreamResponse.SequenceNumber
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the images.go file structure around the flagged lines
cd framework/streaming && head -n 300 images.go | tail -n 100

Repository: maximhq/bifrost

Length of output: 4545


🏁 Script executed:

# Check the ImageGenerationStreamResponse type definition
rg "type.*ImageGenerationStreamResponse" --type go -A 15

Repository: maximhq/bifrost

Length of output: 3495


🏁 Script executed:

# Look for how Index is typically set in providers
rg "\.Index\s*=" --type go -B 2 -A 2 | head -40

Repository: maximhq/bifrost

Length of output: 2411


🏁 Script executed:

# Check the git history of the images.go file to see what changed in b136412
git log --oneline framework/streaming/images.go | head -20

Repository: maximhq/bifrost

Length of output: 155


🏁 Script executed:

# Look at the actual commit to see what was changed
git show b136412 --stat

Repository: maximhq/bifrost

Length of output: 248


Fix the unreachable fallback logic for ImageIndex assignment.

The condition Index >= 0 (Line 259) will always evaluate to true because Index is declared as int (not *int), making its zero value indistinguishable from an explicitly set 0. This renders the fallback to SequenceNumber unreachable.

The previous review marked this as addressed in commit b136412, but that commit does not exist in the repository. The issue persists: you need to either:

  1. Change Index to *int (pointer) so nil indicates "not set"
  2. Use a sentinel value (e.g., -1) to represent "unset"
  3. Add an explicit flag to track whether Index was explicitly set

Without fixing this, the SequenceNumber fallback will never execute, potentially causing image indexing failures for providers that don't explicitly set the Index field.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
docs/providers/supported-providers/anthropic.mdx (1)

26-33: Update the Note to include Image Generation in the list of unsupported operations.

The new Image Generation row correctly shows it as unsupported (❌), but the Note on line 32 still only mentions "Embeddings, Speech, and Transcriptions" as unsupported operations. For consistency, Image Generation should be added to this list.

📝 Proposed fix
 <Note>
-**Unsupported Operations** (❌): Embeddings, Speech, and Transcriptions are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Embeddings, Speech, Transcriptions, and Image Generation are not supported by the upstream Anthropic API. These return `UnsupportedOperationError`.
 </Note>
docs/providers/supported-providers/ollama.mdx (1)

25-32: Update the Note to include Image Generation in the unsupported operations list.

The table correctly shows Image Generation as unsupported (❌), but the Note on line 32 lists "Speech, Transcriptions, Files, and Batch" without mentioning Image Generation. For consistency, update the Note to include Image Generation.

📝 Suggested fix
 <Note>
-**Unsupported Operations** (❌): Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`.
+**Unsupported Operations** (❌): Image Generation, Speech, Transcriptions, Files, and Batch are not supported by the upstream Ollama API. These return `UnsupportedOperationError`.
core/internal/testutil/account.go (1)

217-253: Azure image-generation model/deployment mismatch will likely break tests.

Azure has a deployment mapping for "gpt-image-1", but the configured ImageGenerationModel is "gpt-image-1-mini". If Azure routing requires the deployment map (common), image-gen requests will fail to resolve the deployment.

Proposed diff (pick one)
@@
 					Deployments: map[string]string{
@@
-						"gpt-image-1":            "gpt-image-1",
+						"gpt-image-1":            "gpt-image-1",
+						"gpt-image-1-mini":       "gpt-image-1-mini",
@@
 					},

or change the Azure test config to use the mapped model:

-		ImageGenerationModel: "gpt-image-1-mini",
+		ImageGenerationModel: "gpt-image-1",

Also applies to: 825-866

plugins/semanticcache/utils.go (2)

88-116: Fix possible nil deref for ImageGenerationRequest in hashing/input extraction.
Both generateRequestHash and getInputForCaching assume req.ImageGenerationRequest is non-nil for image request types. A malformed request (or a future refactor) would panic here; other cases (e.g., Speech) already guard.

Proposed fix (nil-safe access)
 func (plugin *Plugin) generateRequestHash(req *schemas.BifrostRequest) (string, error) {
@@
 	switch req.RequestType {
@@
 	case schemas.TranscriptionRequest, schemas.TranscriptionStreamRequest:
 		hashInput.Params = req.TranscriptionRequest.Params
 	case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
-		hashInput.Params = req.ImageGenerationRequest.Params
+		if req.ImageGenerationRequest != nil {
+			hashInput.Params = req.ImageGenerationRequest.Params
+		}
 	}
@@
 }

 func (plugin *Plugin) getInputForCaching(req *schemas.BifrostRequest) interface{} {
 	switch req.RequestType {
@@
 	case schemas.ImageGenerationRequest, schemas.ImageGenerationStreamRequest:
-		return req.ImageGenerationRequest.Input
+		if req.ImageGenerationRequest == nil {
+			return nil
+		}
+		return req.ImageGenerationRequest.Input
 	default:
 		return nil
 	}
 }

Also applies to: 477-516


11-16: Don't store full base64 image data in metadata—vector DB limits will be exceeded.

addSingleResponse stores the entire marshaled response in metadata, and additionally stores the full image_b64 arrays. For image generation, base64 data can easily be MBs per image, which will exceed vector DB metadata limits and consume unnecessary memory/network resources.

Mitigation: For image-gen responses, avoid storing image_b64 in metadata; store only image_urls and/or persist base64 in an external blob store with references.

Suggested fix (remove b64 from metadata)
 	// image specific metadata
 	if res.ImageGenerationResponse != nil {
 		var imageURLs []string
-		var imageB64 []string
 		var revisedPrompts []string

 		for _, img := range res.ImageGenerationResponse.Data {
 			if img.URL != "" {
 				imageURLs = append(imageURLs, img.URL)
 			}
-			if img.B64JSON != "" {
-				imageB64 = append(imageB64, img.B64JSON)
-			}
 			if img.RevisedPrompt != "" {
 				revisedPrompts = append(revisedPrompts, img.RevisedPrompt)
 			}
 		}

 		metadata["image_urls"] = imageURLs
-		metadata["image_b64"] = imageB64
 		metadata["revised_prompts"] = revisedPrompts
 	}
core/providers/huggingface/types.go (1)

32-52: Avoid “successful” unmarshal on non-model payloads (error objects, etc.).

Right now, the object fallback (var obj struct{ Models []HuggingFaceModel }) will “succeed” even when the JSON is { "error": ... }, resulting in Models=nil without an error (Line 46-49). Prefer explicitly checking that the models key exists before accepting the object form.

Proposed fix
 func (r *HuggingFaceListModelsResponse) UnmarshalJSON(data []byte) error {
 	// Try unmarshaling as an array first (most common for /api/models)
 	var arr []HuggingFaceModel
 	if err := sonic.Unmarshal(data, &arr); err == nil {
 		r.Models = arr
 		return nil
 	}
 
-	// Fallback: try object with a `models` field
-	var obj struct {
-		Models []HuggingFaceModel `json:"models"`
-	}
-	if err := sonic.Unmarshal(data, &obj); err == nil {
-		r.Models = obj.Models
-		return nil
-	}
+	// Fallback: object with a `models` field (only accept if key exists)
+	var raw map[string]json.RawMessage
+	if err := sonic.Unmarshal(data, &raw); err == nil {
+		if modelsRaw, ok := raw["models"]; ok {
+			var models []HuggingFaceModel
+			if err := sonic.Unmarshal(modelsRaw, &models); err != nil {
+				return fmt.Errorf("failed to unmarshal HuggingFaceListModelsResponse.models: %w", err)
+			}
+			r.Models = models
+			return nil
+		}
+	}
 
 	return fmt.Errorf("failed to unmarshal HuggingFaceListModelsResponse: unexpected JSON structure")
 }
framework/streaming/types.go (1)

345-393: Regression: StreamTypeTranscription no longer sets RawResponse/CacheDebug.

StreamTypeChat/Responses/Audio set ExtraFields.RawResponse and ExtraFields.CacheDebug (when present), but StreamTypeTranscription now stops after setting RawRequest (Line 357-360). This looks like an accidental omission when inserting the new StreamTypeImage case, and breaks parity/debuggability for transcription streams.

Proposed fix
 	case StreamTypeTranscription:
 		transcriptionResp := p.Data.TranscriptionOutput
 		if transcriptionResp == nil {
 			transcriptionResp = &schemas.BifrostTranscriptionResponse{}
 		}
 		resp.TranscriptionResponse = transcriptionResp
 		resp.TranscriptionResponse.ExtraFields = schemas.BifrostResponseExtraFields{
 			RequestType:    schemas.TranscriptionRequest,
 			Provider:       p.Provider,
 			ModelRequested: p.Model,
 			Latency:        p.Data.Latency,
 		}
 		if p.RawRequest != nil {
 			resp.TranscriptionResponse.ExtraFields.RawRequest = p.RawRequest
 		}
+		if p.Data.RawResponse != nil {
+			resp.TranscriptionResponse.ExtraFields.RawResponse = *p.Data.RawResponse
+		}
+		if p.Data.CacheDebug != nil {
+			resp.TranscriptionResponse.ExtraFields.CacheDebug = p.Data.CacheDebug
+		}
 	case StreamTypeImage:
core/schemas/provider.go (1)

322-378: Fix grammar in ImageGeneration method comments: use "an" instead of "a" (since "image" starts with a vowel sound).

All 19+ provider implementations (BedrockProvider, XAIProvider, OpenAIProvider, NebiusProvider, SGLProvider, VertexProvider, PerplexityProvider, OllamaProvider, MistralProvider, HuggingFaceProvider, GroqProvider, OpenRouterProvider, CohereProvider, AnthropicProvider, CerebrasProvider, AzureProvider, ElevenlabsProvider, GeminiProvider, ParasailProvider) have already been updated with the ImageGeneration and ImageGenerationStream methods, so no build breaker exists.

🤖 Fix all issues with AI agents
In `@core/internal/testutil/test_retry_framework.go`:
- Around line 1250-1252: The non-stream ImageGeneration path is dropping retry
conditions when converting to ImageGenerationRetryConfig; update the branch
handling "ImageGeneration" so it matches the stream behavior by either returning
the TestRetryConfig directly (like ImageGenerationStream does) or by preserving
the default conditions instead of setting Conditions:
[]ImageGenerationRetryCondition{}; ensure the returned config includes
EmptyImageGenerationCondition and GenericResponseCondition as provided by
GetTestRetryConfigForScenario() and use DefaultImageGenerationRetryConfig() or
copy its Conditions into the ImageGenerationRetryConfig conversion.

In `@core/internal/testutil/validation_presets.go`:
- Around line 222-237: Remove the leftover development comment above the
function definition in validation_presets.go; specifically delete the line "//
In validation_presets.go - add this function" so only the
ImageGenerationExpectations function and its body remain (referencing the
ImageGenerationExpectations symbol to locate the block).

In `@core/providers/gemini/gemini.go`:
- Around line 1614-1671: In ImageGeneration, avoid a nil-deref when
geminiResponse.ToBifrostImageGenerationResponse() returns (nil, nil): check if
response == nil and bifrostErr == nil and in that case create and return a
proper *schemas.BifrostError (or populate bifrostErr) before accessing
bifrostErr.ExtraFields; similarly ensure bifrostErr is non-nil before assigning
bifrostErr.ExtraFields and only set response.ExtraFields after confirming
response != nil (references: ImageGeneration,
geminiResponse.ToBifrostImageGenerationResponse, variables response and
bifrostErr).

In `@core/providers/openai/types.go`:
- Around line 546-584: Move the ImageGeneration event constants from images.go
into types.go next to the ImageGenerationEventType declaration: define
ImageGenerationPartial, ImageGenerationCompleted, and ImageGenerationError as
constants of type ImageGenerationEventType (e.g. const ImageGenerationPartial
ImageGenerationEventType = "partial", etc.), remove their definitions from
images.go, and update any references/imports if needed so code uses the
relocated constants; keep the constants in types.go immediately adjacent to the
ImageGenerationEventType type definition for consistency with other providers.

In `@core/providers/vertex/vertex.go`:
- Around line 1493-1559: The IsAllDigitsASCII (custom fine-tuned) and
IsImagenModel branches currently set authQuery from key.Value which enables
API-key auth; update those branches to stop populating authQuery and instead
always obtain an OAuth2 token via getAuthTokenSource(key) and set
req.Header.Set("Authorization", "Bearer "+token.AccessToken) (same approach
already used later), keeping the existing projectNumber/URL construction logic
via completeURL; leave the IsGeminiModel branch behavior unchanged if you want
to continue supporting API key for Gemini. Ensure authQuery remains empty for
Imagen and fine-tuned paths so the later "if authQuery != \"\" { completeURL =
fmt.Sprintf(...)} else { tokenSource... }" flow uses OAuth2 for those models.

In `@docs/openapi/schemas/integrations/openai/images.yaml`:
- Around line 21-63: Update the size enum in the images schema to match OpenAI's
supported values: remove "256x256", "512x512", "1792x1024", and "1024x1792" and
add "auto", leaving "1024x1024", "1536x1024", and "1024x1536"; ensure the
default remains or is set to "1024x1024" as documented. Keep the existing
quality and style enums if this schema is DALL·E 3-specific, but add a short
note in the schema description for the quality/style fields (quality, style)
clarifying they apply to DALL·E 3 only or expand allowed values if the schema is
meant for multiple OpenAI models. Also verify the stream field description and
the response_format/streaming contract (response_format, stream) against
OpenAI's official streaming event shape (partial_images/SSE) and update the
OpenAIImageStreamResponse types elsewhere if needed.

In `@docs/providers/supported-providers/gemini.mdx`:
- Around line 489-610: Add a brief clarifying sentence above the "Parameter
Mapping" table stating that the Gateway accepts OpenAI-compatible flat fields
(e.g., prompt, size, n, output_format) while the SDK uses the
BifrostImageGenerationRequest with nested Input and Params objects
(Input.Prompt, Params.Size, Params.N, Params.OutputFormat); keep the sentence
short and then note that the table shows how those Gateway flat fields and SDK
nested fields map internally.
♻️ Duplicate comments (30)
docs/providers/supported-providers/nebius.mdx (1)

153-156: guidance_scale type should be float per Nebius API specification.

Per a previous review comment, the Nebius API specifies guidance_scale as a numeric float (range 0–100), but the documentation shows it as int. This should be updated to float for accuracy.

📝 Suggested fix
 | Parameter | Type | Notes |
 |-----------|------|-------|
-| `guidance_scale` | int | Guidance scale (0-100) |
+| `guidance_scale` | float | Guidance scale (0-100) |
 | `ai_project_id` | string | Nebius project ID (added as query parameter) |
core/providers/nebius/images.go (3)

12-20: Validate empty prompt and avoid pointer aliasing to caller-owned memory.

The current check allows empty or whitespace-only prompts. Additionally, taking pointers to bifrostReq.Model and bifrostReq.Input.Prompt couples output to caller-owned memory which can lead to subtle bugs if the caller mutates the original request. Based on learnings, validate nil/empty prompts before dispatch.

📝 Proposed fix
 func (provider *NebiusProvider) ToNebiusImageGenerationRequest(bifrostReq *schemas.BifrostImageGenerationRequest) (*NebiusImageGenerationRequest, error) {
 	if bifrostReq == nil || bifrostReq.Input == nil {
 		return nil, fmt.Errorf("bifrost request is nil or input is nil")
 	}
+	if strings.TrimSpace(bifrostReq.Input.Prompt) == "" {
+		return nil, fmt.Errorf("prompt is required")
+	}

+	model := bifrostReq.Model
+	prompt := bifrostReq.Input.Prompt
 	req := &NebiusImageGenerationRequest{
-		Model:  &bifrostReq.Model,
-		Prompt: &bifrostReq.Input.Prompt,
+		Model:  &model,
+		Prompt: &prompt,
 	}

28-46: Harden size parsing: trim whitespace, accept uppercase 'X', validate positive dimensions.

The current parsing doesn't handle whitespace, is case-sensitive for the delimiter, and doesn't validate that dimensions are positive.

📝 Proposed fix
 		if bifrostReq.Params.Size != nil {
-			size := strings.Split(*bifrostReq.Params.Size, "x")
+			rawSize := strings.TrimSpace(*bifrostReq.Params.Size)
+			size := strings.Split(strings.ToLower(rawSize), "x")
 			if len(size) != 2 {
-				return nil, fmt.Errorf("invalid size format: expected 'WIDTHxHEIGHT', got %q", *bifrostReq.Params.Size)
+				return nil, fmt.Errorf("invalid size format: expected 'WIDTHxHEIGHT', got %q", rawSize)
 			}

-			width, err := strconv.Atoi(size[0])
+			width, err := strconv.Atoi(strings.TrimSpace(size[0]))
 			if err != nil {
-				return nil, fmt.Errorf("invalid width in size %q: %w", *bifrostReq.Params.Size, err)
+				return nil, fmt.Errorf("invalid width in size %q: %w", rawSize, err)
 			}

-			height, err := strconv.Atoi(size[1])
+			height, err := strconv.Atoi(strings.TrimSpace(size[1]))
 			if err != nil {
-				return nil, fmt.Errorf("invalid height in size %q: %w", *bifrostReq.Params.Size, err)
+				return nil, fmt.Errorf("invalid height in size %q: %w", rawSize, err)
 			}
+			if width <= 0 || height <= 0 {
+				return nil, fmt.Errorf("invalid size %q: width and height must be > 0", rawSize)
+			}

 			req.Width = &width
 			req.Height = &height
 		}

62-68: Enforce guidance_scale bounds (0–100) per Nebius API specification.

The Nebius API expects guidance_scale as a numeric value in range 0–100, but the current code accepts any integer without validation.

📝 Proposed fix
 		if bifrostReq.Params.ExtraParams != nil {
 			// Map guidance_scale
 			if v, ok := schemas.SafeExtractIntPointer(bifrostReq.Params.ExtraParams["guidance_scale"]); ok {
+				if v != nil && (*v < 0 || *v > 100) {
+					return nil, fmt.Errorf("invalid guidance_scale %d: must be between 0 and 100", *v)
+				}
 				req.GuidanceScale = v
 			}
 		}
core/providers/gemini/transcription.go (1)

36-41: Base64 decoding correctly added for Gemini inline audio data.

The change properly decodes base64-encoded InlineData.Data to raw bytes before aggregating audio data. This aligns with Gemini's API format where inline data is base64-encoded on the wire.

Note: The silent continue on decode failure was flagged in a previous review and escalated to maintainers for evaluation.

core/providers/openai/openai_test.go (1)

43-82: All other test scenarios are disabled - please re-enable before merging.

This configuration disables comprehensive OpenAI provider testing (chat, text completion, tool calls, embeddings, transcription, etc.) and only runs image generation tests. While this may be intentional for focused development, it should be reverted to maintain full test coverage before merging to main.

Compare with the HuggingFace test in this same PR where most scenarios remain enabled alongside the new image generation scenarios.

🔧 Re-enable scenarios before merge
 		Scenarios: testutil.TestScenarios{
-			TextCompletion:        false,
-			TextCompletionStream:  false,
-			SimpleChat:            false,
-			CompletionStream:      false,
-			MultiTurnConversation: false,
-			ToolCalls:             false,
-			ToolCallsStreaming:    false,
-			MultipleToolCalls:     false,
-			End2EndToolCalling:    false,
-			AutomaticFunctionCall: false,
-			ImageURL:              false,
-			ImageBase64:           false,
-			MultipleImages:        false,
-			FileBase64:            false,
-			FileURL:               false,
-			CompleteEnd2End:       false,
-			SpeechSynthesis:       false,
-			SpeechSynthesisStream: false,
-			Transcription:         false,
-			TranscriptionStream:   false,
-			Embedding:             false,
-			Reasoning:             false,
-			ListModels:            false,
+			TextCompletion:        true,
+			TextCompletionStream:  true,
+			SimpleChat:            true,
+			CompletionStream:      true,
+			MultiTurnConversation: true,
+			ToolCalls:             true,
+			ToolCallsStreaming:    true,
+			MultipleToolCalls:     true,
+			End2EndToolCalling:    true,
+			AutomaticFunctionCall: true,
+			ImageURL:              true,
+			ImageBase64:           true,
+			MultipleImages:        true,
+			FileBase64:            true,
+			FileURL:               true,
+			CompleteEnd2End:       true,
+			SpeechSynthesis:       true,
+			SpeechSynthesisStream: true,
+			Transcription:         true,
+			TranscriptionStream:   true,
+			Embedding:             true,
+			Reasoning:             true,
+			ListModels:            true,
 			ImageGeneration:       true,
 			ImageGenerationStream: true,
-			BatchCreate:           false,
-			BatchList:             false,
-			BatchRetrieve:         false,
-			BatchCancel:           false,
-			BatchResults:          false,
-			FileUpload:            false,
-			FileList:              false,
-			FileRetrieve:          false,
-			FileDelete:            false,
-			FileContent:           false,
-			FileBatchInput:        false,
-			CountTokens:           false,
-			ChatAudio:             false,
-			StructuredOutputs:     false,
+			BatchCreate:           true,
+			BatchList:             true,
+			BatchRetrieve:         true,
+			BatchCancel:           true,
+			BatchResults:          true,
+			FileUpload:            true,
+			FileList:              true,
+			FileRetrieve:          true,
+			FileDelete:            true,
+			FileContent:           true,
+			FileBatchInput:        true,
+			CountTokens:           true,
+			ChatAudio:             true,
+			StructuredOutputs:     true,
 		},
core/providers/vertex/vertex.go (1)

1420-1431: Add nil-request guard to prevent panic on request.Model access.

The method accesses request.Model at line 1427 without first checking if request is nil. Other providers in the codebase (e.g., OpenAI) include this guard. Based on learnings, validation should occur before dispatch.

Proposed fix
 func (provider *VertexProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
 	providerName := provider.GetProviderKey()
 
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, providerName)
+	}
+
 	if key.VertexKeyConfig == nil {
 		return nil, providerUtils.NewConfigurationError("vertex key config is not set", providerName)
 	}
 
 	deployment := provider.getModelDeployment(key, request.Model)
core/internal/testutil/response_validation.go (1)

1096-1148: Fix min_images conversion: avoid int overflow + don’t record “warnings” in Errors.
This still does unchecked int64/float64 -> int narrowing and appends parse issues to result.Errors while “skipping check” (and without failing), which is inconsistent and may re-trigger CodeQL.

Proposed fix (range-check + consistent warning path)
 func validateImageGenerationFields(t *testing.T, response *schemas.BifrostImageGenerationResponse, expectations ResponseExpectations, result *ValidationResult) {
@@
 	if expectations.ProviderSpecific != nil {
 		if minImagesVal, ok := expectations.ProviderSpecific["min_images"]; ok {
 			var minImages int
 			var parseErr error
+			maxInt := int64(^uint(0) >> 1)
+			minInt := -maxInt - 1
 
 			// Use type switch to handle various numeric types
 			switch v := minImagesVal.(type) {
 			case int:
 				minImages = v
 			case int64:
-				minImages = int(v)
+				if v > maxInt || v < minInt {
+					parseErr = fmt.Errorf("min_images out of int range: %d", v)
+				} else {
+					minImages = int(v)
+				}
 			case float64:
-				minImages = int(v)
+				if v > float64(maxInt) || v < float64(minInt) || v != float64(int64(v)) {
+					parseErr = fmt.Errorf("min_images must be an integer within int range: %v", v)
+				} else {
+					minImages = int(v)
+				}
 			case json.Number:
 				var parsed int64
 				parsed, parseErr = v.Int64()
 				if parseErr == nil {
-					minImages = int(parsed)
+					if parsed > maxInt || parsed < minInt {
+						parseErr = fmt.Errorf("min_images out of int range: %d", parsed)
+					} else {
+						minImages = int(parsed)
+					}
 				}
 			default:
 				parseErr = fmt.Errorf("unsupported type for min_images: %T", v)
 			}
 
 			if parseErr != nil {
-				// Skip the min_images check if conversion fails, but record a warning
-				result.Errors = append(result.Errors,
-					fmt.Sprintf("Failed to parse min_images: %v (skipping check)", parseErr))
-			} else {
+				// Skip the min_images check if conversion fails, but record a warning
+				result.Warnings = append(result.Warnings,
+					fmt.Sprintf("Failed to parse min_images: %v (skipping check)", parseErr))
+				return
+			}
+
+			if minImages > 0 {
 				actualCount := len(response.Data)
 				result.MetricsCollected["image_count"] = actualCount
 				if actualCount < minImages {
 					result.Passed = false
 					result.Errors = append(result.Errors,
 						fmt.Sprintf("Too few images: got %d, expected at least %d", actualCount, minImages))
 				}
 			}
 		}
 	}
tests/integrations/python/tests/test_google.py (1)

1687-1787: Fix cross-provider skipping, narrow exception handling, and make response inspection format-safe.

This block reintroduces issues already called out in prior reviews: unused test_config (ARG002), broad except Exception (BLE001), and provider-specific API key gating (decorator checks only "google"). Also, test_41c can blow up if response is dict-shaped (your own comment says validation supports dict/object).

Proposed diff (consolidated)
@@
-    def test_41a_image_generation_simple(self, test_config, provider, model):
+    def test_41a_image_generation_simple(self, _test_config, provider, model):
@@
-        client = get_provider_google_client(provider)
+        # Skip cleanly if this provider isn't configured in env for this run
+        try:
+            get_api_key(provider)
+        except ValueError:
+            pytest.skip(f"No API key available for provider {provider}")
+
+        client = get_provider_google_client(provider)
@@
-    def test_41b_imagen_predict(self, test_config, provider, model):
+    def test_41b_imagen_predict(self, _test_config, provider, model):
@@
-        client = get_provider_google_client(provider)
+        try:
+            get_api_key(provider)
+        except ValueError:
+            pytest.skip(f"No API key available for provider {provider}")
+
+        client = get_provider_google_client(provider)
@@
-        except Exception as e:
+        except requests.exceptions.RequestException as e:
             # Imagen may not be available in all regions or configurations
             pytest.skip(f"Imagen generation failed: {e}")
+        except (json.JSONDecodeError, ValueError) as e:
+            pytest.skip(f"Imagen generation returned an invalid response: {e}")
@@
-    def test_41c_image_generation_with_text(self, test_config, provider, model):
+    def test_41c_image_generation_with_text(self, _test_config, provider, model):
@@
-        client = get_provider_google_client(provider)
+        try:
+            get_api_key(provider)
+        except ValueError:
+            pytest.skip(f"No API key available for provider {provider}")
+
+        client = get_provider_google_client(provider)
@@
-        for candidate in response.candidates:
-            if hasattr(candidate, "content") and candidate.content:
-                if hasattr(candidate.content, "parts") and candidate.content.parts:
-                    for part in candidate.content.parts:
-                        if hasattr(part, "text") and part.text:
-                            found_text = True
-                        if hasattr(part, "inline_data") and part.inline_data:
-                            found_image = True
+        candidates = []
+        if isinstance(response, dict):
+            candidates = response.get("candidates") or []
+        else:
+            candidates = getattr(response, "candidates", None) or []
+
+        for candidate in candidates:
+            content = candidate.get("content") if isinstance(candidate, dict) else getattr(candidate, "content", None)
+            parts = (content.get("parts") if isinstance(content, dict) else getattr(content, "parts", None)) or []
+            for part in parts:
+                text = part.get("text") if isinstance(part, dict) else getattr(part, "text", None)
+                inline_data = part.get("inline_data") if isinstance(part, dict) else getattr(part, "inline_data", None)
+                if text:
+                    found_text = True
+                if inline_data:
+                    found_image = True

Also, based on learnings: for OpenAI image-generation scenarios under tests/integrations/python, prefer gpt-image-1 via providers.openai.image_generation for determinism and to avoid DALL·E-3 parameter constraints. Based on learnings, please ensure the config/fixtures for this scenario follow that.

Google GenAI Python SDK: is `types.GenerateContentConfig(response_modalities=["IMAGE"|"TEXT"])` the correct way to request image generation, and what are the response fields for image bytes (`inline_data` vs `inlineData`)?
framework/streaming/images.go (2)

259-263: Unreachable fallback: Index >= 0 is always true for int type.

The condition on line 259 will always be true since Index is declared as int (not *int), making its zero value (0) indistinguishable from an explicitly set 0. The fallback to SequenceNumber on line 262 is effectively dead code.

This was flagged in a previous review but appears unresolved. Consider changing Index to *int in the schema or using a sentinel value like -1 to indicate "not set."


84-95: Multi-chunk reconstruction drops URL and uses time.Now() instead of provider timestamp.

In the multi-chunk fallback path:

  1. URL is lost: Lines 90-94 only populate B64JSON, Index, and RevisedPrompt — if a provider streams URL data, it's discarded.
  2. Timestamp is fabricated: Line 108 sets Created: time.Now().Unix() instead of preserving the provider's CreatedAt from chunks.

The single-chunk path (lines 22, 27) correctly preserves both values. This was flagged in a previous review.

Suggested approach

Track URL and CreatedAt while iterating chunks, similar to how revisedPrompts is tracked:

 images := make(map[int]*strings.Builder)
+imageURLs := make(map[int]string)
 var model string
 var revisedPrompts map[int]string = make(map[int]string)
+var createdAt int64

 for _, chunk := range chunks {
     // ... existing code ...
+    if createdAt == 0 && chunk.Delta.CreatedAt != 0 {
+        createdAt = chunk.Delta.CreatedAt
+    }
+    if chunk.Delta.URL != "" {
+        imageURLs[chunk.ImageIndex] = chunk.Delta.URL
+    }
 }

 // In imageData construction:
 imageData = append(imageData, schemas.ImageData{
     B64JSON:       builder.String(),
+    URL:           imageURLs[imageIndex],
     Index:         imageIndex,
     RevisedPrompt: revisedPrompts[imageIndex],
 })

 // For Created:
-Created: time.Now().Unix(),
+Created: func() int64 { if createdAt != 0 { return createdAt }; return time.Now().Unix() }(),

Also applies to: 106-111

core/providers/azure/azure.go (2)

1212-1252: Add nil guard for request before accessing request.Model.

Line 1222 accesses request.Model before validating that request is non-nil. If a nil request is passed, this will panic. This was flagged in a previous review.

While upstream validation in core/bifrost.go may catch this in normal flows, defensive programming at the provider level prevents crashes from direct provider calls.

Suggested fix
 func (provider *AzureProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key,
 	request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, provider.GetProviderKey())
+	}
 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}

1257-1312: Add nil guard for request in streaming method and clean up stray comment.

  1. Nil request: Line 1270 accesses request.Model without checking if request is nil, same issue as the sync method.

  2. Stray comment: Line 1269 has an orphaned // comment that should be removed.

  3. postResponseConverter: Good addition (lines 1280-1285) — this addresses the previous review about stamping ModelDeployment on streamed chunks.

Suggested fix
 func (provider *AzureProvider) ImageGenerationStream(
 	ctx *schemas.BifrostContext,
 	postHookRunner schemas.PostHookRunner,
 	key schemas.Key,
 	request *schemas.BifrostImageGenerationRequest,
 ) (chan *schemas.BifrostStream, *schemas.BifrostError) {
+	if request == nil {
+		return nil, providerUtils.NewBifrostOperationError("image generation request is nil", nil, provider.GetProviderKey())
+	}

 	// Validate api key configs
 	if err := provider.validateKeyConfig(key); err != nil {
 		return nil, err
 	}

-	//
 	deployment := key.AzureKeyConfig.Deployments[request.Model]
core/providers/openai/openai.go (1)

2876-2878: Early return on first completed event drops remaining images for n > 1 requests.

The handler returns immediately upon the first ImageGenerationCompleted event (lines 2876-2878), which closes the stream and drops subsequent images when n > 1. The incompleteImages tracking map is correctly maintained (line 2764 removes completed images, line 2767 tracks started images) but isn't used to determine when all images have finished.

Proposed fix
 			if isCompleted {
+				// Only exit when all images have completed
+				if len(incompleteImages) == 0 {
+					return
+				}
+				// More images still in progress, continue processing
-				return
 			}
tests/integrations/python/tests/test_openai.py (2)

1170-1258: Cross-provider gating still looks brittle for n>1 / quality / size.

The new cases run against every (provider, model) in the image_generation scenario, but only test_52b has a skip—and it’s keyed only on model (Line 1199-1201), not provider+model. Also, quality="low" (Line 1229) and size="1024x1536" (Line 1253-1254) may not be universally supported across all providers included in that scenario.

This overlaps with prior review feedback on capability gating.

OpenAI Images API (gpt-image-1) supported values: does it support quality="low" and size="1024x1536", and do other providers behind an OpenAI-compatible /images.generate endpoint accept the same parameters?
Minimal robustness tweak for the Gemini skip
-        if model == "gemini-2.5-flash-image":
+        if provider == "gemini" and model == "gemini-2.5-flash-image":
             pytest.skip("Gemini 2.5 flash image does not support multiple images")

1173-1254: Silence Ruff ARG002 for the newly added tests.

Ruff flags the unused test_config param on the 52a–52d methods. If you don’t need it, rename it to _test_config for these tests.

Proposed fix
-    def test_52a_image_generation_simple(self, test_config, provider, model, vk_enabled):
+    def test_52a_image_generation_simple(self, _test_config, provider, model, vk_enabled):
 ...
-    def test_52b_image_generation_multiple(self, test_config, provider, model, vk_enabled):
+    def test_52b_image_generation_multiple(self, _test_config, provider, model, vk_enabled):
 ...
-    def test_52c_image_generation_quality(self, test_config, provider, model, vk_enabled):
+    def test_52c_image_generation_quality(self, _test_config, provider, model, vk_enabled):
 ...
-    def test_52d_image_generation_different_sizes(self, test_config, provider, model, vk_enabled):
+    def test_52d_image_generation_different_sizes(self, _test_config, provider, model, vk_enabled):
core/providers/gemini/gemini.go (2)

1673-1748: Add nil-guard after imagenResponse.ToBifrostImageGenerationResponse() before setting ExtraFields.
Line 1733-1737 dereferences response unconditionally; if the converter returns nil, this will panic.

Proposed fix
 	// Convert to Bifrost format
 	response := imagenResponse.ToBifrostImageGenerationResponse()
+	if response == nil {
+		err := providerUtils.NewBifrostOperationError(
+			schemas.ErrProviderResponseDecode,
+			fmt.Errorf("failed to convert Imagen response to Bifrost format"),
+			providerName,
+		)
+		err.ExtraFields = schemas.BifrostErrorExtraFields{
+			Provider:       providerName,
+			ModelRequested: request.Model,
+			RequestType:    schemas.ImageGenerationRequest,
+		}
+		return nil, err
+	}
 	response.ExtraFields.Provider = providerName

1673-1688: Nit: comment says “Vertex AI endpoint with API key auth”, but the default BaseURL is Generative Language API.
If this path truly targets Generative Language .../v1beta/models/{model}:predict with x-goog-api-key, consider rewording the comment to avoid confusion.

core/providers/gemini/images.go (3)

12-94: Guard against nil part when extracting prompt from content.Parts.
Line 81-83 can panic if content.Parts contains nil.

Proposed fix
-			for _, part := range content.Parts {
-				if part.Text != "" {
+			for _, part := range content.Parts {
+				if part == nil {
+					continue
+				}
+				if part.Text != "" {
 					bifrostReq.Input.Prompt = part.Text
 					break
 				}
 			}

96-127: Normalize imageSize and support “4k” (case-insensitive).
Right now this only matches "1k"/"2k" exactly; "1K", "2K", "4K" will fall back to 1024.

Proposed fix
 func convertImagenFormatToSize(imageSize *string, aspectRatio *string) string {
 	// Default size based on imageSize parameter
 	baseSize := 1024
 	if imageSize != nil {
-		switch *imageSize {
-		case "2k":
+		switch strings.ToLower(strings.TrimSpace(*imageSize)) {
+		case "4k":
+			baseSize = 4096
+		case "2k":
 			baseSize = 2048
-		case "1k":
+		case "1k":
 			baseSize = 1024
 		}
 	}

129-187: Don’t silently succeed with empty Data; add nil receiver + candidate nil guards.
Today this can return a “success” response with Data: [] if the provider returns no inline image parts (or if response/candidate is nil). That’s hard to distinguish from a real success.

Proposed fix (minimal hardening)
 func (response *GenerateContentResponse) ToBifrostImageGenerationResponse() (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if response == nil {
+		return nil, &schemas.BifrostError{
+			IsBifrostError: false,
+			Error: &schemas.ErrorField{Message: "nil Gemini response"},
+		}
+	}
 	bifrostResp := &schemas.BifrostImageGenerationResponse{
 		ID:    response.ResponseID,
 		Model: response.ModelVersion,
 		Data:  []schemas.ImageData{},
 	}
@@
 	if len(response.Candidates) > 0 {
-		candidate := response.Candidates[0]
+		candidate := response.Candidates[0]
+		if candidate == nil {
+			return nil, &schemas.BifrostError{
+				IsBifrostError: false,
+				Error:          &schemas.ErrorField{Message: "empty Gemini candidate"},
+			}
+		}
 		if candidate.Content != nil && len(candidate.Content.Parts) > 0 {
@@
-				if part != nil && part.InlineData != nil {
+				if part != nil && part.InlineData != nil && part.InlineData.Data != "" {
 					imageData = append(imageData, schemas.ImageData{
 						B64JSON: string(part.InlineData.Data),
 					})
@@
 			// Only assign imageData when it has elements
 			if len(imageData) > 0 {
 				bifrostResp.Data = imageData
@@
+			} else {
+				return nil, &schemas.BifrostError{
+					IsBifrostError: false,
+					Error:          &schemas.ErrorField{Message: "no image data found in Gemini response"},
+				}
 			}
docs/openapi/schemas/inference/images.yaml (3)

1-100: Verify request contract vs actual handler shape (flat model/prompt vs provider/input/params).
This schema documents a flat request with model (as provider/model) + prompt. Please confirm that the HTTP inference endpoint actually accepts this shape (or update either the handler or the schema to match).

#!/bin/bash
set -euo pipefail

# Find the images inference route handler and see what it unmarshals into.
rg -n --type=go 'images/generations|ImageGenerationRequest|BifrostImageGenerationRequest' -S transports core
# Compare core schema shape.
rg -n --type=go 'type\s+BifrostImageGenerationRequest\b' -S core/schemas

178-204: Usage detail fields: confirm whether ImageTokenDetails needs additional counters (e.g., n_images).
Right now only image_tokens and text_tokens are documented; if the implementation returns more (or uses different casing), align the schema.

#!/bin/bash
set -euo pipefail

rg -n --type=go 'type\s+Image(TokenDetails|Usage)\b' -S core/schemas
rg -n --type=go 'OutputTokensDetails|InputTokensDetails|NImages|n_images' -S core

80-86: Enum drift risk: output_format likely needs "jpg" in addition to "jpeg".
If providers accept/normalize "jpg""jpeg", document both to avoid client-side validation failures.

Also applies to: 122-128, 252-258

core/internal/testutil/test_retry_framework.go (3)

980-995: DefaultImageGenerationRetryConfig returns TestRetryConfig/TestRetryCondition, but image-gen retry uses ImageGenerationRetryConfig/ImageGenerationRetryCondition.
As-is, “default” image-generation conditions here won’t flow into WithImageGenerationRetry(...) unless there’s an explicit conversion step elsewhere. This is easy to silently misconfigure (conditions never evaluated).

Proposed direction (pick one)
  1. Make the default return the typed config (and add a separate DefaultImageGenerationTestRetryConfig if you still need the generic one), or
  2. Keep returning TestRetryConfig, but add an explicit adapter that copies timing/hook fields into ImageGenerationRetryConfig (and either converts conditions or intentionally leaves them empty per local convention).
-// DefaultImageGenerationRetryConfig creates a retry config for image tests
-func DefaultImageGenerationRetryConfig() TestRetryConfig {
-    return TestRetryConfig{ ... }
-}
+// DefaultImageGenerationRetryConfig creates a retry config for image generation tests
+func DefaultImageGenerationRetryConfig() ImageGenerationRetryConfig {
+    return ImageGenerationRetryConfig{
+        MaxAttempts: 10,
+        BaseDelay:   2000 * time.Millisecond,
+        MaxDelay:    10 * time.Second,
+        Conditions:  []ImageGenerationRetryCondition{
+            &EmptyImageGenerationCondition{},
+            &GenericImageGenerationCondition{},
+        },
+        OnRetry: func(attempt int, reason string, t *testing.T) {
+            t.Logf("🔄 Retrying image generation test (attempt %d): %s", attempt, reason)
+        },
+    }
+}

Based on learnings, typed retry configs in core/internal/testutil are usually constructed from GetTestRetryConfigForScenario() by copying timing/hook fields and leaving Conditions empty.


2097-2260: Fix false-pass: (nil response, nil err) currently exits retries and returns (nil, nil).
If operation() ever returns (nil, nil), this can incorrectly look like success to callers (and bypass OnFinalFail).

Minimal defensive fix
 		// Execute the operation
 		response, err := operation()
 		lastResponse = response
 		lastError = err

+		// Treat nil response as failure (even if err is nil) to avoid false-pass.
+		if response == nil {
+			if attempt < config.MaxAttempts {
+				retryReason := "❌ image generation response is nil"
+				if config.OnRetry != nil {
+					config.OnRetry(attempt, retryReason, t)
+				}
+				delay := calculateRetryDelay(attempt-1, config.BaseDelay, config.MaxDelay)
+				time.Sleep(delay)
+				continue
+			}
+			statusCode := 400
+			return nil, &schemas.BifrostError{
+				IsBifrostError: true,
+				StatusCode:     &statusCode,
+				Error: &schemas.ErrorField{
+					Message: fmt.Sprintf("❌ image generation response is nil after %d attempts", attempt),
+				},
+			}
+		}
-
-		// If we have a response, validate it FIRST
-		if response != nil {
+		// If we have a response, validate it FIRST
+		if response != nil {

2947-3095: Naming/doc mismatch: comment says WithImageGenerationStreamValidationRetry, function is WithImageGenerationStreamRetry.
Either rename the function to match the “*StreamValidationRetry” naming used elsewhere, or update the comment header.

Smallest fix (doc-only)
-// WithImageGenerationStreamValidationRetry wraps an image generation streaming operation with retry logic that includes stream content validation
+// WithImageGenerationStreamRetry wraps an image generation streaming operation with retry logic that includes stream content validation
transports/bifrost-http/integrations/router.go (1)

1290-1291: Potential nil-pointer panic: ImageGenerationStreamResponseConverter is invoked unconditionally.
If StreamConfig exists but ImageGenerationStreamResponseConverter is nil, this will panic mid-stream. Prefer validating at startup or guarding before invocation.

Minimal guard (local)
 				case chunk.BifrostImageGenerationStreamResponse != nil:
+					if config.StreamConfig.ImageGenerationStreamResponseConverter == nil {
+						log.Printf("missing ImageGenerationStreamResponseConverter for route %s", config.Path)
+						continue
+					}
 					eventType, convertedResponse, err = config.StreamConfig.ImageGenerationStreamResponseConverter(bifrostCtx, chunk.BifrostImageGenerationStreamResponse)
tests/integrations/python/tests/utils/common.py (1)

2629-2705: LGTM (leaving current validation strictness as-is).
This is a solid cross-provider “shape” validator. There are known edge cases around Google casing/base64 strictness; since we already decided not to tighten those checks right now, I’d keep it unchanged.

plugins/logging/main.go (1)

34-48: Pooled UpdateLogData: ensure new fields are zeroed on putUpdateLogData to avoid cross-request leakage.
ImageGenerationOutput, RawRequest, and RawResponse are request-specific and must be cleared when returning to the pool.

#!/bin/bash
set -euo pipefail

# Find pool put/reset logic and confirm new fields are cleared.
rg -n --hidden --glob '!**/vendor/**' 'putUpdateLogData\(' -n -C 3

echo "----"

# Show the full reset block(s) for UpdateLogData.
rg -n --hidden --glob '!**/vendor/**' 'data\.(Status|TokenUsage|Cost|ChatOutput|ResponsesOutput|EmbeddingOutput|ErrorDetails|SpeechOutput|TranscriptionOutput|ImageGenerationOutput|RawRequest|RawResponse)\s*=' -n -C 2
🧹 Nitpick comments (15)
core/providers/bedrock/bedrock.go (1)

1347-1355: LGTM!

The image generation stubs correctly return UnsupportedOperationError with appropriate request types. Using provider.GetProviderKey() is the correct approach as it properly handles custom provider configurations.

Minor observation: The existing Speech/Transcription stubs (lines 1328-1345) use schemas.Bedrock directly while these new methods use provider.GetProviderKey(). The new approach is more correct. Consider updating the existing stubs in a follow-up for consistency:

 // Speech is not supported by the Bedrock provider.
 func (provider *BedrockProvider) Speech(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostSpeechRequest) (*schemas.BifrostSpeechResponse, *schemas.BifrostError) {
-	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, schemas.Bedrock)
+	return nil, providerUtils.NewUnsupportedOperationError(schemas.SpeechRequest, provider.GetProviderKey())
 }
framework/modelcatalog/pricing.go (1)

379-419: Consider extracting the tiered rate selection logic.

The isAbove200k, isAbove128k, and base-tier branches for image token rates are nearly identical (all fall back to InputCostPerImageTokeninputTokenRate). This pattern is repeated, which could be simplified.

♻️ Optional refactor to reduce duplication
// Helper to get image token rate with fallback
getImageTokenRate := func(imageRate *float64, fallback float64) float64 {
    if imageRate != nil {
        return *imageRate
    }
    return fallback
}

inputImageTokenRate = getImageTokenRate(pricing.InputCostPerImageToken, inputTokenRate)
outputImageTokenRate = getImageTokenRate(pricing.OutputCostPerImageToken, outputTokenRate)

This removes the three near-identical branches since tiered image token rates (InputCostPerImageTokenAbove128kTokens, etc.) don't exist yet in TableModelPricing.

core/providers/gemini/responses.go (1)

1598-1661: Add defensive guards for data: prefixes in inline image encoding for consistency.

The codebase documents Blob.Data as base64-encoded bytes (without data: prefix), and all creation paths confirm this contract. However, the file handler (line 1649–1651) includes a guard against malformed data, while the image paths (lines 1612 and 1887) don't. This inconsistency creates maintenance risk if specifications change or edge cases emerge.

Apply the same guard to images and candidates inline-data blocks to match the file handler pattern:

Image handler guard
 if isImageMimeType(mimeType) {
-	imageURL := fmt.Sprintf("data:%s;base64,%s", mimeType, blob.Data)
+	imageURL := blob.Data
+	if !strings.HasPrefix(imageURL, "data:") {
+		imageURL = fmt.Sprintf("data:%s;base64,%s", mimeType, imageURL)
+	}
 	return &schemas.ResponsesMessageContentBlock{
Candidate inline-image handler guard
 if strings.HasPrefix(part.InlineData.MIMEType, "image/") {
 	return &schemas.ResponsesInputMessageContentBlockImage{
-		ImageURL: schemas.Ptr("data:" + part.InlineData.MIMEType + ";base64," + part.InlineData.Data),
+		ImageURL: func() *string {
+			data := part.InlineData.Data
+			if strings.HasPrefix(data, "data:") {
+				return schemas.Ptr(data)
+			}
+			return schemas.Ptr("data:" + part.InlineData.MIMEType + ";base64," + data)
+		}(),

Also applies to: 1872–1904

core/internal/testutil/image_generation.go (2)

21-151: Non-stream image validation may be too strict on format/dimensions across providers.

Right now you hard-require successful image.DecodeConfig + exact 1024x1024 when B64JSON is present. That can become flaky if a provider returns a format you don’t have a decoder for (or resizes/crops slightly). Consider making the decode/dimension check conditional (log + continue) while still enforcing “non-empty image bytes”.


153-296: Align stream retry config with the directory’s typed-config convention (and keep event typing consistent).

You already do the typed ImageGenerationRetryConfig{ Conditions: []...{} } pattern for non-stream. For consistency, consider doing the same for stream retry config (if WithImageGenerationStreamRetry supports it), instead of passing the generic config through. Based on learnings, this convention is expected in core/internal/testutil/*.

Also: nice job keying completion off openai.ImageGenerationCompleted — that helps keep the stack consistent on event types.

core/providers/openai/openai.go (2)

2684-2692: Add explicit [DONE] marker handling for clean stream termination.

Unlike all other streaming handlers in this file (chat completion at line 947, text completion at line 487, responses at line 1433, speech at line 1913, transcription at line 2295), this handler does not explicitly check for the data: [DONE] stream termination marker. Currently, [DONE] would fail sonic.Unmarshal and log a warning rather than cleanly terminating.

♻️ Proposed fix
 			jsonData := strings.TrimSpace(strings.TrimPrefix(line, "data:"))
 			if jsonData == "" {
 				continue
 			}
+
+			// Check for end of stream
+			if jsonData == "[DONE]" {
+				break
+			}

2881-2884: Add context error check before processing scanner errors (consistency with other handlers).

Other streaming handlers (e.g., transcription at lines 2368-2372, speech at lines 1991-1995) check ctx.Err() != nil and return early to let the defer block handle cancellation/timeout. This handler processes scanner errors without that check, potentially leading to duplicate error handling.

♻️ Proposed fix for consistency
 		if err := scanner.Err(); err != nil {
+			// If context was cancelled/timed out, let defer handle it
+			if ctx.Err() != nil {
+				return
+			}
+			ctx.SetValue(schemas.BifrostContextKeyStreamEndIndicator, true)
 			logger.Warn(fmt.Sprintf("Error reading stream: %v", err))
 			providerUtils.ProcessAndSendError(ctx, postHookRunner, err, responseChan, schemas.ImageGenerationStreamRequest, providerName, request.Model, logger)
 		}
core/bifrost.go (3)

1008-1059: ImageGenerationRequest: good wiring; fix comment typo + consider consistent ctx fallback.

Implementation matches the established pattern (populate pooled BifrostRequesthandleRequest → unwrap typed response). Minor: docstring should be “an image generation request”. Optionally mirror other public APIs by defaulting ctx to bifrost.ctx early (today it relies on handleRequest doing so).

Proposed small polish
-// ImageGenerationRequest sends a image generation request to the specified provider.
+// ImageGenerationRequest sends an image generation request to the specified provider.
 func (bifrost *Bifrost) ImageGenerationRequest(ctx *schemas.BifrostContext,
 	req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
+	if ctx == nil {
+		ctx = bifrost.ctx
+	}
 	if req == nil {
 		return nil, &schemas.BifrostError{

1061-1094: ImageGenerationStreamRequest: consistent with stream pipeline; fix comment typo.
Same as other stream entrypoints (sets RequestType and reuses ImageGenerationRequest payload). Minor: docstring should be “an image generation stream request”.


2858-2873: Streaming short-circuit: image-gen mapping looks correct; add nil-guard symmetry.
The new mapping into/out of BifrostResponse.ImageGenerationStreamResponse is correct. To match the other fields, consider setting streamResponse.BifrostImageGenerationStreamResponse only when non-nil (helps keep payload minimal + consistent).

Small consistency tweak
 					if processedResponse != nil {
 						streamResponse.BifrostTextCompletionResponse = processedResponse.TextCompletionResponse
 						streamResponse.BifrostChatResponse = processedResponse.ChatResponse
 						streamResponse.BifrostResponsesStreamResponse = processedResponse.ResponsesStreamResponse
 						streamResponse.BifrostSpeechStreamResponse = processedResponse.SpeechStreamResponse
 						streamResponse.BifrostTranscriptionStreamResponse = processedResponse.TranscriptionStreamResponse
-						streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse
+						if processedResponse.ImageGenerationStreamResponse != nil {
+							streamResponse.BifrostImageGenerationStreamResponse = processedResponse.ImageGenerationStreamResponse
+						}
 					}
core/providers/gemini/images.go (1)

189-256: Label extraction likely won’t work with map[string]any (common after JSON unmarshal).
labels.(map[string]string) will usually fail unless the map was constructed with that exact type. Consider accepting map[string]any and stringifying values.

core/internal/testutil/test_retry_framework.go (1)

247-255: ImageGenerationRetryConfig shape looks fine and matches the other typed configs.

Consider whether you want a shared “base retry config” struct embedded in all typed configs to reduce repetition (optional).

transports/bifrost-http/integrations/router.go (1)

278-309: Consider adding fail-fast validation for route-type-specific converters in RegisterRoutes.

The ImageGenerationResponseConverter now has a runtime guard (line 717), but other converters marked "SHOULD NOT BE NIL" (ListModelsResponseConverter, TextResponseConverter, ChatResponseConverter, ResponsesResponseConverter, EmbeddingResponseConverter, TranscriptionResponseConverter) are called without nil checks. While ImageGenerationResponseConverter and CountTokensResponseConverter have runtime guards, adding upfront validation in RegisterRoutes for all required response converters would catch configuration errors at startup rather than during request handling—especially valuable in a multi-PR stack where routes and configs may be split across PRs.

tests/integrations/python/tests/utils/common.py (1)

2624-2627: Add a short note that OpenAI image-gen tests should prefer gpt-image-1 for determinism.
These prompts are fine; adding a 1–2 line comment near them (or in the tests that consume them) to standardize on the config key providers.openai.image_generation = gpt-image-1 will help avoid DALLE-3 parameter limitations in integration tests. Based on learnings, prefer gpt-image-1 for image-generation scenarios.

plugins/logging/main.go (1)

524-533: Simplify image-generation token usage: TotalTokens is guaranteed populated here.
You can directly use result.ImageGenerationResponse.Usage.TotalTokens without recomputing/fallback. Based on learnings, this invariant should hold for ImageUsage.

Proposed diff
 				case result.ImageGenerationResponse != nil && result.ImageGenerationResponse.Usage != nil:
 					usage = &schemas.BifrostLLMUsage{}
 					usage.PromptTokens = result.ImageGenerationResponse.Usage.InputTokens
 					usage.CompletionTokens = result.ImageGenerationResponse.Usage.OutputTokens
-					if result.ImageGenerationResponse.Usage.TotalTokens > 0 {
-						usage.TotalTokens = result.ImageGenerationResponse.Usage.TotalTokens
-					} else {
-						usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
-					}
+					usage.TotalTokens = result.ImageGenerationResponse.Usage.TotalTokens

Comment on lines +1250 to 1252
case "ImageGeneration", "ImageGenerationStream":
return DefaultImageGenerationRetryConfig()
default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n --type=go '\bRunImageGeneration(Stream)?Test\b' core/internal/testutil
rg -n --type=go '\bWithImageGenerationRetry\s*\(' core/internal/testutil
rg -n --type=go '\bWithImageGenerationStreamRetry\s*\(' core/internal/testutil
rg -n --type=go '\bGetTestRetryConfigForScenario\s*\(' core/internal/testutil

Repository: maximhq/bifrost

Length of output: 6799


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the GetTestRetryConfigForScenario implementation, especially the switch case
sed -n '1218,1350p' core/internal/testutil/test_retry_framework.go

# Check the specific case lines mentioned in the review
sed -n '1245,1260p' core/internal/testutil/test_retry_framework.go

Repository: maximhq/bifrost

Length of output: 6460


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check DefaultImageGenerationRetryConfig definition
rg -n --type=go 'func DefaultImageGenerationRetryConfig' core/internal/testutil

# Check WithImageGenerationRetry signature
sed -n '2097,2110p' core/internal/testutil/test_retry_framework.go

# Check WithImageGenerationStreamRetry signature
sed -n '2949,2962p' core/internal/testutil/test_retry_framework.go

# Check how it's used in image_generation.go
sed -n '30,45p' core/internal/testutil/image_generation.go
sed -n '165,180p' core/internal/testutil/image_generation.go

Repository: maximhq/bifrost

Length of output: 2248


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the full signature and type def for DefaultImageGenerationRetryConfig
sed -n '981,1050p' core/internal/testutil/test_retry_framework.go

# Check ImageGenerationRetryConfig type definition
rg -n --type=go 'type ImageGenerationRetryConfig' core/internal/testutil

# Check if there's a type assertion or conversion in image_generation.go
sed -n '30,100p' core/internal/testutil/image_generation.go

# Check around line 86 in image_generation.go where WithImageGenerationRetry is called
sed -n '80,95p' core/internal/testutil/image_generation.go

Repository: maximhq/bifrost

Length of output: 5352


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check the ImageGenerationStream handling around line 170 in image_generation.go
sed -n '165,220p' core/internal/testutil/image_generation.go

# Check if there's any conversion for the stream version
rg -n --type=go -A 20 'GetTestRetryConfigForScenario.*ImageGenerationStream' core/internal/testutil/image_generation.go

Repository: maximhq/bifrost

Length of output: 2668


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check ImageGenerationRetryConfig type definition
sed -n '248,280p' core/internal/testutil/test_retry_framework.go

# Verify TestRetryConfig has OnFinalFail field that's being used in the conversion
sed -n '200,240p' core/internal/testutil/test_retry_framework.go

Repository: maximhq/bifrost

Length of output: 5305


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check TestRetryConfig definition to see if it has the same fields
sed -n '190,240p' core/internal/testutil/test_retry_framework.go

Repository: maximhq/bifrost

Length of output: 3638


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find TestRetryConfig definition
rg -n --type=go 'type TestRetryConfig struct' core/internal/testutil/test_retry_framework.go

# Check if there's an ImageGenerationStreamRetryConfig type
rg -n --type=go 'type ImageGenerationStreamRetryConfig' core/internal/testutil

# Check the complete signature of WithImageGenerationStreamRetry once more
sed -n '2949,2960p' core/internal/testutil/test_retry_framework.go

Repository: maximhq/bifrost

Length of output: 92


🏁 Script executed:

#!/bin/bash
set -euo kapipefail

# Get the full TestRetryConfig definition
sed -n '188,196p' core/internal/testutil/test_retry_framework.go

# Now check the actual call to WithImageGenerationStreamRetry with the config being passed
sed -n '195,210p' core/internal/testutil/image_generation.go

Repository: maximhq/bifrost

Length of output: 114


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full TestRetryConfig definition
sed -n '188,196p' core/internal/testutil/test_retry_framework.go

# Now check the actual call to WithImageGenerationStreamRetry with the config being passed
sed -n '195,210p' core/internal/testutil/image_generation.go

Repository: maximhq/bifrost

Length of output: 1207


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if conditions are being used in WithImageGenerationRetry
sed -n '2097,2150p' core/internal/testutil/test_retry_framework.go | head -60

# Check if conditions are being used in WithImageGenerationStreamRetry
sed -n '2949,3000p' core/internal/testutil/test_retry_framework.go | head -60

Repository: maximhq/bifrost

Length of output: 4001


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check what's in config.Conditions for WithImageGenerationRetry
sed -n '2115,2130p' core/internal/testutil/test_retry_framework.go

# See if checkImageGenerationRetryConditions is actually used
sed -n '2120,2165p' core/internal/testutil/test_retry_framework.go

Repository: maximhq/bifrost

Length of output: 2767


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify the full conversion code in image_generation.go
sed -n '45,70p' core/internal/testutil/image_generation.go

# Double check what DefaultImageGenerationRetryConfig actually includes
sed -n '981,998p' core/internal/testutil/test_retry_framework.go

Repository: maximhq/bifrost

Length of output: 1636


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there's a conversion for stream version
sed -n '165,200p' core/internal/testutil/image_generation.go | grep -A 30 "ImageGenerationStream"

Repository: maximhq/bifrost

Length of output: 1169


Fix inconsistent retry condition handling between ImageGeneration and ImageGenerationStreamRetry.

The non-stream ImageGeneration path (image_generation.go, line 48-54) explicitly discards retry conditions during conversion to ImageGenerationRetryConfig by setting Conditions: []ImageGenerationRetryCondition{}, while ImageGenerationStreamRetry preserves the conditions from DefaultImageGenerationRetryConfig(). Both should consistently include the EmptyImageGenerationCondition and GenericResponseCondition from GetTestRetryConfigForScenario().

Convert the non-stream path to either:

  1. Use TestRetryConfig directly like the stream version, or
  2. Preserve the conditions during the struct conversion instead of dropping them to an empty slice
🤖 Prompt for AI Agents
In `@core/internal/testutil/test_retry_framework.go` around lines 1250 - 1252, The
non-stream ImageGeneration path is dropping retry conditions when converting to
ImageGenerationRetryConfig; update the branch handling "ImageGeneration" so it
matches the stream behavior by either returning the TestRetryConfig directly
(like ImageGenerationStream does) or by preserving the default conditions
instead of setting Conditions: []ImageGenerationRetryCondition{}; ensure the
returned config includes EmptyImageGenerationCondition and
GenericResponseCondition as provided by GetTestRetryConfigForScenario() and use
DefaultImageGenerationRetryConfig() or copy its Conditions into the
ImageGenerationRetryConfig conversion.

Comment on lines +222 to +237
// In validation_presets.go - add this function
func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
return ResponseExpectations{
ShouldHaveContent: false, // Image responses don't have text content
ExpectedChoiceCount: 0, // Image responses don't have choices
ShouldHaveUsageStats: true,
ShouldHaveTimestamps: true,
ShouldHaveModel: true,
ShouldHaveLatency: true, // Global expectation: latency should always be present
ProviderSpecific: map[string]interface{}{
"min_images": minImages,
"expected_size": expectedSize,
"response_type": "image_generation",
},
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove leftover development comment.

Line 222 contains a comment // In validation_presets.go - add this function that appears to be a leftover instruction from development. The function implementation itself follows the established pattern correctly.

Proposed fix
-// In validation_presets.go - add this function
+// ImageGenerationExpectations returns validation expectations for image generation scenarios
 func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// In validation_presets.go - add this function
func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
return ResponseExpectations{
ShouldHaveContent: false, // Image responses don't have text content
ExpectedChoiceCount: 0, // Image responses don't have choices
ShouldHaveUsageStats: true,
ShouldHaveTimestamps: true,
ShouldHaveModel: true,
ShouldHaveLatency: true, // Global expectation: latency should always be present
ProviderSpecific: map[string]interface{}{
"min_images": minImages,
"expected_size": expectedSize,
"response_type": "image_generation",
},
}
}
// ImageGenerationExpectations returns validation expectations for image generation scenarios
func ImageGenerationExpectations(minImages int, expectedSize string) ResponseExpectations {
return ResponseExpectations{
ShouldHaveContent: false, // Image responses don't have text content
ExpectedChoiceCount: 0, // Image responses don't have choices
ShouldHaveUsageStats: true,
ShouldHaveTimestamps: true,
ShouldHaveModel: true,
ShouldHaveLatency: true, // Global expectation: latency should always be present
ProviderSpecific: map[string]interface{}{
"min_images": minImages,
"expected_size": expectedSize,
"response_type": "image_generation",
},
}
}
🤖 Prompt for AI Agents
In `@core/internal/testutil/validation_presets.go` around lines 222 - 237, Remove
the leftover development comment above the function definition in
validation_presets.go; specifically delete the line "// In validation_presets.go
- add this function" so only the ImageGenerationExpectations function and its
body remain (referencing the ImageGenerationExpectations symbol to locate the
block).

Comment on lines +1614 to +1671
// ImageGeneration performs an image generation request to the Gemini API.
func (provider *GeminiProvider) ImageGeneration(ctx *schemas.BifrostContext, key schemas.Key, request *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
// Check if image gen is allowed for this provider
if err := providerUtils.CheckOperationAllowed(schemas.Gemini, provider.customProviderConfig, schemas.ImageGenerationRequest); err != nil {
return nil, err
}

// check for imagen models
if schemas.IsImagenModel(request.Model) {
return provider.handleImagenImageGeneration(ctx, key, request)
}
// Prepare body
jsonData, bifrostErr := providerUtils.CheckContextAndGetRequestBody(
ctx,
request,
func() (any, error) { return ToGeminiImageGenerationRequest(request), nil },
provider.GetProviderKey())
if bifrostErr != nil {
return nil, bifrostErr
}

// Use common request function
geminiResponse, rawResponse, latency, bifrostErr := provider.completeRequest(ctx, request.Model, key, jsonData, ":generateContent", &providerUtils.RequestMetadata{
Provider: provider.GetProviderKey(),
Model: request.Model,
RequestType: schemas.ImageGenerationRequest,
})
if bifrostErr != nil {
return nil, bifrostErr
}

response, bifrostErr := geminiResponse.ToBifrostImageGenerationResponse()
if response == nil || bifrostErr != nil {
// Add extra fields to error
bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{
Provider: provider.GetProviderKey(),
ModelRequested: request.Model,
RequestType: schemas.ImageGenerationRequest,
}
return nil, bifrostErr
}

// Set ExtraFields
response.ExtraFields.Provider = provider.GetProviderKey()
response.ExtraFields.ModelRequested = request.Model
response.ExtraFields.RequestType = schemas.ImageGenerationRequest
response.ExtraFields.Latency = latency.Milliseconds()

if providerUtils.ShouldSendBackRawRequest(ctx, provider.sendBackRawRequest) {
providerUtils.ParseAndSetRawRequest(&response.ExtraFields, jsonData)
}

if providerUtils.ShouldSendBackRawResponse(ctx, provider.sendBackRawResponse) {
response.ExtraFields.RawResponse = rawResponse
}

return response, nil
}
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix possible nil-deref when conversion returns (nil, nil) (or nil error).
At Line 1646-1653, if response == nil and bifrostErr == nil, the code dereferences bifrostErr.ExtraFields and will panic.

Proposed fix
-	response, bifrostErr := geminiResponse.ToBifrostImageGenerationResponse()
-	if response == nil || bifrostErr != nil {
-		// Add extra fields to error
-		bifrostErr.ExtraFields = schemas.BifrostErrorExtraFields{
-			Provider:       provider.GetProviderKey(),
-			ModelRequested: request.Model,
-			RequestType:    schemas.ImageGenerationRequest,
-		}
-		return nil, bifrostErr
-	}
+	response, convErr := geminiResponse.ToBifrostImageGenerationResponse()
+	if convErr != nil {
+		convErr.ExtraFields = schemas.BifrostErrorExtraFields{
+			Provider:       provider.GetProviderKey(),
+			ModelRequested: request.Model,
+			RequestType:    schemas.ImageGenerationRequest,
+		}
+		return nil, convErr
+	}
+	if response == nil {
+		convErr := providerUtils.NewBifrostOperationError(
+			schemas.ErrProviderResponseDecode,
+			fmt.Errorf("empty image generation response"),
+			provider.GetProviderKey(),
+		)
+		convErr.ExtraFields = schemas.BifrostErrorExtraFields{
+			Provider:       provider.GetProviderKey(),
+			ModelRequested: request.Model,
+			RequestType:    schemas.ImageGenerationRequest,
+		}
+		return nil, convErr
+	}
🤖 Prompt for AI Agents
In `@core/providers/gemini/gemini.go` around lines 1614 - 1671, In
ImageGeneration, avoid a nil-deref when
geminiResponse.ToBifrostImageGenerationResponse() returns (nil, nil): check if
response == nil and bifrostErr == nil and in that case create and return a
proper *schemas.BifrostError (or populate bifrostErr) before accessing
bifrostErr.ExtraFields; similarly ensure bifrostErr is non-nil before assigning
bifrostErr.ExtraFields and only set response.ExtraFields after confirming
response != nil (references: ImageGeneration,
geminiResponse.ToBifrostImageGenerationResponse, variables response and
bifrostErr).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no changes here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Radheshg04 Understood.

Quick note on the specific edge: at Line 1646–1653, if ToBifrostImageGenerationResponse() ever returns (nil, nil), the next line dereferences bifrostErr and will panic. If that converter guarantees “never (nil, nil)”, we’re fine. If not, a tiny guard avoids the nil-deref without changing behavior:

resp, convErr := geminiResponse.ToBifrostImageGenerationResponse()
if convErr != nil {
	convErr.ExtraFields = schemas.BifrostErrorExtraFields{
		Provider:       provider.GetProviderKey(),
		ModelRequested: request.Model,
		RequestType:    schemas.ImageGenerationRequest,
	}
	return nil, convErr
}
if resp == nil {
	convErr := providerUtils.NewBifrostOperationError(
		schemas.ErrProviderResponseDecode,
		fmt.Errorf("empty image generation response"),
		provider.GetProviderKey(),
	)
	convErr.ExtraFields = schemas.BifrostErrorExtraFields{
		Provider:       provider.GetProviderKey(),
		ModelRequested: request.Model,
		RequestType:    schemas.ImageGenerationRequest,
	}
	return nil, convErr
}

If you’d like, I can open a follow-up issue to track this as a non-blocking hardening item.

Comment on lines +41 to +69
// ToBifrostImageGenerationRequest converts an OpenAI image generation request to Bifrost format
func (request *OpenAIImageGenerationRequest) ToBifrostImageGenerationRequest() *schemas.BifrostImageGenerationRequest {
if request == nil {
return nil
}

provider, model := schemas.ParseModelString(request.Model, schemas.OpenAI)

// Only set Params if the embedded struct is non-empty to avoid always emitting empty params
var params *schemas.ImageGenerationParameters
if request.N != nil || request.Background != nil || request.Moderation != nil ||
request.PartialImages != nil || request.Size != nil || request.Quality != nil ||
request.OutputCompression != nil || request.OutputFormat != nil || request.Style != nil ||
request.ResponseFormat != nil || request.Seed != nil || request.NegativePrompt != nil ||
request.NumInferenceSteps != nil || request.User != nil ||
len(request.ExtraParams) > 0 {
params = &request.ImageGenerationParameters
}

return &schemas.BifrostImageGenerationRequest{
Provider: provider,
Model: model,
Input: &schemas.ImageGenerationInput{
Prompt: request.Prompt,
},
Params: params,
Fallbacks: schemas.ParseFallbacks(request.Fallbacks),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find ParseModelString implementation and usage
echo "=== ParseModelString definition and calls ==="
rg -n 'ParseModelString|func.*ParseModelString' --type go -A 5 | head -100

Repository: maximhq/bifrost

Length of output: 7714


🏁 Script executed:

#!/bin/bash
# Search for where model strings are created/assigned on ingress
echo "=== Model string construction patterns ==="
rg -n 'Model:\s*|\.Model\s*=' --type go core | head -50

Repository: maximhq/bifrost

Length of output: 3464


🏁 Script executed:

#!/bin/bash
# Look for model validation or prefix enforcement
echo "=== Model format validation/prefix patterns ==="
rg -n 'x-model-provider|provider.*prefix|model.*prefix|FormatModelString' --type go -i | head -50

Repository: maximhq/bifrost

Length of output: 6181


🏁 Script executed:

#!/bin/bash
# Find ParseModelString implementation
rg -n 'func.*ParseModelString|^func ParseModelString' --type go -A 20 core/schemas/

Repository: maximhq/bifrost

Length of output: 1294


🏁 Script executed:

#!/bin/bash
# Check the validation logic at ingress points
cat -n transports/bifrost-http/handlers/inference.go | sed -n '565,570p'

Repository: maximhq/bifrost

Length of output: 346


🏁 Script executed:

#!/bin/bash
# Look at core/schemas/utils.go to see ParseModelString implementation details
cat -n core/schemas/utils.go | head -40

Repository: maximhq/bifrost

Length of output: 1469


🏁 Script executed:

#!/bin/bash
# Find how ToBifrostImageGenerationRequest is called
rg -n 'ToBifrostImageGenerationRequest' --type go -B 3 -A 3

Repository: maximhq/bifrost

Length of output: 2545


🏁 Script executed:

#!/bin/bash
# Check if there are any stack/PR references or context
git log --oneline -20 2>/dev/null || echo "No git history available"

Repository: maximhq/bifrost

Length of output: 155


🏁 Script executed:

#!/bin/bash
# Look for the OpenAI image generation request handler to understand the flow
rg -n 'ImageGeneration|openai.*image' --type go transports/bifrost-http/integrations/openai.go | head -30

Repository: maximhq/bifrost

Length of output: 694


🏁 Script executed:

#!/bin/bash
# Check if there are any direct instantiations or calls to ToBifrostImageGenerationRequest
# that might bypass the ingress validation
rg -n 'ToBifrostImageGenerationRequest|OpenAIImageGenerationRequest' --type go -B 5 | grep -E '(ToBifrostImageGenerationRequest|NewOpenAIImageGenerationRequest|&OpenAIImageGenerationRequest)' | head -40

Repository: maximhq/bifrost

Length of output: 966


🏁 Script executed:

#!/bin/bash
# Trace the flow: verify that RequestConverter is only called after ingress validation
rg -n 'RequestConverter|req\.Model' --type go transports/bifrost-http/integrations/openai.go -A 2 -B 2 | head -50

Repository: maximhq/bifrost

Length of output: 2456


🏁 Script executed:

#!/bin/bash
# Check if openai integration validates model format before calling RequestConverter
cat -n transports/bifrost-http/integrations/openai.go | sed -n '1,100p'

Repository: maximhq/bifrost

Length of output: 4066


🏁 Script executed:

#!/bin/bash
# Verify that the OpenAI integration goes through handler validation before RequestConverter
# Look at the handler registration and see how requests flow through validation
rg -n 'ImageGenerationHandler|images/generations' --type go transports/bifrost-http/ -B 5 -A 10 | head -80

Repository: maximhq/bifrost

Length of output: 6020


🏁 Script executed:

#!/bin/bash
# Check if there's a direct handler for image generation or if it uses standard route patterns
fd -type f -name "*.go" transports/bifrost-http/handlers/ -exec grep -l "image" {} \;

Repository: maximhq/bifrost

Length of output: 229


🏁 Script executed:

#!/bin/bash
# Verify the complete flow: examine how standard HTTP requests are validated before reaching integration layer
rg -n 'DefaultRoute|handleRequest|RequestConverter' --type go transports/bifrost-http/lib/ | head -20

Repository: maximhq/bifrost

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Look at the imageGeneration handler implementation to see if it validates model format
cat -n transports/bifrost-http/handlers/inference.go | sed -n '1417,1450p'

Repository: maximhq/bifrost

Length of output: 1487


🏁 Script executed:

#!/bin/bash
# Check what ImageGenerationHTTPRequest looks like and if validation happens
rg -n 'type ImageGenerationHTTPRequest|model.*validation|ParseModelString.*imageGeneration' --type go transports/bifrost-http/handlers/ -A 5

Repository: maximhq/bifrost

Length of output: 484


🏁 Script executed:

#!/bin/bash
# Continue reading the imageGeneration handler to see what happens after model validation
cat -n transports/bifrost-http/handlers/inference.go | sed -n '1450,1520p'

Repository: maximhq/bifrost

Length of output: 3106


🏁 Script executed:

#!/bin/bash
# Check if the handler uses the integration router or does custom processing
rg -n 'router\.|integration\.|BifrostRequest' --type go transports/bifrost-http/handlers/inference.go | grep -A 5 -B 5 'imageGeneration'

Repository: maximhq/bifrost

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Check if there are any other integration paths for image generation besides OpenAI
rg -n 'ImageGeneration|images/generations' --type go transports/bifrost-http/integrations/ | grep -v openai.go | head -20

Repository: maximhq/bifrost

Length of output: 3201


🏁 Script executed:

#!/bin/bash
# Verify the GenericRouter and how it processes requests vs the handler approach
rg -n 'GenericRouter|handleRequest|RequestParser.*RequestConverter' --type go transports/bifrost-http/integrations/router.go -B 3 -A 3 | head -60

Repository: maximhq/bifrost

Length of output: 3368


🏁 Script executed:

#!/bin/bash
# Check if OpenAI integration routes handle x-model-provider header or similar overrides
rg -n 'x-model-provider|PreHook|PostHook' --type go transports/bifrost-http/integrations/openai.go | head -30

Repository: maximhq/bifrost

Length of output: 1343


🏁 Script executed:

#!/bin/bash
# Check if there's validation or preprocessing in the OpenAI router before RequestConverter
cat -n transports/bifrost-http/integrations/openai.go | sed -n '1,50p'

Repository: maximhq/bifrost

Length of output: 2139


🏁 Script executed:

#!/bin/bash
# Examine a complete route config for image generation to see if there are pre-hooks
cat -n transports/bifrost-http/integrations/openai.go | sed -n '415,445p'

Repository: maximhq/bifrost

Length of output: 1490


🏁 Script executed:

#!/bin/bash
# Check route registration order and understand which handler takes precedence
# Look for how routes are registered in the main app
rg -n 'RegisterRoutes|GenericRouter|OpenAIRouter' --type go transports/bifrost-http/ | grep -E '(main|setup|register)' | head -20

Repository: maximhq/bifrost

Length of output: 1717


🏁 Script executed:

#!/bin/bash
# Check if both the inference handler and openai integration both register for the same path
# and understand the precedence
rg -n 'RegisterRoutes|createHandler.*imageGeneration|CreateOpenAIRoutes' --type go transports/bifrost-http/handlers/ transports/bifrost-http/integrations/ | head -30

Repository: maximhq/bifrost

Length of output: 4020


🏁 Script executed:

#!/bin/bash
# Look for the main entry point to understand which handlers are registered
find transports/bifrost-http -name "main.go" -o -name "server.go" -o -name "router.go" | head -5

Repository: maximhq/bifrost

Length of output: 178


🏁 Script executed:

#!/bin/bash
# Check main.go to see the order of route registration
cat -n transports/bifrost-http/main.go

Repository: maximhq/bifrost

Length of output: 8002


🏁 Script executed:

#!/bin/bash
# Check the server setup
cat -n transports/bifrost-http/server/server.go | head -100

Repository: maximhq/bifrost

Length of output: 4519


🏁 Script executed:

#!/bin/bash
# Look for the Bootstrap method to see route registration order
cat -n transports/bifrost-http/server/server.go | sed -n '100,300p' | grep -A 50 'func.*Bootstrap'

Repository: maximhq/bifrost

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Search for where handlers are registered
rg -n 'CompletionHandler.*RegisterRoutes|IntegrationHandler.*RegisterRoutes|OpenAIRouter' --type go transports/bifrost-http/server/ -A 3 -B 3

Repository: maximhq/bifrost

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Find where handlers are actually registered in the server
rg -n 'RegisterRoutes' --type go transports/bifrost-http/server/server.go -A 2 -B 2

Repository: maximhq/bifrost

Length of output: 1932


🏁 Script executed:

#!/bin/bash
# Search for where CompletionHandler and IntegrationHandler are created/used
rg -n 'CompletionHandler|IntegrationHandler' --type go transports/bifrost-http/server/server.go -B 3 -A 3

Repository: maximhq/bifrost

Length of output: 561


🏁 Script executed:

#!/bin/bash
# Confirm the route precedence - check if fasthttp router allows duplicate routes or if first wins
rg -n 'router.Router|POST.*images' --type go transports/bifrost-http/handlers/inference.go -A 1 -B 1 | head -40

Repository: maximhq/bifrost

Length of output: 719


🏁 Script executed:

#!/bin/bash
# Check if both handlers try to register the exact same path
rg -n '/v1/images/generations' --type go transports/bifrost-http/

Repository: maximhq/bifrost

Length of output: 422


🏁 Script executed:

#!/bin/bash
# Critical finding: both routes registered for /v1/images/generations
# Check if there are other OpenAI integration paths that might bypass inference handler validation
rg -n 'CreateOpenAIRoutes|/images/generations|/openai/' --type go transports/bifrost-http/integrations/openai.go | head -30

Repository: maximhq/bifrost

Length of output: 657


🏁 Script executed:

#!/bin/bash
# Check the other image generation paths in OpenAI integration
cat -n transports/bifrost-http/integrations/openai.go | sed -n '416,422p'

Repository: maximhq/bifrost

Length of output: 326


🏁 Script executed:

#!/bin/bash
# Verify which handler actually handles the standard /v1/images/generations requests
# by checking if inference handler truly validates before calling ToBifrostImageGenerationRequest
cat -n transports/bifrost-http/handlers/inference.go | sed -n '1427,1465p'

Repository: maximhq/bifrost

Length of output: 1575


🏁 Script executed:

#!/bin/bash
# Check what pathPrefix is used for OpenAI integration
rg -n 'CreateOpenAIRoutes|pathPrefix' --type go transports/bifrost-http/handlers/integrations.go -A 5 -B 5

Repository: maximhq/bifrost

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Verify if model format validation is enforced at the OpenAI integration level
# Check if there's any validation in the RequestParser or PreCallback
rg -n 'RequestParser|PreCallback' --type go transports/bifrost-http/integrations/openai.go | grep -A 2 "Image\|image"

Repository: maximhq/bifrost

Length of output: 41


Fix the unvalidated model string handling in the OpenAI integration's image generation route.

The integration handler for /v1/images/generations (registered first) bypasses model format validation and directly calls ToBifrostImageGenerationRequest(), which defaults to schemas.OpenAI when the model string lacks a provider prefix. This creates a routing bug: unprefixed model strings (e.g., gpt-4o) will always be assigned the OpenAI provider, even when the request should be routed to another provider based on gateway configuration.

Add model format validation in the OpenAI integration's image generation route (either in a PreCallback or RequestParser) to enforce that model strings follow the provider/model format, matching the validation already present in the inference handler.

Comment on lines +546 to +584
type ImageGenerationEventType string

// OpenAIImageGenerationRequest is the struct for Image Generation requests by OpenAI.
type OpenAIImageGenerationRequest struct {
Model string `json:"model"`
Prompt string `json:"prompt"`

schemas.ImageGenerationParameters

Stream *bool `json:"stream,omitempty"`
Fallbacks []string `json:"fallbacks,omitempty"`
}

// IsStreamingRequested implements the StreamingRequest interface
func (r *OpenAIImageGenerationRequest) IsStreamingRequested() bool {
return r.Stream != nil && *r.Stream
}

// OpenAIImageStreamResponse is the struct for Image Generation streaming responses by OpenAI.
type OpenAIImageStreamResponse struct {
Type ImageGenerationEventType `json:"type,omitempty"`
SequenceNumber *int `json:"sequence_number,omitempty"`
B64JSON *string `json:"b64_json,omitempty"`
PartialImageIndex *int `json:"partial_image_index,omitempty"`
CreatedAt int64 `json:"created_at,omitempty"`
Size string `json:"size,omitempty"`
Quality string `json:"quality,omitempty"`
Background string `json:"background,omitempty"`
OutputFormat string `json:"output_format,omitempty"`
RawSSE string `json:"-"` // For internal use
Usage *schemas.ImageUsage `json:"usage,omitempty"`
// Error fields for error events
Error *struct {
Code *string `json:"code,omitempty"`
Message string `json:"message,omitempty"`
Param *string `json:"param,omitempty"`
Type *string `json:"type,omitempty"`
} `json:"error,omitempty"`
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for ImageGenerationEventType definition and usage
rg "ImageGenerationEventType" -n

Repository: maximhq/bifrost

Length of output: 565


🏁 Script executed:

# Search for image generation event type constants that may already exist
rg "image_generation\.(partial_image|completed|error)" -n

Repository: maximhq/bifrost

Length of output: 1418


🏁 Script executed:

# Check the context around the file to understand the structure
git log --oneline -n 20 -- core/providers/openai/types.go | head -10

Repository: maximhq/bifrost

Length of output: 155


🏁 Script executed:

# Look for similar event type constants in the repo
rg "EventType.*string|event.*const" -A 3 -B 1 --type go | head -50

Repository: maximhq/bifrost

Length of output: 1656


Move image generation event type constants to types.go alongside the type definition.

The constants (ImageGenerationPartial, ImageGenerationCompleted, ImageGenerationError) currently defined in images.go should be moved to types.go where ImageGenerationEventType is declared. This follows the repo convention established by other providers (Mistral, Anthropic, Cohere), where event type definitions and their constants are co-located for better organization and discoverability.

🤖 Prompt for AI Agents
In `@core/providers/openai/types.go` around lines 546 - 584, Move the
ImageGeneration event constants from images.go into types.go next to the
ImageGenerationEventType declaration: define ImageGenerationPartial,
ImageGenerationCompleted, and ImageGenerationError as constants of type
ImageGenerationEventType (e.g. const ImageGenerationPartial
ImageGenerationEventType = "partial", etc.), remove their definitions from
images.go, and update any references/imports if needed so code uses the
relocated constants; keep the constants in types.go immediately adjacent to the
ImageGenerationEventType type definition for consistency with other providers.

Comment on lines +1493 to +1559
// Auth query is used for fine-tuned models to pass the API key in the query string
authQuery := ""
// Determine the URL based on model type
var completeURL string
if schemas.IsAllDigitsASCII(deployment) {
// Custom Fine-tuned models use OpenAPI endpoint
projectNumber := key.VertexKeyConfig.ProjectNumber
if projectNumber == "" {
return nil, providerUtils.NewConfigurationError("project number is not set for fine-tuned models", providerName)
}
if key.Value != "" {
authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value))
}
if region == "global" {
completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1beta1/projects/%s/locations/global/endpoints/%s:generateContent", projectNumber, deployment)
} else {
completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1beta1/projects/%s/locations/%s/endpoints/%s:generateContent", region, projectNumber, region, deployment)
}

} else if schemas.IsImagenModel(deployment) {
// Imagen models are published models, use publishers/google/models path
if key.Value != "" {
authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value))
}
if region == "global" {
completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1/projects/%s/locations/global/publishers/google/models/%s:predict", projectID, deployment)
} else {
completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:predict", region, projectID, region, deployment)
}
} else if schemas.IsGeminiModel(deployment) {
if key.Value != "" {
authQuery = fmt.Sprintf("key=%s", url.QueryEscape(key.Value))
}
if region == "global" {
completeURL = fmt.Sprintf("https://aiplatform.googleapis.com/v1/projects/%s/locations/global/publishers/google/models/%s:generateContent", projectID, deployment)
} else {
completeURL = fmt.Sprintf("https://%s-aiplatform.googleapis.com/v1/projects/%s/locations/%s/publishers/google/models/%s:generateContent", region, projectID, region, deployment)
}
}

// Create HTTP request for streaming
req := fasthttp.AcquireRequest()
resp := fasthttp.AcquireResponse()
defer fasthttp.ReleaseRequest(req)
defer fasthttp.ReleaseResponse(resp)

req.Header.SetMethod(http.MethodPost)
req.Header.SetContentType("application/json")
providerUtils.SetExtraHeaders(ctx, req, provider.networkConfig.ExtraHeaders, nil)

// If auth query is set, add it to the URL
// Otherwise, get the oauth2 token and set the Authorization header
if authQuery != "" {
completeURL = fmt.Sprintf("%s?%s", completeURL, authQuery)
} else {
// Getting oauth2 token
tokenSource, err := getAuthTokenSource(key)
if err != nil {
return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
}
token, err := tokenSource.Token()
if err != nil {
return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
}
req.Header.Set("Authorization", "Bearer "+token.AccessToken)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Vertex AI Imagen predict endpoint API key authentication support

💡 Result:

Short answer: No — the Vertex AI Imagen predict endpoint does not accept API key authentication. Use OAuth2 (service-account or user access tokens / Application Default Credentials) and send a Bearer token in the Authorization header. See the Imagen predict examples that use "Authorization: Bearer $(gcloud auth print-access-token)". [1][2][3]

Example (curl):
curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" "https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:predict" -d '{ "instances":[{ "prompt":"..." }] }'

References:

  • Imagen API docs (predict examples use Bearer token). [1]
  • models.predict REST reference. [2]
  • Issue/discussion reporting that Vertex AI rejects API keys ("API keys are not supported by this API"). [3]

🌐 Web query:

Vertex AI custom fine-tuned models endpoint authentication requirements

💡 Result:

Short answer — Vertex AI endpoints require Google OAuth 2.0 bearer tokens (service-account or user identity via Application Default Credentials / gcloud). API keys are not supported. You can call endpoints with an access/identity token in the Authorization: Bearer header, and for deployed custom models you may configure a custom service account (prediction containers otherwise run with a Vertex-managed service account). [1][3][4]

Sources:

  • How to authenticate API requests to Vertex AI (use gcloud/ADC/service account tokens). [1][4]
  • Vertex AI deployed prediction containers and custom service account behavior. [3]

🌐 Web query:

Vertex AI Gemini generateContent endpoint API key vs OAuth2 authentication

💡 Result:

Short answer — both work, but use API keys for quick tests and OAuth2/service‑account (Application Default Credentials) for production.

Key differences

  • How to send:
    • API key: supply x-goog-api-key header (or ?key=) when calling generateContent. [1][3]
    • OAuth2 / service account: obtain an access token and send Authorization: Bearer . Vertex APIs accept Google Cloud credentials/ADC. [2]
  • Security & scope:
    • API keys are simple but bearer-like: less granular, riskier if leaked, and should be restricted (HTTP referrer/IP/app restrictions). Not recommended for client apps. [1]
    • OAuth2/service accounts use IAM, scoped OAuth tokens, fine-grained permissions, audit logs, and are recommended for production. [1][2]
  • Use cases:
    • API key: rapid testing, prototypes, demos. [1]
    • OAuth2/service account (ADC): production servers, pipelines, backend services, CI/CD, and cases needing IAM/auditing. [1][2]
  • Quotas, billing, and IAM:
    • Both consume project quota/billing, but only OAuth2 ties calls to a service account / IAM policy for fine-grained access control and auditing. [2]

Recommendation

  • Development/test: API key with restrictions. [1]
  • Production: use OAuth2 via service accounts / Application Default Credentials (ADC) for secure, auditable access. [1][2]

Sources
[1] Google Cloud — Get a Google Cloud API key (Vertex AI / Gemini).
[2] Google Cloud — Vertex AI Generative AI REST reference (generateContent endpoints).
[3] Google — Gemini API authentication (x-goog-api-key example).


API-key authentication will not work for Vertex AI Imagen and custom fine-tuned model endpoints.

The code uses ?key=... query parameter authentication for custom fine-tuned models (lines 1503–1509), Imagen (lines 1511–1520), and Gemini (lines 1522–1530). However, Vertex AI Imagen predict endpoints and custom fine-tuned model endpoints do not accept API key authentication and will return "API keys are not supported by this API" errors. These endpoints require OAuth2 Bearer tokens.

Gemini's generateContent endpoint does support API key auth, so that path is acceptable. However, for consistency and to support all model types reliably, switch to OAuth2 (Bearer tokens via ADC or service-account credentials) for all three branches, or remove API key support from Imagen and custom fine-tuned model paths.

🤖 Prompt for AI Agents
In `@core/providers/vertex/vertex.go` around lines 1493 - 1559, The
IsAllDigitsASCII (custom fine-tuned) and IsImagenModel branches currently set
authQuery from key.Value which enables API-key auth; update those branches to
stop populating authQuery and instead always obtain an OAuth2 token via
getAuthTokenSource(key) and set req.Header.Set("Authorization", "Bearer
"+token.AccessToken) (same approach already used later), keeping the existing
projectNumber/URL construction logic via completeURL; leave the IsGeminiModel
branch behavior unchanged if you want to continue supporting API key for Gemini.
Ensure authQuery remains empty for Imagen and fine-tuned paths so the later "if
authQuery != \"\" { completeURL = fmt.Sprintf(...)} else { tokenSource... }"
flow uses OAuth2 for those models.

Comment on lines +21 to +63
size:
type: string
enum:
- "256x256"
- "512x512"
- "1024x1024"
- "1792x1024"
- "1024x1792"
- "1536x1024"
- "1024x1536"
description: Size of the generated image
quality:
type: string
enum:
- "standard"
- "hd"
description: Quality of the generated image
style:
type: string
enum:
- "natural"
- "vivid"
description: Style of the generated image
response_format:
type: string
enum:
- "url"
- "b64_json"
default: "url"
description: Format of the response. This parameter is not supported for streaming requests.
user:
type: string
description: User identifier for tracking
stream:
type: boolean
default: false
description: |
Whether to stream the response. When true, images are sent as base64 chunks via SSE.
fallbacks:
type: array
items:
type: string
description: Fallback models to try if primary model fails
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

OpenAI Images API supported size values 1536x1024 1024x1536 quality style

💡 Result:

Yes — the Images API supports those sizes and quality options.

  • Supported sizes: 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), plus "auto". [1]
  • Quality options: low, medium, high, and "auto" (affects token usage/cost). [1]
  • Cost/token differences by size × quality are documented per image model (e.g., gpt-image-1/gpt-image-1.5). [2][3]
  • There is no separate "style" parameter in the Images API docs — control style via your prompt (or model choice); other configurable fields include format, compression, background, and action. [1][3]

Sources: OpenAI Images guide and model docs. [1][2][3]

References
[1] OpenAI — Image generation (Customize output: size, quality, auto).
[2] OpenAI — GPT Image 1 model pricing (sizes/quality).
[3] OpenAI — GPT Image 1.5 / chatgpt-image-latest model pages (sizes/quality).


🌐 Web query:

OpenAI Images API official documentation parameters size quality style

💡 Result:

Short summary of the Images API parameters you asked about:

  • size — Image dimensions (examples: "1024x1024", "1536x1024", "1024x1536"); also supports "auto" (model picks). Default 1024x1024. (platform.openai.com)
  • quality — Rendering quality options: "low", "medium", "high" (and "auto"); for DALL·E‑3 there is also "standard" (default) and "hd" (higher‑quality/higher‑cost). (platform.openai.com)
  • style — Optional control for overall look; current valid values include "vivid" (default for some endpoints) and "natural". (help.openai.com)

For full details (model-specific allowed sizes, defaults, compression/output format, and token/cost effects) see the official Image Generation docs and DALL·E 3 help article. (platform.openai.com)

Citations:


🌐 Web query:

OpenAI Images API streaming SSE support image generation

💡 Result:

Yes — OpenAI’s Images/Responses APIs support streaming image generation (Server‑Sent Events style). You can call the API with stream=true and use the partial_images parameter (0–3) to receive partial image frames as they are generated; SDK examples and event types are shown in the docs. [1][2][4]


Correct the size enum — remove unsupported OpenAI values and add "auto".

The size enum includes values not supported by OpenAI's official Images API:

  • "256x256", "512x512", "1792x1024", "1024x1792" do not exist in OpenAI's API
  • The supported OpenAI sizes are: "1024x1024", "1536x1024" (landscape), "1024x1536" (portrait), and "auto" (model-selected; default is 1024x1024)
  • Adjust to match the official OpenAI documentation

The quality enum ("standard", "hd") and style enum ("natural", "vivid") are accurate for DALL·E 3, but note that other OpenAI models support different quality values ("low", "medium", "high", "auto"). If this schema targets DALL·E 3 exclusively, document that; otherwise, clarify which models these parameters apply to.

Streaming is supported by OpenAI's Images API (with stream=true and partial_images parameter), but verify the exact event schema/types for OpenAIImageStreamResponse match OpenAI's official streaming format before finalizing.

Also applies to: 92-129

🤖 Prompt for AI Agents
In `@docs/openapi/schemas/integrations/openai/images.yaml` around lines 21 - 63,
Update the size enum in the images schema to match OpenAI's supported values:
remove "256x256", "512x512", "1792x1024", and "1024x1792" and add "auto",
leaving "1024x1024", "1536x1024", and "1024x1536"; ensure the default remains or
is set to "1024x1024" as documented. Keep the existing quality and style enums
if this schema is DALL·E 3-specific, but add a short note in the schema
description for the quality/style fields (quality, style) clarifying they apply
to DALL·E 3 only or expand allowed values if the schema is meant for multiple
OpenAI models. Also verify the stream field description and the
response_format/streaming contract (response_format, stream) against OpenAI's
official streaming event shape (partial_images/SSE) and update the
OpenAIImageStreamResponse types elsewhere if needed.

Comment on lines +489 to +610
# 8. Image Generation

Gemini supports two image generation formats depending on the model:

1. **Standard Gemini Format**: Uses the `/v1beta/models/{model}:generateContent` endpoint
2. **Imagen Format**: Uses the `/v1beta/models/{model}:predict` endpoint for Imagen models (detected automatically)

## Request Parameters

### Parameter Mapping

| Parameter | Transformation |
|-----------|----------------|
| `input.prompt` | Text description of the image to generate |
| `params.n` | Number of images (mapped to `sampleCount` for Imagen, `candidateCount` for Gemini) |
| `params.size` | Image size in WxH format (e.g., `"1024x1024"`). Converted to Imagen's `imageSize` + `aspectRatio` format |
| `params.output_format` | Output format: `"png"`, `"jpeg"`, `"webp"`. Converted to MIME type for Imagen |
| `params.seed` | Seed for reproducible generation (passed directly) |
| `params.negative_prompt` | Negative prompt (passed directly) |

### Extra Parameters

Use `extra_params` (SDK) or pass directly in request body (Gateway) for Gemini-specific fields:

| Parameter | Type | Notes |
|-----------|------|-------|
| `personGeneration` | string | Person generation setting (Imagen only) |
| `language` | string | Language code (Imagen only) |
| `enhancePrompt` | bool | Prompt enhancement flag (Imagen only) |
| `safetySettings` / `safety_settings` | string/array | Safety settings configuration |
| `cachedContent` / `cached_content` | string | Cached content ID |
| `labels` | object | Custom labels map |

<Tabs>
<Tab title="Gateway">

```bash
curl -X POST http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/imagen-4.0-generate-001",
"prompt": "A sunset over the mountains",
"size": "1024x1024",
"n": 2,
"output_format": "png"
}'
```

</Tab>
<Tab title="Go SDK">

```go
resp, err := client.ImageGenerationRequest(ctx, &schemas.BifrostImageGenerationRequest{
Provider: schemas.Gemini,
Model: "imagen-4.0-generate-001",
Input: &schemas.ImageGenerationInput{
Prompt: "A sunset over the mountains",
},
Params: &schemas.ImageGenerationParameters{
Size: schemas.Ptr("1024x1024"),
N: schemas.Ptr(2),
OutputFormat: schemas.Ptr("png"),
},
})
```

</Tab>
</Tabs>

## Request Conversion

### Standard Gemini Format

- **Model & Prompt**: `bifrostReq.Model` → `req.Model`, `bifrostReq.Input.Prompt` → `req.Contents[0].Parts[0].Text`
- **Response Modality**: Sets `generationConfig.responseModalities = ["IMAGE"]` to indicate image generation
- **Number of Images**: `params.n` → `generationConfig.candidateCount`
- **Extra Parameters**: `safetySettings`, `cachedContent`, `labels` mapped directly

### Imagen Format

- **Prompt**: `bifrostReq.Input.Prompt` → `req.Instances[0].Prompt`
- **Number of Images**: `params.n` → `req.Parameters.SampleCount`
- **Size Conversion**: `params.size` (WxH format) converted to:
- `imageSize`: `"1k"` (if dimensions ≤ 1024) or `"2k"` (if dimensions ≤ 2048)
- `aspectRatio`: `"1:1"`, `"3:4"`, `"4:3"`, `"9:16"`, or `"16:9"` (based on width/height ratio)
- **Output Format**: `params.output_format` (`"png"`, `"jpeg"`) → `parameters.outputOptions.mimeType` (`"image/png"`, `"image/jpeg"`)
- **Seed & Negative Prompt**: Passed directly to `parameters.seed` and `parameters.negativePrompt`
- **Extra Parameters**: `personGeneration`, `language`, `enhancePrompt`, `safetySettings` mapped to parameters

## Response Conversion

### Standard Gemini Format

- **Image Data**: Extracts `InlineData` from `candidates[0].content.parts[]` with MIME type `image/*`
- **Output Format**: Converts MIME type (`image/png`, `image/jpeg`, `image/webp`) → file extension (`png`, `jpeg`, `webp`)
- **Usage**: Extracts token usage from `usageMetadata`
- **Multiple Images**: Each image part becomes an `ImageData` entry in the response array

### Imagen Format

- **Image Data**: Each `prediction` in `response.predictions[]` → `ImageData` with `b64_json` from `bytesBase64Encoded`
- **Output Format**: Converts `prediction.mimeType` → file extension for `outputFormat` field
- **Index**: Each prediction gets an `index` (0, 1, 2, ...) in the response array

## Size Conversion

For Imagen format, size is converted between formats:
- **Bifrost → Imagen**: `"1024x1024"` → `imageSize: "1k"`, `aspectRatio: "1:1"`
- **Bifrost → Imagen**: `"768x1024"` → `imageSize: "1k"`, `aspectRatio: "3:4"`
- **Imagen → Bifrost**: `imageSize: "1k"`, `aspectRatio: "1:1"` → `"1024x1024"`

## Endpoint Selection

The provider automatically selects the endpoint based on model name:
- **Imagen models** (detected via `schemas.IsImagenModel()`): Uses `/v1beta/models/{model}:predict` endpoint
- **Other models**: Uses `/v1beta/models/{model}:generateContent` endpoint with image response modality

## Streaming

Image generation streaming is not supported by Gemini.

---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for /v1/images/generations endpoint definition
echo "=== Searching for /v1/images/generations endpoint ==="
rg -n --hidden --glob '!**/vendor/**' '/v1/images/generations|images/generations' -C 3

echo ""
echo "=== Searching for request DTO/schema fields ==="
rg -n --hidden --glob '!**/vendor/**' 'ImageGeneration.*(Request|Parameters)|output_format|response_format' -C 2

echo ""
echo "=== Searching for OpenAPI/Gateway schema files ==="
fd -e yaml -e yml -e json | head -20 | xargs grep -l 'images/generations' 2>/dev/null || echo "No matches in yaml/json files"

Repository: maximhq/bifrost

Length of output: 50374


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for BifrostImageGenerationRequest struct definition
echo "=== Searching for BifrostImageGenerationRequest ==="
rg -n --hidden --glob '!**/vendor/**' 'BifrostImageGenerationRequest' -A 10

echo ""
echo "=== Searching for gateway request handling (images endpoint) ==="
rg -n --hidden --glob '!**/vendor/**' 'generateContent.*endpoint|predict.*endpoint|HandleImageGeneration' -C 3

Repository: maximhq/bifrost

Length of output: 50371


🏁 Script executed:

#!/bin/bash
# Check if there's PR stack information
git log --oneline -20 | head -10
git branch -a | grep -E 'stack|parent|base' || echo "No stack branches found"
</parameter>

Repository: maximhq/bifrost

Length of output: 302


Add explicit clarification for Gateway vs SDK request schemas.

The mapping table describes parameter transformations but doesn't state that the Gateway endpoint accepts OpenAI-compatible flat fields (prompt, size, n, output_format) while the SDK uses BifrostImageGenerationRequest with nested Input/Params structure. Add a brief note above the mapping table clarifying: "Gateway uses OpenAI Images API schema (flat fields); SDK uses BifrostImageGenerationRequest with Input and Params objects. The table below shows how these map internally."

🧰 Tools
🪛 LanguageTool

[style] ~562-~562: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Model & Prompt*: bifrostReq.Modelreq.Model, bifrostReq.Input.Prompt → `req.Cont...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
In `@docs/providers/supported-providers/gemini.mdx` around lines 489 - 610, Add a
brief clarifying sentence above the "Parameter Mapping" table stating that the
Gateway accepts OpenAI-compatible flat fields (e.g., prompt, size, n,
output_format) while the SDK uses the BifrostImageGenerationRequest with nested
Input and Params objects (Input.Prompt, Params.Size, Params.N,
Params.OutputFormat); keep the sentence short and then note that the table shows
how those Gateway flat fields and SDK nested fields map internally.

@Radheshg04 Radheshg04 closed this Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Files API Support

2 participants