Skip to content

feat: add GCP Vertex AI image generation support#1905

Open
altale wants to merge 10 commits intoenvoyproxy:mainfrom
altale:vertexai-image-gen-support
Open

feat: add GCP Vertex AI image generation support#1905
altale wants to merge 10 commits intoenvoyproxy:mainfrom
altale:vertexai-image-gen-support

Conversation

@altale
Copy link
Copy Markdown

@altale altale commented Feb 28, 2026

Description

This PR adds OpenAI-compatible image generation translation for GCP Vertex AI with two backend paths:

  • Imagen models (imagen-*) -> Vertex predict endpoint.
  • Gemini image models (e.g. gemini-2.5-flash-image) -> Vertex generateContent endpoint.

Related Issues/PRs (if applicable)

#1806

Special notes for reviewers (if applicable)

Behavior Summary

  • size handling:
    • 1024x1024 -> aspectRatio: "1:1" + sampleImageSize: "1K".
    • "" or "auto" -> omit explicit size fields to preserve backend default.
    • any other size -> ErrInvalidRequestBody.
  • quality is not mapped and is rejected (ErrInvalidRequestBody).
  • output_format supports only png and jpeg. Unsupported formats (e.g. webp) are rejected.

@altale altale requested a review from a team as a code owner February 28, 2026 14:09
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Feb 28, 2026
@altale altale marked this pull request as draft February 28, 2026 14:09
@altale altale force-pushed the vertexai-image-gen-support branch from 4d1cd25 to 4e806ce Compare February 28, 2026 14:17
@altale altale marked this pull request as ready for review February 28, 2026 14:18
@altale altale force-pushed the vertexai-image-gen-support branch from f7bfead to 509b0bc Compare March 2, 2026 03:47
@altale altale marked this pull request as draft March 2, 2026 05:46
altale and others added 2 commits March 4, 2026 01:16
Add support for translating OpenAI /v1/images/generations requests to
GCP Vertex AI image generation APIs with two translation paths:

- Imagen models (imagen-*): uses predict endpoint with parameter mapping
  for aspect ratio, image size, output format, and compression quality.
- Gemini image models: uses generateContent endpoint.

Includes 19 unit tests and 3 end-to-end integration tests.

Supersedes envoyproxy#1806

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: zhihao.zhao <zhihao.zhao@lycorp.co.jp>
- preserve explicit output_compression=0 for Imagen requests

- enforce cross-provider size/quality/output_format validation

- keep backend default behavior for size="" and size="auto"

- fix Gemini image usage metrics (input/output/total token usage)

- update and deduplicate translator tests and data-plane fixture

Signed-off-by: zhihao.zhao <zhihao.zhao@lycorp.co.jp>
@altale altale force-pushed the vertexai-image-gen-support branch from b2a6670 to dfb946e Compare March 3, 2026 16:17
altale added 2 commits March 4, 2026 01:23
Signed-off-by: zhihao.zhao <zhihao.zhao@lycorp.co.jp>
Signed-off-by: zhihao.zhao <zhihao.zhao@lycorp.co.jp>
@altale altale marked this pull request as ready for review March 3, 2026 16:28
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Mar 3, 2026
@nacx
Copy link
Copy Markdown
Member

nacx commented Mar 16, 2026

/gemini summary

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

This pull request introduces a new feature that enables the AI Gateway to translate OpenAI-compatible image generation requests to GCP Vertex AI, supporting both Imagen and Gemini models. It includes request and response translation logic, size and format handling, and error management to ensure compatibility and proper functionality with the Vertex AI service.

Highlights

  • New Feature: Adds support for GCP Vertex AI image generation, compatible with OpenAI's image generation API.
  • Backend Support: Supports two Vertex AI backend paths: Imagen models (via predict endpoint) and Gemini image models (via generateContent endpoint).
  • Size Handling: Handles image size conversions, mapping '1024x1024' to specific aspect ratios and sizes for the Imagen backend, while also supporting backend defaults.
  • Error Handling: Rejects unsupported parameters like 'quality' and certain 'output_format' values, providing specific error messages.
Changelog
  • internal/apischema/gcp/gcp.go
    • Defines the data structures for interacting with the GCP Vertex AI Imagen predict endpoint.
  • internal/endpointspec/endpointspec.go
    • Adds GCP Vertex AI as a supported API schema for image generation, linking it to the new translator.
  • internal/translator/imagegeneration_openai_gcpvertexai.go
    • Implements the translation logic between OpenAI's image generation API and GCP Vertex AI's Imagen and Gemini endpoints, handling request formatting, response parsing, and error conversion.
  • internal/translator/imagegeneration_openai_gcpvertexai_test.go
    • Adds comprehensive unit tests for the new translator, covering request and response transformations, parameter handling, and error cases.
  • site/docs/capabilities/llm-integrations/supported-endpoints.md
    • Documents the new GCP Vertex AI image generation support, providing examples for both Imagen and Gemini models.
  • tests/data-plane/testupstream_test.go
    • Adds integration tests to verify the end-to-end functionality of the GCP Vertex AI image generation integration, including request translation, response parsing, and error handling.
Activity
  • The pull request introduces a new translator for GCP Vertex AI image generation.
  • It includes comprehensive tests to ensure the translator functions correctly.
  • The documentation has been updated to reflect the new functionality.
  • The pull request was summarized by /gemini.

@nacx
Copy link
Copy Markdown
Member

nacx commented Mar 16, 2026

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for GCP Vertex AI image generation, including both Imagen and Gemini models. The changes are well-structured, with a new translator to handle the conversion from OpenAI's API format to GCP's. The implementation correctly distinguishes between the two backend paths (predict for Imagen, generateContent for Gemini) and handles parameter mapping and response translation. The addition of comprehensive unit and end-to-end tests ensures the new functionality is robust. I have one suggestion to refactor a part of the code for better readability.

Comment on lines +52 to +74
o.isImagenModel = strings.HasPrefix(o.requestModel, "imagen")
var path string
if o.isImagenModel {
var imgGenReq *gcp.ImagePredictRequest
imgGenReq, err = openAIToImagenRequest(openAIReq)
if err != nil {
return
}
newBody, err = json.Marshal(imgGenReq)
path = buildGCPModelPathSuffix(gcpModelPublisherGoogle, string(o.requestModel), gcpMethodPredict)
} else {
var geminiReq *gcp.GenerateContentRequest
geminiReq, err = openAIToGeminiRequest(openAIReq)
if err != nil {
return
}
newBody, err = json.Marshal(geminiReq)
path = buildGCPModelPathSuffix(gcpModelPublisherGoogle, string(o.requestModel), gcpMethodGenerateContent)
}
if err != nil {
err = fmt.Errorf("failed to encode request: %w", err)
return
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block can be refactored to improve readability and separate the logic for creating the backend request from the JSON marshaling. By using an any type for the backend request, you can avoid duplicating the marshaling and error handling logic.

o.isImagenModel = strings.HasPrefix(o.requestModel, "imagen")
	var path string
	var backendReq any
	if o.isImagenModel {
		backendReq, err = openAIToImagenRequest(openAIReq)
		if err != nil {
			return
		}
		path = buildGCPModelPathSuffix(gcpModelPublisherGoogle, string(o.requestModel), gcpMethodPredict)
	} else {
		backendReq, err = openAIToGeminiRequest(openAIReq)
		if err != nil {
			return
		}
		path = buildGCPModelPathSuffix(gcpModelPublisherGoogle, string(o.requestModel), gcpMethodGenerateContent)
	}

	newBody, err = json.Marshal(backendReq)
	if err != nil {
		err = fmt.Errorf("failed to encode request: %w", err)
		return
	}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep the explicit typed branches here (ImagePredictRequest vs GenerateContentRequest) for clarity and type safety.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 94.51220% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.32%. Comparing base (7f77c8a) to head (1dcef4c).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...l/translator/imagegeneration_openai_gcpvertexai.go 95.67% 4 Missing and 3 partials ⚠️
internal/endpointspec/endpointspec.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1905      +/-   ##
==========================================
+ Coverage   84.22%   84.32%   +0.09%     
==========================================
  Files         128      129       +1     
  Lines       17828    17992     +164     
==========================================
+ Hits        15016    15171     +155     
- Misses       1868     1874       +6     
- Partials      944      947       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@altale altale force-pushed the vertexai-image-gen-support branch from d177345 to 36d3ed0 Compare March 17, 2026 17:24
@altale
Copy link
Copy Markdown
Author

altale commented Mar 22, 2026

Hi @nacx. Update on the earlier CI failure in E2E Test for Inference Extensions. I re-ran the same changes in my own fork CI, and that job passed successfully:

Also, the scope of this PR is unrelated to the inference-extension e2e logic (it mainly touches Vertex AI image generation translation paths and related tests/docs, not the inference-extension controller/e2e framework itself).
This looks more like an upstream CI environment/timing issue rather than a functional regression.

Could you please help re-run the failed upstream job for confirmation? Thank you so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants