Skip to content

feat: support Gemini native client endpoints (/v1beta/models/{model}:generateContent) #1960

@adirags

Description

@adirags

Summary

Gemini CLI and the Google AI SDK hard-code the native Gemini API path format:

POST /v1beta/models/{model}:generateContent
POST /v1beta/models/{model}:streamGenerateContent

The model name is embedded in the URL, not the request body. The current extproc server uses an exact-path hash map (processorFactories), so these paths fall through with a 404 and cannot be routed.

This issue tracks adding first-class support for Gemini native client endpoints so that tools like Gemini CLI can be pointed at the gateway without modification.

Proposed Changes

1. Prefix-based path dispatch in Server (internal/extproc/server.go)

Add RegisterPrefix(prefix string, factory ProcessorFactory) alongside the existing Register. When an exact match is not found, fall back to a longest-prefix match over all registered prefix factories. This is a general mechanism — not Gemini-specific.

2. NewGeminiProcessorFactory (internal/extproc/processor_impl.go)

A ProcessorFactory that:

  • Parses the model name and streaming flag from the URL path (extractGeminiModelFromPath)
  • Passes them as a pre-populated GenerateContentEndpointSpec to newRouterProcessor (router branch) or newUpstreamProcessor (upstream branch)
  • Uses NoopTracer since Gemini passthrough does not need tracing

3. GenerateContentEndpointSpec (internal/endpointspec/gemini_generatecontent.go)

Endpoint spec carrying ModelFromPath string and Streaming bool. The existing GeminiToGCPVertexAI translator reads these to build the correct Vertex AI path suffix.

4. GeminiToGCPVertexAITranslator (internal/translator/gemini_gcpvertexai.go)

A near-passthrough translator: rewrites :path to the full Vertex AI generateContent / streamGenerateContent URL. Also strips FunctionResponse.ID from all parts before forwarding — the Google AI SDK (used by Gemini CLI in gemini-api auth mode) populates this field, but Vertex AI /v1 rejects it as an unknown field. Vertex AI matches function_response to function_call by name, not id, so stripping is safe.

5. EndpointPrefixes.Gemini (internal/internalapi/internalapi.go)

Adds a gemini key to ParseEndpointPrefixes, defaulting to /v1beta, so operators can remap the prefix if needed.

6. Wire-up in cmd/extproc/mainlib/main.go

geminiModelPrefix := strings.TrimRight(path.Join(flags.rootPrefix, endpointPrefixes.Gemini), "/") + "/models/"
server.RegisterPrefix(geminiModelPrefix, extproc.NewGeminiProcessorFactory(generateContentMetricsFactory))

Tests

  • TestServer_ProcessorForPath_PrefixMatch — exact match wins over prefix, longest prefix wins, unknown paths return errNoProcessor
  • TestExtractGeminiModelFromPath — table-driven: standard, streaming, with root prefix, missing colon, empty string
  • TestNewGeminiProcessorFactory — router (non-streaming), router (streaming), upstream branch

Notes

  • No changes to filterapi, controller, or CRD schema — the GCPVertexAI backend schema is reused as-is
  • The translator is a passthrough for response bodies; token usage extraction is not implemented (Gemini native responses use a different token field structure — can be a follow-up)
  • Streaming detection uses the :streamGenerateContent method suffix in the URL path, not a body field

cc @mathetake @yuzisun — would love your thoughts on the prefix dispatch approach and whether RegisterPrefix belongs on Server or should be a separate routing layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions