When using the @google/genai SDK with vertexai: true, the serviceTier field in GenerateContentConfig has no effect. The SDK places serviceTier in the JSON request body, but the Vertex AI API does not read flex/priority tier configuration from the request body. It uses HTTP headers (X-Vertex-AI-LLM-Request-Type and X-Vertex-AI-LLM-Shared-Request-Type).
This means serviceTier: ServiceTier.FLEX silently does nothing on Vertex AI. There is no error. The request succeeds but is billed at standard rates.
Steps to reproduce:
import { GoogleGenAI, ServiceTier } from "@google/genai";
const client = new GoogleGenAI({
vertexai: true,
project: "my-project",
location: "global",
});
const response = await client.models.generateContent({
model: "gemini-3-flash-preview",
contents: [{ role: "user", parts: [{ text: "Hello" }] }],
config: {
serviceTier: ServiceTier.FLEX,
},
});
console.log(response.usageMetadata?.trafficType);
// Expected: "ON_DEMAND_FLEX"
// Actual: "ON_DEMAND"
Workaround:
Pass the Vertex AI headers manually via httpOptions:
config: {
httpOptions: {
headers: {
"X-Vertex-AI-LLM-Request-Type": "shared",
"X-Vertex-AI-LLM-Shared-Request-Type": "flex",
},
},
},
This correctly returns trafficType: "ON_DEMAND_FLEX".
Expected behavior:
When vertexai: true, the SDK should translate serviceTier: ServiceTier.FLEX into the appropriate X-Vertex-AI-LLM-* HTTP headers, so users don't need to know about the underlying API differences.
Additional context:
When using the
@google/genaiSDK withvertexai: true, theserviceTierfield inGenerateContentConfighas no effect. The SDK placesserviceTierin the JSON request body, but the Vertex AI API does not read flex/priority tier configuration from the request body. It uses HTTP headers (X-Vertex-AI-LLM-Request-TypeandX-Vertex-AI-LLM-Shared-Request-Type).This means
serviceTier: ServiceTier.FLEXsilently does nothing on Vertex AI. There is no error. The request succeeds but is billed at standard rates.Steps to reproduce:
Workaround:
Pass the Vertex AI headers manually via
httpOptions:This correctly returns
trafficType: "ON_DEMAND_FLEX".Expected behavior:
When
vertexai: true, the SDK should translateserviceTier: ServiceTier.FLEXinto the appropriateX-Vertex-AI-LLM-*HTTP headers, so users don't need to know about the underlying API differences.Additional context:
@google/genai@1.48.0serviceTierfield works correctly on the Gemini Developer API (vertexai: false)