-
Notifications
You must be signed in to change notification settings - Fork 492
Description
What happened?
Summary
When using the AI Gateway endpoint (ai-gateway.helicone.ai) with my own Gemini API key, all Google AI Studio requests hit a free-tier quota limit (20 req/day), even though my Gemini account is on a paid plan Tier 3. After investigation, I confirmed that the AI Gateway is not using my provided API key for upstream requests to Google; it appears to use Helicone's own free-tier Google AI Studio keys regardless of what key is provided.
Setup
I was using the OpenAI SDK with the AI Gateway endpoint, passing my Gemini API key:
const openai = new OpenAI({
apiKey: process.env.GEMINI_API_KEY,
baseURL: "https://ai-gateway.helicone.ai/v1",
defaultHeaders: {
"Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
}
});
Model format: gemini-2.5-flash-lite/google-ai-studio
Error
Intermittent 429 errors from Google with this detail:
quotaMetric: generativelanguage.googleapis.com/generate_content_free_tier_requests
quotaId: GenerateRequestsPerDayPerProjectPerModel-FreeTier
quotaValue: 20
Key indicator: the quota is generate_content_free_tier_requests — this is the free tier quota, not any paid-plan rate limit.
Testing I did to confirm this is a Helicone-side issue
- With my Gemini key in AI Gateway → Free-tier 429 errors (20 req/day limit)
- Removed my Gemini key from Helicone and from my environment entirely → Same free-tier 429 errors, same behavior. Requests still went through, which means the AI Gateway has its own keys.
- Bypassed Helicone entirely, called Google directly with my Gemini key at generativelanguage.googleapis.com/v1beta/openai/ → Works perfectly, no 429 errors, no free-tier limits.
The fact that removing my API key changed nothing (step 2) proves the AI Gateway was never using it for the upstream Google request. My key appears to have been ignored entirely.
Additional observation
For 2) in the Helicone dashboard, I noticed that for every 429 error there was a corresponding successful request at the same timestamp. This suggests the gateway may be rotating through a pool of free-tier Google API keys — when one hits the 20 req/day limit, it retries with another. This works eventually but adds unnecessary latency.
Current workaround
I switched to using the generic proxy (gateway.helicone.ai) with Helicone-Target-URL instead, which correctly passes my API key through to Google:
const openai = new OpenAI({
apiKey: process.env.GEMINI_API_KEY,
baseURL: "https://gateway.helicone.ai/v1beta/openai/",
defaultHeaders: {
"Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
"Helicone-Target-URL": "https://generativelanguage.googleapis.com",
}
});
This works, but cost tracking shows "unsupported" for Google AI Studio models. Would be great if the cost registry supported generativelanguage.googleapis.com as a provider with Gemini model pricing.
Relevant log output
Twitter / LinkedIn details
No response