Skip to content

[Bug]: AI Gateway (ai-gateway.helicone.ai) ignores provided Gemini API key, uses free-tier keys instead #5561

@kkcoms

Description

@kkcoms

What happened?

Summary

When using the AI Gateway endpoint (ai-gateway.helicone.ai) with my own Gemini API key, all Google AI Studio requests hit a free-tier quota limit (20 req/day), even though my Gemini account is on a paid plan Tier 3. After investigation, I confirmed that the AI Gateway is not using my provided API key for upstream requests to Google; it appears to use Helicone's own free-tier Google AI Studio keys regardless of what key is provided.

Setup

I was using the OpenAI SDK with the AI Gateway endpoint, passing my Gemini API key:


const openai = new OpenAI({
  apiKey: process.env.GEMINI_API_KEY,
  baseURL: "https://ai-gateway.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
  }
});
Model format: gemini-2.5-flash-lite/google-ai-studio

Error

Intermittent 429 errors from Google with this detail:


quotaMetric: generativelanguage.googleapis.com/generate_content_free_tier_requests
quotaId: GenerateRequestsPerDayPerProjectPerModel-FreeTier
quotaValue: 20
Key indicator: the quota is generate_content_free_tier_requests — this is the free tier quota, not any paid-plan rate limit.

Testing I did to confirm this is a Helicone-side issue

  1. With my Gemini key in AI Gateway → Free-tier 429 errors (20 req/day limit)
  2. Removed my Gemini key from Helicone and from my environment entirely → Same free-tier 429 errors, same behavior. Requests still went through, which means the AI Gateway has its own keys.
  3. Bypassed Helicone entirely, called Google directly with my Gemini key at generativelanguage.googleapis.com/v1beta/openai/ → Works perfectly, no 429 errors, no free-tier limits.

The fact that removing my API key changed nothing (step 2) proves the AI Gateway was never using it for the upstream Google request. My key appears to have been ignored entirely.

Additional observation

For 2) in the Helicone dashboard, I noticed that for every 429 error there was a corresponding successful request at the same timestamp. This suggests the gateway may be rotating through a pool of free-tier Google API keys — when one hits the 20 req/day limit, it retries with another. This works eventually but adds unnecessary latency.

Current workaround

I switched to using the generic proxy (gateway.helicone.ai) with Helicone-Target-URL instead, which correctly passes my API key through to Google:

const openai = new OpenAI({
  apiKey: process.env.GEMINI_API_KEY,
  baseURL: "https://gateway.helicone.ai/v1beta/openai/",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    "Helicone-Target-URL": "https://generativelanguage.googleapis.com",
  }
});

This works, but cost tracking shows "unsupported" for Google AI Studio models. Would be great if the cost registry supported generativelanguage.googleapis.com as a provider with Gemini model pricing.

Relevant log output

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions