Skip to content

feat: meter Gemini thinking tokens and grounding requests#3178

Open
Aaryan-Dadu wants to merge 1 commit into
HeyPuter:mainfrom
Aaryan-Dadu:feat/3132
Open

feat: meter Gemini thinking tokens and grounding requests#3178
Aaryan-Dadu wants to merge 1 commit into
HeyPuter:mainfrom
Aaryan-Dadu:feat/3132

Conversation

@Aaryan-Dadu
Copy link
Copy Markdown

@Aaryan-Dadu Aaryan-Dadu commented May 28, 2026

Summary

  • Thinking tokens: Extracted from standard completion tokens to ensure they are billed accurately at the correct model specific rate.
  • Grounding requests: Added flat-fee metering for Google Search by tracking grounding_metadata across both streaming and non-streaming responses.
  • Pricing updates: Corrected stale rates for Gemini 2.5 Flash output, cached tokens, thinking tokens, and grounding requests.

Closes #3132

Test

  • All pre-existing tests pass.
  • 5 unit tests for the corresponding changes have been added
  • 4 pre-existing test assertions updated to include thinking_tokens: 0 and grounding_requests: 0 in the expected usage shapes
Screenshot From 2026-05-28 15-12-15

- Thinking tokens: Extracted from standard completion tokens to ensure they are billed accurately at the correct model-specific rate.
- Grounding requests: Added flat-fee metering for Google Search by tracking grounding_metadata across both streaming and non-streaming responses.
- Pricing updates: Corrected stale rates for Gemini 2.5 Flash output, cached tokens, thinking tokens, and grounding requests.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 28, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate & possible fix metering for gemini models search and caching

3 participants