feat(google): support OpenAI-compatible reasoning_effort for Gemini thinking models#160
Open
Tushar49 wants to merge 5 commits into
Open
feat(google): support OpenAI-compatible reasoning_effort for Gemini thinking models#160Tushar49 wants to merge 5 commits into
Tushar49 wants to merge 5 commits into
Conversation
…asoning_content from thought parts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds OpenAI-compatible
reasoning_effortparameter for the Google provider, mapping to Gemini'sthinkingConfigso callers can opt into Gemini's thinking budget without leaving the OpenAI request shape.What this PR does
shared/types.ts+server/src/routes/proxy.tsacceptreasoning_effort: "minimal" | "low" | "medium" | "high"onChatCompletionRequest. Omitting it preserves existing behavior exactly (nothinkingConfigis sent).server/src/providers/google.ts):thinkingConfig.thinkingLevel(MINIMAL/LOW/MEDIUM/HIGH).gemini-3.1-prodoes not acceptMINIMAL→ falls back toLOW.thinkingConfig.thinkingBudgetinteger. Canonical mapping:minimal→ 512 (Pro: 128),low→ 1024,medium→ 8192,high→ 24576. Clamped per model:gemini-2.5-promin 128 / max 24576, Flash 0–24576, Flash-Lite 512–24576.includeThoughts: truewhenreasoning_effortis provided.candidates[0].content.parts[]where parts may carrythought: true. Non-thought parts go tomessage.content(unchanged); thought parts are aggregated intomessage.reasoning_content(OpenAI extension used by opencode and other clients).usageMetadata.thoughtsTokenCount→usage.completion_tokens_details.reasoning_tokens.delta.reasoning_content, normal text deltas asdelta.content.Why
OpenAI clients (opencode, Continue, Cursor, etc.) already use
reasoning_effortto drive reasoning models. Today they have no way to dial Gemini's thinking budget through freellmapi without bypassing the proxy. Google's own OpenAI-compatible endpoint maps the same field to the samethinkingConfig— this PR brings freellmapi to parity.Verification
From
server/:bun run build(tsc): cleanbun run test src/__tests__/providers/google-reasoning.test.ts: 8 / 8 pass (1 file)bun run test src/__tests__/providers/google.test.ts src/__tests__/providers/google-schema.test.ts: 19 / 19 pass (no regressions to existing Google tests / PR fix(google): strip every JSON Schema key not in Google's Schema proto (examples/const/readOnly/writeOnly/uniqueItems/not/allOf/oneOf/...) #105 tests)New tests cover: effort → budget mapping per model class, Gemini 3.x
thinkingLevelpath,gemini-3.1-proMINIMAL → LOW fallback,includeThoughtson/off semantics, thought-part parsing intoreasoning_content,reasoning_tokensin usage, and streamingdelta.reasoning_content.Style / scope
feat(api):→feat(google):→feat(proxy):→test(google):).reasoning_effortis omitted,generationConfig.thinkingConfigis not added to the outgoing request, so existing callers see byte-for-byte the same Gemini payload.Happy to iterate on naming, default budgets, or split into smaller PRs if preferred.