translator: gemini non-stream reasoning as string and thinking_blocks#1995
Open
Flgado wants to merge 1 commit intoenvoyproxy:mainfrom
Open
translator: gemini non-stream reasoning as string and thinking_blocks#1995Flgado wants to merge 1 commit intoenvoyproxy:mainfrom
Flgado wants to merge 1 commit intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Joao Folgado <jfolgado94@gmail.com>
|
Related Documentation 2 document(s) may need updating based on files changed in this PR: Envoy's Space vendor-specific-fields
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Non-streaming Vertex / Gemini chat completions now follow the
LiteLLM-stylesplit already discussed in the community:reasoning_contentis always a plain string (the visible thinking summary text), and optionalthinking_blockscarry structured metadata (e.g. signatures) that OpenAI’s core schema does not define. This matches the direction agreed on in Slack (link) and avoids exposing a Bedrock-shaped nested object in reasoning_content, which breaks common OpenAI-compatible clients.Implementation highlights:
Extend the OpenAI-shaped
ChatCompletionResponseChoiceMessageschema with ThinkingBlockIn the Gemini helper, map thought parts to the string union for
reasoning_contentand populatethinking_blocks; when the model attaches a thought signature to the first function-call part (parallel tools) or only there, merge or attach that signature intothinking_blocksso clients can round-trip history together with tool_calls and assistant content parts of type thinking + signature.Add unit tests in internal/translator/gemini_helper_test.go for geminiCandidatesToOpenAIChoices and for signature extraction in extractTextAndThoughtSummaryFromGeminiParts.
This commit improves interoperability with LiteLLM, LangChain, and other clients that expect
reasoning_contentto be a string while still preserving provider-specific seals for advanced use cases.Fixes #1974
Special notes for reviewers
Testing: