Skip to content

Vertex thinking changes to disable thinking mode for 2.5 models #1090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

narengogi
Copy link
Collaborator

@narengogi narengogi commented May 12, 2025

make changes to disable thinking mode in 2.5 models
also return thinking tokens in the response as per open ai format

example request body to disable thinking

{
    "model": "gemini-2.5-flash-preview-04-17",
    "max_tokens": 2000,
    "stream": true,
    "messages": [
        {
            "role": "user",
            "content": "What is the meanin of life, universe and everything"
        }
    ],
    "thinking": {
        "type": "disabled",
        "budget_tokens": 0
    }
}

NOTE: users are required to send type disabled and budget_tokens: 0 explicitly

Copy link

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

  • /matter summary: Generate AI Summary for the PR
  • /matter review: Generate AI Reviews for the latest commit in the PR
  • /matter review-full: Generate AI Reviews for the complete PR
  • /matter release-notes: Generate AI release-notes for the PR
  • /matter <ask-question>: Chat with your PR with Matter AI Agent
  • /matter remember <recommendation>: Generate AI memories for the PR
  • /matter explain: Get an explanation of the PR
  • /matter help: Show the list of available commands and documentation

@narengogi narengogi requested review from csgulati09 and VisargD May 12, 2025 08:42
Copy link

Code Quality type: new feature

Summary By MatterAI MatterAI logo

🔄 What Changed

  • Added thoughtsTokenCount tracking for Vertex AI and Google providers
  • Implemented conditional thinking mode configuration
  • Enhanced token usage metadata tracking

🔍 Impact of the Change

  • More granular token usage reporting
  • Flexible thinking mode configuration
  • Improved token tracking for AI models

📁 Total Files Changed

  • 4 files modified:
    1. src/providers/google-vertex-ai/chatComplete.ts
    2. src/providers/google-vertex-ai/transformGenerationConfig.ts
    3. src/providers/google-vertex-ai/types.ts
    4. src/providers/google/chatComplete.ts

🧪 Test Added

  • No explicit new tests added
  • Existing test coverage assumed

🔒 Security Vulnerabilities

  • No direct security vulnerabilities detected
  • Improved configuration validation

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

Sequence Diagram

sequenceDiagram
participant GoogleVertexAI as Vertex
participant TransformConfig as Config
participant ChatComplete as Chat

Vertex->>Config: transformGenerationConfig()
Config-->>Vertex: Configure thinking mode

Vertex->>Chat: chatComplete()
Chat-->>Vertex: Process token usage

Note over Vertex, Chat: Token Tracking
Vertex->Chat: Extract thoughtsTokenCount
Chat-->Vertex: Return token details
Loading

Copy link

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

  • /matter summary: Generate AI Summary for the PR
  • /matter review: Generate AI Reviews for the latest commit in the PR
  • /matter review-full: Generate AI Reviews for the complete PR
  • /matter release-notes: Generate AI release-notes for the PR
  • /matter <ask-question>: Chat with your PR with Matter AI Agent
  • /matter remember <recommendation>: Generate AI memories for the PR
  • /matter explain: Get an explanation of the PR
  • /matter help: Show the list of available commands and documentation

Copy link

Code Quality new feature

Summary By MatterAI MatterAI logo

🔄 What Changed

  • Added support for granular thinking mode configuration in Google Vertex AI
  • Introduced thoughtsTokenCount tracking
  • Implemented conditional thinking mode enablement

🔍 Impact of the Change

  • Provides more precise control over AI model's reasoning process
  • Enables token-level budget management for thinking mode
  • Enhances token usage reporting

📁 Total Files Changed

  • 2 files modified:
    1. src/providers/google-vertex-ai/chatComplete.ts
    2. src/providers/google/chatComplete.ts

🧪 Test Added

  • N/A (No explicit test cases provided in PR)

🔒 Security Vulnerabilities

  • No direct security vulnerabilities detected

Type of Change

  • New feature (non-breaking change which adds functionality)

Checklist

  • Code follows project style guidelines
  • Self-review performed
  • Comments added for complex logic
  • Documentation updated
  • No new warnings generated
  • Tests added to prove feature works
  • Unit tests pass locally

Sequence Diagram

sequenceDiagram
participant GoogleVertexAI as Vertex
participant ChatComplete as ChatAPI

Vertex->>ChatAPI: Request with thinking parameters
Note over Vertex: thinking.type = 'enabled'
Note over Vertex: thinking.budget_tokens configured
ChatAPI-->>Vertex: Response with thoughtsTokenCount
Vertex->>ChatAPI: Parse token details
Note over ChatAPI: Add completion_tokens_details
ChatAPI-->>Vertex: Return reasoning_tokens
Loading

@@ -52,7 +52,8 @@ export function transformGenerationConfig(params: Params) {

if (params?.thinking) {
const thinkingConfig: Record<string, any> = {};
thinkingConfig['include_thoughts'] = true;
thinkingConfig['include_thoughts'] =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this vertex doc: https://cloud.google.com/vertex-ai/generative-ai/docs/thinking#budget

In order to disable thinking, you need to set thinking budget tokens to 0. I do not see any parameter named include_thoughts available.

Copy link
Collaborator Author

@narengogi narengogi May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this parameter looking inside the sdk, I'll find it and link it here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants