| title | Extended Thinking Configuration | |||||||
|---|---|---|---|---|---|---|---|---|
| description | Enable extended thinking and reasoning modes for AI models that support deeper reasoning capabilities | |||||||
| keywords |
|
Enable extended thinking/reasoning modes for AI models that support deeper reasoning capabilities. This feature allows models to "think through" complex problems before providing a response.
NeuroLink supports extended thinking/reasoning configuration for models that provide this capability. Extended thinking enables models to perform more thorough reasoning, particularly useful for complex tasks like mathematical proofs, coding problems, and multi-step analysis.
gemini-3.1-pro- Full thinking support with high token budgets (up to 100,000)gemini-3-flash-preview- Fast thinking with support for "minimal" level (up to 50,000)
gemini-2.5-pro- Supports thinking configuration (up to 32,000 tokens)gemini-2.5-flash- Supports thinking configuration (up to 32,000 tokens)
All Claude 4.0+ models support extended thinking via budget tokens:
claude-sonnet-4-20250514(Claude Sonnet 4)claude-opus-4-20250514(Claude Opus 4)claude-opus-4-1-20250805(Claude Opus 4.1)claude-sonnet-4-5-20250929(Claude Sonnet 4.5)claude-opus-4-5-20251101(Claude Opus 4.5)claude-haiku-4-5-20251001(Claude Haiku 4.5)claude-sonnet-4-6(Claude Sonnet 4.6)claude-opus-4-6(Claude Opus 4.6)
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink();
const result = await neurolink.generate({
input: { text: "Prove that the square root of 2 is irrational" },
provider: "google-ai",
model: "gemini-2.5-flash",
thinkingConfig: { thinkingLevel: "high" },
});
console.log(result.content);For Gemini 3 models, use thinkingLevel to control reasoning depth:
const response = await neurolink.generate({
input: { text: "Prove that the square root of 2 is irrational" },
provider: "vertex",
model: "gemini-3-flash-preview",
thinkingConfig: {
thinkingLevel: "high", // 'minimal' | 'low' | 'medium' | 'high'
},
});| Level | Description | Best For |
|---|---|---|
minimal |
Near-zero thinking (Flash models only) | Simple queries requiring speed |
low |
Fast reasoning for simple tasks | Quick analysis, summaries |
medium |
Balanced reasoning/latency trade-off | General-purpose tasks |
high |
Maximum reasoning depth | Complex reasoning, math, coding |
| Model | Max Thinking Budget |
|---|---|
gemini-3-pro-* |
100,000 tokens |
gemini-3-flash-* |
50,000 tokens |
gemini-2.5-* |
32,000 tokens |
claude-opus-4-6 |
100,000 tokens |
claude-sonnet-4-6 |
100,000 tokens |
claude-opus-4-5-* |
100,000 tokens |
claude-sonnet-4-5-* |
100,000 tokens |
claude-haiku-4-5-* |
100,000 tokens |
claude-opus-4-1-* |
100,000 tokens |
claude-opus-4-* |
100,000 tokens |
claude-sonnet-4-* |
100,000 tokens |
For Claude models, use budgetTokens to set the thinking token budget:
const response = await neurolink.generate({
input: { text: "Solve this complex math problem step by step..." },
provider: "anthropic",
model: "claude-sonnet-4-6",
thinkingConfig: {
enabled: true,
budgetTokens: 10000, // Range: 5000-100000
},
});- Minimum: 5,000 tokens
- Maximum: 100,000 tokens
- Recommended for simple tasks: 5,000-10,000 tokens
- Recommended for complex reasoning: 20,000-50,000 tokens
- Maximum depth: 50,000-100,000 tokens
The thinkingConfig object supports the following options:
thinkingConfig: {
enabled?: boolean; // Enable/disable thinking
type?: "enabled" | "disabled"; // Alternative enable/disable
budgetTokens?: number; // Token budget (Anthropic models)
thinkingLevel?: "minimal" | "low" | "medium" | "high"; // Thinking level (Gemini models)
}Extended thinking is also available via the CLI:
# Enable thinking with default settings
neurolink generate "Solve this problem" --thinking
# Set thinking budget for Anthropic
neurolink generate "Complex problem" --provider anthropic --thinking --thinkingBudget 20000
# Set thinking level for Gemini 3
neurolink generate "Complex problem" --provider vertex --model gemini-3-pro-preview --thinkingLevel high| Option | Description | Default |
|---|---|---|
--thinking |
Enable extended thinking | false |
--thinkingBudget |
Token budget (Anthropic: 5000-100000) | 10000 |
--thinkingLevel |
Thinking level (Gemini 3: minimal, low, medium, high) | medium |
- Complex mathematical proofs and calculations
- Multi-step coding problems and debugging
- Detailed analysis requiring multiple considerations
- Tasks where accuracy is more important than speed
- Simple queries where speed matters
- Straightforward information retrieval
- Quick summaries and formatting tasks
- High-volume, latency-sensitive applications
- Start with medium: Use
mediumas your default and adjust based on results - Match model to task: Use Pro models for complex tasks, Flash for speed
- Monitor token usage: Higher thinking levels consume more tokens
- Test performance: Compare response quality vs. latency for your use case
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink();
// Complex coding problem with high reasoning
const result = await neurolink.generate({
input: {
text: `
Design an optimal algorithm to find the longest palindromic subsequence
in a string. Explain your approach, prove its correctness, and analyze
the time and space complexity.
`,
},
provider: "vertex",
model: "gemini-3-pro-preview",
thinkingConfig: {
thinkingLevel: "high",
},
maxTokens: 4000,
});
console.log(result.content);NeuroLink provides utilities to check thinking support:
import {
supportsThinkingConfig,
getMaxThinkingBudgetTokens,
} from "@juspay/neurolink";
// Check if a model supports thinking
const supports = supportsThinkingConfig("gemini-3-pro-preview"); // true
// Get maximum budget for a model
const maxBudget = getMaxThinkingBudgetTokens("gemini-3-flash-preview"); // 50000- Provider compatibility: Thinking configuration is provider-specific. Gemini uses
thinkingLevel, Claude usesbudgetTokens - Token consumption: Extended thinking uses additional tokens beyond the response
- Latency impact: Higher thinking levels increase response time
- Not all models support thinking: Check
supportsThinkingConfig()before enabling - Streaming support: Thinking configuration works with both
generate()andstream()