Skip to content

feat: Add reasoning effort configuration for reasoning models #6095

@louis-menlo

Description

@louis-menlo

Jan should provide users with an option to configure the reasoning effort level when using reasoning models (like o1, Claude-3.5-Sonnet with reasoning, etc.). This would allow users to control the trade-off between response quality and speed/cost.

Current Behavior

  • Users can select different models but have no control over reasoning effort/depth
  • Reasoning models use default effort levels without user customization

Proposed Feature

  • Add a "Reasoning Effort" slider/dropdown in the model configuration UI
  • Support effort levels like "Low", "Medium", "High" or numerical scale (1-5)
  • Persist user's effort preference per model
  • Apply the effort setting when making API calls to reasoning models

Use Cases

  • Users want faster responses for simple queries (low effort)
  • Users need thorough analysis for complex problems (high effort)
  • Cost-conscious users want to optimize token usage
  • Power users need fine-grained control over model behavior

Technical Implementation Areas

  • Model configuration interface
  • API request parameter handling
  • Settings persistence
  • Provider-specific reasoning effort mapping

ggml-org/llama.cpp#15266
ggml-org/llama.cpp#15130

Metadata

Metadata

Labels

No labels
No labels

Projects

Status

Blocked

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions