feat: Add reasoning effort configuration for reasoning models

Jan should provide users with an option to configure the reasoning effort level when using reasoning models (like o1, Claude-3.5-Sonnet with reasoning, etc.). This would allow users to control the trade-off between response quality and speed/cost.

## Current Behavior
- Users can select different models but have no control over reasoning effort/depth
- Reasoning models use default effort levels without user customization

## Proposed Feature
- Add a "Reasoning Effort" slider/dropdown in the model configuration UI
- Support effort levels like "Low", "Medium", "High" or numerical scale (1-5)
- Persist user's effort preference per model
- Apply the effort setting when making API calls to reasoning models

## Use Cases
- Users want faster responses for simple queries (low effort)
- Users need thorough analysis for complex problems (high effort) 
- Cost-conscious users want to optimize token usage
- Power users need fine-grained control over model behavior

## Technical Implementation Areas
- Model configuration interface
- API request parameter handling
- Settings persistence
- Provider-specific reasoning effort mapping


https://github.com/ggml-org/llama.cpp/issues/15266
https://github.com/ggml-org/llama.cpp/issues/15130



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add reasoning effort configuration for reasoning models #6095

Current Behavior

Proposed Feature

Use Cases

Technical Implementation Areas

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Add reasoning effort configuration for reasoning models #6095

Description

Current Behavior

Proposed Feature

Use Cases

Technical Implementation Areas

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions