Skip to content

bug: reasoning_effort wrapped in chat_template_kwargs breaks real OpenAI API backends #1901

@Notumlo

Description

@Notumlo

Describe the bug

Since PR #1116, the reasoning_effort family type unconditionally wraps the effort value inside chat_template_kwargs regardless of whether the backend is a vLLM-hosted
model or api.openai.com. The real OpenAI API does not recognise chat_template_kwargs and returns a 400.

To Reproduce

  1. Configure a model with reasoning_family: gpt (type: reasoning_effort) pointing to an api.openai.com backend
  2. Route a request that triggers a decision with use_reasoning: true
  3. Observe a 400 from OpenAI: "Unknown parameter: 'chat_template_kwargs'"

Expected behavior

For backends targeting api.openai.com, reasoning_effort should be sent as a plain top-level field:
{ "reasoning_effort": "high" }

The current behaviour (correct for vLLM-hosted models) sends:
{ "chat_template_kwargs": { "reasoning_effort": "high" } }

Affected layer

None

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions