Describe the bug
Since PR #1116, the reasoning_effort family type unconditionally wraps the effort value inside chat_template_kwargs regardless of whether the backend is a vLLM-hosted
model or api.openai.com. The real OpenAI API does not recognise chat_template_kwargs and returns a 400.
To Reproduce
- Configure a model with
reasoning_family: gpt (type: reasoning_effort) pointing to an api.openai.com backend
- Route a request that triggers a decision with
use_reasoning: true
- Observe a 400 from OpenAI:
"Unknown parameter: 'chat_template_kwargs'"
Expected behavior
For backends targeting api.openai.com, reasoning_effort should be sent as a plain top-level field:
{ "reasoning_effort": "high" }
The current behaviour (correct for vLLM-hosted models) sends:
{ "chat_template_kwargs": { "reasoning_effort": "high" } }
Affected layer
None
Additional context
No response
Describe the bug
Since PR #1116, the
reasoning_effortfamily type unconditionally wraps the effort value insidechat_template_kwargsregardless of whether the backend is a vLLM-hostedmodel or api.openai.com. The real OpenAI API does not recognise
chat_template_kwargsand returns a 400.To Reproduce
reasoning_family: gpt (type: reasoning_effort)pointing to an api.openai.com backenduse_reasoning: true"Unknown parameter: 'chat_template_kwargs'"Expected behavior
For backends targeting api.openai.com,
reasoning_effortshould be sent as a plain top-level field:{ "reasoning_effort": "high" }The current behaviour (correct for vLLM-hosted models) sends:
{ "chat_template_kwargs": { "reasoning_effort": "high" } }Affected layer
None
Additional context
No response