Skip to content

Fix: llm health check failing for thinking models#709

Open
jfromentrudder wants to merge 1 commit intosrbhr:mainfrom
jfromentrudder:main
Open

Fix: llm health check failing for thinking models#709
jfromentrudder wants to merge 1 commit intosrbhr:mainfrom
jfromentrudder:main

Conversation

@jfromentrudder
Copy link
Copy Markdown
Contributor

@jfromentrudder jfromentrudder commented Mar 10, 2026

Closes [Bug]: Health checks do not support models with thinking capabilities, such as deepseek-r1 Fixes #707

Pull Request Title

Fix LLM health check failing for thinking models

Related Issue

Closes #707

Description

copilot:summary
Added checks for reasoning messaging in the health check content. If reasoning is not present, proceed with normal empty content response.

Type

  • [ x] Bug Fix
  • Feature Enhancement
  • Documentation Update
  • Code Refactoring
  • Other (please specify):

Proposed Changes

  • update check_llm_health in llm.py

Screenshots / Code Snippets (if applicable)

image

How to Test

  1. Attempt to use a reasoning llm model such as DeepSeek-R1 and save

Checklist

  • [ x] The code compiles successfully without any errors or warnings
  • [x ] The changes have been tested and verified
  • [ x] The documentation has been updated (if applicable)
  • [ x] The changes follow the project's coding guidelines and best practices
  • [ x] The commit messages are descriptive and follow the project's guidelines
  • [ x] All tests (if applicable) pass successfully
  • [ x] This pull request has been linked to the related issue (if applicable)

Additional Information

copilot:walkthrough


Summary by cubic

Fixes LLM health checks for reasoning/thinking models by recognizing reasoning output even when the text field is empty. Prevents false negatives for models like deepseek-r1.

  • Bug Fixes
    • Updated check_llm_health to accept reasoning_content/thinking as a valid response when content is empty.
    • Still marks truly empty responses as unhealthy with error_code: empty_content.
    • Minor logging and formatting tweaks; no API changes.

Written for commit cbac3a0. Summary will update on new commits.

Closes [Bug]: Health checks do not support models with thinking capabilities, such as deepseek-r1
Fixes srbhr#707
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot bot commented Mar 10, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Overview

This PR contains changes to apps/backend/app/llm.py with:

  • Formatting changes: Line breaks added for code style consistency (no functional impact)
  • Functional improvement: Added reasoning content check in check_llm_health() to handle models that respond with reasoning_content or thinking attributes but empty main content

Analysis

Reasoning Content Check (lines 361-365)

The new logic correctly handles edge cases where LLM models (like OpenAI's reasoning models) may return empty content but populate reasoning_content or thinking attributes. Previously, this would incorrectly mark the health check as unhealthy.

message = response.choices[0].message
has_reasoning = getattr(message, "reasoning_content", None) or getattr(
    message, "thinking", None)
if not has_reasoning:
    # Mark as unhealthy only if no reasoning content either

This is a valid improvement that prevents false-negative health checks for reasoning-capable models.

API Compatibility: No breaking changes detected. The health check response schema remains unchanged.

LiteLLM Configuration: The changes are compatible with existing LiteLLM patterns used in the codebase.

Other Observations (not in diff)

Minor Enhancement Opportunity: When a model responds with reasoning content but empty main content, the health check passes but model_output (line 392) still shows <empty>. Consider including reasoning content in the debug output for better observability. This is not a bug, just a potential UX improvement for debugging.

Files Reviewed (1 file)
  • apps/backend/app/llm.py - 0 issues

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file


Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
  • Ask questions if you need clarification on any suggestion

srbhr pushed a commit that referenced this pull request Mar 12, 2026
Health checks now recognize reasoning_content/thinking attributes
as valid responses for thinking models like deepseek-r1, preventing
false-negative health check failures.

Closes #707

Co-Authored-By: jfromentrudder <jfromentrudder@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Health checks do not support models with thinking capabilities, such as deepseek-r1

1 participant