Add reasoning streaming event #1211

aantn · 2025-12-21T19:44:04Z

Summary

add a dedicated AI reasoning stream event for chat SSE payloads
emit reasoning content from streaming LLM responses before returning chat results

Testing

Not run (not requested)

Codex Task

Summary by CodeRabbit

New Features
- Streaming now exposes the AI's intermediate reasoning in real time alongside final messages, so users can observe the model's thought process as responses generate.
- Streaming order improved so reasoning appears earlier and is included with final answers for clearer, more coherent output during live responses.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-21T19:44:16Z

Walkthrough

Adds extraction and streaming emission of LLM "reasoning" content: a new AI_REASONING stream event is introduced and call_stream now yields reasoning alongside or before AI messages and tool-call events, using extracted message content when emitting ANSWER_END.

Changes

Cohort / File(s)	Summary
Stream event type addition `holmes/utils/stream.py`	Added `AI_REASONING = "ai_reasoning"` enum member to `StreamEvents`.
Reasoning extraction & streaming `holmes/core/tool_calling_llm.py`	Updated `call_stream` to extract `reasoning_content`, emit a new `AI_REASONING` stream event with metadata, emit `AI_MESSAGE` including both content and reasoning when present, and use the extracted message for `ANSWER_END` when no tool calls occur.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ToolCallingLLM
  participant LLM
  participant StreamEmitter
  participant ToolRunner

  Client->>ToolCallingLLM: request call_stream(...)
  ToolCallingLLM->>LLM: send prompt / await streaming response
  LLM-->>ToolCallingLLM: stream chunks (content, reasoning_content, tool_call hints)
  alt reasoning_content present
    ToolCallingLLM->>StreamEmitter: emit AI_REASONING(reasoning_content, metadata)
  end
  alt content present
    ToolCallingLLM->>StreamEmitter: emit AI_MESSAGE(content + reasoning, metadata)
  end
  alt tool call required
    ToolCallingLLM->>StreamEmitter: emit TOOL_CALL(...)
    ToolCallingLLM->>ToolRunner: invoke tool(...)
    ToolRunner-->>ToolCallingLLM: tool result
    ToolCallingLLM->>StreamEmitter: emit TOOL_RESPONSE(...)
  else no tool call
    ToolCallingLLM->>StreamEmitter: emit ANSWER_END(final_message)
  end
  StreamEmitter-->>Client: forward stream events

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

add ai reasoning logs #1062 — Related changes to reasoning handling in holmes/core/tool_calling_llm.py (logging and reasoning content).
Add ai_message for intermediate message & reasoning #879 — Related streaming-path changes that emit intermediate AI events and adjust reasoning/AI message emission.

Suggested reviewers

moshemorad
arikalon1

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'Add reasoning streaming event' directly and clearly summarizes the main change: adding a new AI_REASONING stream event to handle reasoning content in streaming LLM responses.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…-to-ui-as-streaming-ev

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

holmes/core/tool_calling_llm.py (1)
938-946: LGTM: AI_MESSAGE event properly handles reasoning and content together.

The event correctly includes both content and reasoning fields, allowing the frontend to display them together when tool calls are present. The condition if reasoning or message: appropriately prevents emitting empty events.
Optional: Consider consistency with non-streaming version

The non-streaming version (Line 448) checks text_response and text_response.strip() to avoid logging whitespace-only content. For consistency, consider:
-if reasoning or message:
+if reasoning or (message and message.strip()):
This prevents emitting AI_MESSAGE events when message contains only whitespace. However, this is a minor edge case and the current implementation is acceptable.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd8ee21 and 425c09e.

📒 Files selected for processing (1)

holmes/core/tool_calling_llm.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

holmes/core/tool_calling_llm.py

holmes/core/tool_calling_llm.py

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/core/tool_calling_llm.py: Use LiteLLM for multi-provider support (OpenAI, Anthropic, Azure, etc.) in LLM integration
Implement structured tool calling with automatic retry and error handling

Files:

holmes/core/tool_calling_llm.py

🧠 Learnings (2)

📚 Learning: 2025-12-21T13:17:48.366Z

Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Use LiteLLM for multi-provider support (OpenAI, Anthropic, Azure, etc.) in LLM integration

Applied to files:

holmes/core/tool_calling_llm.py

📚 Learning: 2025-12-21T13:17:48.366Z

Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

holmes/core/tool_calling_llm.py

🧬 Code graph analysis (1)

holmes/core/tool_calling_llm.py (1)

holmes/utils/stream.py (2)

StreamMessage (28-30)

StreamEvents (16-25)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: build (3.12)
GitHub Check: build (3.11)
GitHub Check: build (3.10)
GitHub Check: llm_evals
GitHub Check: build

🔇 Additional comments (3)

holmes/core/tool_calling_llm.py (3)

915-916: LGTM: Clear variable extraction for reasoning and message content.

Extracting reasoning_content and message content into local variables improves readability and prepares them for use in multiple stream events. The use of getattr with a default value safely handles cases where reasoning_content may not be present.

927-936: LGTM: Proper use of extracted message in ANSWER_END event.

Using the extracted message variable instead of accessing response_message.content directly improves code readability and is consistent with the non-streaming implementation.

918-925: LGTM: Correctly implements AI reasoning stream event with cross-provider safety.

The AI_REASONING event emission uses defensive attribute access via getattr(response_message, "reasoning_content", None), ensuring compatibility across all LiteLLM providers. The event is emitted only when reasoning exists, gracefully degrading for providers that don't support reasoning content. This pattern aligns with the non-streaming implementation and requires no additional provider-specific handling.

github-actions · 2025-12-25T05:41:26Z

Results of HolmesGPT evals

ask_holmes: 26/37 test cases were successful, 2 regressions, 6 setup failures, 3 mock failures

Test suite	Test case	Status
ask	01_how_many_pods	🚧
ask	02_what_is_wrong_with_pod	🚧
ask	04_related_k8s_events	✅
ask	05_image_version	✅
ask	09_crashpod	✅
ask	10_image_pull_backoff	✅
ask	110_k8s_events_image_pull	✅
ask	11_init_containers	✅
ask	13a_pending_node_selector_basic	✅
ask	14_pending_resources	✅
ask	15_failed_readiness_probe	✅
ask	163_compaction_follow_up	✅
ask	17_oom_kill	🚧
ask	18_oom_kill_from_issues_history	❌
ask	19_detect_missing_app_details	✅
ask	20_long_log_file_search	❌
ask	24_misconfigured_pvc	✅
ask	24a_misconfigured_pvc_basic	✅
ask	28_permissions_error	🚧
ask	39_failed_toolset	✅
ask	41_setup_argo	✅
ask	42_dns_issues_steps_new_tools	✅
ask	43_current_datetime_from_prompt	✅
ask	45_fetch_deployment_logs_simple	✅
ask	51_logs_summarize_errors	✅
ask	53_logs_find_term	✅
ask	54_not_truncated_when_getting_pods	✅
ask	59_label_based_counting	🚧
ask	60_count_less_than	✅
ask	61_exact_match_counting	🚧
ask	63_fetch_error_logs_no_errors	✅
ask	79_configmap_mount_issue	✅
ask	83_secret_not_found	✅
ask	86_configmap_like_but_secret	✅
ask	93_calling_datadog[0]	🔧
ask	93_calling_datadog[1]	🔧
ask	93_calling_datadog[2]	🔧

Legend

✅ the test was successful
:minus: the test was skipped
⚠️ the test failed but is known to be flaky or known to fail
🚧 the test had a setup failure (not a code regression)
🔧 the test failed due to mock data issues (not a code regression)
🚫 the test was throttled by API rate limits/overload
❌ the test failed and should be fixed before merging the PR

github-actions · 2025-12-25T05:48:27Z

Dev Docker images are ready for this commit:

us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:425c09e

Use this tag to pull the image for testing.

⚠️ Temporary images are deleted after 30 days. Copy to a permanent registry before using them:

gcloud auth configure-docker us-central1-docker.pkg.dev
docker pull us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:425c09e
docker tag us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:425c09e me-west1-docker.pkg.dev/robusta-development/development/holmes-dev:425c09e
docker push me-west1-docker.pkg.dev/robusta-development/development/holmes-dev:425c09e

Patch Helm values in one line (choose the chart you use):

HolmesGPT chart:

helm upgrade --install holmesgpt ./helm/holmes \
  --set registry=me-west1-docker.pkg.dev/robusta-development/development \
  --set image=holmes-dev:425c09e

Robusta wrapper chart:

helm upgrade --install robusta robusta/robusta \
  --reuse-values \
  --set holmes.registry=me-west1-docker.pkg.dev/robusta-development/development \
  --set holmes.image=holmes-dev:425c09e

Add reasoning streaming event

bd8ee21

aantn added the codex label Dec 21, 2025 — with ChatGPT Codex Connector

Merge branch 'master' into codex/linear-mention-rob-25-send-reasoning…

425c09e

…-to-ui-as-streaming-ev

aantn marked this pull request as draft December 25, 2025 05:33

coderabbitai bot reviewed Dec 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add reasoning streaming event #1211

Add reasoning streaming event #1211

aantn commented Dec 21, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 21, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add reasoning streaming event #1211

Are you sure you want to change the base?

Add reasoning streaming event #1211

Conversation

aantn commented Dec 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 25, 2025

Results of HolmesGPT evals

Uh oh!

github-actions bot commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aantn commented Dec 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 21, 2025 •

edited

Loading