Skip to content

Conversation

@aantn
Copy link
Collaborator

@aantn aantn commented Dec 25, 2025

Summary

  • log /api/chat SSE responses to stdout while streaming, including intermediate events
  • log final non-streaming /api/chat responses for better observability

Testing

  • not run (not requested)

Codex Task

Summary by CodeRabbit

  • Refactor

    • Consolidated streaming event handling to use a unified message accumulation and yield pattern.
    • Improved error handling for rate limit scenarios with dedicated error message construction.
  • Chores

    • Enhanced logging of streamed chat events for better monitoring.
    • Added explicit logging of API response data.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 25, 2025

Walkthrough

Both files are enhanced with logging instrumentation. holmes/utils/stream.py consolidates SSE event handling into a unified yield path with logging for streamed events and reworked exception handling. server.py adds explicit logging of chat responses before returning them.

Changes

Cohort / File(s) Summary
Streaming response logging
holmes/utils/stream.py
Introduces local sse_message variable to consolidate SSE formatting across all branches; adds logging.info calls for each streamed event; refactors RateLimitError and generic exception handling to construct SSE messages, log them, and yield via unified path instead of multiple yield points.
Chat API response logging
server.py
Assigns ChatResponse to local variable response, logs the serialized response, and returns the variable instead of directly constructing and returning inline.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • moshemorad
  • arikalon1

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Log /api/chat streaming output' directly and specifically describes the primary change: adding logging to the /api/chat endpoint's streaming responses, which aligns with the main objectives of logging SSE responses and intermediate events.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
holmes/utils/stream.py (1)

137-150: Consider using logging.exception for better error diagnostics.

The exception handling correctly creates and logs SSE error messages. However, line 138 uses logging.error(e) which doesn't include the stack trace. Using logging.exception would provide better debugging information.

🔎 Suggested improvement
     except Exception as e:
-        logging.error(e)
+        logging.exception("Error in stream_chat_formatter")
         if "Model is getting throttled" in str(e):  # happens for bedrock
             sse_message = create_rate_limit_error_message(str(e))

As per static analysis hints, this change would provide stack traces for unexpected errors, making debugging easier.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ac2ae8 and 3af9d54.

📒 Files selected for processing (2)
  • holmes/utils/stream.py
  • server.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: ALWAYS place Python imports at the top of the file, not inside functions or methods
Use Ruff for formatting and linting code
Type hints are required for all Python code

Files:

  • holmes/utils/stream.py
  • server.py
🧬 Code graph analysis (1)
server.py (1)
holmes/core/models.py (1)
  • ChatResponse (243-249)
🪛 Ruff (0.14.10)
holmes/utils/stream.py

137-137: Do not catch blind exception: Exception

(BLE001)


138-138: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: llm_evals
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.11)
  • GitHub Check: build
🔇 Additional comments (4)
server.py (1)

403-411: Good addition of response logging for observability.

The logging implementation aligns with the PR objectives. However, be aware that response.model_dump() will serialize the entire ChatResponse, including conversation_history, tool_calls, and metadata, which could produce verbose logs.

Please verify:

  1. The log volume is acceptable for production environments, especially with lengthy conversation histories.
  2. No sensitive data (PII, credentials, etc.) flows through conversation_history or metadata that shouldn't be logged.
holmes/utils/stream.py (3)

93-104: Excellent refactoring to consolidate the yield path.

The introduction of the sse_message variable and single yield point improves code maintainability and enables consistent logging for all event types. This structural change makes the streaming logic cleaner and more predictable.

Also applies to: 117-121, 128-128


123-128: Great addition for streaming observability.

The logging successfully implements the PR objective to log SSE responses while streaming. Each event is logged with its type and full message content, providing comprehensive visibility into the streaming flow.


130-136: Proper logging added for rate limit errors.

The exception handler correctly creates the SSE error message, logs it, and yields it, maintaining consistency with the normal event flow.

@github-actions
Copy link
Contributor

Results of HolmesGPT evals

  • ask_holmes: 7/7 test cases were successful, 0 regressions
Test suite Test case Status
ask 09_crashpod
ask 101_loki_historical_logs_pod_deleted
ask 12_job_crashing
ask 162_get_runbooks
ask 176_network_policy_blocking_traffic_no_runbooks
ask 43_current_datetime_from_prompt
ask 61_exact_match_counting

Legend

  • ✅ the test was successful
  • :minus: the test was skipped
  • ⚠️ the test failed but is known to be flaky or known to fail
  • 🚧 the test had a setup failure (not a code regression)
  • 🔧 the test failed due to mock data issues (not a code regression)
  • 🚫 the test was throttled by API rate limits/overload
  • ❌ the test failed and should be fixed before merging the PR

@github-actions
Copy link
Contributor

github-actions bot commented Dec 25, 2025

Dev Docker images are ready for this commit:

Use this tag to pull the image for testing.

⚠️ Temporary images are deleted after 30 days. Copy to a permanent registry before using them:

gcloud auth configure-docker us-central1-docker.pkg.dev
docker pull us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:3af9d54
docker tag us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:3af9d54 me-west1-docker.pkg.dev/robusta-development/development/holmes-dev:3af9d54
docker push me-west1-docker.pkg.dev/robusta-development/development/holmes-dev:3af9d54

Patch Helm values in one line (choose the chart you use):

  • HolmesGPT chart:
helm upgrade --install holmesgpt ./helm/holmes \
  --set registry=me-west1-docker.pkg.dev/robusta-development/development \
  --set image=holmes-dev:3af9d54
  • Robusta wrapper chart:
helm upgrade --install robusta robusta/robusta \
  --reuse-values \
  --set holmes.registry=me-west1-docker.pkg.dev/robusta-development/development \
  --set holmes.image=holmes-dev:3af9d54

@aantn aantn marked this pull request as draft December 28, 2025 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants