Skip to content

[LEADS-415] Responses streaming support#255

Merged
asamal4 merged 2 commits into
lightspeed-core:mainfrom
xmican10:LEADS-415-responses-streaming-support
Jun 23, 2026
Merged

[LEADS-415] Responses streaming support#255
asamal4 merged 2 commits into
lightspeed-core:mainfrom
xmican10:LEADS-415-responses-streaming-support

Conversation

@xmican10

@xmican10 xmican10 commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Description

Followup PR with supporting responses stream=True parameter

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features
    • Added OpenAI-style streaming support to the /responses endpoint for real-time delivery
    • Improved streaming parsing, including tool-call extraction and file-search chunk mapping into the response payload
    • Enhanced token timing metrics for better generation performance tracking
  • Bug Fixes
    • Improved handling of tool-call argument shapes and stricter validation for missing/invalid fields
  • Tests
    • Extended unit tests to verify correct streaming vs non-streaming request behavior and end-to-end streaming parsing (including error and [DONE] edge cases)

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 222153c9-e025-49c1-9cf1-494fe9f662a3

📥 Commits

Reviewing files that changed from the base of the PR and between 9e9df01 and 5449b8f.

📒 Files selected for processing (4)
  • src/lightspeed_evaluation/core/api/client.py
  • src/lightspeed_evaluation/core/api/streaming_parser.py
  • tests/unit/core/api/test_client_responses.py
  • tests/unit/core/api/test_streaming_parser.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/unit/core/api/test_client_responses.py
  • src/lightspeed_evaluation/core/api/client.py
  • src/lightspeed_evaluation/core/api/streaming_parser.py

Walkthrough

The streaming parser is refactored from Pydantic-based models to dataclasses, with the monolithic event switch replaced by handler-dispatch tables (_STREAMING_EVENT_HANDLERS, _RESPONSES_EVENT_HANDLERS). A new parse_responses_streaming function and ResponsesStreamingContext dataclass handle the /responses SSE protocol. APIClient._responses_query gains a stream-gated branch using httpx.Client.stream. Unit tests cover both the streaming dispatch in APIClient and the new parser entrypoint.

Changes

Responses Endpoint Streaming Support

Layer / File(s) Summary
Streaming parser: dataclass state and handler dispatch
src/lightspeed_evaluation/core/api/streaming_parser.py
Replaces Pydantic with dataclasses, converts CONTENT_EVENTS to frozenset, refactors _PerformanceTracker and StreamingContext to dataclasses with token-per-second calculation excluding TTFT, replaces the centralized event switch with _STREAMING_EVENT_HANDLERS per-event dispatch for /streaming protocol, updates _parse_tool_call to support both name/args and legacy tool_name/arguments shapes with an optional error field, and splits validation into _validate_streaming_response and _validate_responses_response.
parse_responses_streaming and /responses SSE loop
src/lightspeed_evaluation/core/api/streaming_parser.py
Adds ResponsesStreamingContext dataclass and _RESPONSES_EVENT_HANDLERS dispatch for response.created, response.output_text.delta (TTFT capture), response.output_item.done (MCP normalization and file-search-to-rag_chunks extraction), and response.completed (final response and token counts); implements parse_responses_streaming with [DONE]-terminated SSE loop; updates parse_streaming_response to use handler dispatch and raises ValueError on error events with tokens.
APIClient streaming dispatch for /responses
src/lightspeed_evaluation/core/api/client.py
Imports parse_responses_streaming; adds a stream-gated branch in _responses_query that uses httpx.Client.stream when responses_request.get("stream") is truthy, calls _handle_response_errors, invokes parse_responses_streaming, and returns APIResponse from the parsed result; non-streaming path via httpx.Client.post remains unchanged.
Unit tests: client streaming dispatch and responses SSE parser
tests/unit/core/api/test_client_responses.py, tests/unit/core/api/test_streaming_parser.py
Adds test_responses_streaming_dispatches_to_parse_responses_streaming and test_responses_non_streaming_does_not_use_streaming_path asserting correct dispatch in TestResponsesEndpoint; adds TestNormalizeMcpItem for JSON-string argument decoding; adds TestParseResponsesStreaming covering basic response parsing, usage tokens and TTFT timing, MCP tool-call extraction with argument decoding and error field capture, file-search-to-rag_chunks mapping with synthetic file_search tool call, missing final-response and conversation_id errors, and [DONE] sentinel ordering validation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • lightspeed-core/lightspeed-evaluation#254: Introduces the /responses endpoint support in APIClient and the initial request/response mapping that this PR extends with streaming dispatch and parse_responses_streaming.

Suggested reviewers

  • asamal4
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main objective: adding streaming support for the responses endpoint, which aligns with the core changes across client.py, streaming_parser.py, and tests.
Docstring Coverage ✅ Passed Docstring coverage is 98.15% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@xmican10 xmican10 force-pushed the LEADS-415-responses-streaming-support branch 3 times, most recently from e8a6dd9 to 5bb24c3 Compare June 12, 2026 14:55
@asamal4

asamal4 commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/api/streaming_parser.py (1)

356-358: 💤 Low value

Unreachable code: arguments can never be None after line 350.

Line 350 uses or {} as a fallback, so arguments will always be at least an empty dict. The condition if arguments is None on line 356 will never be true.

Consider removing this unreachable block:

Proposed fix
         if not tool_name:
             logger.debug("Tool call missing name/tool_name field")
             return None
 
-        if arguments is None:
-            logger.debug("Tool call missing args/arguments field for %s", tool_name)
-            return None
-
         tool_call: dict[str, Any] = {"tool_name": tool_name, "arguments": arguments}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lightspeed_evaluation/core/api/streaming_parser.py` around lines 356 -
358, Remove the unreachable `if arguments is None:` condition and its associated
debug logging block. Since the arguments variable is assigned with an `or {}`
fallback pattern earlier in the code, it will always contain at least an empty
dictionary and can never be None, making this entire conditional block dead code
that should be deleted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/core/api/test_client_responses.py`:
- Line 1: Remove the module-level pylint disable comment at the top of
test_client_responses.py that suppresses protected-access and duplicate-code
warnings. Instead of disabling these checks globally, identify the specific code
locations that trigger these warnings (likely places where protected members are
accessed with underscore prefixes or where code is duplicated) and either
refactor the code to avoid the issue, or if necessary, apply targeted localized
lint suppressions directly to those specific lines. This ensures code quality
standards are maintained and warnings are addressed rather than hidden.

---

Nitpick comments:
In `@src/lightspeed_evaluation/core/api/streaming_parser.py`:
- Around line 356-358: Remove the unreachable `if arguments is None:` condition
and its associated debug logging block. Since the arguments variable is assigned
with an `or {}` fallback pattern earlier in the code, it will always contain at
least an empty dictionary and can never be None, making this entire conditional
block dead code that should be deleted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 00d2cc99-273a-4e6c-8365-056cd39844d9

📥 Commits

Reviewing files that changed from the base of the PR and between 454d871 and 5bb24c3.

📒 Files selected for processing (19)
  • config/system.yaml
  • examples/01_getting_started/basic_setup/README.md
  • examples/02_metrics/context_quality/README.md
  • examples/02_metrics/conversation_quality/README.md
  • examples/02_metrics/keywords_evaluation/README.md
  • examples/02_metrics/nlp_metrics/README.md
  • examples/02_metrics/response_quality/README.md
  • examples/02_metrics/tool_evaluation/README.md
  • examples/03_endpoints/responses/README.md
  • examples/03_endpoints/responses/eval_data.yaml
  • examples/03_endpoints/responses/system.yaml
  • src/lightspeed_evaluation/core/api/client.py
  • src/lightspeed_evaluation/core/api/streaming_parser.py
  • src/lightspeed_evaluation/core/constants.py
  • src/lightspeed_evaluation/core/metrics/custom/tool_eval.py
  • tests/unit/core/api/conftest.py
  • tests/unit/core/api/test_client_responses.py
  • tests/unit/core/api/test_streaming_parser.py
  • tests/unit/core/metrics/custom/test_tool_eval.py

Comment thread tests/unit/core/api/test_client_responses.py
@xmican10

Copy link
Copy Markdown
Collaborator Author

I need to rebase..

@xmican10 xmican10 force-pushed the LEADS-415-responses-streaming-support branch 2 times, most recently from 8c03464 to 54e60be Compare June 17, 2026 07:49
@xmican10

Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review

@asamal4 asamal4 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !! Some minor comments - applicable in multiples places

Comment thread src/lightspeed_evaluation/core/api/streaming_parser.py Outdated
Comment thread src/lightspeed_evaluation/core/api/streaming_parser.py
Comment thread tests/unit/core/api/test_streaming_parser.py Outdated
@xmican10 xmican10 force-pushed the LEADS-415-responses-streaming-support branch from 54e60be to 9e9df01 Compare June 22, 2026 11:34
@xmican10 xmican10 force-pushed the LEADS-415-responses-streaming-support branch from 9e9df01 to 5449b8f Compare June 22, 2026 12:51
@asamal4 asamal4 merged commit a625cab into lightspeed-core:main Jun 23, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants