Skip to content

feat: add websocket_mode for responses api#990

Open
praneeth999 wants to merge 7 commits intomainfrom
Web_socket_mode
Open

feat: add websocket_mode for responses api#990
praneeth999 wants to merge 7 commits intomainfrom
Web_socket_mode

Conversation

@praneeth999
Copy link
Collaborator

@praneeth999 praneeth999 commented Mar 10, 2026

Description

Add Web socket mode for resposnes api

Type of change

  • New feature (feat:) - Non-breaking change which adds functionality

Checklist

  • I have run pre-commit on my changed files and all checks pass
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
## How to Test
Add test method for this PR.
### Test CLI Command
Write down the test bash command. If there is pre-requests, please emphasize.
```bash

Summary by CodeRabbit

  • New Features

    • WebSocket mode with adaptive streaming; transport manages stream/background and reconnects for real-time responses.
  • Configs

    • Added example WebSocket-ready provider configs for multiple models.
  • Bug Fixes

    • Transport-control fields excluded from API payloads.
    • Preserve messages that include an identifier to avoid unwanted reconstruction.
  • Tests

    • Extensive regression tests covering websocket mode, fallback behavior, mid-stream errors, and transport parity.
  • Chores

    • Added websockets dependency and validation for websocket_mode.

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds WebSocket-mode support for the Responses backend: new persistent WebSocket transport, websocket-aware streaming in ResponseBackend, websocket_mode config/validation and API-parameter exclusion, new WebSocket configs, extensive tests, and small formatter tweak preserving messages with an "id".

Changes

Cohort / File(s) Summary
WebSocket transport & backend logic
massgen/backend/_websocket_transport.py, massgen/backend/response.py
New WebSocketResponseTransport and WebSocketConnectionError; ResponseBackend extended to optionally use WebSocket transport, added _WSEvent adapter, websocket-aware stream selection, fallback-to-HTTP, connection lifecycle and mid-stream error handling.
API params & config validation
massgen/api_params_handler/_api_params_handler_base.py, massgen/api_params_handler/_response_api_params_handler.py, massgen/backend/base.py, massgen/config_validator.py
Added websocket_mode to exclusion lists and to backend config boolean validation; Response API params builder now excludes websocket_mode, base_url, organization and conditions stream behavior on websocket_mode.
Configs / manifests
pyproject.toml, massgen/configs/providers/openai/*_websocket.yaml
Added websockets>=14.0 dependency and three WebSocket-mode YAML provider configs for different models.
Tests
massgen/tests/test_websocket_mode.py
Large new test suite covering websocket-mode validation, param handling, transport behavior, ResponseBackend WebSocket/HTTP selection and fallback, mid-stream disconnects, and event wrapping.
Formatting behavior
massgen/formatter/_response_formatter.py
Preserve messages that include an "id" by appending them unchanged to converted_messages, skipping reconstruction logic.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant ResponseBackend
participant WS_Transport as WebSocketResponseTransport
participant OpenAI_WS as OpenAI WebSocket Server

Client->>ResponseBackend: request.create (websocket_mode=true, api_params)
ResponseBackend->>WS_Transport: connect()
WS_Transport->>OpenAI_WS: WebSocket handshake (Authorization, org)
OpenAI_WS-->>WS_Transport: connected
ResponseBackend->>WS_Transport: send response.create event (payload)
WS_Transport->>OpenAI_WS: send event
OpenAI_WS-->>WS_Transport: event stream (partial/complete/error)
WS_Transport-->>ResponseBackend: yield parsed events (_WSEvent)
ResponseBackend-->>Client: SDK stream chunks (converted from events)
Note over WS_Transport,ResponseBackend: on transport failure -> ResponseBackend falls back to HTTP streaming

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • ncrispino
  • a5507203
🚥 Pre-merge checks | ✅ 3 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Documentation Updated ⚠️ Warning The pull request introduces WebSocket mode for the Response API but lacks critical user-facing documentation, design documents, and YAML parameter documentation required for new features. Add documentation to yaml_schema.rst, create user guide explaining WebSocket mode, add design document in docs/dev_notes/, and update backends.rst with WebSocket capabilities.
Description check ❓ Inconclusive The description covers the type of change and includes a completed checklist, but lacks detail: missing how to test section, incomplete pre-commit status, and no explanation of what the feature does or why it was added. Expand the description section with details on what WebSocket mode enables, why it was needed, and provide actual test commands and expected results in the How to Test section.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main feature: adding WebSocket mode support for the responses API, matching the primary objective of the changeset.
Capabilities Registry Check ✅ Passed WebSocket mode is a transport-layer capability addition using existing models, not a backend or model change, so custom check instructions do not apply.
Config Parameter Sync ✅ Passed Both required files have been properly updated with the new websocket_mode parameter in their respective exclusion sets with consistent comments.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch Web_socket_mode
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@praneeth999 praneeth999 changed the title feat(api): add websocket_mode for responses api feat: add websocket_mode for responses api Mar 10, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
massgen/api_params_handler/_response_api_params_handler.py (1)

74-77: Consider aligning ChatCompletionsAPIParamsHandler with websocket_mode handling.

Only ResponseAPIParamsHandler implements the websocket_mode gate on streaming. The ChatCompletionsAPIParamsHandler (per context snippet 3) unconditionally sets stream: True. If WebSocket mode should apply to both backends, the sibling handler may need similar logic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/api_params_handler/_response_api_params_handler.py` around lines 74 -
77, The ChatCompletionsAPIParamsHandler unconditionally sets
api_params["stream"] = True while ResponseAPIParamsHandler gates streaming with
websocket_mode; update ChatCompletionsAPIParamsHandler to read websocket_mode =
all_params.get("websocket_mode", False) (same as in ResponseAPIParamsHandler)
and only set api_params["stream"] = True when websocket_mode is False so
WebSocket mode disables backend streaming consistently; look for the
websocket_mode variable usage and the api_params["stream"] assignment in
ChatCompletionsAPIParamsHandler to apply the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@massgen/api_params_handler/_response_api_params_handler.py`:
- Around line 74-77: The ChatCompletionsAPIParamsHandler unconditionally sets
api_params["stream"] = True while ResponseAPIParamsHandler gates streaming with
websocket_mode; update ChatCompletionsAPIParamsHandler to read websocket_mode =
all_params.get("websocket_mode", False) (same as in ResponseAPIParamsHandler)
and only set api_params["stream"] = True when websocket_mode is False so
WebSocket mode disables backend streaming consistently; look for the
websocket_mode variable usage and the api_params["stream"] assignment in
ChatCompletionsAPIParamsHandler to apply the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 03601812-a7a7-4b93-a471-258aafbbe7cd

📥 Commits

Reviewing files that changed from the base of the PR and between 68dc8df and 3d6824c.

📒 Files selected for processing (5)
  • massgen/api_params_handler/_api_params_handler_base.py
  • massgen/api_params_handler/_response_api_params_handler.py
  • massgen/backend/base.py
  • massgen/config_validator.py
  • pyproject.toml

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
massgen/backend/_websocket_transport.py (3)

96-137: Add return type hint and handle unexpected disconnection.

The send_and_receive method is missing a return type annotation. Additionally, if the WebSocket connection drops unexpectedly during iteration (line 116), the exception will propagate unhandled. Consider adding explicit handling for connection loss.

♻️ Suggested improvements
+from collections.abc import AsyncGenerator
+from websockets.exceptions import ConnectionClosed
+
     async def send_and_receive(
         self,
         api_params: dict[str, Any],
-    ):
+    ) -> AsyncGenerator[dict[str, Any], None]:
         """Send a response.create event and yield parsed response events.

         Args:
             api_params: The API params dict (same as HTTP body, minus stream/background).

         Yields:
             Parsed event dicts with a "type" field matching the HTTP SSE event types
             (e.g. "response.output_text.delta", "response.completed").
+
+        Raises:
+            WebSocketConnectionError: If not connected or connection lost during receive.
         """
         if self._ws is None:
             raise WebSocketConnectionError("Not connected. Call connect() first.")

         message = self._build_response_create_event(api_params)
         await self._ws.send(message)
         logger.debug("[WebSocket] Sent response.create event")

-        async for raw_message in self._ws:
+        try:
+            async for raw_message in self._ws:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/_websocket_transport.py` around lines 96 - 137, Add an
explicit async return type to send_and_receive (e.g., AsyncIterator[dict[str,
Any]] or AsyncGenerator[dict[str, Any], None]) and wrap the async iteration over
self._ws in a try/except that catches connection-related exceptions (e.g.,
websockets ConnectionClosed, ConnectionResetError, asyncio.CancelledError or a
broad Exception as a fallback) and re-raises them as WebSocketConnectionError
with a clear message; update any necessary typing imports
(typing.AsyncIterator/AsyncGenerator) and reference send_and_receive, self._ws,
and WebSocketConnectionError when making the change.

41-50: Add docstring for __init__ method.

Per coding guidelines, new functions should include Google-style docstrings.

📝 Suggested docstring
     def __init__(
         self,
         api_key: str,
         url: str = DEFAULT_WS_URL,
         organization: str | None = None,
     ):
+        """Initialize the WebSocket transport.
+
+        Args:
+            api_key: OpenAI API key for authentication.
+            url: WebSocket endpoint URL.
+            organization: Optional OpenAI organization ID.
+        """
         self.api_key = api_key
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/_websocket_transport.py` around lines 41 - 50, Add a
Google-style docstring to the __init__ method of the WebSocket transport class
(the __init__ that accepts api_key, url, and organization and sets self._ws),
documenting the purpose of the constructor, each parameter (api_key: str, url:
str = DEFAULT_WS_URL, organization: Optional[str]) and the attributes
initialized (self.api_key, self.url, self.organization, self._ws), and include
types and brief behavior notes (e.g., default URL and that _ws is initialized to
None).

83-83: Consider catching more specific exceptions for connection retries.

The blind except Exception catches all exceptions including KeyboardInterrupt and SystemExit. For WebSocket connection attempts, catching websockets library exceptions and standard connection errors would be more appropriate.

♻️ Suggested improvement
+from websockets.exceptions import WebSocketException
+
-            except Exception as e:
+            except (OSError, WebSocketException) as e:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/_websocket_transport.py` at line 83, The catch-all "except
Exception as e:" in the connection/retry logic should be replaced with specific
exception types so we don't swallow critical signals like
KeyboardInterrupt/SystemExit; update the except block in the websocket
connection routine (the block containing "except Exception as e:") to catch
websocket and network-related errors only (for example
websockets.exceptions.ConnectionClosedError,
websockets.exceptions.InvalidHandshake/InvalidURI, asyncio.TimeoutError,
OSError) and handle/retry those, while letting other exceptions propagate.
Ensure the variable "e" is preserved in the new except clauses for logging and
that any generic cleanup still runs outside these specific exception handlers.
massgen/configs/providers/openai/gpt5_2_websocket.yaml (1)

1-13: Add "What happens" comment explaining execution flow.

Per coding guidelines, YAML configs should include comments explaining execution flow to help users understand the configuration behavior.

📝 Suggested improvement
 # GPT-5.2 with WebSocket Mode
 # Single agent using the latest GPT-5.2 model over persistent WebSocket.
+#
+# What happens:
+# 1. A persistent WebSocket connection is established to OpenAI's responses API
+# 2. All requests/responses flow over this single connection (lower latency)
+# 3. Web search and code interpreter tools are available for the agent
 agents:
   - id: "gpt-5-2-ws"

As per coding guidelines: "Include 'What happens' comments explaining execution flow"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/configs/providers/openai/gpt5_2_websocket.yaml` around lines 1 - 13,
Add a "What happens" comment at the top of the YAML describing execution flow
for the agent with id "gpt-5-2-ws": explain that the backend type "openai" will
use model "gpt-5.2" in websocket_mode (persistent connection), note that
enable_web_search and enable_code_interpreter toggle browsing and code execution
capabilities, and state how UI settings (display_type "textual_terminal" and
logging_enabled true) affect runtime output and logs; keep the comment concise
and placed above the agents block so users immediately see the behavior.
massgen/configs/providers/openai/multi_model_websocket.yaml (1)

1-18: Consider adding orchestrator and system_message for multi-agent setup.

This config defines two agents without an orchestrator section or system_message fields. Per coding guidelines, multi-agent setups should have identical system messages for all agents, and coordination behavior should be explicit.

For a minimal example config this may be acceptable, but users might be confused about how the agents coordinate.

📝 Suggested additions for clarity
 # Multi-Model WebSocket Mode
 # Two agents with different models, both using WebSocket transport.
+#
+# What happens:
+# 1. Both agents establish persistent WebSocket connections
+# 2. Default voting coordination is used to reconcile agent outputs
 agents:
   - id: "gpt-5-2"
+    system_message: "You are a helpful AI assistant."
     backend:
       type: "openai"
       model: "gpt-5.2"
       websocket_mode: true
       enable_code_interpreter: true
   - id: "gpt-5-nano"
+    system_message: "You are a helpful AI assistant."
     backend:
       type: "openai"
       model: "gpt-5-nano"
       websocket_mode: true
       enable_code_interpreter: true
+orchestrator:
+  voting_sensitivity: "balanced"
 ui:
   display_type: "textual_terminal"
   logging_enabled: true

As per coding guidelines: "All agents should have identical system_message for multi-agent setups"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/configs/providers/openai/multi_model_websocket.yaml` around lines 1 -
18, Add an explicit orchestrator section and identical system_message fields for
both agents to clarify coordination: update the config to include an
"orchestrator" entry describing coordination mode (e.g., turn-taking or
leader-election) and add the same "system_message" text to the agent blocks for
"gpt-5-2" and "gpt-5-nano" so both agents share identical system instructions
and the orchestration behavior is explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@massgen/backend/_websocket_transport.py`:
- Around line 96-137: Add an explicit async return type to send_and_receive
(e.g., AsyncIterator[dict[str, Any]] or AsyncGenerator[dict[str, Any], None])
and wrap the async iteration over self._ws in a try/except that catches
connection-related exceptions (e.g., websockets ConnectionClosed,
ConnectionResetError, asyncio.CancelledError or a broad Exception as a fallback)
and re-raises them as WebSocketConnectionError with a clear message; update any
necessary typing imports (typing.AsyncIterator/AsyncGenerator) and reference
send_and_receive, self._ws, and WebSocketConnectionError when making the change.
- Around line 41-50: Add a Google-style docstring to the __init__ method of the
WebSocket transport class (the __init__ that accepts api_key, url, and
organization and sets self._ws), documenting the purpose of the constructor,
each parameter (api_key: str, url: str = DEFAULT_WS_URL, organization:
Optional[str]) and the attributes initialized (self.api_key, self.url,
self.organization, self._ws), and include types and brief behavior notes (e.g.,
default URL and that _ws is initialized to None).
- Line 83: The catch-all "except Exception as e:" in the connection/retry logic
should be replaced with specific exception types so we don't swallow critical
signals like KeyboardInterrupt/SystemExit; update the except block in the
websocket connection routine (the block containing "except Exception as e:") to
catch websocket and network-related errors only (for example
websockets.exceptions.ConnectionClosedError,
websockets.exceptions.InvalidHandshake/InvalidURI, asyncio.TimeoutError,
OSError) and handle/retry those, while letting other exceptions propagate.
Ensure the variable "e" is preserved in the new except clauses for logging and
that any generic cleanup still runs outside these specific exception handlers.

In `@massgen/configs/providers/openai/gpt5_2_websocket.yaml`:
- Around line 1-13: Add a "What happens" comment at the top of the YAML
describing execution flow for the agent with id "gpt-5-2-ws": explain that the
backend type "openai" will use model "gpt-5.2" in websocket_mode (persistent
connection), note that enable_web_search and enable_code_interpreter toggle
browsing and code execution capabilities, and state how UI settings
(display_type "textual_terminal" and logging_enabled true) affect runtime output
and logs; keep the comment concise and placed above the agents block so users
immediately see the behavior.

In `@massgen/configs/providers/openai/multi_model_websocket.yaml`:
- Around line 1-18: Add an explicit orchestrator section and identical
system_message fields for both agents to clarify coordination: update the config
to include an "orchestrator" entry describing coordination mode (e.g.,
turn-taking or leader-election) and add the same "system_message" text to the
agent blocks for "gpt-5-2" and "gpt-5-nano" so both agents share identical
system instructions and the orchestration behavior is explicit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 47e0b97a-50c3-4682-816b-cc47a51e3cfa

📥 Commits

Reviewing files that changed from the base of the PR and between 3d6824c and 66ca5db.

📒 Files selected for processing (9)
  • massgen/api_params_handler/_api_params_handler_base.py
  • massgen/api_params_handler/_response_api_params_handler.py
  • massgen/backend/_websocket_transport.py
  • massgen/backend/base.py
  • massgen/config_validator.py
  • massgen/configs/providers/openai/gpt5_2_websocket.yaml
  • massgen/configs/providers/openai/gpt5_nano_websocket.yaml
  • massgen/configs/providers/openai/multi_model_websocket.yaml
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (3)
  • massgen/api_params_handler/_response_api_params_handler.py
  • pyproject.toml
  • massgen/backend/base.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
massgen/backend/_websocket_transport.py (1)

96-137: Add return type hint for the async generator.

The method is missing a return type annotation. For async generators, use AsyncGenerator from typing or collections.abc.

📝 Suggested improvement
+from collections.abc import AsyncGenerator
+
 ...
 
     async def send_and_receive(
         self,
         api_params: dict[str, Any],
-    ):
+    ) -> AsyncGenerator[dict[str, Any], None]:
         """Send a response.create event and yield parsed response events.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/_websocket_transport.py` around lines 96 - 137, The async
generator method send_and_receive lacks a return type annotation; update its
signature to return an AsyncGenerator of event dicts (e.g.
AsyncGenerator[dict[str, Any], None]) and add the corresponding import (from
typing or collections.abc import AsyncGenerator) at the top of the module;
ensure the symbol names to edit are send_and_receive and the module imports so
tooling and type checkers recognize the async generator return type.
massgen/configs/providers/openai/multi_model_websocket.yaml (1)

1-18: Consider adding a "What happens" comment explaining the execution flow.

Per coding guidelines for YAML configs, include comments explaining what happens when this configuration is used. This helps users understand the WebSocket transport behavior.

📝 Suggested improvement
 # Multi-Model WebSocket Mode
 # Two agents with different models, both using WebSocket transport.
+#
+# What happens:
+# - Both agents use persistent WebSocket connections instead of HTTP streaming
+# - The transport layer handles connection lifecycle and message streaming
+# - Code interpreter is enabled for both agents
 agents:
   - id: "gpt-5-2"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/configs/providers/openai/multi_model_websocket.yaml` around lines 1 -
18, Add a short "What happens" comment at the top of this YAML (above
"Multi-Model WebSocket Mode") describing the execution flow: that the two agents
listed under agents (ids "gpt-5-2" and "gpt-5-nano") will run concurrently using
OpenAI backends with websocket_mode: true so their messages are streamed over
WebSocket, that enable_code_interpreter: true enables the code-execution
extension for each agent, and that ui.display_type: "textual_terminal" with
logging_enabled: true will present streamed output in the terminal and record
logs; place this explanatory comment near the file header so users immediately
see the runtime behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/api_params_handler/_response_api_params_handler.py`:
- Around line 77-82: The code mutates the input dict by calling
all_params.pop("background", None), which can cause unexpected side effects;
change the logic to operate on a shallow copy of all_params (e.g., work with a
new dict variable before modifying) or explicitly build a new params dict
instead of mutating all_params so websocket_mode, api_params, and the rest of
the flow remain the same; update the code paths that reference websocket_mode,
api_params, and any later use of all_params to use the new copy or constructed
dict, or document the mutation in the function docstring if mutation is
intentional.

---

Nitpick comments:
In `@massgen/backend/_websocket_transport.py`:
- Around line 96-137: The async generator method send_and_receive lacks a return
type annotation; update its signature to return an AsyncGenerator of event dicts
(e.g. AsyncGenerator[dict[str, Any], None]) and add the corresponding import
(from typing or collections.abc import AsyncGenerator) at the top of the module;
ensure the symbol names to edit are send_and_receive and the module imports so
tooling and type checkers recognize the async generator return type.

In `@massgen/configs/providers/openai/multi_model_websocket.yaml`:
- Around line 1-18: Add a short "What happens" comment at the top of this YAML
(above "Multi-Model WebSocket Mode") describing the execution flow: that the two
agents listed under agents (ids "gpt-5-2" and "gpt-5-nano") will run
concurrently using OpenAI backends with websocket_mode: true so their messages
are streamed over WebSocket, that enable_code_interpreter: true enables the
code-execution extension for each agent, and that ui.display_type:
"textual_terminal" with logging_enabled: true will present streamed output in
the terminal and record logs; place this explanatory comment near the file
header so users immediately see the runtime behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: aadbdc46-c52e-4aa2-a8ae-eda50b4f5acd

📥 Commits

Reviewing files that changed from the base of the PR and between 3e56d15 and 7026172.

📒 Files selected for processing (11)
  • massgen/api_params_handler/_api_params_handler_base.py
  • massgen/api_params_handler/_response_api_params_handler.py
  • massgen/backend/_websocket_transport.py
  • massgen/backend/base.py
  • massgen/config_validator.py
  • massgen/configs/providers/openai/gpt5_2_websocket.yaml
  • massgen/configs/providers/openai/gpt5_nano_websocket.yaml
  • massgen/configs/providers/openai/multi_model_websocket.yaml
  • massgen/formatter/_response_formatter.py
  • massgen/tests/test_websocket_mode.py
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (7)
  • massgen/config_validator.py
  • massgen/configs/providers/openai/gpt5_2_websocket.yaml
  • massgen/backend/base.py
  • pyproject.toml
  • massgen/configs/providers/openai/gpt5_nano_websocket.yaml
  • massgen/tests/test_websocket_mode.py
  • massgen/api_params_handler/_api_params_handler_base.py

Comment on lines +77 to +82
websocket_mode = all_params.get("websocket_mode", False) # In WebSocket mode, stream/background are not used (transport handles streaming)
api_params = {"input": converted_messages}
if not websocket_mode:
api_params["stream"] = True
else:
all_params.pop("background", None)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Mutating all_params may cause unintended side effects.

The all_params.pop("background", None) call modifies the input dictionary in place. If the caller reuses all_params after this call, the background key will be unexpectedly missing. Consider operating on a copy or documenting this mutation.

🛡️ Suggested fix to avoid mutation
         websocket_mode = all_params.get("websocket_mode", False)  # In WebSocket mode, stream/background are not used (transport handles streaming)
         api_params = {"input": converted_messages}
         if not websocket_mode:
             api_params["stream"] = True
         else:
-            all_params.pop("background", None)
+            # Don't include 'background' in api_params for WebSocket mode
+            # (handled below via excluded params or explicit skip)

Alternatively, if background must be removed to prevent it from being added later in the loop, ensure this mutation is documented in the docstring.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/api_params_handler/_response_api_params_handler.py` around lines 77 -
82, The code mutates the input dict by calling all_params.pop("background",
None), which can cause unexpected side effects; change the logic to operate on a
shallow copy of all_params (e.g., work with a new dict variable before
modifying) or explicitly build a new params dict instead of mutating all_params
so websocket_mode, api_params, and the rest of the flow remain the same; update
the code paths that reference websocket_mode, api_params, and any later use of
all_params to use the new copy or constructed dict, or document the mutation in
the function docstring if mutation is intentional.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
massgen/backend/response.py (1)

40-81: Add Google-style docstrings to the new websocket helpers.

_WSEvent’s new helper methods plus _create_response_stream(), _ws_event_stream(), and _create_client() still use minimal or non-Google docstrings. Please add Google-style Args/Returns/Yields sections so the new websocket surface matches the repo standard.

As per coding guidelines, **/*.py: For new or changed functions, include Google-style docstrings

Also applies to: 1724-1744

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/response.py` around lines 40 - 81, Add Google-style
docstrings to the websocket helper classes and functions so they match the repo
standard: for class _WSEvent add a class docstring and per-method docstrings
(including __init__, _wrap, __getattr__, __getitem__, get, model_dump, etc.)
with Args and Returns sections where relevant; for the functions
_create_response_stream, _ws_event_stream, and _create_client add docstrings
that describe purpose, include Args, Returns and Yields (for generators/streams)
as appropriate and mention any raised exceptions; ensure the text is concise and
follows the existing Google-style format used elsewhere in the codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/backend/response.py`:
- Around line 565-584: The code uses a single current_function_call slot which
gets overwritten when multiple concurrent function calls stream; replace
current_function_call with a dict (e.g., in_flight_calls) keyed by the unique
item id/call id (use getattr(chunk.item, "call_id") or chunk.item.item_id if
present) and update the three handlers: in response.output_item.added create an
entry in in_flight_calls[item_id] with name/arguments/call_id, in
response.function_call_arguments.delta append deltas to
in_flight_calls[item_id]["arguments"] only when the chunk's item_id matches an
existing key, and in response.output_item.done move in_flight_calls[item_id]
into captured_function_calls and delete that key; ensure all references to
current_function_call are replaced accordingly so concurrent streams are tracked
independently.

---

Nitpick comments:
In `@massgen/backend/response.py`:
- Around line 40-81: Add Google-style docstrings to the websocket helper classes
and functions so they match the repo standard: for class _WSEvent add a class
docstring and per-method docstrings (including __init__, _wrap, __getattr__,
__getitem__, get, model_dump, etc.) with Args and Returns sections where
relevant; for the functions _create_response_stream, _ws_event_stream, and
_create_client add docstrings that describe purpose, include Args, Returns and
Yields (for generators/streams) as appropriate and mention any raised
exceptions; ensure the text is concise and follows the existing Google-style
format used elsewhere in the codebase.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7d86f495-c2d8-4226-a339-483f92b80251

📥 Commits

Reviewing files that changed from the base of the PR and between 7026172 and 879091c.

📒 Files selected for processing (1)
  • massgen/backend/response.py

Comment on lines +565 to +584
# Detect function call start
if chunk.type == "response.output_item.added" and hasattr(chunk, "item") and chunk.item and getattr(chunk.item, "type", None) == "function_call":
current_function_call = {
"call_id": getattr(chunk.item, "call_id", ""),
"name": getattr(chunk.item, "name", ""),
"arguments": "",
}
logger.info(
f"Function call detected: {current_function_call['name']}",
)

# Handle other streaming events (reasoning, provider tools, etc.)
else:
result = self._process_stream_chunk(chunk, agent_id)
yield result

# Response completed
if chunk.type in ["response.completed", "response.incomplete"]:
response_completed = True
# Note: Usage tracking is handled in _process_stream_chunk() above
# Capture response ID and ALL output items for reasoning continuity
if hasattr(chunk, "response") and chunk.response:
response_id = getattr(chunk.response, "id", None)
if response_id:
logger.debug(f"Captured response ID for reasoning continuity: {response_id}")
# CRITICAL: Capture ALL output items (reasoning, function_call, message)
# These must be included in the next request for reasoning models
output = getattr(chunk.response, "output", [])
if output:
for item in output:
# Convert to dict format for the API
item_dict = self._convert_to_dict(item) if hasattr(item, "model_dump") or hasattr(item, "dict") else item
if isinstance(item_dict, dict):
response_output_items.append(item_dict)
logger.debug(f"Captured {len(response_output_items)} output items for reasoning continuity")
if captured_function_calls:
# Execute captured function calls and recurse
self.end_api_call_timing(success=True)
break # Exit chunk loop to execute functions
# Accumulate function arguments
elif chunk.type == "response.function_call_arguments.delta" and current_function_call is not None:
delta = getattr(chunk, "delta", "")
current_function_call["arguments"] += delta

# Function call completed
elif chunk.type == "response.output_item.done" and current_function_call is not None:
captured_function_calls.append(current_function_call)
current_function_call = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n massgen/backend/response.py | head -600 | tail -100

Repository: massgen/MassGen

Length of output: 5628


🏁 Script executed:

cat -n massgen/backend/response.py | sed -n '520,600p'

Repository: massgen/MassGen

Length of output: 4564


🏁 Script executed:

rg -n "output_index|item_id" massgen/

Repository: massgen/MassGen

Length of output: 30636


🏁 Script executed:

rg -n "function_call|function call" massgen/backend/response.py -A 2 -B 2

Repository: massgen/MassGen

Length of output: 7726


🏁 Script executed:

sed -n '620,640p' massgen/backend/response.py

Repository: massgen/MassGen

Length of output: 1499


🏁 Script executed:

rg -n "captured_function_calls|response_output_items" massgen/backend/response.py | head -20

Repository: massgen/MassGen

Length of output: 1340


🏁 Script executed:

fd -e py -path "*/test*" | xargs rg -l "function_call.*function_call\|multi.*tool\|multiple.*call" 2>/dev/null || echo "No matches found"

Repository: massgen/MassGen

Length of output: 261


🏁 Script executed:

sed -n '657,680p' massgen/backend/response.py

Repository: massgen/MassGen

Length of output: 1305


🏁 Script executed:

sed -n '576,585p' massgen/backend/response.py

Repository: massgen/MassGen

Length of output: 642


🏁 Script executed:

sed -n '576,600p' massgen/backend/response.py

Repository: massgen/MassGen

Length of output: 1464


🏁 Script executed:

sed -n '1077,1086p' massgen/backend/docs/Function\ calling\ openai\ responses.md

Repository: massgen/MassGen

Length of output: 1586


🏁 Script executed:

grep -A 10 "response.function_call_arguments.delta" massgen/backend/docs/Function\ calling\ openai\ responses.md | head -20

Repository: massgen/MassGen

Length of output: 1932


Track streamed tool calls by item_id.

The Responses API can emit multiple function calls in a single response, with each concurrent call receiving deltas keyed by item_id. The current implementation uses a single current_function_call slot that gets overwritten when a second response.output_item.added arrives before the first call's arguments are finalized. The delta handler at lines 577-579 checks only current_function_call is not None without verifying the item_id matches, causing argument streams for earlier calls to accumulate into later calls' slots and resulting in corrupted or dropped tool calls. Use a dict keyed by item_id to track in-flight calls separately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/response.py` around lines 565 - 584, The code uses a single
current_function_call slot which gets overwritten when multiple concurrent
function calls stream; replace current_function_call with a dict (e.g.,
in_flight_calls) keyed by the unique item id/call id (use getattr(chunk.item,
"call_id") or chunk.item.item_id if present) and update the three handlers: in
response.output_item.added create an entry in in_flight_calls[item_id] with
name/arguments/call_id, in response.function_call_arguments.delta append deltas
to in_flight_calls[item_id]["arguments"] only when the chunk's item_id matches
an existing key, and in response.output_item.done move in_flight_calls[item_id]
into captured_function_calls and delete that key; ensure all references to
current_function_call are replaced accordingly so concurrent streams are tracked
independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant