Skip to content

fix(client): harden chat and agent SSE reliability#1740

Open
coconut-yc wants to merge 1 commit into
Tencent:mainfrom
coconut-yc:contrib/fix-sse-streaming
Open

fix(client): harden chat and agent SSE reliability#1740
coconut-yc wants to merge 1 commit into
Tencent:mainfrom
coconut-yc:contrib/fix-sse-streaming

Conversation

@coconut-yc

@coconut-yc coconut-yc commented Jun 20, 2026

Copy link
Copy Markdown

Description

Fixes the streaming and response-fidelity defects reported in #1738 against
the main base ae9038732ad2. This PR is intentionally limited to the SDK
readers and the existing CLI/MCP accumulators; it does not change the CLI output
contract.

  • Default timeout no longer severs SSE. Client.doRequestStream excludes the ordinary 30-second default from KnowledgeQAStream, ContinueStream, and AgentQAStreamWithRequest. Stream lifetime is governed by context cancellation. The public WithTimeout contract is preserved: when callers explicitly configure it, the same upper bound still applies to streaming calls.
  • Large SSE lines parse. All three readers raise the bufio.Scanner per-line cap from 64 KiB to 4 MiB, covering reference events of hundreds of KiB while retaining a bounded allocation.
  • Terminal error frames terminate. A response_type=error, done=true frame is delivered to the callback and then returned as an SDK error, even when the server leaves the connection open without complete or EOF.
  • Agent accumulation waits for the run sentinel. AgentAccumulator now terminates only on response_type=complete via the new AgentResponseTypeComplete constant. Intermediate thinking/reflection/answer done:true markers no longer discard later tool events or the final answer.
  • Reference identity survives unmarshal. SearchResult.KnowledgeBaseID, ParentChunkID, and SubChunkID preserve knowledge_base_id, parent_chunk_id, and sub_chunk_id from reference events.

The follow-up agent-output PR, #1741, is stacked on this fix and separately changes the CLI/MCP output contract. It adds bounded defaults, --reference, --verbose, human-readable text, and an explicit raw NDJSON mode.

Scope Boundary

The fix commit contains only:

  • client/client.go, client/agent.go, client/session.go, client/knowledgebase.go, client/streaming_test.go
  • the completion-sentinel hunks in cli/internal/sse/agent_accumulator.go and its test
  • the corresponding complete-event fixtures in cli/cmd/session/ask_test.go and cli/internal/mcp/tools_test.go
  • the streaming fixes under cli/CHANGELOG.md [Unreleased]

It does not contain projected events, buffered envelopes, --reference,
--verbose, or other output-contract changes.

Type of Change

  • 🐛 Bug fix
  • 📚 Documentation update (cli/CHANGELOG.md)
  • 🧪 Test
  • ✨ New feature
  • 💥 Breaking change

Related Issue

Fixes #1738

Testing

New regression coverage verifies:

  • the agent reader accepts 256 KiB and a payload just below the 4 MiB cap, while an over-cap payload fails cleanly;
  • KnowledgeQAStream and ContinueStream each accept a 256 KiB reference event;
  • the ordinary default timeout does not cut SSE, while an explicit WithTimeout remains enforced;
  • terminal agent and knowledge-stream error frames return promptly even if the HTTP connection stays open;
  • knowledge_base_id, parent_chunk_id, and sub_chunk_id survive JSON decoding;
  • AgentAccumulator ignores intermediate done:true and finalizes on response_type=complete.

Validated locally:

(cd client && go test ./... && go vet ./...)
(cd cli && go test ./... && go vet ./...)
git diff --check

Checklist

  • Relevant Go files are formatted and git diff --check passes
  • Client and CLI tests/vet pass locally
  • Self-reviewed the code
  • Added regression tests for the changed behavior
  • Updated the unreleased changelog
  • No breaking output change; explicit WithTimeout behavior remains compatible

Screenshots / Recordings

N/A — SDK / CLI internals only.

@coconut-yc coconut-yc force-pushed the contrib/fix-sse-streaming branch from 6ff0f7b to 1898dd9 Compare June 22, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Go SDK SSE streams time out, truncate large events, and return incomplete agent results

1 participant