Skip to content

refactor(ws): reuse marshaled payload JSON for write and prewarm#1642

Open
penwyp wants to merge 3 commits intoWei-Shaw:mainfrom
penwyp:feat/ws
Open

refactor(ws): reuse marshaled payload JSON for write and prewarm#1642
penwyp wants to merge 3 commits intoWei-Shaw:mainfrom
penwyp:feat/ws

Conversation

@penwyp
Copy link
Copy Markdown

@penwyp penwyp commented Apr 14, 2026

Summary

forwardOpenAIWSV2 can end up marshaling the same request payload twice: once to resolve payload_bytes for WS schema logging and again when the payload is written to the websocket. performOpenAIWSGeneratePrewarm had the same pattern: it built JSON bytes for logging but still passed the map to WriteJSONWithContextTimeout, forcing another marshal on write.

This PR narrows the WS optimization work to one focused change: marshal the payload once, reuse the same JSON bytes for the actual WS write/prewarm write, and fail closed if marshaling fails.

Changes

  • backend/internal/service/openai_ws_forwarder.go
    • cache the marshaled request payload in forwardOpenAIWSV2 and reuse it for websocket writes via json.RawMessage
    • marshal the prewarm payload once in performOpenAIWSGeneratePrewarm, log the byte size, and write the same JSON upstream
    • return explicit fallback errors when write/prewarm marshaling fails instead of attempting to write a partial request
  • backend/internal/service/openai_ws_forwarder_success_test.go
    • add regression coverage that the forwarded WS payload JSON is equivalent to direct json.Marshal(payload)
    • add fail-closed coverage for write-path marshal failures
    • add regression coverage that the prewarm payload JSON is equivalent to direct json.Marshal(prewarmPayload)
    • add fail-closed coverage for prewarm marshal failures

Test plan

  • cd backend && go test ./internal/service/...
  • cd backend && go test ./...

penwyp added 3 commits April 14, 2026 15:37
The branch originally mixed JSON reuse on the WS write path with
payload summary micro-optimizations, benchmark churn, and a repo-local
ignore rule. This follow-up narrows the PR to one reviewable theme:
reuse the marshaled payload JSON for WS write and prewarm paths while
keeping the focused regression coverage that proves wire equivalence and
fail-closed behavior.

Constraint: The upstream repo favors small, focused PRs with explicit validation
Rejected: Keep payload summary micro-optimizations | broadened the PR without proving value for this submission
Rejected: Keep .omx ignore change | unrelated to gateway behavior
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep future WS perf PRs scoped to one hot path unless the same benchmark justifies bundled changes
Tested: /Users/penwyp/.local/share/mise/installs/go/1.26.2/bin/go test ./internal/service/...
Tested: /Users/penwyp/.local/share/mise/installs/go/1.26.2/bin/go test ./...
Not-tested: Benchmark delta on production-like websocket traffic
@penwyp penwyp changed the title refactor(ws): optimize OpenAI WebSocket payload handling and size est… refactor(ws): reuse marshaled payload JSON for write and prewarm Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant