perf: add DirectEncoder to eliminate nested MarshalJSON allocation chain#7
Merged
johnstcn merged 9 commits intocoder_2_33from Apr 15, 2026
Merged
perf: add DirectEncoder to eliminate nested MarshalJSON allocation chain#7johnstcn merged 9 commits intocoder_2_33from
johnstcn merged 9 commits intocoder_2_33from
Conversation
…isition Benchmarks MarshalJSON on MessageNewParams with 1, 10, 100, and 1000 message pairs to quantify per-call allocation amplification from the nested MarshalJSON → Marshal → copy chain. Current results show ~8x byte amplification over output JSON size, which compounds quadratically when chatloop re-serializes growing conversation history on every agentic step.
These tests need either a Stainless mock server on localhost:4010 or VCR cassettes recorded against api.anthropic.com that don't replay correctly after fork changes. Skip them so we have a green baseline for the DirectEncoder work.
Adds a DirectEncoder interface to the shimmed encoding/json package. When a type implements EncodeDirect() (any, bool), the encoder writes the returned value directly into the parent buffer instead of going through MarshalJSON → Marshal → []byte copy → Buffer.Write. Implements EncodeDirect on 9 hot-path SDK param types: MessageNewParams, MessageParam, TextBlockParam, ToolUseBlockParam, ToolResultBlockParam, ImageBlockParam, Base64ImageSourceParam, ThinkingBlockParam, RedactedThinkingBlockParam Benchmark results (pairs=1000, 2001 messages, 3.2MB JSON): Before: ~28 MB/op, 49035 allocs/op After: ~20 MB/op, 56036 allocs/op (-30% bytes, +14% allocs) The byte reduction matters more than the alloc count increase — the OOM is driven by buffer sizing, not alloc count. The alloc increase comes from interface conversions in the EncodeDirect check path, which is a known tradeoff.
param.IsNull is a generic function that does any(v).(ParamNullable) which allocates on every call for value types. Adding an exported IsNull() method on metadata lets EncodeDirect call r.IsNull() directly, eliminating ~6000 extra allocs at the 1000-pair benchmark level. Before: 56036 allocs/op (pairs=1000) After: 50035 allocs/op (pairs=1000) — back in line with baseline
Implements EncodeDirect on every union type in message.go. Each method finds the single non-nil variant pointer and returns it directly, bypassing MarshalUnion → shimjson.Marshal → []byte copy. Combined with the struct type changes, benchmark results (pairs=1000): Baseline: 28 MB/op, 49035 allocs/op After: 12 MB/op, 38034 allocs/op (-57% bytes, -22% allocs) Byte amplification ratio dropped from ~8x to ~3.7x of JSON output size.
Address Copilot review feedback: mock-server tests now only skip when TEST_API_BASE_URL is not set, and VCR tests only skip when ANTHROPIC_LIVE!=1. Tests will run when the appropriate env vars are configured.
kylecarbs
approved these changes
Apr 15, 2026
mafredri
approved these changes
Apr 15, 2026
Member
mafredri
left a comment
There was a problem hiding this comment.
The change is brittle, but logically sound. I verified some types in message.go but by far not all.
Member
|
I would recommend some type of lint or build-time enforcement that makes sure all fields are handled. Introduction of new fields would immediately risk breaking this. |
Extends DirectEncoder to every param struct type in message.go, not just the hot-path ones. Adds a coverage test that asserts every type implementing MarshalJSON also implements EncodeDirect — any new type added without the fast path will fail CI.
Member
Author
|
Extends DirectEncoder to every type across betamessage.go, completion.go, messagebatch.go, and betamessagebatch.go. Replaces the manual type list in the coverage test with an AST parser that scans all non-test .go files for MarshalJSON methods and asserts a matching EncodeDirect exists. New types can no longer slip through without the fast path — no manual list to maintain.
Cycles through thinking, redacted thinking, server tool use, images, search results, multi-tool calls, and text-only turns instead of just text + tool_use + tool_result. Gives a more realistic picture of serialization cost with varied content block types.
johnstcn
added a commit
to coder/fantasy
that referenced
this pull request
Apr 15, 2026
Updates coder/anthropic-sdk-go to a31d7d0e7067 which adds the DirectEncoder interface, cutting marshal allocation overhead by ~66% for nested SDK types. See: coder/anthropic-sdk-go#7
johnstcn
added a commit
to coder/fantasy
that referenced
this pull request
Apr 15, 2026
Updates coder/anthropic-sdk-go to a31d7d0e7067 which adds the DirectEncoder interface, cutting marshal allocation overhead by ~66% for nested SDK types. See: coder/anthropic-sdk-go#7
johnstcn
added a commit
to coder/coder
that referenced
this pull request
Apr 15, 2026
… over-allocations (#24390) Updates go.mod to reference our internal fork of anthropic-sdk-go. See: coder/anthropic-sdk-go#7 Relates to CODAGT-167 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
johnstcn
added a commit
to coder/coder
that referenced
this pull request
Apr 15, 2026
… over-allocations (#24390) Updates go.mod to reference our internal fork of anthropic-sdk-go. See: coder/anthropic-sdk-go#7 Relates to CODAGT-167 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a
DirectEncoderinterface to the shimmedencoding/jsonpackage. When a type implementsEncodeDirect() (any, bool), the encoder writes the returned value directly into the parent buffer — bypassing theMarshalJSON→Marshal→[]bytecopy round-trip that causes ~9.6x byte amplification on nested SDK types.DirectEncoderinterface ininternal/encoding/json/encode.gowith checks inmarshalerEncoderandaddrMarshalerEncoderEncodeDirecton all 246 types withMarshalJSONacrossmessage.go,betamessage.go,completion.go,messagebatch.go,betamessagebatch.goIsNull()onmetadatato avoidparam.IsNullgeneric function allocation overheadMarshalJSONwithoutEncodeDirectContentBlockParamUnionvariants (text, thinking, redacted thinking, images, search results, server tool use, multi-tool calls)Benchmark (pairs=1000, 2001 messages, 3.3MB JSON output)
Full benchmark table
Context: why this matters
Prod pprof shows
chatd.processChatat 37% cumulative heap, withbytes.growSliceat 2.1 GB driven by the nestedMarshalJSONchain inMessageNewParamsserialization. The chatd agentic loop re-serializes the entire conversation history on every step, so per-call amplification compounds across steps. This change cuts the per-call overhead by ~66%, reducing GC pressure and OOM risk.