Skip to content

core/types: reduce header RLP decode allocations#35030

Open
Sahil-4555 wants to merge 2 commits into
ethereum:masterfrom
Sahil-4555:optimize-header-rlp-decode
Open

core/types: reduce header RLP decode allocations#35030
Sahil-4555 wants to merge 2 commits into
ethereum:masterfrom
Sahil-4555:optimize-header-rlp-decode

Conversation

@Sahil-4555

@Sahil-4555 Sahil-4555 commented May 22, 2026

Copy link
Copy Markdown

This change adds a concrete header decode path and uses it when decoding inbound eth protocol BlockHeaders responses. The new path decodes directly into a types.Header instead of going through the generic RLP reflection decoder, while keeping the existing behavior of rejecting trailing input.

The header decoder is careful about reuse: variable-size fields reuse their backing storage, optional pointer fields are reused when present, and fields that are absent in shorter legacy headers are cleared so stale values cannot survive across decodes.

The BlockHeaders handler still allocates a distinct Header for every decoded item, so response consumers keep the same ownership semantics as before. This avoids unsafe pointer reuse while reducing the overhead of the per-header decode path.

goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
                        │ old_bench.txt │            new_bench.txt            │
                        │    sec/op     │   sec/op     vs base                │
DecodeRLP/legacy-header    1491.5n ± 3%   580.1n ± 3%  -61.10% (p=0.000 n=10)
DecodeRLP/london-header    1193.5n ± 5%   640.8n ± 2%  -46.31% (p=0.000 n=10)
geomean                     1.334µ        609.7n       -54.30%

                        │ old_bench.txt │             new_bench.txt             │
                        │      B/s      │     B/s       vs base                 │
DecodeRLP/legacy-header    347.2Mi ± 3%   892.6Mi ± 3%  +157.10% (p=0.000 n=10)
DecodeRLP/london-header    438.6Mi ± 5%   817.1Mi ± 2%   +86.28% (p=0.000 n=10)
geomean                    390.2Mi        854.0Mi       +118.84%

                        │ old_bench.txt │           new_bench.txt            │
                        │     B/op      │    B/op     vs base                │
DecodeRLP/legacy-header      88.00 ± 0%   24.00 ± 0%  -72.73% (p=0.000 n=10)
DecodeRLP/london-header      88.00 ± 0%   24.00 ± 0%  -72.73% (p=0.000 n=10)
geomean                      88.00        24.00       -72.73%

                        │ old_bench.txt │           new_bench.txt            │
                        │   allocs/op   │ allocs/op   vs base                │
DecodeRLP/legacy-header      3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)
DecodeRLP/london-header      3.000 ± 0%   1.000 ± 0%  -66.67% (p=0.000 n=10)
geomean                      3.000        1.000       -66.67%

@fjl

fjl commented May 22, 2026

Copy link
Copy Markdown
Contributor

I understand this is faster, but does it really matter? Decoding headers is already very fast, we can decode them at 400MB/s even without this change.

@Sahil-4555

Copy link
Copy Markdown
Author

Speed wise yes it's already fast, the main benefit here is actually on the allocation side. During skeleton sync each batch pulls 512 headers from a peer, and right now every header needs 3 separate heap allocations - the Header struct, one big.Int for Difficulty, one for Number. So per batch that's around 1500 allocs going through the GC. With the headerAlloc packing we bring that down to ~500 by just putting all three in one struct. Same data, just one trip to the allocator instead of three. And the stream reuse avoids doing 512 pool get/put cycles per batch. Individually small, but during sync this is happening continuously from multiple peers so it adds up in terms of GC pressure.

@Sahil-4555 Sahil-4555 force-pushed the optimize-header-rlp-decode branch from 17ecac6 to 4c8150a Compare May 29, 2026 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants