feat: delayed p2p message decoding via rlp.RawList#65
Conversation
…ops (ethereum#33245) The list iterator previously returned true on parse errors without advancing the input, which could lead to non-advancing infinite loops for callers that do not check Err() inside the loop; to make iteration safe while preserving error visibility, Next() now marks the iterator as finished when readKind fails, returning true for the error step so existing users that check Err() can handle it, and then false on subsequent calls, and the function comment was updated to document this behavior and the need to check Err().
This adds a new type wrapper that decodes as a list, but does not actually decode the contents of the list. The type parameter exists as a marker, and enables decoding the elements lazily. RawList can also be used for building a list incrementally.
Most uses of the iterator are like this:
it, _ := rlp.NewListIterator(data)
for it.Next() {
do(it.Value())
}
This doesn't require the iterator to be a pointer and it's better to
have it stack-allocated. AFAIK the compiler cannot prove it is OK to
stack-allocate when it is returned as a pointer because the methods of
`Iterator` use pointer receiver and also mutate the object.
The iterator type was not exported until very recently, so I think it is
still OK to change this API.
This is helpful when building a list from already-encoded elements.
This changes `RawList` to ensure the count of items is always valid. Lists with invalid structure, i.e. ones where an element exceeds the size of the container, are now detected during decoding of the `RawList` and thus cannot exist. Also remove `RawList.Empty` since it is now fully redundant, and `Iterator.Count` since it returns incorrect results in the presence of invalid input. There are no callers of these methods (yet).
I removed `Iterator.Count` in ethereum#33840, because it appeared to be unused and did not provide the documented invariant: the returned count should always be an upper bound on the number of iterations allowed by `Next`. In order to make `Count` work, the semantics of `CountValues` has to change to return the number of items up and including the invalid one. I have reviewed all callsites of `CountValues` to assess if changing this is safe. There aren't that many, and the only call that doesn't check the error and return is in the trie node parser, `trie.decodeNodeUnsafe`. There, we distinguish the node type based on the number of items, and it previously returned an error for item count zero. In order to avoid any potential issue that could result from this change, I'm adding an error check in that function, though it isn't necessary.
This changes the p2p protocol handlers to delay message decoding. For responses, we delay the decoding until it is confirmed that the response matches an active request and does not exceed its limits. In order to make this work, all messages have been changed to use rlp.RawList instead of a slice of the decoded item type. For block bodies specifically, the decoding has been delayed all the way until after verification of the response hash. The role of p2p/tracker.Tracker changes significantly in this PR. The Tracker's original purpose was to maintain metrics about requests and responses in the peer-to-peer protocols. Each protocol maintained a single global Tracker instance. As of this change, the Tracker is now always active (regardless of metrics collection), and there is a separate instance of it for each peer. Whenever a response arrives, it is first verified that a request exists for it in the tracker. The tracker is also the place where limits are kept. Adapted for stablenet: ETH68 only, no ETH69 support.
colinkim
left a comment
There was a problem hiding this comment.
Non-blocking: The comment states that the connection is closed if the sidecar does not match the transaction's version hash, but the implementation only checks if sidecar != nil. Since StableNet currently has CancunTime=nil, this appears to be a discrepancy between the comment and the intended behavior rather than a bug. Should we consider updating the comment or restoring ValidateBlobCommitmentHashes(...) to maintain consistency with upstream?
fix : 7b164e8 |
Summary
This PR introduces delayed RLP decoding for P2P Ethereum protocol messages, porting upstream go-ethereum #33835 to stablenet. Instead of decoding incoming wire messages immediately on receipt, message payloads are stored as
rlp.RawList[T]and decoded only when actually consumed. This reduces unnecessary CPU work and memory allocations when received data is ultimately discarded (e.g., due to hash mismatch or request cancellation).Upstream References
Changes
rlp/— RawListIntroduces
rlp.RawList[T], a generic list type that stores raw RLP bytes and decodes elements on demand via.Items(). Wire encoding is identical to[]T, so no protocol compatibility changes are needed. Several iterator improvements accompany this type (Count(), non-pointer return, parse-error finalization).p2p/tracker/— Per-peer request trackerRewrites
tracker.Trackerfrom a global singleton to a per-peer instance. The sharedtracker.goshims ineth/andsnap/are removed.tracker.Fulfil(id)is added to signal request completion.eth/protocols/eth/— Core delayed decodingTransactionsPacket,BlockHeadersPacket,BlockBodiesPacket,ReceiptsPacket,BlockBody) converted to userlp.RawListfields.hashBodyParts()+derivableRawList: compute MPT hashes directly from raw bytes without full deserialization.maxTransactionAnnouncements = 5000: DoS defense against oversized transaction messages.markTransactionexported asMarkTransaction; transaction marking moved to the application layer (eth/handler_eth.go).eth/protocols/snap/andeth/upper layerssnap.Peermigrated to per-peer tracker (same pattern as eth).downloader,fetcher, andhandler_eth.goupdated to work with the new packet type fields.Security
maxTransactionAnnouncements = 5000prevents memory exhaustion from oversized transaction lists.hashBodyParts()validates block body hashes before full decode.