fix: incremental frontierRootHash to avoid O(N²) on pre-Byzantium blocks#10220
Open
diega wants to merge 3 commits intobesu-eth:mainfrom
Open
fix: incremental frontierRootHash to avoid O(N²) on pre-Byzantium blocks#10220diega wants to merge 3 commits intobesu-eth:mainfrom
diega wants to merge 3 commits intobesu-eth:mainfrom
Conversation
…or hierarchy Replace the two-phase copy pattern (construct empty + cloneFromUpdater) with a proper copy constructor chain. Each level copies its own fields with compile-time type safety, eliminating the need for a separate public cloneFromUpdater method that could be called without copying subclass-specific state. Signed-off-by: Diego López León <dieguitoll@gmail.com>
On pre-Byzantium blocks, FrontierTransactionReceiptFactory calls frontierRootHash() for every transaction to include intermediate state roots in receipts. The previous implementation copied the entire accumulator and rebuilt the Merkle trie from scratch on each call, giving O(N) cost per call and O(N²) total for a block with N transactions. On the 2016 DoS attack blocks (~2.3M on mainnet), this causes full sync to stall indefinitely. The fix introduces FrontierRootHashTracker, which caches the account trie between calls and tracks dirty addresses per transaction via a single Set<Address> in the accumulator. Each frontierRootHash() call now only processes accounts changed by the latest transaction, reducing per-call cost from O(N) to O(k) where k is the number of accounts touched by that transaction. Safety: - Dirty addresses are cleared only after successful computation. - The cache is reset on persist() at block boundaries. - Uses StoredMerklePatriciaTrie (not ParallelStoredMerklePatriciaTrie) to avoid ForkJoinPool overhead on per-transaction calls. - Only affects pre-Byzantium code path; post-Byzantium is unchanged. Fixes besu-eth#10155 Signed-off-by: Diego López León <dieguitoll@gmail.com>
…head in frontier path Adds a Usage-based trie construction policy that selects the appropriate MerkleTrie implementation based on execution context: - BLOCK_COMPUTATION: throughput-oriented, respects parallelStateRootComputationEnabled - FRONTIER_INCREMENTAL: latency-sensitive, always uses StoredMerklePatriciaTrie This eliminates the ParallelStoredMerklePatriciaTrie ForkJoinPool overhead from per-transaction frontierRootHash() calls. The factory replaces the inline createTrie() logic and is used by both the normal block computation path (unchanged behavior) and the incremental frontier path via FrontierRootHashTracker (now guaranteed sequential for both account and storage tries). Signed-off-by: Diego López León <dieguitoll@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR description
Bonsai full sync hangs indefinitely at the 2016 DoS attack blocks (~2.3M) on mainnet because
frontierRootHash()copies the entire accumulator and rebuilds the Merkle trie from scratch on every per-transaction call, giving O(N²) total cost per block.Commit 1: copy constructor refactor
Replaces the two-phase
cloneFromUpdatercopy pattern inPathBasedWorldStateUpdateAccumulatorwith a proper copy constructor chain. Each level copies its own fields with compile-time type safety. This is a prerequisite for commit 2, which adds a subclass-specific field (frontierDirtyAddresses) that needs to be copied correctly.Commit 2: incremental frontierRootHash
Introduces
FrontierRootHashTracker, which makesfrontierRootHash()incremental instead of rebuilding from scratch on every call.How it works: The accumulator's
commit()override captures the addresses fromgetUpdatedAccounts()andgetDeletedAccountAddresses()into afrontierDirtyAddressesset. This happens aftersuper.commit()populatesaccountsToUpdate, so every dirty address is guaranteed to have a corresponding entry. The dirty set accumulates across transactions within a block.When
frontierRootHash()is called, the tracker:At the block boundary,
persist()callstracker.reset()to discard the cached trie before the next block.Why it's safe to operate on the live accumulator instead of copying it: the previous
accumulator.copy()was introduced to preventsetStorageRoot()mutations from corrupting the live state. ButsetStorageRoot()sets deterministic values computed from the trie: the same values thatpersist()will compute at end-of-block. The no-op storage updater used for frontier ensures no trie nodes are written to storage. The accumulator is only read and mutated in the same way thatpersist()would later.Commit 3: BonsaiTrieFactory
Commit 2 makes the frontier account trie sequential via
createFrontierTrie()(which always returnsStoredMerklePatriciaTrie). But the frontier storage tries for each dirty account still go through the normalcreateTrie()path, which returnsParallelStoredMerklePatriciaTriewhen parallel computation is enabled. This adds ForkJoinPool scheduling overhead on every per-transaction call without benefit: the frontier path is inherently sequential because each receipt depends on the prior transaction's state root.This commit introduces
BonsaiTrieFactorywith aTrieModeenum (ALWAYS_SEQUENTIALvsPARALLELIZE_ALLOWED) that centralizes the trie implementation decision. The frontier path passesALWAYS_SEQUENTIALfor both account and storage tries, eliminating the parallel overhead. The normalpersist()/ block computation path continues to usePARALLELIZE_ALLOWED.Post-Byzantium receipts use a status code instead of a state root, so the frontier path is only active for pre-Byzantium blocks. Neither commit changes behavior for post-Byzantium blocks or the normal block computation path.
Verifying the performance improvement locally
The O(N²) scaling can be reproduced by measuring individual
frontierRootHash()call times across a simulated block (500 accounts × 20 storage slots, nopersist()between calls):Before this PR: the last call takes 8–11x longer than the first.
After this PR: the last call is the same speed or faster than the first.
The parallel trie overhead can be verified by comparing
frontierRootHash()wall time withparallelStateRootComputationEnabled=truevsfalseusing RocksDB-backed storage. Before this PR: ~1.9x overhead. After: ~1.0x.Fixed Issue(s)
fixes #10155
Thanks for sending a pull request! Have you done the following?
doc-change-requiredlabel to this PR if updates are required.Locally, you can run these tests to catch failures early:
./gradlew spotlessApply./gradlew build./gradlew acceptanceTest./gradlew integrationTest./gradlew ethereum:referenceTests:referenceTests