perf(miner): adaptive root-reserve + per-attempt sparse-trie sink (ported to develop-hardfork-pasteur)#418
Merged
constwz merged 6 commits intoJun 24, 2026
Conversation
When the sparse-trie precomputed root is unavailable, the build falls back to the synchronous full-trie state_root_with_updates. Under a deep miner overlay this walk takes ~700ms, far past the block period: it produces a candidate the miner has already abandoned and pins a CPU core into the next slot, shrinking the next block's budget and cascading more empty blocks. Abort the candidate when already at/over the state-root deadline (end_mining_timestamp_ms - STATE_ROOT_WAIT_MARGIN_MS) so the miner ships the best already-completed candidate on time instead of over-running the slot. Adds counter bsc_builder_sync_root_deadline_abort_total. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The sparse-trie precomputed-root sink was a single Arc<Mutex<>> shared by every build attempt in a job, including the deadline-spawned empty-fallback build. Inside finish_with_difflayer the write (after the sparse-trie wait) and the read-back (sink.take()) are separated by merge_transitions + hashed_post_state over all txs (~hundreds of ms for a full block). A concurrent second finish — the empty build, which has no trie_handle and jumps straight to the take() — steals the root the full build deposited, forcing the full build onto the slow synchronous state_root_with_updates (~700ms) and risking sealing a foreign root onto the empty block. Give each build_payload attempt its own fresh sink so write->read is strictly intra-attempt; the full build reads back its OWN precomputed root. The empty build now passes state_root_precomputed_sink: None and computes its own cheap root. Logs state_root on the 'delivered' and 'Using precomputed' lines so the routing is verifiable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The occasional slow miner state root (100-280ms vs ~5ms median) is not tx-count
driven; it tracks overlay reconstruction / proof-fetch cost. Add per-block
metrics on the miner sparse-trie spawn path to pin the cause:
- bsc_miner_overlay_depth: in-memory overlay depth (head - on-disk tip ~= persist
lag) the proof workers reconstruct over. Correlate with the existing
bsc_builder_state_root_wait_duration_seconds tail.
- bsc_miner_sparse_trie_anchor_{inmemory,persisted,nocim}_total: overlay anchor
kind per block.
- bsc_miner_sparse_trie_spawn_duration_seconds: cost of spawn_state_root
(proof-worker pool creation + overlay setup) — flags per-block worker churn.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…spawns The miner built OverlayBuilder with ChangesetCache::default() on every spawn_state_root call — a fresh, empty cache per block. Any overlay that needed trie reverts (anchor below db_tip, the common case under a deep/finality-lagging overlay) therefore missed and recomputed changesets from the DB every block. That DB recompute over a 40-50 deep overlay is the dominant cause of the occasional 100-280ms state root that tips blocks past the build budget -> empty. Reuse a single ChangesetCache across all miner spawns (clone shares the inner Arc<RwLock>): the first block computes the reverts once, later blocks reuse them. Evict below finalized-256 each spawn to keep the working set hot while bounding growth over long runs. Note: this is the miner's own warm cache, not the engine tree's import-populated cache (that lives in reth's engine service and isn't reachable from here without an upstream change). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Empty blocks under load cluster at the peaks of the in-memory overlay depth (head - finalized): there the background sparse-trie root can't finalize within the default 120ms reserve, finish() waits past the slot deadline, and the block degrades to empty-fallback. ~97% of blocks at normal depth compute the root in ~20ms and are unaffected. Make DELAY_LEFT_OVER adaptive: at overlay_depth <= 15 keep the default 120ms; at >= 40 reserve up to 280ms; linear in between. A larger reserve stops fill earlier (exec ends earlier -> the finalize tail gets a bigger window) and fills fewer txs (smaller finalize tail), turning a would-be empty block into an on-time smaller block. overlay_depth = (parent+1) - finalized, via canonical_in_memory_state; depth 0 (finalized unavailable) keeps the default. Adds histogram bsc_miner_effective_reserve_ms. See docs/design-adaptive-overlay-depth.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tune without recompiling during perf testing. Env vars (read once at startup, cached; defaults match the previous hardcoded consts): BSC_MINING_ADAPTIVE_RESERVE on/off (default on); off = fixed DELAY_LEFT_OVER BSC_MINING_ROOT_RESERVE_DEPTH_LOW default 15 BSC_MINING_ROOT_RESERVE_DEPTH_HIGH default 40 BSC_MINING_ROOT_RESERVE_MAX_MS default 280 Branch ordering makes depth_high<=depth_low degrade to a step at depth_low (no divide-by-zero). Logs the resolved config at startup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pull Request ReviewThis PR ports miner performance and reliability improvements onto Sensitive ContentNo sensitive content detected. Security IssuesNo serious security issues detected. Generated by Hashdit Bot. This tool can absolutely NOT replace manual audits. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Ports the miner adaptive root-reserve work from
fix/miner-adaptive-root-reserveontodevelop-hardfork-pasteur.A straight rebase was not viable: the original branch diverged long ago (most lower commits are already squash-merged into develop) and, critically, develop removed the triedb/pathdb state backend (#371) while the original branch was written against it. So this is a clean port of the 6 genuinely-new miner commits, with the triedb coupling stripped.
Commits
fix(miner): bound synchronous state-root fallback by slot deadline— abort the sync full-trie fallback when already past the slot deadline (avoids ~700ms over-run cascading empty blocks). Adapted to develop's 2-tuple state-root return.fix(miner): per-attempt sparse-trie root sink to stop cross-build theft— fresh per-attemptArc<Mutex<>>sink so a concurrent empty-fallback build can't steal this attempt's precomputed root. triedb-decoupled: droppedfinish_with_difflayer/difflayer, uses develop'sfinish(state, None).feat(miner): instrument sparse-trie overlay depth + spawn durationperf(miner): share one long-lived changeset cache across sparse-trie spawnsperf(miner): adaptive end-of-slot reserve by overlay depthperf(miner): make adaptive root-reserve knobs env-configurableThe last 4 cherry-picked with zero conflicts.
Notes
src/node/{engine.rs, evm/builder.rs, miner/payload.rs}(+185/-11). No triedb/difflayer references remain.cargo checkis clean (0 errors, 0 warnings) against develop's pinned reth rev0dea17d2.🤖 Generated with Claude Code