You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Remove the inert `EnvWarmupParallelProcess` warmup from
`ParallelPatriciaHashed.Process` (never fed — `WarmKey` runs only on the
sequential `HashSort` path), plus the dead flag.
- Refresh `docs/design/parallel-patricia-hashed.md` for the erigontech#21945 deep
storage fold.
Holds a configuration/base `template *HexPatriciaHashed`, a `sync.Pool` of worker
86
-
tries, a `TrieContextFactory`, `numWorkers`, the published `rootHash`, and — for
87
-
the deferred path — a `leaveDeferredForCaller` flag with a `deferredForCaller`
88
-
hand-off slice. The `template` doubles as the **mount base** during `Process`: it
89
-
is unfolded to the root branch, the workers' folded cells are dropped into its row
90
-
0, and it folds the merged root. (Outside `Process` it exposes
91
-
ctx/cache/metrics/trace configuration only.)
95
+
tries, a `TrieContextFactory`, the `cfg TrieConfig` and `accountKeyLen` used to mint
96
+
pooled workers, `numWorkers`, the published `rootHash`, and — for the deferred path
97
+
— a `leaveDeferredForCaller` flag with a `deferredForCaller` hand-off slice. An
98
+
optional `streaming *StreamingCommitter`: when set, `Process` delegates to it
99
+
(`processStreaming`, §10) and the mount path below is not used. The `template`
100
+
doubles as the **mount base** during `Process`: it is unfolded to the root branch,
101
+
the workers' folded cells are dropped into its row 0, and it folds the merged root.
102
+
(Outside `Process` it exposes ctx/cache/metrics/trace configuration only.)
92
103
93
104
## 4. Pipeline
94
105
95
106
| phase | site | action |
96
107
| --- | --- | --- |
97
108
| 1. Touch |`Updates.TouchPlainKey` (ModeParallel) | insert each hashed key into the prefix trie, carrying its `plainKey`/`update` on the terminating node; no ETL collectors are used |
98
-
| 2. Mount + fold |`processMounted`, concurrent | unfold the base to the root branch; mount a worker per touched root nibble; each folds its child subtree into a cell; drop the cells back into the base row and fold the merged root |
109
+
| 2. Mount + fold |`processMounted`, concurrent | unfold the base to the root branch; mount a worker per touched root nibble; each folds its child subtree into a cell (a big-storage account's storage folds concurrently, §4.1.1); drop the cells back into the base row and fold the merged root |
99
110
| 3. Commit |`Process` end | apply (or hand off) the merged deferred branch updates; publish the root |
100
111
101
-
### 4.1 Phase 2 — Mount and fold (`processMounted`, `dfsSubtree`)
112
+
### 4.1 Phase 2 — Mount and fold (`processMounted`, `dfsSubtreeDeep`)
102
113
103
114
1.**Unfold the base.**`processMounted` unfolds `template` down to the root
104
115
branch (`needUnfolding`/`unfold` on the zero prefix), loading the on-disk root
`*HexPatriciaHashed`, calls `mountTo(base, nibble)` — inheriting a copy of the
111
122
base's unfolded grid and sharing the base root cell read-only — and binds it to
112
123
a fresh factory `PatriciaContext` with deferred branch writes enabled.
113
-
3.**Build.**`dfsSubtree(child, [nibble]+child.ext)` walks the nibble's subtree in
114
-
nibble-ascending order. At each terminating node it reconstructs the full hashed
115
-
key, reads the `plainKey`/`update` off the node, and calls
124
+
3.**Build.**`dfsSubtreeDeep(child, [nibble]+child.ext)` walks the nibble's subtree
125
+
in nibble-ascending order. At each terminating node it reconstructs the full
126
+
hashed key, reads the `plainKey`/`update` off the node, and calls
116
127
`followAndUpdate`. A node MUST emit its own key **before** descending to its
117
128
children, so an account at depth 64 precedes its storage keys — the sorted order
118
129
the fold state machine requires (I4). A terminating node with a nil `plainKey`
119
130
and no children is unsupported and MUST raise an error rather than be skipped.
120
-
4.**Fold the mount.**`foldMounted(nibble)` folds the worker's subtree into a
121
-
single cell at the nibble's child depth, stopping before it would absorb the
122
-
shared base root row. The worker's deferred branch updates are appended to the
123
-
shared accumulator and the worker is returned to the pool.
131
+
When a node is a *big-storage account* (`isDeepStorageAccount`: depth 64, plain
132
+
key set, ≥ 2 first-storage nibbles, `subtreeCount > deepStorageThreshold`) the
133
+
walk does **not** descend into its storage children: it computes the storage root
134
+
via the deep storage fold (§4.1.1) and injects it into the account leaf
135
+
(`setAccountStorageRoot`).
136
+
4.**Fold the mount.**`foldMounted(nibble)` folds the worker's subtree upward,
137
+
stopping when it reaches `mountWall` — the depth `mountTo` records as
138
+
`split-depth + 1`, so the top-level mount stops at depth 1, before it would absorb
139
+
the shared base root row — and returns `grid[0][nibble]`. The worker's deferred
140
+
branch updates are appended to the shared accumulator and the worker is returned
141
+
to the pool.
124
142
5.**Merge and fold root.** On the main goroutine, after `errgroup.Wait`, each
125
143
folded cell is dropped into `base.grid[0][nibble]` (stripping the leading nibble
126
144
a hash-only sub-branch carries in its extension) and the touch/after maps are
@@ -132,6 +150,39 @@ the unfolded grid; only the main goroutine mutates the base after `Wait`. There
132
150
no fold-time barrier and no cross-worker synchronisation beyond the deferred-update
133
151
mutex.
134
152
153
+
### 4.1.1 Deep storage fold (`foldStorageRoot`, `streaming_deep_fold.go`)
154
+
155
+
A big-storage account's storage subtree (a "whale") would otherwise fold serially on
156
+
its top-nibble worker. `foldStorageRoot` folds it concurrently, applying the §4.1
157
+
mount/fold model one level down at depth 64. It runs the same primitives
158
+
(`mountTo`/`foldMounted`/`followAndUpdate`) and is shared verbatim by the streaming
159
+
variant (§10).
160
+
161
+
1.**Unfold the storage-root branch.**`unfoldStorageBase(base, accHash[:64])` seeds a
162
+
base worker by reading the account's on-disk storage-root branch
163
+
(`branchFromCacheOrDB` + `decodeBranchIntoRow` — the same decode the account unfold
164
+
`unfoldBranchNode` uses, entered manually at depth 64 instead of by recursive
165
+
descent). This is I2 applied at depth 64:
166
+
untouched on-disk first-storage-nibble siblings MUST be present before the storage
167
+
root is folded, or they are dropped and the storage root — hence the state root —
168
+
diverges (see I2).
169
+
2.**Fold per first-storage nibble.** One errgroup worker per touched first-storage
170
+
nibble: `foldStorageLeaf` mounts the shared base at that nibble (`mountWall = 65`),
171
+
streams the nibble's sorted slots, and `foldMounted` returns the depth-65 child
172
+
cell. Workers defer their branch writes into the shared accumulator. They own
173
+
disjoint storage prefixes, so concurrent reads of the shared base are race-free.
174
+
3.**Aggregate.**`aggregateMountedStorageRoot` overlays the folded child cells onto
175
+
the unfolded base row (setting/clearing each touched present bit, leaving untouched
176
+
on-disk siblings intact) and folds once to the account's storage-root cell.
177
+
4.**Inject.**`setAccountStorageRoot` writes that hash into the account leaf
178
+
(`cell.hash`, `hashLen = 32`); `computeCellHash` uses it as the storageRoot for an
179
+
account whose storage cell was not streamed, so the leaf hashes identically to the
180
+
serial path. The DFS then skips the account's storage children.
181
+
182
+
Below `deepStorageThreshold`, or with storage in a single first nibble, the account
183
+
streams inline as in §4.1 — the per-account setup cost (a pooled worker, a fresh
184
+
context, the storage-root unfold) only pays off for genuinely large storage.
185
+
135
186
### 4.2 Phase 3 — Commit and root publication
136
187
137
188
Workers accumulate `DeferredBranchUpdate`s rather than writing branches. After the
@@ -155,9 +206,11 @@ primary enforcement of I1.
155
206
-**I1 — Equal root.** The published root equals the sequential root for every
156
207
input (R1).
157
208
-**I2 — Untouched-nibble preservation.** Because a branch hash mixes all present
158
-
nibbles, the shared root row MUST be unfolded from `ctx.Branch` before mounting,
159
-
so untouched on-disk siblings are present in `grid[0]` when the merged root is
160
-
folded; workers write only their own touched subtree.
209
+
nibbles, every branch row a fold collapses MUST first be unfolded from `ctx.Branch`
210
+
so untouched on-disk siblings are present. This holds at two depths: the shared
211
+
root row before mounting (`processMounted`), and each big-storage account's
212
+
storage-root branch before the deep fold (`unfoldStorageBase`, §4.1.1). Dropping
213
+
either unfold drops untouched siblings and diverges the root.
161
214
-**I3 — `plainKey` follows the split.**`prefixTrie.Insert` MUST route a
162
215
terminator's `plainKey` to the correct node across path-compression splits (§3.1).
163
216
A misroute is a wrong DB read and a diverged root.
@@ -172,6 +225,11 @@ primary enforcement of I1.
172
225
base root cell read-only; the merged base row is folded only on the main
173
226
goroutine after `errgroup.Wait`, so concurrent structure/`plainKey` reads and the
174
227
final fold are race-free.
228
+
-**I7 — Deep fold equals inline stream.** For a big-storage account, the storage
229
+
root from `foldStorageRoot` injected via `setAccountStorageRoot` MUST equal the
230
+
root the serial inline stream would produce. Its per-first-nibble workers own
231
+
disjoint storage prefixes, share the unfolded storage base read-only, and each
232
+
defers its own branch writes (I5).
175
233
176
234
## 6. Integration contract
177
235
@@ -212,16 +270,15 @@ substitution of the as-of reader is validated at runtime by the block-root check
212
270
| --- | --- | --- |
213
271
|`--experimental.parallel-commitment`| off | selects `VariantParallelHexPatricia` (`execctx.PickTrieVariant`) |
214
272
|`--experimental.streaming-commitment`| off | selects `VariantStreamingHexPatricia` (`StreamingCommitter`); takes precedence over `--experimental.parallel-commitment`|
215
-
|`ERIGON_WARMUP_PARALLEL_PROCESS`| off (env) | opt-in branch-cache prefetch inside the parallel/streaming `Process`; intended for measurement |
216
-
|`deepStorageThreshold`| 1000 | touched-slot count above which an account's storage subtree folds concurrently (split at the first storage nibble); mitigates the whale bottleneck of §11 |
273
+
|`deepStorageThreshold`| 1000 | compile-time const (not a runtime flag): per-account touched-storage-key count above which the storage subtree folds concurrently (§4.1.1); mitigates the whale bottleneck of §11 |
217
274
|`numWorkers`|`runtime.NumCPU()`| worker-pool size and errgroup limit; override via `SetNumWorkers`|
218
275
219
276
## 8. Failure modes
220
277
221
278
| condition | behaviour |
222
279
| --- | --- |
223
280
| empty update set | return the template's existing root (matches the sequential no-op) |
224
-
| base root carries an extension (`root.ext != 0`) | return an error — not yet supported by the single-level mount |
281
+
| base root carries an extension (`root.ext != 0`) | return an error — not yet supported by the mount path|
225
282
| terminating node with nil `plainKey` and no children | return an error (only reachable via a hashed-only `TouchHashedKey`; that path is not wired for the parallel trie) |
226
283
| deferred apply failure (inline path) | discard the staged root; never surface an unpersisted root |
227
284
| worker error mid-fold | cancel the group; return pooled deferred entries |
@@ -245,8 +302,8 @@ in scheduling.
245
302
| --- | --- | --- |
246
303
| flag | (default) |`--experimental.parallel-commitment`|
|`execution/commitment/parallel_patricia_hashed.go`|`ParallelPatriciaHashed`, `Process` (routes to `processStreaming` when a committer is set), `dfsSubtree`, deferred apply and hand-off |
|`execution/commitment/streaming_deep_fold.go`| the deep storage fold shared by the parallel and streaming paths: `dfsSubtreeDeep`, `isDeepStorageAccount`, `foldStorageRoot`, `unfoldStorageBase`, `foldStorageLeaf`, `aggregateMountedStorageRoot`|
352
+
|`execution/commitment/hex_patricia_hashed.go`| sequential engine; `foldMounted` and the `mountWall` stop used by both fold levels |
0 commit comments