New payload with witness ssz#12048
Conversation
WitnessCapturingWorldStateProxy implemented only the explicit IWorldState methods, leaving IsNonZeroAccount, IsStorageEmpty, HasCode, GetNonce, and IsDelegatedCode to fall through to the default interface implementations. Those defaults route via TryGetAccount and see only the committed AccountStruct, while the real WorldState overrides them to consult the in-flight persistent storage provider. The mismatch broke EIP-7610 checks on SELFDESTRUCT-then-CREATE within a single block, causing the Pyspec test_recreate fixtures on Paris and Shanghai to fail with a header gas-used mismatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restore codeHash-first lookup and the parallel-BAL comment. The proxy already captures the address via GetCodeHash(Address) earlier in InternalGetCodeInfo, so the swap is not required for witness capture. Also drop the verbose comment on the new default-method forwarders and update the misleading note on GetCode(in ValueHash256). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec MAX_REQUEST_BODY_SIZE per execution-apis#764 is 16 MiB; revert the undocumented bump to 64 MiB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Witness owns ArrayPoolList buffers. On the JSON-RPC success path the result is wrapped in ResultWrapper -> JsonRpcSuccessResponse, whose Dispose() calls Result.TryDispose(). Implementing IDisposable here routes that into Witness.Dispose() so the pool buffers are returned instead of GC'd. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When two engine_newPayloadWithWitness calls land for the same blockHash, ArmCapture cancels the first TCS. The first caller's await captureTask then throws OperationCanceledException. The block itself executed successfully, so return VALID with a null witness rather than letting the cancellation propagate as a 500. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The INVALID_BLOCK_HASH constant and its SSZ byte mapping (4) were added but no code path produces them; Nethermind returns INVALID with a descriptive error for block-hash mismatches. Drop both until/unless we wire the status through the actual mismatch path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BranchProcessor tracked the witness-capture lifecycle with a pair of booleans and a manual finally block to disarm on exception. Replace with a `using` session struct that arms in its factory, drains on success, and disarms on Dispose if not drained. Removes the inner try/finally and the ArmWitnessCapture / DrainWitnessCapture helpers; restores the minor cosmetic edits that crept into the file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the Func<...> indirection the handler used to break the construction cycle. Now it takes `Lazy<IEngineRpcModule>` directly and calls `Value.engine_newPayloadV5(...)`, which simplifies the DI wiring to a plain type registration. Tests substitute IEngineRpcModule on the builder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
newPayloadWithWitness already has both a JSON-RPC and SSZ-REST surface gated on the same fork, so route it through the same Configure(...) helper the other Amsterdam entries use rather than two parallel dict writes. Trim the comment in SszWireTypes that explains why NewPayloadWithWitnessResponseV1 is hand-rolled in SszCodec — keep the relevant why, drop the prose. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restore the blank line after the opening brace and the inline body of engine_getClientVersionV1 — neither was needed for the PR's witness work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three near-identical tests differing only in the engine module factory collapse into a single [TestCaseSource]-driven case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WitnessGeneratingWorldState.GetWitness and WitnessCapturingWorldStateProxy.BuildWitness both ran the same per-(address, slots) AccountProofCollector tree-walk loop. Pull the loop into WitnessProofCollector.CollectAccountProofs so both call sites share it. Also: - Swap Arm() order in the proxy to allocate the tracking dictionaries before flipping the armed flag; the prior order was safe only on the single-threaded ProcessOne path but inverted is a free correctness improvement. - Update WitnessCaptureRegistry xmldoc; the registry tolerates multiple concurrent armed entries for distinct block hashes (and cancels-and-replaces on duplicates). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dispatch SszMiddleware.InvokeAsync (versioned) and DispatchWitnessAsync (non-versioned witness path) duplicated the entire post-route pipeline: PipeReader body read, request-bytes metric, handler invocation, status-code bucketing, and the 413/400/500 error mapping. Pull the shared work into DispatchAsync(handler, version, extra) — the witness path now adds only the path-specific method/content-type/handler-not-null checks. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Use Utf8JsonReader(ReadOnlySequence<byte>) in NewPayloadWithWitnessSszHandler instead of `body.ToArray()` on multi-segment bodies — saves an LOH allocation on every request. - Promote the encode-length Debug.Assert in EncodeNewPayloadWithWitnessResponse to a runtime check so a future header-size bug fails fast in Release rather than emitting silently-wrong bytes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ain processing modules in MainProcessingContext
This reverts commit 65073e0.
|
Claude finished @hudem1's task in 10m 33s —— View job PR Review: New payload with witness SSZTasks
SummaryThis PR implements EIP-7928 No critical issues found. I found two High findings that block merge, four Medium findings that should be addressed, and a handful of Low items. FindingsHighH1 – DI anti-pattern in The clean H2 –
MediumM1 – LINQ on hot-path:
M2 – Fragile cast from
M3 – Fragile The route-building loop excludes the witness handler by concrete type. Introduce an M4 – The trie and header recorders are written before the CAS, not after it. The LowL1 – L2 – L3 – No timeout before Test coverageThe new The Verdict: Not mergeable until H1 and H2 are resolved. |
| ctx.Resolve<IWithdrawalProcessorFactory>(), | ||
| ctx.ResolveOptional<PrewarmerEnvFactory>(), | ||
| ctx.ResolveOptional<PreBlockCaches>(), | ||
| ctx.ResolveOptional<IReadOnlyTxProcessingEnvFactory>(), |
There was a problem hiding this comment.
High – DI anti-pattern (violates di-patterns.md)
The previous one-liner .AddScoped<IBlockAccessListManager, BlockAccessListManager>() has been replaced with a manual lambda that explicitly lists every constructor argument. The rule file (di-patterns.md) calls this exact pattern an anti-pattern:
"Using the form
.Add<IFoo>(ctx => new Foo(ctx.Resolve<Dep1>(), ctx.Resolve<Dep2>()))is an anti-pattern. It will cause changes to the wiring whenFooadds new dependencies, which increases review load."
Preferred fix: have BlockAccessListManager accept WitnessExecutionPredicate? directly as a constructor parameter, then re-register it with the simple AddScoped<IBlockAccessListManager, BlockAccessListManager>(). The container can resolve the optional WitnessExecutionPredicate? automatically if it is registered in witness-capable scopes and omitted elsewhere.
// In BlockProcessingModule, back to the clean form:
.AddScoped<IBlockAccessListManager, BlockAccessListManager>()// In BlockAccessListManager constructor:
public partial class BlockAccessListManager(
IWorldState stateProvider,
ISpecProvider specProvider,
IBlockhashProvider blockHashProvider,
ILogManager logManager,
IBlocksConfig blocksConfig,
IWithdrawalProcessorFactory withdrawalProcessorFactory,
PrewarmerEnvFactory? prewarmerEnvFactory = null,
PreBlockCaches? preBlockCaches = null,
IReadOnlyTxProcessingEnvFactory? readOnlyTxProcessingEnvFactory = null,
WitnessExecutionPredicate? witnessExecutionPredicate = null) // ← add thisThen derive isWitnessExecution from witnessExecutionPredicate?.IsActive in the class body.
| // no patricia machinery on a flat node. Live trie-node capture during execution happens | ||
| // elsewhere — the main-world ITrieStore decorator on patricia, the trie-read observer on | ||
| // flat — both feeding the same recorder; this instance backs only GetWitness's rarely-hit | ||
| // root-resolution fallback. |
There was a problem hiding this comment.
Medium – fallbackTrieStore is never disposed
CreateReadOnlyTrieStore() on the flat backend returns a FlatReadOnlyTrieStore that holds a disposable SnapshotBundle. WitnessCapturingTrieStore.Dispose() forwards to baseStore.Dispose(), but fallbackTrieStore is never disposed — it is neither wrapped in a using nor disposed in the finally block.
In practice the impact is low because FlatReadOnlyTrieStore._bundle is nulled out by the ScopeCleanup returned from BeginScope(parentHeader) (the using in GetWitness), so the actual dispose would be a no-op. However, the code violates dispose discipline: any future change to the store that adds resources beyond _bundle would silently leak.
// Fix: dispose in the finally block alongside session.Disarm()
WitnessCapturingTrieStore fallbackTrieStore = new(worldStateManager.CreateReadOnlyTrieStore(), session);
...
finally
{
session.Disarm();
fallbackTrieStore.Dispose(); // ← add this
}| /// the allocate-then-discard of <c>node.FullRlp.ToArray()</c>. | ||
| /// </summary> | ||
| public void Record(Hash256 hash, TrieNode node) | ||
| => _rlpCollector.GetOrAdd(hash, static (_, n) => n.FullRlp.ToArray()!, node); |
There was a problem hiding this comment.
Medium – LINQ on hot-path property (violates coding-style rule)
_rlpCollector.Select(...) is a lazy LINQ query. Every consumer pays the delegate allocation and iterator overhead. The rule file explicitly prohibits LINQ when a simple loop works.
WitnessGeneratingWorldState.GetWitness also calls .Any() on this property — two LINQ allocations on what can be a per-block path.
// Preferred: expose counts directly to avoid the Any() call at the use site
public bool IsEmpty => _rlpCollector.IsEmpty; // ConcurrentDictionary.IsEmpty is O(1), no alloc
// And for the iteration:
public IEnumerable<byte[]> TouchedNodesRlp => _rlpCollector.Values; // no LINQ, no extra allocationThen in WitnessGeneratingWorldState.GetWitness:
// Replace: if (!trieRecorder.TouchedNodesRlp.Any())
if (trieRecorder.IsEmpty)| // Expose the same proxy instance as a typed singleton so the block-processor decorator can | ||
| // take it directly. Cast through IWorldState because Autofac doesn't model decorator chains | ||
| // as typed singletons. | ||
| builder.AddSingleton<WitnessCapturingWorldStateProxy>(ctx => |
There was a problem hiding this comment.
Medium – Fragile cast from IWorldState to concrete decorator type
builder.AddSingleton<WitnessCapturingWorldStateProxy>(ctx =>
(WitnessCapturingWorldStateProxy)ctx.Resolve<IWorldState>());This cast assumes IWorldState resolves to exactly WitnessCapturingWorldStateProxy — i.e., no other decorator sits on top of it in this scope. If any future plugin adds a decorator on IWorldState after this module loads, ctx.Resolve<IWorldState>() returns the outer decorator and the cast throws InvalidCastException at startup.
The same issue applies to the WitnessCapturingHeaderFinder bridge on line 47.
The underlying need — giving WitnessCapturingBlockProcessor direct access to the proxy — suggests using Autofac.IRegistrationSource or a keyed service. A simpler short-term fix is to register the proxy both ways in a single lambda:
builder.Register(ctx =>
{
IWorldState inner = ctx.Resolve<IWorldState>();
return new WitnessCapturingWorldStateProxy(inner, ctx.Resolve<WitnessCaptureSession>());
})
.As<IWorldState>()
.As<WitnessCapturingWorldStateProxy>()
.InstancePerLifetimeScope();This avoids the cast entirely; both IWorldState and WitnessCapturingWorldStateProxy resolve the same instance.
| @@ -76,6 +85,9 @@ private static (FrozenDictionary<string, List<ISszEndpointHandler>> post, | |||
|
|
|||
| foreach (ISszEndpointHandler h in handlers) | |||
| { | |||
There was a problem hiding this comment.
Medium – Fragile is type-check to exclude a handler from route-building
if (h is NewPayloadWithWitnessSszHandler) continue;This couples the route-building loop to a specific concrete type. If NewPayloadWithWitnessSszHandler is subclassed, or a second "non-routable" handler is introduced, this check silently breaks.
A cleaner design: introduce an opt-out interface or property on ISszEndpointHandler:
// In ISszEndpointHandler:
/// <summary>False for handlers that manage their own dispatch path (e.g. the witness fast-path).</summary>
bool RegisterInRoutingTable => true;Then the loop becomes:
if (!h.RegisterInRoutingTable) continue;Alternatively, don't register NewPayloadWithWitnessSszHandler as ISszEndpointHandler at all (it's already registered as a concrete singleton in SszMiddlewareConfigurer), removing the need to exclude it here.
|
|
||
| /// <summary> | ||
| /// Atomically installs the three recorders for a single capture pass. Returns <c>false</c> | ||
| /// when a capture is already in progress on this session. |
There was a problem hiding this comment.
Medium – TryArm memory-ordering comment is incorrect
The <remarks> says:
"The world-state recorder is the primary slot — the CAS on it gates the operation; the other two are written under the post-CAS happens-before..."
But the code writes _trieRecorder and _headerRecorder before the CAS, not after it. What the comment should say is that the writes happen-before the CAS release (i.e., sequentially consistent via Volatile.Write), so any reader who observes _worldStateRecorder != null via Volatile.Read is guaranteed to also observe the earlier writes. The current phrasing inverts the ordering and is misleading.
More critically: if two threads race on TryArm, thread B can overwrite _trieRecorder and _headerRecorder between thread A's volatile writes and thread A's CAS. Thread A then wins the CAS but holds the world-state recorder from A paired with trie/header recorders from B — an inconsistent state. The comment notes that "the main processing pipeline drives blocks serially so contention is theoretical", which is accurate, but the comment should reflect the actual invariant (serial use) rather than claiming a memory-ordering guarantee that doesn't hold under true concurrent use.
| throw new InvalidOperationException($"NewPayloadWithWitnessResponseV1 encode length mismatch: wrote {pos} bytes but expected {totalLen}"); | ||
| writer.Advance(totalLen); | ||
| return totalLen; | ||
| } |
There was a problem hiding this comment.
Low – TruncateUtf8 silently returns an empty slice if the first character is multi-byte
If ValidationError begins with a multi-byte character that extends past maxBytes, the loop backs all the way to i = 0 and returns utf8[..0] — an empty byte array. This means a ValidationError that starts with a non-ASCII character can be silently dropped in the response rather than truncated at a character boundary. The encoded response would include hasError = true but errorLen = 1 + 0 = 1 (just the 0x01 selector with an empty body), which is a valid "Some("")" but may confuse a decoder expecting at least one character.
For resilience, consider returning utf8[..maxBytes] (potentially invalid UTF-8) when i reaches 0, or logging the truncation. The current behaviour is silent and may produce an unexpected empty error string.
| // capture, avoiding the allocate-then-discard on every cache hit (hot in SLOAD loops touching | ||
| // the same branch). | ||
| if (node.NodeType != NodeType.Unknown && session.TrieRecorder is { } recorder) | ||
| recorder.Record(node.Keccak, node); |
There was a problem hiding this comment.
Low – node.Keccak can be null for dirty in-memory nodes
node.Keccak is Hash256?. When a dirty (not-yet-persisted) node is returned from cache and its Keccak hasn't been computed yet, passing it to recorder.Record(node.Keccak, node) passes null as the Hash256 key, which will throw NullReferenceException inside ConcurrentDictionary<Hash256AsKey, byte[]>.
In practice this path should not fire: FindCachedOrUnknown is keyed by hash, so a cached hit at that hash should have Keccak == hash. However, the write-through (readOnly: false) path decorates the live store, and dirty nodes can be in the cache with a non-Unknown NodeType but a null Keccak.
Add a null guard:
if (node.NodeType != NodeType.Unknown
&& node.Keccak is { } nodeHash // ← guard against dirty nodes with no hash yet
&& session.TrieRecorder is { } recorder)
recorder.Record(nodeHash, node);|
|
||
| PayloadStatusV1 payloadStatus = statusResult.Data!; | ||
| Witness? witness = null; | ||
|
|
There was a problem hiding this comment.
Low – Missing timeout guard on captureTask
The handler calls CancelWitnessRequest if the task is not completed, but this check happens only once, synchronously after engine_newPayloadV5 returns. If the block processor is processing on a background thread and engine_newPayloadV5 returns before processing finishes (e.g. it's queued but not yet executed), captureTask.IsCompleted would be false even for a VALID result, and the witness would be silently dropped.
The current comment acknowledges this with "BlockProcessor normally completes the TCS synchronously inside ProcessOne", and the test BlockProcessor_completes_rendezvous_task_synchronously_inside_newPayloadV5 asserts it. But there is no fallback timeout — if the assumption ever breaks (e.g. due to a future refactor that makes block processing async), CancelWitnessRequest fires and the witness is silently lost with only a WARN log.
Consider a short Task.WhenAny(captureTask, Task.Delay(TimeSpan.Zero)) (yield and re-check) or a configurable deadline before cancelling, to catch the common "already completed before we check" case more robustly. At minimum, the log message should be elevated to WARN (it already is) and include the processing time so operators can detect the broken-assumption case.
This reverts commit fe7a6a7.
Add `PatriciaTrieWitnessGenerator`, a mutation-free, single-trie witness collector that walks the pre-state trie driven by a list of read/written key paths and reports the nodes a stateless verifier needs. It replaces the trie-read-interception approach, which forced execution off the flat-DB fast path (the flat snapshot layer and TrieNodeCache hand back decoded TrieNodes without materializing RLP). The walk mirrors `PatriciaTree.BulkSet` (partial-sort by nibble, bucketize into 16, recurse) and reuses its sort helpers. A recursion returns true when its whole subtree is deleted, which drives the lone-child collapse: when deletions reduce a branch to a single remaining child, the surviving sibling is reported even though it was never on a touched path — the node a plain read-path visitor misses. An optional top-level 16-way parallel mode fans the root subtrees out (children pre-resolved single-threaded so the root is never mutated concurrently); the sink must then be thread-safe. Output goes to a small `IWitnessNodeSink`. Tests assert, across fuzz seeds, BulkSet-derived shapes, and targeted collapse/extension/absent/mixed cases: - the generated node set (sequential and parallel) equals the ground truth obtained by replaying the reads and deletes through a read-capturing store; - the witness alone serves every read and recomputes the post-state root with no missing node. A benchmark compares the new generator against the old capture-during-mutation path; the new path is ~0.3-0.7x of the old time, with the biggest win on delete-heavy workloads. Scope: standalone class + tests + benchmark; not yet wired into the witness pipeline. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the read-interception witness mechanism with a post-execution walk that drives PatriciaTrieWitnessGenerator over the pre-state trie. Intercepting every trie read forced execution onto the slow trie path and defeated the flat DB. - The env factory no longer wraps execution in WitnessCapturingTrieStore, so execution runs on the plain read path. - WitnessGeneratingWorldState.GetWitness walks the pre-state state trie once for the touched accounts, then each touched account's pre-state storage trie for its touched slots, collecting the witness nodes (including the deletion-collapse siblings the generator reasons about). Read vs Delete is read off the committed post-state: an account that no longer exists, or a slot now zero, was removed. - Removes the now-dead WitnessCapturingTrieStore and MultiAccountProofCollector. The generator runs with parallelism off (the collecting sink is a plain dictionary). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Use short padded hex prefixes for the absent-key test instead of an over-length literal that was silently truncated. - Document why an account with no individually-touched slots skips the storage-trie walk (its state-trie leaf removal already covers it). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…er and changes to DI for those
Follow up of PR #11623 as this was from a forked repo
What types of changes does your code introduce?
Testing
Requires testing
If yes, did you write tests?
Notes on testing
Optional. Remove if not applicable.
Documentation
Requires documentation update
If yes, link the PR to the docs update or the issue with the details labeled
docs. Remove if not applicable.Requires explanation in Release Notes
If yes, fill in the details here. Remove if not applicable.
Remarks
Optional. Remove if not applicable.