Stream debug_traceBlock* responses directly to avoid OOM on large blocks#9848
Stream debug_traceBlock* responses directly to avoid OOM on large blocks#9848daniellehrner merged 31 commits intobesu-eth:mainfrom
Conversation
27af4c6 to
7927737
Compare
|
@daniellehrner is this PR ready for review ? |
...java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockStreamer.java
Show resolved
Hide resolved
...est/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockTest.java
Show resolved
Hide resolved
...org/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-call/debug_traceCall_disableStack.json
Outdated
Show resolved
Hide resolved
...g/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-call/debug_traceCall_disableStorage.json
Outdated
Show resolved
Hide resolved
...rg/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-call/debug_traceCall_stateOverride.json
Outdated
Show resolved
Hide resolved
...r/besu/ethereum/api/jsonrpc/debug/trace-transaction/debug_traceTransaction_disableStack.json
Outdated
Show resolved
Hide resolved
...besu/ethereum/api/jsonrpc/debug/trace-transaction/debug_traceTransaction_disableStorage.json
Outdated
Show resolved
Hide resolved
ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/DebugOperationTracer.java
Outdated
Show resolved
Hide resolved
ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/DebugOperationTracer.java
Outdated
Show resolved
Hide resolved
ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/DebugOperationTracer.java
Outdated
Show resolved
Hide resolved
| * StringBuilder is cleared before use but retains its internal buffer. | ||
| */ | ||
| public static void toCompactHex( | ||
| final Bytes abytes, final boolean prefix, final StringBuilder buf) { |
There was a problem hiding this comment.
interesting 👍 , not very important/urgent, but it would be great to see JMH benchmarks on both methods.
.../org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByNumberTest.java
Show resolved
Hide resolved
...va/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByHashTest.java
Show resolved
Hide resolved
ahamlat
left a comment
There was a problem hiding this comment.
I think it makes sens to have two different DebugOperationTracer, one that uses streaming and one that doesn't use streaming. It is a bit misleading to have traceFrames object where it is not needed anymore for the streaming version.
Also, fix unit tests as some of them are failing related to this PR.
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
…to accumulation in memory, adddress pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
9023c97 to
67326b4
Compare
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
There was a problem hiding this comment.
Pull request overview
This PR adds first-class support for streaming JSON-RPC responses, then migrates debug_traceBlock* (especially opcode tracing) to stream structLogs as they are produced to avoid OOM on large blocks.
Changes:
- Introduces a streaming execution path (
StreamingJsonRpcMethod, executor/processor support) for HTTP and WebSocket handlers. - Adds streaming-capable debug tracers and a
DebugTraceBlockStreamerthat writes opcode struct logs during EVM execution. - Updates debug trace block golden files and rewrites debug trace tests to validate the streamed JSON output.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/StreamingDebugOperationTracer.java | New operation tracer that emits per-opcode data via a callback instead of accumulating frames. |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/DebugOperationTracer.java | Refactors to share logic via AbstractDebugOperationTracer and optionally stream frames via a consumer. |
| ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/AbstractDebugOperationTracer.java | New base class for shared opcode filtering, stack capture, and gas-cost computation. |
| ethereum/api/src/test/resources/org/hyperledger/besu/ethereum/api/jsonrpc/trace/chain-data/genesis-osaka.json | Adds a genesis config for updated trace tests. |
| ethereum/api/src/test/resources/org/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-block/debug_traceBlock_disableStorage.json | Updates expected JSON ordering for streamed results. |
| ethereum/api/src/test/resources/org/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-block/debug_traceBlock_disableStack.json | Updates expected JSON ordering for streamed results. |
| ethereum/api/src/test/resources/org/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-block/debug_traceBlock_disableMemory.json | Updates expected JSON ordering for streamed results. |
| ethereum/api/src/test/resources/org/hyperledger/besu/ethereum/api/jsonrpc/debug/trace-block/debug_traceBlock_default.json | Updates expected JSON ordering for streamed results. |
| ethereum/api/src/test/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockTest.java | Rewrites tests to invoke streamResponse and validate streamed JSON content. |
| ethereum/api/src/test/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByNumberTest.java | Rewrites tests to invoke streamResponse and validate streamed JSON content. |
| ethereum/api/src/test/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByHashTest.java | Rewrites tests to invoke streamResponse and validate streamed JSON content. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/websocket/WebSocketMessageHandler.java | Detects streaming methods and writes directly to the WebSocket output stream. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/methods/JsonRpcMethodsFactory.java | Updates debug method construction after removing EthScheduler dependency. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/methods/DebugJsonRpcMethods.java | Removes EthScheduler usage and wires new streaming debug trace methods. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/results/StructLog.java | Adds toCompactHex(..., StringBuilder) overload to reduce allocations. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/StreamingJsonRpcMethod.java | New interface for methods that stream directly to an OutputStream. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/JsonRpcMethod.java | Adds an isStreaming() marker method. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockStreamer.java | New streaming engine that writes opcode structLogs while executing transactions. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByNumber.java | Implements streaming for debug_traceBlockByNumber with a fallback batch path. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByHash.java | Implements streaming for debug_traceBlockByHash. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlock.java | Implements streaming for debug_traceBlock and adjusts invalid/parent-missing handling. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/AbstractDebugTraceBlock.java | Shared streaming/batch logic for debug trace block methods and response writing. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/execution/TracedJsonRpcProcessor.java | Adds streamProcess() support while preserving metrics/tracing behavior. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/execution/TimedJsonRpcProcessor.java | Adds timing support for streamed execution. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/execution/JsonRpcProcessor.java | Adds streamProcess() hook to processor chain. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/execution/JsonRpcExecutor.java | Adds streaming dispatch and refactors shared execution preparation. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/execution/BaseJsonRpcProcessor.java | Implements streamed execution including error mapping for invalid params/runtime errors. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/jsonrpc/execution/AuthenticatedJsonRpcProcessor.java | Adds auth checks for streamed execution. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/handlers/JsonRpcObjectExecutor.java | Routes streaming methods to executeStreaming() and writes directly to response stream. |
| ethereum/api/src/main/java/org/hyperledger/besu/ethereum/api/handlers/AbstractJsonRpcExecutor.java | Makes SPAN_CONTEXT available to subclasses for streaming execution. |
You can also share your feedback on Copilot code review. Take the survey.
ethereum/core/src/main/java/org/hyperledger/besu/ethereum/vm/DebugOperationTracer.java
Outdated
Show resolved
Hide resolved
...java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockStreamer.java
Show resolved
Hide resolved
...java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockStreamer.java
Outdated
Show resolved
Hide resolved
| gen.writeEndArray(); | ||
| gen.flush(); | ||
| } |
| @Override | ||
| public JsonRpcResponse response(final JsonRpcRequestContext request) { | ||
| final Optional<Block> maybeBlock = findBlock(request); | ||
| if (maybeBlock.isEmpty()) { | ||
| return new JsonRpcSuccessResponse(request.getRequest().getId(), null); | ||
| } | ||
| final TraceOptions traceOptions = getTraceOptions(request); | ||
| final DebugTraceBlockStreamer streamer = createStreamer(traceOptions, maybeBlock); | ||
| return new JsonRpcSuccessResponse(request.getRequest().getId(), streamer.accumulateAll()); | ||
| } |
| /** | ||
| * Streaming methods do not support the synchronous response path (used by batch requests). | ||
| * Returns an error response instead of throwing, so batch requests degrade gracefully. | ||
| */ | ||
| @Override | ||
| default JsonRpcResponse response(final JsonRpcRequestContext request) { | ||
| return new JsonRpcErrorResponse( | ||
| new JsonRpcRequestId(request.getRequest().getId()), RpcErrorType.INVALID_REQUEST); | ||
| } |
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
| .apply(transactionTrace); | ||
| } | ||
|
|
||
| private void writeStreamingStructLog( |
There was a problem hiding this comment.
writeStreamingStructLog creates new TreeMap<>(updatedStorage) on every traced opcode. The class already has private final TreeMap<UInt256, UInt256> sortedStorage — that field should be cleared and re-used (sortedStorage.clear(); sortedStorage.putAll(updatedStorage)) instead of allocating a new one each time. For a block with thousands of transactions and complex storage access this dominates GC pressure.
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
… size to work better with netty's default buffer size Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
… is available, allows to send the proper error codes during setup Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
…corrected error format, reason encoding, returnValue prefix, and precompile gasCost, with equivalence tests between both Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
| if (prepared == null) { | ||
| return Optional.of(new JsonRpcNoResponse()); | ||
| } | ||
| if (prepared instanceof JsonRpcResponse response) { |
There was a problem hiding this comment.
My understanding is this happens only when there're errors : JsonRpcErrorResponse with Invalid request. When reading the code, it seems weird for me that preparing a request returned a response instead of the PreparedRequest.
I would make prepareExecution return only PreparedRequest and throw IllegalArgumentException to be handled by executeStreaming, and return optional.of(JsonRpcErrorResponse(id, INVALID_REQUEST))
| // Sized to stay below Netty's default high watermark (64 KB). This ensures each | ||
| // flush does not overshoot the backpressure threshold in JsonResponseStreamer, allowing | ||
| // the drain/resume cycle to regulate direct-memory usage smoothly. | ||
| private static final int BUF_SIZE = 32 * 1024; |
There was a problem hiding this comment.
Testing bigger buffer size 256 KiB with 1 MiB max queue size showed slightly better latency numbers but was less stable then this configuration. I suggest to keep existing configuration re-evaluate later if needed.
ahamlat
left a comment
There was a problem hiding this comment.
LGTM, I tested different blocks on mainnet and noticed big improvements in terms of performance. The most interesting ones are when existing besu nodes (before this PR) couldn't reply because of OOO errors where this PR is able to reply with very small memory overhead.
There are small comments but nothing blocking. We can re-evaluate in future PRs.
* Add SHL, SHR and SAR shift operations for EVM v2 (#10154) * Add SHL, SHR and SAR implementations and benchmarks for EVM v2 Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * Upgrade RocksDB version from 9.7.3 to 10.6.2 (#9767) * Upgrade RocksDB version from 9.7.3 to 10.6.2 * Fix JNI SIGSEGV crashes Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> * Add missing verification metadata (#10198) Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Stream debug_traceBlock* responses directly to avoid OOM on large blocks (#9848) * stream block traces on op code level Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * correctly parse default setting for memory tracing Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * fix initcode capture for failed create op codes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * created separate streaming debug tracer, for batch request fall back to accumulation in memory, adddress pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * execute tests from genesis and verify full trace Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * addressed pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * optimize trace streaming and struct log handling Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * spotless Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * Fix remaining issues and add unit tests Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * added back pressure when writing to the socket and reduced the buffer size to work better with netty's default buffer size Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * improve error handling by deferring to send the header only when data is available, allows to send the proper error codes during setup Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * compactHex candidate comparison Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * wire in more performant hex writer Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * introduce separate timeout for streaming calls, defaults to 10 minutes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * Fix streamin/accumulating output parity, added missing refund field, corrected error format, reason encoding, returnValue prefix, and precompile gasCost, with equivalence tests between both Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * revert accidental removal of 0x prefix Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * pad memory bytes to 32 bytes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> --------- Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Co-authored-by: Ameziane H. <ameziane.hamlat@consensys.net> --------- Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Co-authored-by: ahamlat <ameziane.hamlat@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Co-authored-by: Fabio Di Fabio <fabio.difabio@consensys.net>
* Add SHL, SHR and SAR shift operations for EVM v2 (#10154) * Add SHL, SHR and SAR implementations and benchmarks for EVM v2 Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * Upgrade RocksDB version from 9.7.3 to 10.6.2 (#9767) * Upgrade RocksDB version from 9.7.3 to 10.6.2 * Fix JNI SIGSEGV crashes Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> * Add missing verification metadata (#10198) Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Stream debug_traceBlock* responses directly to avoid OOM on large blocks (#9848) * stream block traces on op code level Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * correctly parse default setting for memory tracing Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * fix initcode capture for failed create op codes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * created separate streaming debug tracer, for batch request fall back to accumulation in memory, adddress pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * execute tests from genesis and verify full trace Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * addressed pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * optimize trace streaming and struct log handling Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * spotless Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * Fix remaining issues and add unit tests Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * added back pressure when writing to the socket and reduced the buffer size to work better with netty's default buffer size Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * improve error handling by deferring to send the header only when data is available, allows to send the proper error codes during setup Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * compactHex candidate comparison Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * wire in more performant hex writer Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * introduce separate timeout for streaming calls, defaults to 10 minutes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * Fix streamin/accumulating output parity, added missing refund field, corrected error format, reason encoding, returnValue prefix, and precompile gasCost, with equivalence tests between both Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * revert accidental removal of 0x prefix Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * pad memory bytes to 32 bytes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> --------- Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Co-authored-by: Ameziane H. <ameziane.hamlat@consensys.net> * Optimize performance and reduce memory when creating Quantity from scalar (#10134) * Optimize performance and reduce memory when creating Quantity from scalar Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Benchmark other implementations Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> --------- Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * snap sync - apply BALs before flat db heal (#10151) Signed-off-by: Miroslav Kovar <miroslavkovar@protonmail.com> * Remove dryRunDetector workaround methods from unit tests (#10201) * Remove dryRunDetector workaround methods from unit tests The dryRunDetector methods were added as a workaround for a Gradle issue that prevented @ParameterizedTest classes from being selected when running with --dry-run. Since the issue is fixed and --dry-run is no longer used, these methods are no longer needed. Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Remove dryRunDetector workaround from acceptance tests too The Gradle issue is confirmed fixed, so the workaround is no longer needed anywhere, including acceptance tests. Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> --------- Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * preserve state gas reservoir for the top level frame in case of OOG (#10205) Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> --------- Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Signed-off-by: Miroslav Kovar <miroslavkovar@protonmail.com> Co-authored-by: ahamlat <ameziane.hamlat@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Co-authored-by: Fabio Di Fabio <fabio.difabio@consensys.net> Co-authored-by: Miroslav Kovář <miroslavkovar@protonmail.com>
* Add SHL, SHR and SAR shift operations for EVM v2 (#10154) * Add SHL, SHR and SAR implementations and benchmarks for EVM v2 Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * Upgrade RocksDB version from 9.7.3 to 10.6.2 (#9767) * Upgrade RocksDB version from 9.7.3 to 10.6.2 * Fix JNI SIGSEGV crashes Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> * Add missing verification metadata (#10198) Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Stream debug_traceBlock* responses directly to avoid OOM on large blocks (#9848) * stream block traces on op code level Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * correctly parse default setting for memory tracing Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * fix initcode capture for failed create op codes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * created separate streaming debug tracer, for batch request fall back to accumulation in memory, adddress pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * execute tests from genesis and verify full trace Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * addressed pr comments Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * optimize trace streaming and struct log handling Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * spotless Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * Fix remaining issues and add unit tests Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> * added back pressure when writing to the socket and reduced the buffer size to work better with netty's default buffer size Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * improve error handling by deferring to send the header only when data is available, allows to send the proper error codes during setup Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * compactHex candidate comparison Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * wire in more performant hex writer Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * introduce separate timeout for streaming calls, defaults to 10 minutes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * spotless Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * Fix streamin/accumulating output parity, added missing refund field, corrected error format, reason encoding, returnValue prefix, and precompile gasCost, with equivalence tests between both Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * revert accidental removal of 0x prefix Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * pad memory bytes to 32 bytes Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> --------- Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Co-authored-by: Ameziane H. <ameziane.hamlat@consensys.net> * Optimize performance and reduce memory when creating Quantity from scalar (#10134) * Optimize performance and reduce memory when creating Quantity from scalar Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Benchmark other implementations Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> --------- Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * snap sync - apply BALs before flat db heal (#10151) Signed-off-by: Miroslav Kovar <miroslavkovar@protonmail.com> * Remove dryRunDetector workaround methods from unit tests (#10201) * Remove dryRunDetector workaround methods from unit tests The dryRunDetector methods were added as a workaround for a Gradle issue that prevented @ParameterizedTest classes from being selected when running with --dry-run. Since the issue is fixed and --dry-run is no longer used, these methods are no longer needed. Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * Remove dryRunDetector workaround from acceptance tests too The Gradle issue is confirmed fixed, so the workaround is no longer needed anywhere, including acceptance tests. Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> --------- Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> * preserve state gas reservoir for the top level frame in case of OOG (#10205) Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * Enable execution processor on PoA networks with system contract addresses (#10196) * enable the prague execution processor for poa networks that have the systems contract addresses set in their genesis file Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> * Fix engine_getPayloadV1 to return -38001 instead of -32001 for unknown payloadId (#10179) The Engine API spec requires error code -38001 (Unknown payload) when engine_getPayloadV1 is called with an unrecognized payloadId. Besu was incorrectly returning -32001 (Resource not found), which is a non-standard error code that may cause interoperability issues with consensus layer clients. Fixes #10174 Signed-off-by: Vivek Singh Solanki <viveksolanki0509@gmail.com> * Exclude IntelliJ bin/default output from Spotless shell script check (#10210) When IntelliJ syncs a Gradle project without build delegation, it copies processed resources (including reference test shell scripts from the submodule) into bin/default/. Spotless then finds these copies and incorrectly flags them for missing license headers, while CI never sees bin/default/ since it runs bare Gradle. Add '**/bin/default/**' to the ShellScripts targetExclude, matching the existing pattern used for other generated/external content. Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Missing memory presence check (#10213) * Call lastFrame.getMemory().isPresent() before calling lastFrame.getMemory().get().length to avoid NPE Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net> Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net> Signed-off-by: daniellehrner <daniel.lehrner@consensys.net> Signed-off-by: Miroslav Kovar <miroslavkovar@protonmail.com> Signed-off-by: Vivek Singh Solanki <viveksolanki0509@gmail.com> Signed-off-by: Simon Dudley <simon.dudley@consensys.net> Co-authored-by: ahamlat <ameziane.hamlat@consensys.net> Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com> Co-authored-by: Fabio Di Fabio <fabio.difabio@consensys.net> Co-authored-by: Miroslav Kovář <miroslavkovar@protonmail.com> Co-authored-by: Vivek Singh Solanki <viveksolanki0509@gmail.com> Co-authored-by: Simon Dudley <simon.dudley@consensys.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
PR description
Converts
debug_traceBlockByNumber,debug_traceBlockByHash, and `debug_traceBlock from accumulate-then-serialize to stream-as-you-go. Previously, these methods built the entire JSON response in memory (via TransactionTrace + DebugTraceTransactionResult), which OOMs on blocks with many transactions or complex traces. Now, structLogs are written directly to the HTTP/WebSocket output stream during EVM execution.Infrastructure changes:
Trace-specific changes
opcode step
Breaking changes
JSON field ordering changed for debug_traceBlock* with OPCODE_TRACER (default tracer):
Before:
After:
gas,failed, andreturnValuenow appear after structLogs because they're only known after execution completes. JSON-RPC clients that parse by key name (standard) are unaffected. Clients that depend on field ordering willbreak.
Batch requests containing streaming methods now return {"error": {"code": -32600, "message": "Invalid request"}} for those methods instead of crashing the batch. This is new behavior but I am unsure if tracing more than one block in parallel would have been possible without OOM anyways.Batch requests fall back to the normal OperationTracer which accumulates the results in memoryPerformance tests
I ran the following script to trace 10 recent blocks in a row to compare the current implementation against this PR. The script was:
On the feature node, which includes this PR we got:
On the control node, which is
mainwe got:The control node crashed during the execution, as can be seen by retruning 0 bytes on the bottom 3 blocks.
TTFB (time to first byte) is only a few ms on the feature node and several seconds on the control node, showing that the streaming works correctly and starts to send data almost immediately.
The total response time varies between the two without a clear winner.
During the tests we saw the following memory consumption:
We see the expected spikes on the control node in GC time and general memory consumption. As the memory consumption increases by several GBs we eventually run into a OOM error which crahes the node.
On the feature node GC time increases a bit, but the general memory consumption stays relatively flat, as expected because the streaming only keeps very little data in memory before writing it to the socket and deleting it right away.
Fixed Issue(s)
Thanks for sending a pull request! Have you done the following?
doc-change-requiredlabel to this PR if updates are required.Locally, you can run these tests to catch failures early:
./gradlew spotlessApply./gradlew build./gradlew acceptanceTest./gradlew integrationTest./gradlew ethereum:referenceTests:referenceTests