Skip to content

evmtool: add engine-test subcommand and parallel workers for test runners#10184

Draft
spencer-tb wants to merge 5 commits intobesu-eth:mainfrom
spencer-tb:feat/evm-enginetest
Draft

evmtool: add engine-test subcommand and parallel workers for test runners#10184
spencer-tb wants to merge 5 commits intobesu-eth:mainfrom
spencer-tb:feat/evm-enginetest

Conversation

@spencer-tb
Copy link
Copy Markdown

@spencer-tb spencer-tb commented Apr 4, 2026

Summary

Add evm engine-test subcommand for direct engine fixture execution through the real Engine API path, and add --workers, --run, and directory support to block-test and state-test.

evm engine-test

Routes engine fixtures through the real AbstractEngineNewPayload.syncResponse() and AbstractEngineForkchoiceUpdated.syncResponse(), exercising the same validation and execution logic as consume engine via Hive. No core Besu code is modified — only imports existing engine API classes.

Code path (matching consume engine):

  1. Initial engine_forkchoiceUpdatedVX to genesis
  2. For each payload: engine_newPayloadVX via AbstractEngineNewPayload.syncResponse()
  3. For each valid payload: engine_forkchoiceUpdatedVX via AbstractEngineForkchoiceUpdated.syncResponse()

This is a different code path from block-test which uses BlockImporter.importBlock().

block-test / state-test improvements

  • --run alias for --test-name (matching geth/erigon/nethermind naming)
  • --workers N — parallel file processing via ExecutorService thread pool
  • Directory input — recursively walks directories for .json files

Benchmarks

Tested against EEST v5.3.0 stable fixtures on Apple M-series.

For reference, Hive runs of the same test suite on the same Besu version:

  • consume engine: 1d 2h 37m
  • consume rlp: >36h

evm engine-test — exercises the same engine code paths as consume engine (40,523 tests, 47,882 payloads):

Workers Time Speedup vs serial vs Hive consume engine
1 2m13s 1x ~720x
8 1m18s 1.7x ~1,230x

100% pass rate — 40,523 / 40,523 (0 failures). Full Hive parity.

evm block-test — exercises the same block import paths as consume rlp (43,795 tests):

Workers Time Speedup vs serial vs Hive consume rlp
1 1m01s 1x >2,100x
8 16.9s 3.6x >7,600x

evm state-test (40,551 tests):

Workers Time Speedup vs serial
1 42.6s 1x
8 25.2s 1.7x

Usage

# Engine test with parallel workers
evm engine-test --workers 8 /path/to/blockchain_tests_engine/

# Block test with directory support and workers
evm block-test --workers 8 /path/to/blockchain_tests/

# State test with regex filter
evm state-test --run "eip4844" /path/to/state_tests/

Related: ethereum/go-ethereum#34650, erigontech/erigon#20315, NethermindEth/nethermind#11035, ethereum/execution-spec-tests#2319

…ate-test

Signed-off-by: spencer-tb <spencer.tb@ethereum.org>
Signed-off-by: spencer-tb <spencer.tb@ethereum.org>
@spencer-tb spencer-tb force-pushed the feat/evm-enginetest branch from 2c43f7b to 383df79 Compare April 4, 2026 20:38
- Cache ReferenceTestProtocolSchedules across all tests (was rebuilding
  30+ protocol schedules per test)
- Share ObjectMapper, PostMergeContext, ServiceManager, EngineCallListener
  as static singletons (all stateless/thread-safe after init)
- Lazy engine method creation (9 objects per test → only 1-2 used)
- Thread-safety fix: synchronize failures map in BlockchainTestSubCommand
- Add --json-array flag to engine-test and block-test subcommands
- Output includes: name, pass, fork, lastBlockHash, lastPayloadStatus, error
- Report validation error in `error` even on pass for negative tests
- Delete EngineXTestSubCommand (unused, had compilation errors)
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant