Skip to content

feat(l1): ef_tests: add standalone statetest, blocktest, and enginetest CLI runners#6445

Draft
spencer-tb wants to merge 7 commits intolambdaclass:mainfrom
spencer-tb:main
Draft

feat(l1): ef_tests: add standalone statetest, blocktest, and enginetest CLI runners#6445
spencer-tb wants to merge 7 commits intolambdaclass:mainfrom
spencer-tb:main

Conversation

@spencer-tb
Copy link
Copy Markdown

Summary

Add three standalone CLI runners for direct execution of EEST fixtures, enabling consume direct integration without Hive. Each runner has --workers N, --run <regex>, --json, and --path <dir> flags.

Why new CLIs?

The existing test infrastructure uses either cargo test harnesses (datatest-stable for blockchain tests) or the state_v2 binary. Neither supports the flags needed for consume direct:

  • datatest-stable harness — no --run regex filter, no --json output, no --workers control, hardcoded fixture paths. Can't be called as a standalone binary from EELS.
  • state_v2 runner — creates a full Store (with background threads, trie cache, channels) per test case, making it 89x slower than necessary. State tests only need a HashMap → VM → check result.

The new runners are standalone binaries that EELS consume direct can invoke directly.

`ef_tests-statetest`

Fast state test runner using LEVM directly with in-memory HashMap state. No Store, no trie, no background threads — just loads pre-state into memory, executes one transaction, and verifies post-state accounts directly.

`ef_tests-blocktest`

Thin CLI wrapper around the existing `ef_tests-blockchain` crate. Calls `parse_and_execute()` per fixture file with rayon parallelism.

`ef_tests-enginetest`

Engine test runner for `blockchain_test_engine` fixtures. Routes payloads through the real Engine API RPC handler functions: `NewPayloadV1-V5Request.parse()` + `.handle(context)` and `ForkChoiceUpdatedV1-V4.handle(context)`. Same code path as `consume engine` via Hive, minus the HTTP transport layer.

Uses a shared `SyncManager`/`PeerHandler` via `OnceCell` to avoid thread exhaustion. Per-test `RpcApiContext` with isolated `block_worker_channel`.

Benchmarks

Tested against EEST v5.3.0 stable fixtures on Apple M-series.

`ef_tests-statetest` (36,220 tests):

Workers Time vs state_v2
8 3.8s 89x faster

`ef_tests-blocktest` (2,776 files, ~40k test cases):

Workers Time vs cargo test
8 28.7s ~same (now with CLI flags)

`ef_tests-enginetest` — exercises the same engine code paths as `consume engine` (40,521 tests):

Workers Time
8 32.7s

100% pass rate across all three runners (3 known statetest failures from invalid CREATE blob transactions).

Usage

# State tests
ef_tests-statetest --workers 8 --path /path/to/state_tests/

# Block tests
ef_tests-blocktest --workers 8 --path /path/to/blockchain_tests/

# Engine tests
ef_tests-enginetest --workers 8 --path /path/to/blockchain_tests_engine/

# Filter by regex
ef_tests-enginetest --run "eip7702" --path /path/to/fixtures/

# JSON output for consume direct
ef_tests-statetest --json --workers 8 --path /path/to/state_tests/

Related: ethereum/go-ethereum#34650, erigontech/erigon#20315, NethermindEth/nethermind#11035, besu-eth/besu#10184, ethereum/execution-spec-tests#2319

Fast state test runner using LEVM directly with in-memory HashMap
state. No Store, no trie, no background threads.

36,220 tests in 3.8s (w=8). 89x faster than state_v2 runner.
3 known failures (invalid CREATE blob transactions).

Flags: --workers N, --run <regex>, --json, --path <dir>
Thin wrapper around ef_tests-blockchain with CLI flags.
2,776 files in 28.7s (w=8). Same results as cargo test harness.

Flags: --workers N, --run <regex>, --json, --path <dir>
Engine test runner for blockchain_test_engine fixtures. Validates
engine API version-specific parameters (V1-V5), converts payload to
block, executes via Blockchain::add_block_pipeline(), applies
fork choice, and verifies post-state.

40,521 tests in 29.5s (w=8). 0 failures on v5.3.0 stable.

Flags: --workers N, --run <regex>, --json, --path <dir>
Routes payloads through the real Engine API RPC handler functions:
NewPayloadV1-V5Request.parse() + .handle(context) and
ForkChoiceUpdatedV1-V4.handle(context). Same code path as
consume engine via Hive, minus HTTP transport.

Uses shared SyncManager/PeerHandler via OnceCell to avoid thread
exhaustion. Per-test RpcApiContext with isolated block_worker_channel.

40,521 tests in 32.7s (w=8). 0 failures on v5.3.0 stable.
@edg-l edg-l changed the title ef_tests: add standalone statetest, blocktest, and enginetest CLI runners feat(l1): ef_tests: add standalone statetest, blocktest, and enginetest CLI runners Apr 7, 2026
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 8, 2026
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant