Skip to content

feat(validate): new validate command for direct client binary testing#2622

Draft
spencer-tb wants to merge 1 commit intoethereum:forks/amsterdamfrom
spencer-tb:feat/consume-direct-engine
Draft

feat(validate): new validate command for direct client binary testing#2622
spencer-tb wants to merge 1 commit intoethereum:forks/amsterdamfrom
spencer-tb:feat/consume-direct-engine

Conversation

@spencer-tb
Copy link
Copy Markdown
Contributor

@spencer-tb spencer-tb commented Apr 5, 2026

🗒️ Description

New validate (props to @raxhvl for the naming convention) CLI command for running EEST fixtures directly against client EVM binaries. Replaces consume direct with a cleaner UX — type is the subcommand, --client is required, no --bin or --type flags.

validate health                    # health check all clients
validate engine --client geth      # engine tests
validate state --client besu       # state tests
validate block --client nethermind # block tests

Features

  • 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
  • Per-type Pydantic models: StateTestResult (stateRoot), BlockTestResult (lastBlockHash), EngineTestResult (lastPayloadStatus)
  • Exception matching: maps client error strings to EEST exception types, verifies correct exception for every invalid test (--no-exception-check to disable)
  • Cross-validation: lastBlockHash against fixture, lastPayloadStatus (VALID/INVALID) for engine tests
  • validate.toml config for client binary paths with per-type overrides (state-bin, block-bin, engine-bin)
  • Auto-tuning: bin-workers and xdist settings per client
  • Health checks: version detection + sanity fixture per client per type
  • Aliases: go-ethereum resolves to geth
  • Fully standalone: no dependency on consume plugin
  • Removes consume direct: replaced entirely by validate

Pydantic Result Models

All client adapters return structured results via a shared model hierarchy in cli_types.py:

FixtureTestResult          # base — name, pass, fork, error
├── StateTestResult        # + stateRoot
├── BlockTestResult        # + lastBlockHash
│   └── EngineTestResult   # + lastPayloadStatus
Model Extra Fields Used By
FixtureTestResult name, pass, fork, error base class
StateTestResult stateRoot validate state
BlockTestResult lastBlockHash validate block
EngineTestResult lastBlockHash, lastPayloadStatus validate engine

Each client binary outputs JSON matching these schemas. The shared validate_helpers.py module validates results against fixture expectations:

  • stateRoot compared to fixture's postState hash
  • lastBlockHash compared to fixture's lastblockhash
  • lastPayloadStatus checked as VALID (positive) or INVALID (negative test)
  • error matched through ExceptionMapper against fixture's expectException / validationError

Results (v5.3.0 fixtures, Apple M-series)

Client Engine Time Engine Pass % Block Time Block Pass % State Time State Pass % Exc Check (state/block/engine) Default Flags
geth 58s 99.98% 64s 99.96% pending pending ✓ / ✓ / ✓ --bin-workers 8 (auto)
ethrex 66s 100% 66s 100% pending pending ✗ / ✗ / ✗ --bin-workers 8 (auto)
nimbus 71s 100% 67s 100% pending pending ✗ / ✗ / ✗ --fast mode, sequential
reth 111s 99.10% broken pending pending ✗ / — / ✗ -n 2 (auto)
besu 174s 99.99% 65s 100% pending pending ✓ / ✗ / ✓ --bin-workers 8 (auto)
nethermind 176s 99.98% 107s 99.97% pending pending ✓ / ✓ / ✓ --bin-workers 4 (auto)
erigon 26m 100% pending pending pending pending ✗ / ✓ / ✗ --bin-workers 8 (auto)

Exception check = client reports validation error on pass for negative tests, enabling EELS-side exception verification.

Client PRs (adding statetest/blocktest/enginetest runners)

🔗 Related Issues or PRs

Fixes #2319

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

@spencer-tb spencer-tb added C-feat Category: an improvement or new feature A-test-consume Area: execution_testing.cli.pytest_commands.plugins.consume A-test-client-clis Area: execution_testing.client_clis labels Apr 5, 2026
@spencer-tb spencer-tb changed the title feat(consume): consume direct with per-type result models and exception matching feat(test-consume): direct with per type result models and exception matching Apr 5, 2026
@spencer-tb spencer-tb force-pushed the feat/consume-direct-engine branch from d67f961 to 8e48ce6 Compare April 8, 2026 17:02
@spencer-tb spencer-tb changed the title feat(test-consume): direct with per type result models and exception matching feat(validate): new validate command for direct client binary testing Apr 8, 2026
@spencer-tb spencer-tb force-pushed the feat/consume-direct-engine branch 7 times, most recently from 2c76c12 to 2658023 Compare April 8, 2026 19:40
New `validate` CLI command for running EEST fixtures directly against
client EVM binaries, replacing Hive for execution correctness testing.

Usage:
  validate health                    # health check all clients
  validate engine --client geth      # engine tests
  validate state --client besu       # state tests
  validate block --client nethermind # block tests

Features:
- 7 clients: geth, besu, nethermind, erigon, reth, ethrex, nimbus
- Per-type Pydantic result models: StateTestResult, BlockTestResult,
  EngineTestResult with type-specific fields
- Exception matching: maps client error strings to EEST exception
  types via ExceptionMapper, verifies correct exception for every
  invalid test (--no-exception-check to disable)
- Cross-validation: lastBlockHash against fixture, lastPayloadStatus
  (VALID/INVALID) for engine tests
- validate.toml config for client binary paths with per-type overrides
  (state-bin, block-bin, engine-bin)
- Auto bin-workers and xdist tuning per client
- Bundled Frontier sanity fixtures for health checks
- Shared validate_helpers.py for validation logic

Client binary PRs:
- geth: ethereum/go-ethereum#34650
- erigon: erigontech/erigon#20315
- besu: besu-eth/besu#10184
- nethermind: NethermindEth/nethermind#11035
- reth: paradigmxyz/reth#23361
- ethrex: lambdaclass/ethrex#6445
- nimbus: status-im/nimbus-eth1#4101
- revm: bluealloy/revm#3544

Tracking issue: ethereum#2319
@spencer-tb spencer-tb force-pushed the feat/consume-direct-engine branch from 2658023 to 88d72b6 Compare April 8, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-client-clis Area: execution_testing.client_clis A-test-consume Area: execution_testing.cli.pytest_commands.plugins.consume C-feat Category: an improvement or new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant