Skip to content

cmd/evm: add enginetest command for direct engine fixture execution#34650

Draft
spencer-tb wants to merge 9 commits intoethereum:masterfrom
spencer-tb:feat/evm-enginetest
Draft

cmd/evm: add enginetest command for direct engine fixture execution#34650
spencer-tb wants to merge 9 commits intoethereum:masterfrom
spencer-tb:feat/evm-enginetest

Conversation

@spencer-tb
Copy link
Copy Markdown
Contributor

@spencer-tb spencer-tb commented Apr 3, 2026

Summary

Add evm enginetest command that runs blockchain_test_engine fixtures directly against a lightweight Engine API handler, without requiring Hive or full client startup. Also add --workers flag to all three test runners (enginetest, blocktest, statetest) for parallel fixture file processing.

evm enginetest

A new direct runner for Engine API test fixtures. Implements a lightweight engine handler that mirrors the core logic of eth/catalyst.ConsensusAPI:

  • Version-specific NewPayloadV1-V5 parameter validation
  • ExecutableDataToBlock payload conversion and block insertion via InsertBlockWithoutSetHead
  • ForkchoiceUpdated with SetCanonical head management (including initial FCU to genesis)
  • Invalid block ancestor tracking with proper PayloadStatusV1 responses
  • EngineAPIError code validation against fixture expectations (errorCode, validationError)

This exercises the actual engine code path (two-phase insert-then-canonicalize via forkchoice), not just InsertChain like blocktest.

Benchmarks

Tested against EEST v5.3.0 stable fixtures on Apple M-series.

For reference, Hive runs of the same test suite on the same geth version:

evm enginetest — exercises the same engine code paths as consume engine (40,523 tests):

Workers Time Speedup vs serial vs Hive consume engine
1 1m02s 1x ~162x
8 12.0s 5.2x ~840x

evm blocktest — exercises the same execution paths as consume rlp (43,924 tests):

Workers Time Speedup vs serial
1 1m06s 1x
8 12.7s 5.2x

evm statetest (40,553 tests):

Workers Time Speedup
1 21.8s 1x
8 4.4s 4.9x

Hive parity

Tested against v5.3.0 stable release — exact same 4 failures as Hive consume engine on geth master:

  • eip7002/test_system_contract_deployment[CancunToPragueAtTime15k-deploy_after_fork-nonzero_balance]
  • eip7002/test_system_contract_deployment[CancunToPragueAtTime15k-deploy_after_fork-zero_balance]
  • eip7251/test_system_contract_deployment[CancunToPragueAtTime15k-deploy_after_fork-nonzero_balance]
  • eip7251/test_system_contract_deployment[CancunToPragueAtTime15k-deploy_after_fork-zero_balance]

Usage

# Run engine fixtures
evm enginetest /path/to/blockchain_tests_engine/

# With parallel workers
evm enginetest --workers 8 /path/to/blockchain_tests_engine/

# Filter by regex
evm enginetest --run "eip4844" /path/to/fixtures/

# Human-readable output
evm enginetest --human /path/to/fixtures/

# Same --workers flag on blocktest and statetest
evm blocktest --workers 8 /path/to/blockchain_tests/
evm statetest --workers 8 /path/to/state_tests/

@spencer-tb spencer-tb force-pushed the feat/evm-enginetest branch 5 times, most recently from 29e3f4c to 00957ea Compare April 3, 2026 16:47
return engine.PayloadStatusV1{Status: engine.INVALID}, engineParamsErr("nil versionedHashes post-cancun")
case p.BeaconRoot == nil:
return engine.PayloadStatusV1{Status: engine.INVALID}, engineParamsErr("nil beaconRoot post-cancun")
case !h.checkFork(params.Timestamp, forks.Cancun, forks.Prague, forks.Osaka, forks.BPO1, forks.BPO2, forks.BPO3, forks.BPO4, forks.BPO5):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to double check this for my own sanity -- for some reason I was expecting it to just say forks.Cancun

Copy link
Copy Markdown
Contributor Author

@spencer-tb spencer-tb Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the real ConsensusAPI.NewPayloadV3 at api.go:204 which allows V3 for Cancun through BPO5. So maybe can be changed there too but not 100% certain :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, seems I was remembering this:

case !api.checkFork(params.Timestamp, forks.Cancun):

Comment on lines +202 to +204
if postCheck != nil {
defer postCheck(result, chain)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: I think we may want to use a closure here becauseresult here would be evaluated at the callsite and not when the defer is triggered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small self-contained example to elaborate:

package main

import "fmt"

func runBuggy(postCheck func(error, string)) (result error) {
	chain := "some-chain"
	// `result` is evaluated at the callsite, so its nil and not when the defer fires.
	if postCheck != nil {
		defer postCheck(result, chain)
	}
	result = fmt.Errorf("payload 3: expected VALID, got INVALID")
	return result
}

func runFixed(postCheck func(error, string)) (result error) {
	chain := "some-chain"
	// closure captures `result` by reference, so it reads the final value when the defer fires.
	if postCheck != nil {
		defer func() { postCheck(result, chain) }()
	}
	result = fmt.Errorf("payload 3: expected VALID, got INVALID")
	return result
}
func main() {
	check := func(res error, chain string) {
		fmt.Println("  postCheck got error:", res)
	}
	fmt.Println("buggy:")
	runBuggy(check)
	fmt.Println("fixed:")
	runFixed(check)
}

var tests map[string]*tests.BlockTest
if err = json.Unmarshal(src, &tests); err != nil {
return nil, err
return nil, nil // Skip non-fixture JSON files
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to confirm that this also skips errors from malformed fixture files?

- Add lastBlockHash to blocktest/enginetest, lastPayloadStatus to enginetest
- Remove stateRoot from blocktest/enginetest (only statetest has it)
- Report validation/rejection error in `error` even when test passes,
  for negative tests (expected exceptions)
- Enables EELS consume direct to map errors through ExceptionMapper
  and verify correct exception for every invalid test
@MariusVanDerWijden
Copy link
Copy Markdown
Member

I like the idea. I don't know if we need --worker though, we could just default to runtime.NumCPU()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants