Skip to content

Conversation

@LouisTsai-Csie
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie commented Jul 11, 2025

🗒️ Description

This PR introduces a new fill option, --gas-benchmark-values. Supply a comma-separated list of gas amounts (in millions) to set the values used during benchmarking.

The PR also adds two example tests in tests/benchmark/test_worst_blocks.py. To generate their fixtures, run:

uv run fill -v tests/benchmark/test_worst_blocks.py::test_block_full_data \
  --fork Prague \
  --gas-benchmark-values 1,10,30,60,90,120 \
  --generate-pre-alloc-groups \
  --clean

Flag --generate-pre-alloc-groups is required for the enginex fixture format.

The command creates two directories:

  • fixtures/blockchain_tests_engine_x/benchmark/worst_blocks
  • fixtures/blockchain_tests_engine_x/pre_alloc

Because only one preAllocGroup is produced, this process generates a single genesis file.

To generate the genesis file, please follow the documentation to run hive locally and run the extract_config command

For example: uv run extract_config --fixture fixtures/blockchain_tests_engine_x/pre_alloc/0x10763c36b27696c5.json

I would prefer to refactor the benchmark test in a separate PR, this task is updated in the issue.

I’ve reviewed the Filling Test section, and I see that the command and flag descriptions are generated by this script. However, I’m happy to contribute additional documentation if needed.

For pytest plugin test cases, I add three cases, you could run with the following command:
Case 1: Verify the --gas-benchmark-values flag is added
Case 2: Verify the flag works as expected if provided
Case 3: Verify the non-benchmark test is not affected.

python -m pytest src/pytest_plugins/filler/tests/test_benchmarking.py -v

🔗 Related Issues or PRs

Issue #1891

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx --with=tox-uv tox -e lint,typecheck,spellcheck,markdownlint
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

@LouisTsai-Csie LouisTsai-Csie self-assigned this Jul 11, 2025
@LouisTsai-Csie LouisTsai-Csie force-pushed the fill-benchmark-command branch from 8675c6c to d0413c8 Compare July 11, 2025 15:22
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I think this a great change for maintainability and make it easier for us to generate more vectors as they are required.

Small downside is that we have to remove Environment().gas_limit from most of tests, but I would say we do it the earlier the better.

cc @jsign for some feedback on my comments.

Thanks!

@jsign
Copy link
Collaborator

jsign commented Jul 11, 2025

Nice! @LouisTsai-Csie @marioevz, is this compatible with supporting all the test formats too? (i.e. #1778).

Mostly asking since I think this is coming from the fact of simplifying the single genesis for perfnets, but wondering if it should still be fine for the other formats that we need for zkVMs.

@marioevz
Copy link
Member

Nice! @LouisTsai-Csie @marioevz, is this compatible with supporting all the test formats too? (i.e. #1778).

Mostly asking since I think this is coming from the fact of simplifying the single genesis for perfnets, but wondering if it should still be fine for the other formats that we need for zkVMs.

Should be compatible out of the box, but I'll give that a look again and raise if the there's any concerns.

@LouisTsai-Csie LouisTsai-Csie marked this pull request as ready for review July 14, 2025 02:13
Copy link
Member

@danceratopz danceratopz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great @LouisTsai-Csie!

Shame, that this didn't occur to me up front in #1891, but I'd suggest that we move this codeto a new plugin that gets activated with fill by default. This should work well due to the composability of pytest plugins.

This means, we:

  1. Add these changes (and other benchmarking related pytest config, if any) to a separate pytest plugin, I'd suggest src/pytest_plugins/filling/benchmarking.py.
  2. Enable this plugin using -p via the fill command's pytest ini:
    addopts =
    -p pytest_plugins.concurrency
    -p pytest_plugins.filler.pre_alloc
    -p pytest_plugins.filler.filler
    -p pytest_plugins.filler.ported_tests
    -p pytest_plugins.filler.static_filler
    -p pytest_plugins.shared.execute_fill
    -p pytest_plugins.forks.forks
    -p pytest_plugins.eels_resolver
    -p pytest_plugins.help.help
    --tb short
    --ignore tests/cancun/eip4844_blobs/point_evaluation_vectors/

All benchmarking-related plugin customizations (e.g. pytest_addoption, pytest_generate_tests, etc.) currently in filler/filler.py can be moved directly to filler/benchmarking.py. This keeps the benchmarking logic self-contained. Pytest hooks from both modules should compose as expected.

To cleanly handle options/values that are specific to benchmarking, I'd suggestion the following approach, if you agree/like it feel free to go for it!

1. Define a filling mode enum in filler/filler.py:

from enum import StrEnum, unique

@unique
class FillMode(StrEnum):
    CONSENSUS = "consensus"
    BENCHMARKING = "benchmarking"

2. In the filler plugin (filler.py), set the default:

from _pytest.config import Config
from .filler import FillMode

def pytest_configure(config: Config) -> None:
    if not hasattr(config, "fill_mode"):
        config.fill_mode = FillMode.CONSENSUS

3. In the benchmarking plugin (filler/benchmarking.py), override only if --benchmark-gas-values is set:

from _pytest.config import Config
from .filler import FillMode

def pytest_configure(config: Config) -> None:
    if config.getoption("--benchmark-gas-values") is not None:
        config.fill_mode = FillMode.BENCHMARKING

4. Example usage in filler logic, wrapped in a fixture:

import pytest
from ,filler import FillMode

GIGA_GAS = 1_000_000_000

@pytest.fixture
def env() -> Environment:  # noqa: D103
    return 1_000_000_000)
    if config.fill_mode == FillMode.BENCHMARKING:
        return Environment(gas_limit=GIGA_GAS)
    else:
        return Environment()

@LouisTsai-Csie LouisTsai-Csie force-pushed the fill-benchmark-command branch from d0413c8 to 946de75 Compare July 15, 2025 09:09
Copy link
Member

@danceratopz danceratopz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks great to me!

One comment below.

@LouisTsai-Csie
Copy link
Collaborator Author

@danceratopz Thank you for review, but I am wondering the following:

  • Should I add test cases under src/cli/tests/? I noticed your recent PR included tests there.
  • Should I add documentation for the new flag? I’m happy to do that, but I ran into some issues building the docs locally with mkdocs (related to cairosvg and missing libcairo on macOS). Let me know if there’s a preferred workaround or if I should just update the markdown and let CI verify the build (not a good idea).

@danceratopz
Copy link
Member

@danceratopz Thank you for review, but I am wondering the following:

* Should I add test cases under `src/cli/tests/`? I noticed your recent [PR](https://github.com/ethereum/execution-spec-tests/pull/1855/files#diff-5c3633f8cbee135e20eb35f9537277edaf7ff69714db9f5c0993431a312ca5f5) included tests there.

I don't think it's strictly necessary for the PR, but some sanity check that the flag works is nice, of course. Recently, I've been pointing Claude at unit testing tasks.

* Should I add documentation for the new flag? I’m happy to do that, but I ran into some [issues](https://github.com/ethereum/execution-spec-tests/issues/1908) building the docs locally with `mkdocs` (related to cairosvg and missing `libcairo` on `macOS`). Let me know if there’s a preferred workaround or if I should just update the markdown and let CI verify the build (not a good idea).

Does this work?

uvx --with=tox-uv tox -e mkdocs

If so, its' because of the macOS trick found in these lines (you can then set the env var locally):

# Required for `cairosvg` so tox can find `libcairo-2`.
# https://squidfunk.github.io/mkdocs-material/plugins/requirements/image-processing/?h=cairo#cairo-library-was-not-found
DYLD_FALLBACK_LIBRARY_PATH = /opt/homebrew/lib

Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I did my suggestions locally and execute is working with the new flag! 🎉

@LouisTsai-Csie LouisTsai-Csie force-pushed the fill-benchmark-command branch from fa71d8d to 7e5f501 Compare July 18, 2025 15:21
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, I would like to see if we could rebase and use gas_benchmark_value in all benchmark tests so we can prepare for the next benchmark release if possible.

@LouisTsai-Csie LouisTsai-Csie force-pushed the fill-benchmark-command branch from 7e5f501 to ca18d04 Compare July 22, 2025 08:53
Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@marioevz marioevz merged commit 323b2b3 into ethereum:main Jul 22, 2025
14 checks passed
kclowes pushed a commit to kclowes/execution-spec-tests that referenced this pull request Oct 20, 2025
…nesis file (ethereum#1895)

* feat(fill): add benchmark gas valu command to support single genesis file

* refactor(tests): update benchmark test for supported command

* refactor(benchmark): consolidate benchmark configurations into a single entry

* doc(fill): update command description and changelog

* chore(fill): remove legacy gas benchmark values command

* refactor(fill): create gas benchmakr value pytest plugin

* test(fill): add pytest plugin test and update state test

* refactor(fill): add env fixture for benchmarking with gas limit configuration

* refactor: support both fill and execute mode

* fix: update ci flag and test command
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants