Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(consume): allow geth to validate eof test vectors via direct (#1232) #1246

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

felix314159
Copy link
Collaborator

🗒️ Description

Geth supports eofparse and can be used to validate a given eof test (code valid yes/no without running the code). The command I used for fill was uv run fill -vv -m "not slow" -m eof_test --fork=osaka --evm-bin=$HOME/Documents/evmone/build/bin/evmone-t8n and the consume command I tested the code with was uv run consume direct --input=./fixtures/eof_tests --bin=$HOME/Documents/go-ethereum/build/bin/evm -s (I added -s to see the output of some print statements I put in test_via_direct.py to see more detailed info about what is going on). It shows all 3895 tests passed in less than a minute, lmk what can be improved. Next I would try to do the same for consume direct via evmone.

🔗 Related Issues

✅ Checklist

  • All: Set appropriate labels for the changes.
  • All: Considered squashing commits to improve commit history.
  • All: Added an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • Tests: All converted JSON/YML tests from ethereum/tests have been added to converted-ethereum-tests.txt.
  • Tests: A PR with removal of converted JSON/YML tests from ethereum/tests have been opened.
  • Tests: Included the type and version of evm t8n tool used to locally execute test cases: e.g., ref with commit hash or geth 1.13.1-stable-3f40e65.
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

@felix314159 felix314159 changed the title feat(consume): allow geth to validate eof test vectors via direct feat(consume): allow geth to validate eof test vectors via direct (#1232) Feb 24, 2025
Copy link
Member

@danceratopz danceratopz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this @felix314159!

This needs to be tested against some test fixtures that are expected to fail, as

INFO [02-25|10:26:56.473] Executed tests                           passed=3 "total executed"=4

still leads to all fixtures in the file passing the test.

Two (obvious) comments:

  1. We're forced into hacky methods as the eofparse tool doesn't provide any structured output. Perhaps we should add this to geth's eofparse instead of writing brittle code in EEST. I looked at Besu's tool yesterday and they do return structured output. We could adopt this:
{
    "fileName": "/home/dtopz/.cache/ethereum-execution-spec-tests/cached_downloads/v4.0.0/fixtures_eip7692/fixtures/eof_tests/osaka/eip7692_eof_v1/eip3540_eof_v1/eof_example/eof_example_custom_fields.json",
    "group": "tests/osaka/eip7692_eof_v1/eip3540_eof_v1/test_eof_example.py::test_eof_example_custom_fields[fork_Osaka-eof_test]",
    "name": "0",
    "fork": "Osaka",
    "pass": true,
    "expectedError": null,
    "actualError": null
}
  1. Regardless of whether we improve eofparse or not, better testing of consume direct is a must. We can use pytester-level tests (which can test the behavior of entire pytest test sessions) to catch fails like the one above. I can start adding some in a separate PR.

Let's discuss 1. in our meeting today.

):
"""Geth's implementation of the fixture consumer."""

# ------------------------------------ state test ---------------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to remove these separators. Perhaps we need to organize the code into more logical groups if this class is getting out of hand.

Suggested change
# ------------------------------------ state test ---------------------------------------------

):
"""Geth's implementation of the fixture consumer."""

# ------------------------------------ state test ---------------------------------------------
@cache # noqa: B019
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the code (B019) here. Did you see the pop-up (from the ruff extension)? Very helpful:
image

As we only have one instance of GethFixtureConsumer per consume run, I think we can leave it as-is. Open to other thoughts though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the memory leaks should not be a problem for consume since we have a single instance per run as you mention, and also consume is not a long-running process anyway, so it should be ok for the sake of code simplicity.

@cache # noqa
def consume_state_test_file(
# ------------------------------------ eof test -----------------------------------------------
def extract_int_from_subprocess_string(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about?

Suggested change
def extract_int_from_subprocess_string(
def extract_test_count_from_eofparse_summary(

def extract_int_from_subprocess_string(
self, s: str, substring_of_interest_prefix: str
) -> int | None:
"""Hacky method to extract relevant substring from string returned by subprocess execution.""" # noqa: E501
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically one-liners should be concise and fit within our ruff-line-length (99). If clarification is needed, then it should be added as a separate block below. I think this is easy to fix in this case :)

Suggested change
"""Hacky method to extract relevant substring from string returned by subprocess execution.""" # noqa: E501
"""Hacky method to extract test counts from `eofparse` console output."""

) -> int | None:
"""Hacky method to extract relevant substring from string returned by subprocess execution.""" # noqa: E501
# get everything after this substring occurrence, then remove everything not in [0,9]
relevant_substring = re.sub(r"\D", "", s.split('total executed"=', 1)[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to raise an exception if the Executed tests line isn't found in the eofparse output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could be made more explicit by re-writing it to match 2 regex groups, one for the "passed" and one for the "total" count and return them both in a single call.

all_tests_passed: bool = False
if total_tests_amount is not None and passed_tests_amount is not None:
if total_tests_amount == passed_tests_amount:
all_tests_passed = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will result in fixture file level pass/fail resolution and not give a detailed per test case (test fixture) resolution?

Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments, but looking good overall 👍

):
"""Geth's implementation of the fixture consumer."""

# ------------------------------------ state test ---------------------------------------------
@cache # noqa: B019
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the memory leaks should not be a problem for consume since we have a single instance per run as you mention, and also consume is not a long-running process anyway, so it should be ok for the sake of code simplicity.

) -> int | None:
"""Hacky method to extract relevant substring from string returned by subprocess execution.""" # noqa: E501
# get everything after this substring occurrence, then remove everything not in [0,9]
relevant_substring = re.sub(r"\D", "", s.split('total executed"=', 1)[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be:

Suggested change
relevant_substring = re.sub(r"\D", "", s.split('total executed"=', 1)[1])
relevant_substring = re.sub(r"\D", "", s.split(substring_of_interest_prefix, 1)[1])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is failing after the change for me:
For s equal to 'INFO [03-17|17:33:22.741] Executed tests passed=4 "total executed"=4', relevant_substring becomes 44 when substring_of_interest_prefix=='tests passed='.

I think this function can be replaced with:

m = re.search(r'tests passed=(\d+) "total executed"=(\d+)', s)

and then tests passed is int(m.group(1)) and tests executed is int(m.group(2)).

Comment on lines +372 to +374
def extract_int_from_subprocess_string(
self, s: str, substring_of_interest_prefix: str
) -> int | None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def extract_int_from_subprocess_string(
self, s: str, substring_of_interest_prefix: str
) -> int | None:
@staticmethod
def extract_int_from_subprocess_string(
s: str, substring_of_interest_prefix: str
) -> int | None:

This should be static since it doesn't access any instance properties, then you can call it with GethFixtureConsumer.extract_int_from_subprocess_string.

Comment on lines +383 to +385
def eofparse_subprocess_output_to_dict(
self, process_result: str, fixture_name: str | None, fork_name: str
) -> Dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def eofparse_subprocess_output_to_dict(
self, process_result: str, fixture_name: str | None, fork_name: str
) -> Dict:
@staticmethod
def eofparse_subprocess_output_to_dict(
process_result: str, fixture_name: str | None, fork_name: str
) -> Dict:

Also can be an static method, because self.extract_int_from_subprocess_string can become GethFixtureConsumer.extract_int_from_subprocess_string if that method is also static, and then to call this one we would do GethFixtureConsumer.eofparse_subprocess_output_to_dict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:consume Scope: Consume command suite type:feat type: Feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants