feat(specs,tests): Update all benchmark tests to use gas_benchmark_value#1935
feat(specs,tests): Update all benchmark tests to use gas_benchmark_value#1935
gas_benchmark_value#1935Conversation
|
OK, pushed it for today :) |
0e966d4 to
8f1795a
Compare
|
Updated the branch with the following changes:
No need to restart clients anymore to perform every single benchmark we have 😉 |
jsign
left a comment
There was a problem hiding this comment.
Very nice change! Left some opinions for your consideration.
Lateral question: is there a way to compare the generated fixtures of this branch against master to see if there are only changes in the expected tests that had bugs/improvements? (e.g., not sure if there's some produced "hash" of each fixture output that could be compared to double check).
I'm a paranoid person, so for such a big refactoring I'm always fearing something falling through the cracks.
If it sounds like a rabbit hole, dismiss.
…an run from a single genesis
… if not specified
8f1795a to
1fca4d9
Compare
I think I've got your solution with this evmone new feature: ipsilon/evmone#1274 I could give it a try to (1) test it out for pawel, (2) verify that the tests produce similar opcode numbers before and after this PR using the same gas limits. |
Makes sense, since actually most "compute" test might not have a visible effect in block header fields such as state root and similar (used gas is a red herring) -- so the opcode count actually sounds better to achieve that. For the "stateful" ones, checking the block header might be enough. |
|
Generated the fixtures from master and from this PR with the opcode counts produced by ipsilon/evmone#1274: benchmark-fixtures-with-opcode-count.tar.gz Each fixture in every fixture file now contains a I have not made the comparison because the name changes make it tricky, but will do it next week. |
|
I gpt'd a script to get the differences (which could definitely be included in the PR to support opcode counts!) and got the results: https://gist.github.com/marioevz/b745761e0882549c59ab9195b6865386 |
Looks awesome! |
|
Merging since I believe @LouisTsai-Csie comments were addressed 👍 |
…alue` (ethereum#1935) * feat(specs): Check gas used by benchmark tests matches expectation * feat(tests): Update all benchmark tests to use `gas_benchmark_value` * fix: tox * fix(plugins/benchmarking): Increase GIGA_GAS * fix(tests): Environment usages * docs: Changelog * fix(tox): tests-deployed-benchmark command line * fix: Remove unused parameter * fix: `test_worst_storage_access_cold` and add more cases * fix: `test_worst_bytecode_single_opcode` incorrect env overwrite * fix(plugins): Bump `BENCHMARKING_MAX_GAS` so all benchmarking tests can run from a single genesis * feat(plugins): Pass `env` fixture to the test spec builder by default if not specified * fix(benchmark): Remove `env` from tests where it's unused * fix(benchmark): tests referencing the `fee_recipient` from the environment * fix(plugins): Set benchmark environment based on test mark * fix: minor fix * fix(plugins): Defaults for `env` and `genesis_environment`
🗒️ Description
Add check for gas used in blockchain and state tests
blockchain_testandstate_testnow have a flag calledexpected_benchmark_gas_usedwhich can be used to specify the expected gas used by the last payload (the benchmark payload).If the value is not specified, the test is expected to use the value specified by
--gas-benchmark-valuesentirely.Modified all benchmark tests to use
gas_benchmark_valueAll benchmark tests now use
gas_benchmark_value, and alsoexpected_benchmark_gas_usedif the test is not expected to use the entire gas (e.g. when the tx is not expected to run out-of-gas and instead consume a specific amount of gas).🔗 Related Issues or PRs
N/A.
✅ Checklist
toxchecks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:uvx --with=tox-uv tox -e lint,typecheck,spellcheck,markdownlinttype(scope):.