-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add to epoch processing tests generation states before and after full epoch processing #4155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Add to epoch processing tests generation states before and after full epoch processing #4155
Conversation
… epoch processing
Hey @leolara, thanks for the PR! This is an interesting one. You're right, this is a shortcoming of that test format. We need to be very careful when making changes to test formats though. I'm glad to see it's backwards compatible, that's how it should be. It might take me a few days to properly review this. |
@jtraglia no rush :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR!
Overall, I'm in a favor of having those changes. But before we merge it, would it be possible to have an example of a test (as stated on #4025) that would be caught thanks to this refactoring?
Additionally, I think I'd like to have @mkalinin's input to make sure those changes address the point he was making initially. I'd also like to have a client's team input to know how difficult it would be for clients to handle the extra outputs. I've sent a message in Teku's discord regarding this.
I took a quick look at our implementation in Teku and I don't think it would be too hard to adapt our tests to consume this updated format, maintaining it backward compatible. I am not sure if you are planning to add a config for the test as part of its outputs so we can check the "type" of the test, and properly consume both pre and post epoch states. Otherwise we'd need to rely on the absence of |
Co-authored-by: Gabriel Astieres <[email protected]>
Co-authored-by: Gabriel Astieres <[email protected]>
Re what @GabrielAstieres says about a specific test, perhaps @mkalinin that made the issue have specific examples. For me is clear how this change will catch the bugs that the original unit test finds while will also catch bugs related to the data dependencies between sub-transitions of process_epoch, that currently were not being covered. But I have not specific examples as I have not been working on the consensus clients recently. |
Re what @lucassaldanha says about how to consuming the vectors. My idea is that the current consumer will keep working, and it can be substituted once this change is accepted by simpler code that simply loads pre_epoch, runs process_epoch and then asserts post_epoch. After this current change is accepted the four states should be present always. I don't think there is a way that the new ones are absent after they are produced with this new version of the generation. If you mean to make the consumer work also with older versions of the vectors, then yes you would have to check for the presence of the new states and fallback to the old version of the consumer if they are not present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @leolara!
@jtraglia should we run this with a consumer to test that it doesn't break it? What do you think it is the simplest client to run that run the consumer for epoch_processing? Or @lucassaldanha has run this already with Teku? Should I run it with teku? |
One of examples where this change would be useful is #4024. These tests use custom A thought, what if we add I think making all epoch processing tests yielding full epoch transition test vectors might not have the desired effect on catching bugs. The reason is that each sub-transition test populates the state with data relevant to the sub-transition it is covering. And the class of tests that covers cases on the edge of dependency between different sub-transitions requires relevant data in the state. And this is great if test format will support this type of tests which this PR addresses 👍 |
Afaik, all epoch processing tests, use
I think this is not necessary, for what I said earlier, but it could be a transition mechanism to a new format when there are only full transition tests. Moving all of them to the new sub-format. I can implement it this way if everyone thinks it is better. Then basically all the other sub-formats will become empty. |
I tested with Teku. But not with the whole test vector, just a few examples. However, it is mostly adding the extra ssz files to the test so I'd assume for any client running the tests, unless they actually consume the files it won't do anything. |
Do we have an estimate of the size impact of having those extra 2x states per test? Just wondering if this is something we should consider. Also I'll try to run some benchmarks to see if tests are taking too much longer for a full run. |
A rough estimate would be an additional 226 MiB in total; double the current size.
This is noticeable but it shouldn't be an issue. The current size of everything is 2.7 GiB.
|
Ok! I tested with a subset of the tests and the time increase was negligible. I think we won't even feel the impact of these extra state checks in a full run |
There are a few tests that use I am not against converting all tests into a new format. My point was not changing existing tests but rather adding a full epoch transition and use it in some already written tests that checks cases where an execution of one sub-transition may depend on the result of executing the others. But I agree that among already written tests there can be cases in which switching to the proposed format can increase the coverage. |
I see there are 7 tests that use
I can search for already existing tests, where the sub-transition data depends on a previous sub-transition. |
These are the tests that I am aware of: |
I have changed run_pending_deposit_applying to the new format, for the pending deposit epoch sub-transition tests. I have removed the test test_apply_pending_deposit_top_up__zero_balance that was the only that didn't pass. My idea was that a validator with zero balance is going to exit and it is not required how this method would behave with it in a situation where it is not exiting. If I am wrong and we need to test this let me know. |
Currently, I have all the tests changed to the new format except What do you suggest? Do I change the test completely so it does not have to do this manipulation between sub-transitions? Or I leave it as the only one with pre_epoch and post_epoch? |
What are you suggesting, changing these tests to be epoch_processing format? Could we do that in another PR? |
In the existing protocol design top-ups are happening regardless of the validator state, this is what is checked by this state as 0 balance is one of the edge case conditions. I think we should not drop this test
Why can’t this test yield pre and post epoch transition states too?
Doing it in a separate PR is fine |
from eth2spec.test.helpers.epoch_processing import run_epoch_processing_from, run_epoch_processing_to, \ | ||
run_process_slots_up_to_epoch_boundary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from eth2spec.test.helpers.epoch_processing import run_epoch_processing_from, run_epoch_processing_to, \ | |
run_process_slots_up_to_epoch_boundary | |
from eth2spec.test.helpers.epoch_processing import ( | |
run_epoch_processing_from, | |
run_epoch_processing_to, | |
run_process_slots_up_to_epoch_boundary, | |
) |
""" | ||
Processes to the next epoch transition, up to, but not including, the sub-transition named ``process_name`` | ||
""" | ||
slot = state.slot + (spec.SLOTS_PER_EPOCH - state.slot % spec.SLOTS_PER_EPOCH) | ||
if not disable_slots_processing: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to put it this way:
if not disable_slots_processing: | |
if enable_slot_processing: |
|
||
## Condition (alternative) | ||
|
||
Instead of having a different handler for each sub-transition, a single handler for all cases should load `pre_full` state, call `process_epoch` and then assert that the result state should match `post_full` state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it should be instead and couldn't be in addition to the pre
and post
state checks?
According to #4025 epoch processing format test vectors cannot detect a class of bugs related to data dependencies between epoch processing sub-transitions.
Also, the current epoch processing format is incompatible with single pass optimised epoch processing like https://ethresear.ch/t/formally-verified-optimised-epoch-processing/17359
This PR, adds a backward compatible addition to the epoch processing format, that allows to run the full epoch processing transition for each of the test vectors of this format.
This has the advantages:
As a disadvantage this condition takes more resources to compute, but just a constant amount per test vector. And the previously test runner will still keep running as expected.