test: nuke int; move state transition tests to state transition module#166
test: nuke int; move state transition tests to state transition module#166wemeetagain merged 6 commits intomainfrom
Conversation
Summary of ChangesHello @spiral-ladder, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the project's test suite by reorganizing how state transition tests are structured and executed. The primary goal is to improve test clarity, modularity, and efficiency. By moving state transition tests into their dedicated module and renaming the integration test directory, the changes enable more focused testing and better reflect the nature of different test types within the codebase. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request does a great job of reorganizing the tests to improve locality and modularity, which is a significant improvement for the project's structure. The addition of sanity tests for many of the state transition functions is also a positive step.
My main feedback is that most of the new sanity tests only check that the function doesn't error, without asserting that the state has been modified correctly. The project's style guide emphasizes a high density of assertions to ensure correctness. While this applies to production code, the principle is equally valuable for tests to make them more robust and effective at catching regressions. I've added several specific suggestions to add post-condition assertions to the new tests. Applying this pattern to the other new tests as well would greatly increase their value.
| test "process block header - sanity" { | ||
| const allocator = std.testing.allocator; | ||
|
|
||
| var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256); | ||
| const slot = config.mainnet.chain_config.ELECTRA_FORK_EPOCH * preset.SLOTS_PER_EPOCH + 2025 * preset.SLOTS_PER_EPOCH - 1; | ||
| defer test_state.deinit(); | ||
|
|
||
| const proposers = test_state.cached_state.getEpochCache().proposers; | ||
|
|
||
| var message: types.electra.BeaconBlock.Type = types.electra.BeaconBlock.default_value; | ||
| const proposer_index = proposers[slot % preset.SLOTS_PER_EPOCH]; | ||
|
|
||
| var header_parent_root: [32]u8 = undefined; | ||
| try types.phase0.BeaconBlockHeader.hashTreeRoot(test_state.cached_state.state.latestBlockHeader(), &header_parent_root); | ||
|
|
||
| message.slot = slot; | ||
| message.proposer_index = proposer_index; | ||
| message.parent_root = header_parent_root; | ||
|
|
||
| const beacon_block = BeaconBlock{ .electra = &message }; | ||
|
|
||
| const block = Block{ .regular = beacon_block }; | ||
| try processBlockHeader(allocator, test_state.cached_state, block); | ||
| } |
There was a problem hiding this comment.
This sanity test is a good start, but it only verifies that processBlockHeader doesn't return an error. To make the test more robust, it should also assert that the function has the intended side effects on the state.
Specifically, processBlockHeader updates the latest_block_header in the state. You could add assertions to verify that the header has been updated correctly with the new block's data.
Additionally, the style guide (line 296) recommends adding a comment at the top of tests to explain their goal and methodology. Adding a brief description would improve clarity for future readers.
test "process block header - sanity" {
// Verifies that processBlockHeader correctly updates the state's latest_block_header
// with data from a valid block, without returning an error.
const allocator = std.testing.allocator;
var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256);
const slot = config.mainnet.chain_config.ELECTRA_FORK_EPOCH * preset.SLOTS_PER_EPOCH + 2025 * preset.SLOTS_PER_EPOCH - 1;
defer test_state.deinit();
const proposers = test_state.cached_state.getEpochCache().proposers;
var message: types.electra.BeaconBlock.Type = types.electra.BeaconBlock.default_value;
const proposer_index = proposers[slot % preset.SLOTS_PER_EPOCH];
var header_parent_root: [32]u8 = undefined;
try types.phase0.BeaconBlockHeader.hashTreeRoot(test_state.cached_state.state.latestBlockHeader(), &header_parent_root);
message.slot = slot;
message.proposer_index = proposer_index;
message.parent_root = header_parent_root;
const beacon_block = BeaconBlock{ .electra = &message };
const block = Block{ .regular = beacon_block };
try processBlockHeader(allocator, test_state.cached_state, block);
// Assert that the latest block header has been updated.
const new_header = test_state.cached_state.state.latestBlockHeader();
try std.testing.expectEqual(slot, new_header.slot);
try std.testing.expectEqual(proposer_index, new_header.proposer_index);
try std.testing.expectEqualSlices(u8, &header_parent_root, &new_header.parent_root);
}
References
- The style guide recommends adding a description at the top of a test to explain its goal and methodology. This helps readers understand the test's purpose without having to read the implementation details. (link)
- The style guide emphasizes using assertions to detect programmer errors and ensure correctness. While this test checks for errors, it doesn't assert that the function's post-conditions are met, which is a key part of verifying its behavior. (link)
| test "process eth1 data - sanity" { | ||
| const allocator = std.testing.allocator; | ||
|
|
||
| var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256); | ||
| defer test_state.deinit(); | ||
|
|
||
| const block = types.electra.BeaconBlock.default_value; | ||
| try processEth1Data(allocator, test_state.cached_state, &block.body.eth1_data); | ||
| } |
There was a problem hiding this comment.
Similar to other new sanity tests, this one could be strengthened by asserting post-conditions. The processEth1Data function appends to eth1_data_votes. The test should verify that a vote has been added. This ensures the function is behaving as expected, not just that it runs without crashing.
test "process eth1 data - sanity" {
const allocator = std.testing.allocator;
var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256);
defer test_state.deinit();
const block = types.electra.BeaconBlock.default_value;
const votes_before = test_state.cached_state.state.eth1DataVotes().items.len;
try processEth1Data(allocator, test_state.cached_state, &block.body.eth1_data);
const votes_after = test_state.cached_state.state.eth1DataVotes().items.len;
try std.testing.expect(votes_after == votes_before + 1);
}
References
- The style guide emphasizes using assertions to detect programmer errors and ensure correctness. While this test checks for errors, it doesn't assert that the function's post-conditions are met, which is a key part of verifying its behavior. (link)
| test "process execution payload - sanity" { | ||
| const allocator = std.testing.allocator; | ||
|
|
||
| var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256); | ||
| defer test_state.deinit(); | ||
|
|
||
| var execution_payload: types.electra.ExecutionPayload.Type = types.electra.ExecutionPayload.default_value; | ||
| execution_payload.timestamp = test_state.cached_state.state.genesisTime() + test_state.cached_state.state.slot() * config.mainnet.chain_config.SECONDS_PER_SLOT; | ||
| var body: types.electra.BeaconBlockBody.Type = types.electra.BeaconBlockBody.default_value; | ||
| body.execution_payload = execution_payload; | ||
|
|
||
| var message: types.electra.BeaconBlock.Type = types.electra.BeaconBlock.default_value; | ||
| message.body = body; | ||
|
|
||
| const beacon_block = BeaconBlock{ .electra = &message }; | ||
| const block = Block{ .regular = beacon_block }; | ||
|
|
||
| try processExecutionPayload( | ||
| allocator, | ||
| test_state.cached_state, | ||
| block.beaconBlockBody(), | ||
| .{ .execution_payload_status = .valid, .data_availability_status = .available }, | ||
| ); | ||
| } |
There was a problem hiding this comment.
This test correctly sets up a valid timestamp to pass the check within processExecutionPayload. To make it more comprehensive, you should add an assertion to verify that the state was updated correctly. processExecutionPayload calls state.setLatestExecutionPayloadHeader. A good assertion would be to check that the latest_execution_payload_header in the state has been updated and is no longer the default value.
test "process execution payload - sanity" {
const allocator = std.testing.allocator;
var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256);
defer test_state.deinit();
var execution_payload: types.electra.ExecutionPayload.Type = types.electra.ExecutionPayload.default_value;
execution_payload.timestamp = test_state.cached_state.state.genesisTime() + test_state.cached_state.state.slot() * config.mainnet.chain_config.SECONDS_PER_SLOT;
// Use a non-default block hash to make the assertion meaningful.
execution_payload.block_hash = [_]u8{1} ** 32;
var body: types.electra.BeaconBlockBody.Type = types.electra.BeaconBlockBody.default_value;
body.execution_payload = execution_payload;
var message: types.electra.BeaconBlock.Type = types.electra.BeaconBlock.default_value;
message.body = body;
const beacon_block = BeaconBlock{ .electra = &message };
const block = Block{ .regular = beacon_block };
try processExecutionPayload(
allocator,
test_state.cached_state,
block.beaconBlockBody(),
.{ .execution_payload_status = .valid, .data_availability_status = .available },
);
const new_header = test_state.cached_state.state.latestExecutionPayloadHeader();
try std.testing.expectEqualSlices(u8, &execution_payload.block_hash, &new_header.getBlockHash());
}
References
- The style guide emphasizes using assertions to detect programmer errors and ensure correctness. While this test checks for errors, it doesn't assert that the function's post-conditions are met, which is a key part of verifying its behavior. (link)
| test "process randao - sanity" { | ||
| const allocator = std.testing.allocator; | ||
|
|
||
| var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256); | ||
| const slot = config.mainnet.chain_config.ELECTRA_FORK_EPOCH * preset.SLOTS_PER_EPOCH + 2025 * preset.SLOTS_PER_EPOCH - 1; | ||
| defer test_state.deinit(); | ||
|
|
||
| const proposers = test_state.cached_state.getEpochCache().proposers; | ||
|
|
||
| var message: types.electra.BeaconBlock.Type = types.electra.BeaconBlock.default_value; | ||
| const proposer_index = proposers[slot % preset.SLOTS_PER_EPOCH]; | ||
|
|
||
| var header_parent_root: [32]u8 = undefined; | ||
| try types.phase0.BeaconBlockHeader.hashTreeRoot(test_state.cached_state.state.latestBlockHeader(), &header_parent_root); | ||
|
|
||
| message.slot = slot; | ||
| message.proposer_index = proposer_index; | ||
| message.parent_root = header_parent_root; | ||
|
|
||
| const beacon_block = BeaconBlock{ .electra = &message }; | ||
| const block = Block{ .regular = beacon_block }; | ||
| try processRandao(test_state.cached_state, block.beaconBlockBody(), block.proposerIndex(), false); | ||
| } |
There was a problem hiding this comment.
This sanity test is good for ensuring the function runs without errors on a valid state. To improve it, you could add an assertion to check that the randao_mixes in the state is actually updated by processRandao. You can store the old mix, call the function, and then assert that the new mix is different.
test "process randao - sanity" {
const allocator = std.testing.allocator;
var test_state = try TestCachedBeaconStateAllForks.init(allocator, 256);
const slot = config.mainnet.chain_config.ELECTRA_FORK_EPOCH * preset.SLOTS_PER_EPOCH + 2025 * preset.SLOTS_PER_EPOCH - 1;
defer test_state.deinit();
const epoch = test_state.cached_state.getEpochCache().epoch;
const old_mix = test_state.cached_state.state.randaoMixes()[epoch % preset.EPOCHS_PER_HISTORICAL_VECTOR];
const proposers = test_state.cached_state.getEpochCache().proposers;
var message: types.electra.BeaconBlock.Type = types.electra.BeaconBlock.default_value;
const proposer_index = proposers[slot % preset.SLOTS_PER_EPOCH];
var header_parent_root: [32]u8 = undefined;
try types.phase0.BeaconBlockHeader.hashTreeRoot(test_state.cached_state.state.latestBlockHeader(), &header_parent_root);
message.slot = slot;
message.proposer_index = proposer_index;
message.parent_root = header_parent_root;
const beacon_block = BeaconBlock{ .electra = &message };
const block = Block{ .regular = beacon_block };
try processRandao(test_state.cached_state, block.beaconBlockBody(), block.proposerIndex(), false);
const new_mix = test_state.cached_state.state.randaoMixes()[epoch % preset.EPOCHS_PER_HISTORICAL_VECTOR];
try std.testing.expect(!std.mem.eql(u8, &old_mix, &new_mix));
}
References
- The style guide emphasizes using assertions to detect programmer errors and ensure correctness. While this test checks for errors, it doesn't assert that the function's post-conditions are met, which is a key part of verifying its behavior. (link)
| test "processEffectiveBalanceUpdates - sanity" { | ||
| try @import("../test_utils/test_runner.zig").TestRunner( | ||
| processEffectiveBalanceUpdates, | ||
| .{ | ||
| .alloc = false, | ||
| .err_return = true, | ||
| .void_return = false, | ||
| }, | ||
| ).testProcessEpochFn(); | ||
| defer @import("../state_transition.zig").deinitStateTransition(); | ||
| } |
There was a problem hiding this comment.
Using a TestRunner for a basic sanity check is convenient, but for a function like processEffectiveBalanceUpdates, a more specific test case would be more valuable. The current test runs on a default state where no effective balances are likely to be updated.
Consider adding a dedicated test that sets up a validator with a balance that would trigger the hysteresis update, then call processEffectiveBalanceUpdates and assert that the validator's effective_balance has been updated as expected. This would provide a much stronger guarantee of correctness.
References
- The style guide emphasizes using assertions to detect programmer errors and ensure correctness. While this test checks for errors, it doesn't assert that the function's post-conditions are met, which is a key part of verifying its behavior. (link)
| const res = stateTransition(allocator, test_state.cached_state, signed_block, tc.transition_opt); | ||
| if (tc.expect_error) { | ||
| if (res) |_| { | ||
| try testing.expect(false); | ||
| } else |_| {} | ||
| } else { | ||
| if (res) |post_state| { | ||
| defer { | ||
| post_state.deinit(); | ||
| allocator.destroy(post_state); | ||
| } | ||
| } else |_| { | ||
| try testing.expect(false); | ||
| } | ||
| } |
There was a problem hiding this comment.
This is a great test case that covers multiple scenarios for the state transition function. To make the success case (expect_error = false) even stronger, you could add assertions on the post_state. For example, you could verify that the state has advanced to the new block's slot.
const res = stateTransition(allocator, test_state.cached_state, signed_block, tc.transition_opt);
if (tc.expect_error) {
if (res) |_| {
try testing.expect(false);
} else |_| {}
} else {
if (res) |post_state| {
defer {
post_state.deinit();
allocator.destroy(post_state);
}
try testing.expectEqual(signed_block.message().slot(), post_state.state.slot());
} else |_| {
try testing.expect(false);
}
}
References
- The style guide emphasizes using assertions to detect programmer errors and ensure correctness. While this test checks for errors, it doesn't assert that the function's post-conditions are met, which is a key part of verifying its behavior. (link)
|
Most of the Gemini reviews seem to be good suggestions for a more comprehensive test cleanup |
|
we may want to defer merging this until the TreeView PRs are in to make the lives of those working on that easier |
wemeetagain
left a comment
There was a problem hiding this comment.
yes lets please wait to merge until we have merged #168
In a similar vein with moving the `ssz` tests to the ssz module, we move the state transition tests to `state_transition` because they aren't actually integration tests, and with that we nuke `int` entirely. We rename `int_slow` to `int` - technically our era tests are the only 'integration' tests anyway! Why this re-organization: - `int` was a ambiguous test directory name to begin with - there were some tests spread between their own module and within the `int` directory, despite the tests actually not being integration in nature. This was clearer in the `ssz` tests refactored in #162. - we can now test each module independently. Previously, if we made changes to `ssz`, we're forced to do both `zig build test:ssz` and `zig build test:int` because our tests were scattered. It also unnecessarily runs the `state_transition` tests since they're both put in `int`. With this structure we can test only the module we are working on and made changes in. - test locality - it's obvious right away if a certain key procedure does not have tests, if the tests are located in proximity with the implementation.
b167eea
b2ff97d to
b167eea
Compare
7169378 to
c17ec5e
Compare
4f85260 to
4c5fbfd
Compare
4c5fbfd to
f180d8d
Compare
#166) In a similar vein with moving the `ssz` tests to the ssz module, we move the state transition tests to `state_transition` because they aren't actually integration tests, and with that we nuke `int` entirely. We rename `int_slow` to `int` - technically our era tests are the only 'integration' tests anyway! Why this re-organization: - `int` was a ambiguous test directory name to begin with - there were some tests spread between their own module and within the `int` directory, despite the tests actually not being integration in nature. This was more obvious to see in the `ssz` tests refactored in #162. - we can now test each module independently. Previously, if we made changes to `ssz`, we're forced to do both `zig build test:ssz` and `zig build test:int` because our tests were scattered. It also unnecessarily runs the `state_transition` tests since they're both put in `int`. With this structure we can test only the module we are working on and made changes in. - test locality - it's obvious right away if a certain key procedure does not have tests, if the tests are located in proximity with the implementation. This is a big PR in terms of files changed but really, it's all just file deletions and moving code around.
In a similar vein with moving the
ssztests to the ssz module, we move the state transition tests tostate_transitionbecause they aren't actually integration tests, and with that we nukeintentirely.We rename
int_slowtoint- technically our era tests are the only 'integration' tests anyway!Why this re-organization:
intwas a ambiguous test directory name to begin with - there were some tests spread between their own module and within theintdirectory, despite the tests actually not being integration in nature. This was more obvious to see in thessztests refactored in move ssz int tests to ssz module #162.we can now test each module independently. Previously, if we made changes to
ssz, we're forced to do bothzig build test:sszandzig build test:intbecause our tests were scattered. It also unnecessarily runs thestate_transitiontests since they're both put inint. With this structure we can test only the module we are working on and made changes in.test locality - it's obvious right away if a certain key procedure does not have tests, if the tests are located in proximity with the implementation.
This is a big PR in terms of files changed but really, it's all just file deletions and moving code around.