Skip to content

fix root updates bug in hash_builder#25

Open
flame4 wants to merge 18 commits into
alloy-rs:mainfrom
soonlabs:main
Open

fix root updates bug in hash_builder#25
flame4 wants to merge 18 commits into
alloy-rs:mainfrom
soonlabs:main

Conversation

@flame4
Copy link
Copy Markdown

@flame4 flame4 commented Aug 5, 2024

Hi, Thanks for implementing such an amazing crate to help merkle particia tree calculation.

During using, we found some weird updating case and made the smallest reproducible prototype. we add a test_updates_root test case to track updating, but nothing generated. and then we add a test_top_branch_logic to track every branch in codebase update. And we think the store_branch_node has some logic bug.

This bug will leads to that some branch nodes are not pushed into updates currently, but it doesn't affect the root calculation progress since it has no matters with the stack logic.

To reproduce the error, maybe you can just call test_updates_root directly. The origin call will get an empty updates, which is not reasonable.

We also refer the origin design doc on erigon, seems nothing clear about add_branch logic mentioned.

So, maybe we don't understand the logic currently (like some branch that did not generate updates were intentionally skipped), so can u plz offer some metarial to explain? Or hope this fix helps, we are willing to contribute to this repo constantly.

thanks,

@ferrell-code ferrell-code requested a review from gakonst as a code owner October 8, 2024 13:37
ferrell-code and others added 2 commits October 9, 2024 12:01
Signed-off-by: Charles Ferrell <charlie@manta.network>
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Add proptest coverage improvements to prevent vacuous test passes:

1. arbitrary_branch_updates_complete (ignored):
   - New test that verifies branch updates are complete for multi-leaf tries
   - Marked #[ignore] because it exposes the store_branch_node bug (PR #25)
   - Uses prop_assume!(len >= 2) to focus on meaningful cases

2. Prevent empty/trivial input issues in existing proptests:
   - arbitrary_hashed_root: require non-empty state
   - arbitrary_trie_root_raw_keys: require >= 2 entries
   - arbitrary_trie_root_with_updates: require >= 2 entries
   - arbitrary_deterministic_root: require >= 2 entries
   - arbitrary_common_prefix_stress: require >= 2 entries
   - arbitrary_add_leaf_unchecked_equivalence: require >= 2 entries

These changes ensure proptests exercise meaningful code paths (branching,
updates machinery) rather than passing vacuously on empty/singleton inputs.
Empty trie behavior is already covered by the dedicated empty() unit test.

Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019bf62a-fcc3-770a-b813-cf3151d34bf1
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Previously, branch nodes were only stored in `updated_branch_nodes` if they
had children that were themselves branches (non-empty tree_masks/hash_masks).
This caused branch nodes with only leaf children to be silently omitted from
updates, breaking incremental trie sync for:
- Small contracts with few storage slots
- Sparse trie regions where all children are leaves
- Fresh tries built without existing database data

The fix adds a third condition: store the branch if its parent references it
as a hashed child (which we just set in the parent's hash_mask). This ensures
all branch nodes that would be looked up during trie traversal are present.

Changes:
- Root node (len == 0) is always stored
- Non-root nodes are stored if they have branch children OR if the parent
  references this child in its hash_mask
- Updated test expectations for tree_mask (now correctly marks stored children)
- Added regression tests for the bug scenarios

Closes: #25
Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019bf607-2e34-74e0-b2d6-df3285ef82df
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Previously, branch nodes were only stored in `updated_branch_nodes` if they
had children that were themselves branches (non-empty tree_masks/hash_masks).
This caused branch nodes with only leaf children to be silently omitted from
updates, breaking incremental trie sync for:
- Small contracts with few storage slots
- Sparse trie regions where all children are leaves
- Fresh tries built without existing database data

The fix adds a third condition: store the branch if its parent references it
as a hashed child (which we just set in the parent's hash_mask). This ensures
all branch nodes that would be looked up during trie traversal are present.

Changes:
- Root node (len == 0) is always stored
- Non-root nodes are stored if they have branch children OR if the parent
  references this child in its hash_mask
- Updated test expectations for tree_mask (now correctly marks stored children)
- Added regression tests for the bug scenarios

Closes: #25
Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019bf607-2e34-74e0-b2d6-df3285ef82df
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Previously, branch nodes were only stored in `updated_branch_nodes` if they
had children that were themselves branches (non-empty tree_masks/hash_masks).
This caused branch nodes with only leaf children to be silently omitted from
updates, breaking incremental trie sync for:
- Small contracts with few storage slots
- Sparse trie regions where all children are leaves
- Fresh tries built without existing database data

The fix adds a third condition: store the branch if its parent references it
as a hashed child (which we just set in the parent's hash_mask). This ensures
all branch nodes that would be looked up during trie traversal are present.

Changes:
- Root node (len == 0) is always stored
- Non-root nodes are stored if they have branch children OR if the parent
  references this child in its hash_mask
- Updated test expectations for tree_mask (now correctly marks stored children)
- Added regression tests for the bug scenarios

Closes: #25
Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019bf62a-fcc3-770a-b813-cf3151d34bf1
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Previously, branch nodes were only stored in `updated_branch_nodes` if they
had children that were themselves branches (non-empty tree_masks/hash_masks).
This caused branch nodes with only leaf children to be silently omitted from
updates, breaking incremental trie sync for:
- Small contracts with few storage slots
- Sparse trie regions where all children are leaves
- Fresh tries built without existing database data

The fix adds a third condition: store the branch if its parent references it
as a hashed child (which we just set in the parent's hash_mask). This ensures
all branch nodes that would be looked up during trie traversal are present.

Changes:
- Root node (len == 0) is always stored
- Non-root nodes are stored if they have branch children OR if the parent
  references this child in its hash_mask
- Updated test expectations for tree_mask (now correctly marks stored children)
- Added regression tests for the bug scenarios
- Un-ignored tests that now pass with the fix

Closes: #25
Amp-Thread-ID: https://ampcode.com/threads/T-019bf62a-fcc3-770a-b813-cf3151d34bf1
Co-authored-by: Amp <amp@ampcode.com>
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Previously, branch nodes were only stored in `updated_branch_nodes` if they
had children that were themselves branches (non-empty tree_masks/hash_masks).
This caused branch nodes with only leaf children to be silently omitted from
updates, breaking incremental trie sync for:
- Small contracts with few storage slots
- Sparse trie regions where all children are leaves
- Fresh tries built without existing database data

The fix adds a third condition: store the branch if its parent references it
as a hashed child (which we just set in the parent's hash_mask). This ensures
all branch nodes that would be looked up during trie traversal are present.

Changes:
- Root node (len == 0) is always stored
- Non-root nodes are stored if they have branch children OR if the parent
  references this child in its hash_mask
- Updated test expectations for tree_mask (now correctly marks stored children)
- Added regression tests for the bug scenarios
- Un-ignored tests that now pass with the fix

Closes: #25
Amp-Thread-ID: https://ampcode.com/threads/T-019bf62a-fcc3-770a-b813-cf3151d34bf1
Co-authored-by: Amp <amp@ampcode.com>
zerosnacks added a commit that referenced this pull request Jan 25, 2026
Previously, branch nodes were only stored in `updated_branch_nodes` if they
had children that were themselves branches (non-empty tree_masks/hash_masks).
This caused branch nodes with only leaf children to be silently omitted from
updates, breaking incremental trie sync for:
- Small contracts with few storage slots
- Sparse trie regions where all children are leaves
- Fresh tries built without existing database data

The fix adds a third condition: store the branch if its parent references it
as a hashed child (which we just set in the parent's hash_mask). This ensures
all branch nodes that would be looked up during trie traversal are present.

Changes:
- Root node (len == 0) is always stored
- Non-root nodes are stored if they have branch children OR if the parent
  references this child in its hash_mask
- Updated test expectations for tree_mask (now correctly marks stored children)
- Added regression tests for the bug scenarios
- Un-ignored tests that now pass with the fix

Closes: #25
Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019bf64d-c82f-7682-9933-a37c563468a6
@zerosnacks
Copy link
Copy Markdown
Contributor

Hey @flame4, thanks for reporting this issue and providing the initial fix attempt!

I've implemented a fix in #124 that addresses the root cause I believe. The issue was that store_branch_node wasn't storing branch nodes whose children were all leaves (no sub-branches).

The fix adds a third condition: store the branch if its parent references it as a hashed child. This ensures all branch nodes get stored, including the root and any branch with only leaf children.

I've included your test cases from this PR as regression tests. Would you mind taking a look at #124 to verify it solves the issue you originally encountered?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants