[SharovBot] fix(commitment): count first-nibble branches in CanDoConcurrentNext to fix flaky test#19568
Open
Giulio2002 wants to merge 1 commit intomainfrom
Open
[SharovBot] fix(commitment): count first-nibble branches in CanDoConcurrentNext to fix flaky test#19568Giulio2002 wants to merge 1 commit intomainfrom
Giulio2002 wants to merge 1 commit intomainfrom
Conversation
…o fix flaky test Test_Trie_CorrectSwitchForConcurrentAndSequential fails intermittently (CI run 22561824327, job 65349926766) because CanDoConcurrentNext() only checks if nibble-0 has a branch node. The test uses a globally-shared rand.NewSource(42) protected by a mutex, so different parallel test runs consume the random stream at different offsets — occasionally producing a 150-key set where zero (or one) keys land at nibble 0 after keccak256 hashing. In that case the function returns false even though the trie is wide enough to parallelize, causing the require.True at line 280 to fail. Root cause: arbitrary choice of nibble 0 as the single probe. P(zero keys at nibble 0 with 150 random keccak hashes) ≈ (15/16)^150 ≈ 5.6e-5 per run — invisible in dev but visible across thousands of CI runs. Fix: instead of probing a single nibble, count how many of the 16 first-nibble sub-tries have stored branch data. Require at least 4 to justify spawning 16 parallel goroutines. - 150 random keys: ~15.997 of 16 nibbles filled (P(< 4 filled) ≈ 0) → true ✓ - 2 remaining keys: at most 2 nibbles filled → false ✓ This makes the decision independent of any particular nibble's occupancy, eliminating the flakiness while keeping the correct semantic (small key sets → sequential).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[SharovBot]
Problem
Test_Trie_CorrectSwitchForConcurrentAndSequentialfails intermittently on CI:Failing job: https://github.com/erigontech/erigon/actions/runs/22561824327/job/65349926766
Root Cause
CanDoConcurrentNext()only checks if nibble 0 has a branch node:The test generates 150 keys using a globally shared
rand.New(rand.NewSource(42))protected by a mutex. Because multiple tests run in parallel they consume the random stream at different offsets each run. Occasionally, the resulting 150 keys hash (keccak256) such that zero or one keys have first nibble 0 — the nibble-0 branch is empty, so the function returnsfalseeven though the trie is perfectly wide enough to parallelize.P(zero keys at nibble 0 with 150 random keccak hashes) ≈ (15/16)^150 ≈ 5.6×10⁻⁵ per run. Invisible in dev, but observable across thousands of CI runs.
Fix
Instead of probing a single arbitrary nibble, count how many of the 16 first-nibble sub-tries have stored branch data. Return
trueonly if at least 4 nibbles are populated.true✓false✓The threshold of 4 makes the decision independent of any particular nibble's occupancy and preserves the correct semantic: small key sets are processed sequentially.
Testing
go test ./execution/commitment/... -run Test_Trie_CorrectSwitchForConcurrentAndSequential -count=5→ 5/5 PASS./execution/commitment/...suite → all pass