Merged
Conversation
JkLondon
approved these changes
Mar 4, 2026
sudeepdino008
approved these changes
Mar 4, 2026
sudeepdino008
pushed a commit
that referenced
this pull request
Mar 4, 2026
---
1. heap.Fix instead of heap.Pop + heap.Push
The k-way merge uses a min-heap to track the current position in each
segment file. The original code for each key:
top := heap.Pop(&hi.h) // remove root, sift-down: O(log n)
// ... advance reader ...
heap.Push(&hi.h, top) // re-insert, sift-up: O(log n)
We can instead just update the root in-place:
top := hi.h[0] // peek — no removal
// ... advance reader ...
heap.Fix(&hi.h, 0) // only sift-down: O(log n)
Fix is cheaper because it skips the sift-up step that Push always does.
Works because the updated key is always ≥ the old one (files
are sorted ascending).
(-10%)
---
2. File index cached in ReconItem
For every key that passes the filter, the original code searched for the
corresponding history file:
historyItem, ok := hi.hc.getFileDeprecated(top.startTxNum, top.endTxNum)
// linear scan: for i := range ht.files { if files[i].startTxNum == ...
}
With 60 files, that's 60 comparisons per key. Since each ReconItem
already knows its file range (startTxNum/endTxNum), we look up the
index once at construction time and store it:
heap.Push(&h, &ReconItem{..., histFileIdx: j}) // done once per file
// ...
historyItem := hi.hc.files[top.histFileIdx] // O(1) array access per key
---
3. Eliminated SegReaderWrapper
The II segment files store key→txNum-sequence pairs. To read them, the
original code wrapped *seg.Reader in a SegReaderWrapper to
satisfy a stream.KV interface:
advance() → stream.KV interface → SegReaderWrapper → seg.ReaderI
interface → *seg.Reader
Two interface dispatch layers per call. SegReaderWrapper.Next() also had
a redundant HasNext() guard (advance already checked it). Now
ReconItem.g holds *seg.Reader directly:
advance() → *seg.Reader (direct call)
This is why allocations dropped 19% — the SegReaderWrapper heap
allocations are gone.
(-5%)
`erigon seg check-commitment-hist-at-blk-range` got faster: `4m58.038s
-> 4m5.294s`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The k-way merge uses a min-heap to track the current position in each segment file. The original code for each key:
top := heap.Pop(&hi.h) // remove root, sift-down: O(log n)
// ... advance reader ...
heap.Push(&hi.h, top) // re-insert, sift-up: O(log n)
We can instead just update the root in-place:
top := hi.h[0] // peek — no removal
// ... advance reader ...
heap.Fix(&hi.h, 0) // only sift-down: O(log n)
Fix is cheaper because it skips the sift-up step that Push always does. Works because the updated key is always ≥ the old one (files
are sorted ascending).
(-10%)
For every key that passes the filter, the original code searched for the corresponding history file:
historyItem, ok := hi.hc.getFileDeprecated(top.startTxNum, top.endTxNum)
// linear scan: for i := range ht.files { if files[i].startTxNum == ... }
With 60 files, that's 60 comparisons per key. Since each ReconItem already knows its file range (startTxNum/endTxNum), we look up the
index once at construction time and store it:
heap.Push(&h, &ReconItem{..., histFileIdx: j}) // done once per file
// ...
historyItem := hi.hc.files[top.histFileIdx] // O(1) array access per key
The II segment files store key→txNum-sequence pairs. To read them, the original code wrapped *seg.Reader in a SegReaderWrapper to
satisfy a stream.KV interface:
advance() → stream.KV interface → SegReaderWrapper → seg.ReaderI interface → *seg.Reader
Two interface dispatch layers per call. SegReaderWrapper.Next() also had a redundant HasNext() guard (advance already checked it). Now
ReconItem.g holds *seg.Reader directly:
advance() → *seg.Reader (direct call)
This is why allocations dropped 19% — the SegReaderWrapper heap allocations are gone.
(-5%)
erigon seg check-commitment-hist-at-blk-rangegot faster:4m58.038s -> 4m5.294s