Skip to content

HistoryRange: perf#19605

Merged
AskAlexSharov merged 5 commits intomainfrom
alex/hist_range_34
Mar 4, 2026
Merged

HistoryRange: perf#19605
AskAlexSharov merged 5 commits intomainfrom
alex/hist_range_34

Conversation

@AskAlexSharov
Copy link
Collaborator

@AskAlexSharov AskAlexSharov commented Mar 4, 2026


  1. heap.Fix instead of heap.Pop + heap.Push

The k-way merge uses a min-heap to track the current position in each segment file. The original code for each key:
top := heap.Pop(&hi.h) // remove root, sift-down: O(log n)
// ... advance reader ...
heap.Push(&hi.h, top) // re-insert, sift-up: O(log n)
We can instead just update the root in-place:
top := hi.h[0] // peek — no removal
// ... advance reader ...
heap.Fix(&hi.h, 0) // only sift-down: O(log n)
Fix is cheaper because it skips the sift-up step that Push always does. Works because the updated key is always ≥ the old one (files
are sorted ascending).
(-10%)


  1. File index cached in ReconItem

For every key that passes the filter, the original code searched for the corresponding history file:
historyItem, ok := hi.hc.getFileDeprecated(top.startTxNum, top.endTxNum)
// linear scan: for i := range ht.files { if files[i].startTxNum == ... }
With 60 files, that's 60 comparisons per key. Since each ReconItem already knows its file range (startTxNum/endTxNum), we look up the
index once at construction time and store it:
heap.Push(&h, &ReconItem{..., histFileIdx: j}) // done once per file
// ...
historyItem := hi.hc.files[top.histFileIdx] // O(1) array access per key


  1. Eliminated SegReaderWrapper

The II segment files store key→txNum-sequence pairs. To read them, the original code wrapped *seg.Reader in a SegReaderWrapper to
satisfy a stream.KV interface:
advance() → stream.KV interface → SegReaderWrapper → seg.ReaderI interface → *seg.Reader
Two interface dispatch layers per call. SegReaderWrapper.Next() also had a redundant HasNext() guard (advance already checked it). Now
ReconItem.g holds *seg.Reader directly:
advance() → *seg.Reader (direct call)
This is why allocations dropped 19% — the SegReaderWrapper heap allocations are gone.

(-5%)

erigon seg check-commitment-hist-at-blk-range got faster: 4m58.038s -> 4m5.294s

@AskAlexSharov AskAlexSharov marked this pull request as draft March 4, 2026 10:30
@AskAlexSharov AskAlexSharov changed the title Alex/hist range 34 hist: Mar 4, 2026
@AskAlexSharov AskAlexSharov changed the title hist: HistoryRange: perf Mar 4, 2026
@AskAlexSharov AskAlexSharov marked this pull request as ready for review March 4, 2026 10:45
@AskAlexSharov AskAlexSharov merged commit ccf360c into main Mar 4, 2026
33 of 38 checks passed
@AskAlexSharov AskAlexSharov deleted the alex/hist_range_34 branch March 4, 2026 11:36
sudeepdino008 pushed a commit that referenced this pull request Mar 4, 2026
---
  1. heap.Fix instead of heap.Pop + heap.Push

The k-way merge uses a min-heap to track the current position in each
segment file. The original code for each key:
  top := heap.Pop(&hi.h)   // remove root, sift-down: O(log n)
  // ... advance reader ...
  heap.Push(&hi.h, top)    // re-insert, sift-up: O(log n)
  We can instead just update the root in-place:
  top := hi.h[0]           // peek — no removal
  // ... advance reader ...
  heap.Fix(&hi.h, 0)       // only sift-down: O(log n)
Fix is cheaper because it skips the sift-up step that Push always does.
Works because the updated key is always ≥ the old one (files
  are sorted ascending).
(-10%)

  ---
  2. File index cached in ReconItem

For every key that passes the filter, the original code searched for the
corresponding history file:
historyItem, ok := hi.hc.getFileDeprecated(top.startTxNum, top.endTxNum)
// linear scan: for i := range ht.files { if files[i].startTxNum == ...
}
With 60 files, that's 60 comparisons per key. Since each ReconItem
already knows its file range (startTxNum/endTxNum), we look up the
  index once at construction time and store it:
  heap.Push(&h, &ReconItem{..., histFileIdx: j})  // done once per file
  // ...
historyItem := hi.hc.files[top.histFileIdx] // O(1) array access per key

  ---
  3. Eliminated SegReaderWrapper

The II segment files store key→txNum-sequence pairs. To read them, the
original code wrapped *seg.Reader in a SegReaderWrapper to
  satisfy a stream.KV interface:
advance() → stream.KV interface → SegReaderWrapper → seg.ReaderI
interface → *seg.Reader
Two interface dispatch layers per call. SegReaderWrapper.Next() also had
a redundant HasNext() guard (advance already checked it). Now
  ReconItem.g holds *seg.Reader directly:
  advance() → *seg.Reader (direct call)
This is why allocations dropped 19% — the SegReaderWrapper heap
allocations are gone.

(-5%)



`erigon seg check-commitment-hist-at-blk-range` got faster: `4m58.038s
-> 4m5.294s`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants