Skip to content

etl: enable Append/AppendDup for certain state tables during etl loading#19956

Closed
sudeepdino008 wants to merge 5 commits intomainfrom
etl-optimization
Closed

etl: enable Append/AppendDup for certain state tables during etl loading#19956
sudeepdino008 wants to merge 5 commits intomainfrom
etl-optimization

Conversation

@sudeepdino008
Copy link
Copy Markdown
Member

@sudeepdino008 sudeepdino008 commented Mar 17, 2026

Summary

Enable AppendDup for DupSort Keys tables (ii/history keys) by sorting ETL entries by (key, value) at flush time. Keys tables use txNum as the key, and since txNums are always increasing during execution, new data is always past the end of the table — making Append/AppendDup safe. With value sorting, AppendDup works correctly for these DupSort tables.

  • For tables with random/scattered keys (e.g. AccountIdx keyed by address, StorageHistoryVals keyed by address+slot, LogTopicsIdx keyed by topic hash), the first ETL key is almost always ≤ the last key in the table, so canUseAppend is false for the entire load and everything falls back to Put.
  • A dynamic canUseAppend approach (switching between Append and Put mid-load based on key comparison) was tried but did not yield better results — when keys are randomly distributed, the vast majority of entries require Put anyway, and the overhead of per-entry switching negates any benefit from the rare Append.

Changes

  • Sort ETL entries by (key, value) at flush time for Keys tables, enabling AppendDup — all records use AppendDup instead of Put
  • Selective IdentityLoadFunc migration: only indexKeys collectors and non-DupSort history vals use it; everything else keeps loadFunc (forces Put) to preserve insertion-order-dependent upsert logic (domain non-largeVals)
  • Deduplicate consecutive identical (key, value) pairs in the AppendDup path — retried transactions can produce duplicate entries that Put handles idempotently but AppendDup rejects with MDBX_EKEYMISMATCH

Per-table behavior

Collector Sort values? IdentityLoadFunc? Observed result
ii/history keys (all DupSort) Yes Yes All records via AppendDup
ii values / Idx (all DupSort) No No (loadFunc) All Put — entity keys (addresses, topic hashes) are scattered
history vals — Account, Storage, Commitment, Receipt (DupSort) No No (loadFunc) All Put — entity keys are scattered
history vals — RCache (non-DupSort) No Yes Allows Append when canUseAppend is true (1st commit only in practice — key includes address prefix → scattered after)
history vals — Code (non-DupSort) No Yes Allows Append when canUseAppend is true (1st commit only in practice — key=address+txNum → scattered after)
domain vals (all) No No (custom loadFunc) All Put — unchanged from main

Safety constraint

The heap's DupSort value comparison is gated behind haveSortingGuaranties && isDupSort && c.sortValues — never applied to domain non-largeVals flush which uses a custom loadFunc with SeekBothRange/DeleteCurrent/Put that depends on insertion order.

Future: Receipt history vals

ReceiptHistoryVals is DupSort (HistoryLargeValues: false) and currently uses loadFunc (all Put). However, Receipt entity keys may be sequential enough for canUseAppend to hold, meaning AppendDup could also work here. This would require adding IdentityLoadFunc and potentially .SortValues(true) for the Receipt history vals collector. Left as a follow-up to validate separately.

Sepolia validation

Ran stage_exec --batchSize=10mb for 10 minutes on identical Sepolia state (2 step ranges removed):

Metric main (baseline) Feature branch
Gas mismatches 0 0
MDBX errors 0 0
Commit cycles 25 21
Keys tables all Put all AppendDup

Test plan

  • go test ./db/etl/ -count=1 — all pass (includes 6 new tests + 1 benchmark)
  • go test ./db/state/ -count=1 -short — all pass
  • make erigon integration — builds clean
  • make lint — 0 issues (2 runs)
  • Sepolia stage_exec — 0 gas mismatches, 0 errors across 21 commit cycles

🤖 Generated with Claude Code

Sort ETL entries by (key, value) at flush time for ii/history Keys tables,
enabling AppendDup where keys (txNums) are always ascending. This replaces
Put with AppendDup for Keys tables, achieving 100% append ratio.

Changes:
- Add SortByKeyAndValue() to sortableBuffer for (key, value, insertionOrder) sort
- Add sortValues flag to Collector with chainable SortValues(bool) setter
- Pass sortValues through FlushToDisk/FlushToDiskAsync/sortAndFlush
- Add sortValues field to Heap for DupSort-aware merge (gated behind
  haveSortingGuaranties && isDupSort && sortValues)
- indexKeys collectors: .SortValues(true) + IdentityLoadFunc (AppendDup)
- index collectors: loadFunc (Put, unchanged)
- DupSort history vals: loadFunc (Put, unchanged)
- Non-DupSort history vals: IdentityLoadFunc (enables Append)
- Domain vals: unchanged (custom loadFunc preserved)
- Deduplicate consecutive identical (key, value) pairs in AppendDup path
  to handle parallel executor retries
- Add per-Load metrics: put/append/appendDup/dupSkip counts with appendRatio

Validated on Sepolia stage_exec: 0 gas mismatches, 0 MDBX errors,
Keys tables at 100% AppendDup across 21 commit cycles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sudeepdino008 sudeepdino008 marked this pull request as draft March 17, 2026 11:31
sudeepdino008 and others added 2 commits March 17, 2026 12:43
…story annotation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sudeepdino008 sudeepdino008 changed the title etl, state: enable AppendDup for DupSort Keys tables via value sorting etl: enable Append/AppendDup for certain state tables during etl loading Mar 17, 2026
@sudeepdino008
Copy link
Copy Markdown
Member Author

sudeepdino008 commented Mar 17, 2026

│ Metric │ main │ Feature │
│ Blocks in ~10 min │ 3,963 │ 3,499 (~12% fewer) │
│ Commit cycles │ 25 │ 21 │
│ Avg bdur (flush) │ ~29ms │ ~55ms │

The feature branch commits are taking roughly 2x longer (bdur). This is likely because SortByKeyAndValue() is ~10x slower than Sort()
(as the benchmark showed), and this cost is paid at every flush for the Keys table collectors.

TODO:

  • try with chaintip mode
  • try with larger BatchSize..
  • anyway to make sortbykeyandvalue faster? (it shouldn't be that slow)

  • same perf on chaintip mode; so 10x slower is only for larger batchSizes..
  • trying optimal impl of sortbykeyandvalue now.

@AskAlexSharov
Copy link
Copy Markdown
Collaborator

look at new etl sort in main: it does:
return int(a.insertionOrder - b.insertionOrder)
to do "stable sort" by calling slices.SortFunc

But there is one more trick which i decided to not release yet: cast first 8 bytes prefix of value to u64 and use prefix1 < prefix2 instead of bytes.Compare(val1, val2). In my head: it will produce "almost sorted values" fast - and (we can call .Put instead of .AppendDup)
Here: #19886

Only 1 problem: high chance that values are already sorted?

@AskAlexSharov
Copy link
Copy Markdown
Collaborator

AskAlexSharov commented Mar 18, 2026

also now we have stepSize/4 - maybe just enable commitment.history.largeValues=true and commitment.largeValues=true - then commitment will not use DupSort table - and we will not need sort it's values. I created PR: #19966 - will run this experiment

@sudeepdino008
Copy link
Copy Markdown
Member Author

things i learned:

  • dynamic switching of canUseAppend -- not worth it; we have two kinds of data -- either it's coming in ascending order (receipts/rcache) or historyKeys table, so no need to dynamic switch there; OR we have random keys (history values or ii values tbl), where the db will probably have some lexicographic bigger keys (0xff..) in which case dynamic switching happens too late.

  • chaintip perf has remained the same; so this will be specifically about etl.Load timings (and we need to measure against the SortByKeysAndValues, which is slower than SortByKeys)

  • stage_exec (batchSize=512mb) improves slightly, but inconclusive.

  • don't want to use Append/AppendDup for history values table - i think it's not worth it.

  • historyKeys table show about 25% improvement in Loading, but values sorting delay offsets it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants