Skip to content

Commit 193d04a

Browse files
mh0ltMark Holtclaude
authored
State Cache Consolidation (PR erigontech#1 of the perf stack) (erigontech#21380)
This is **PR erigontech#1 of a 3-PR perf stack**. It consolidates state caching across all modes — sync, tip-tracking, integration and testing — so the state cache is either **on and completely trustworthy**, or **off to measure** — never off because it unexpectedly breaks things. The caches (the account/storage `StateCache` and the commitment `BranchCache`) are an **internal implementation detail of `SharedDomains`**. No external entity accesses or mutates them directly: callers drive state through `Flush` / `Commit` / `GetLatest` / `DomainPut`, and the full cache lifecycle (population, invalidation, commit-gating) is owned inside `SharedDomains`. ## What this PR contains ### `BranchCache` — single aggregator-scope commitment cache - Aggregator-lifetime cache: pinned root slot + bounded LRU tail, behind the `sd.mem` chain so unwinds and fork-validations see consistent state. - Wired into the trie read + encoder write paths. - Tx-precise unwind invalidation: entries are stamped with their per-key write `txNum`; `sd.Unwind` evicts everything above the unwind watermark (`BranchCache.UnwindTo`). ### One switch for all caches The `BranchCache` is a *type of* state cache, so it rides the existing `USE_STATE_CACHE` toggle rather than getting its own env. One operator switch turns **all** caching off — the relevant operation when bisecting a state-root mismatch, where an operator shouldn't have to reason about the interaction of several independent caches. (This is a deliberate deviation from the review's "add a separate `BranchCache` kill-switch" suggestion — flagged for confirmation.) ### The BranchCache reflects only committed state — by construction The BranchCache can never hold a value a failed commit rolled back: `SharedDomains.Commit` flushes the in-memory batch into the tx, commits, and **only then** applies the flushed commitment branches to the cache (the flush is implicit in committing; a failed commit applies nothing). Plain `Flush` — callers that own their own commit, e.g. offline tools — never touches the cache; read-through populates from committed files. The caches are an internal detail of `SharedDomains`; nothing else writes to them. > **StateCache no-poisoning is erigontech#21386, not this PR.** Unlike the BranchCache, the account/storage StateCache is an *in-flight, cross-transaction* cache — it holds prior txs' not-yet-committed writes within a batch, and later txs read them from it. So it can't be made commit-safe by simply invalidating on write (that breaks cross-tx reads in serial exec). Its no-poisoning is the txNum/epoch rework in erigontech#21386; this PR keeps its existing ValidateAndPrepare/unwind invalidation. ### BUG erigontech#21138 — parallel-exec from-0 wrong trie root `ResetExec` wipes the commitment DB table; the aggregator's in-memory `BranchCache` could still reference the just-deleted trie nodes, so a from-0 re-exec served stale entries when computing block 0's commitment → wrong root, dropping genesis-allocated balances no later block touched (mainnet block 46147, `0xA1E4380A3B1f749673E270229993eE55F35663b4`). Fix: `ResetExec` clears the aggregator's `BranchCache`. `TestFromZero_GenesisAllocPreservedAfterResetReExec` passes on current `main`; the test's value here is keeping *this PR's* cache safe across reset, not fixing a live `main` bug. ## Follow-ups (the rest of the stack) - **erigontech#21386 (PR erigontech#2 of the stack) — StateCache LRU + Mode rework:** consistency + no performance drop-off at a 1 GB cache for long-running nodes; re-adds the warm StateCache repopulation deferred above, under its `txNum`/`epoch` model. - **Pinning** (the stack's third step) — **no PR yet**; in progress and under test on branch [`mh/branch-cache-trunk-pin`](https://github.com/erigontech/erigon/tree/mh/branch-cache-trunk-pin), to be **re-benchmarked before merge**. - **erigontech#21739** — interface-unification follow-up: collapse the duck-typed `GetLatest` variants into a single metered, `txNum`-returning `GetLatest`. ## Testing Behaves identically across parallel and serial exec — confirmed in CI across both exec modes. Unit coverage for the BranchCache (tiers, tx-precise `UnwindTo`, commit-gated population) plus the engine/exec-module FCU commit paths. Given today's changes (the commit-gating of both caches), we will do another A/B performance run before merging. --------- Co-authored-by: Mark Holt <erigon@dev-bm-e3-ethmainnet-n4.erigon.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 18ad562 commit 193d04a

39 files changed

Lines changed: 1467 additions & 1175 deletions

cmd/evm/staterunner.go

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ import (
3535
"github.com/erigontech/erigon/common/log/v3"
3636
"github.com/erigontech/erigon/db/datadir"
3737
"github.com/erigontech/erigon/db/kv/temporal/temporaltest"
38+
"github.com/erigontech/erigon/db/state/execctx"
3839
"github.com/erigontech/erigon/execution/tests/testutil"
3940
"github.com/erigontech/erigon/execution/tracing/tracers/logger"
4041
"github.com/erigontech/erigon/execution/vm"
@@ -224,7 +225,15 @@ func runStateTest(ctx *cli.Context, cfg vm.Config, fname string) ([]testResult,
224225
}
225226
defer tx.Rollback()
226227

227-
statedb, root, err := test.Run(nil, tx, st, cfg, dirs)
228+
// Per-subtest SD: closed without Flush so its writes never enter the branch cache.
229+
sd, err := execctx.NewSharedDomains(context.Background(), tx, log.New())
230+
if err != nil {
231+
result.Pass, result.Error = false, err.Error()
232+
return
233+
}
234+
defer sd.Close()
235+
236+
statedb, root, err := test.Run(nil, sd, tx, st, cfg, dirs)
228237
if err != nil {
229238
result.Pass, result.Error = false, err.Error()
230239
}
@@ -238,8 +247,9 @@ func runStateTest(ctx *cli.Context, cfg vm.Config, fname string) ([]testResult,
238247
}
239248
}
240249
if bench {
250+
// Reuse the subtest's tx+sd: a second concurrent rwtx on the same env would deadlock.
241251
_, stats, _ := timedExec(true, func() ([]byte, uint64, error) {
242-
_, _, gasUsed, _ := test.RunNoVerify(nil, tx, st, cfg, dirs)
252+
_, _, gasUsed, _ := test.RunNoVerify(nil, sd, tx, st, cfg, dirs)
243253
return nil, gasUsed, nil
244254
})
245255
result.Stats = &stats

cmd/integration/commands/stages.go

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -758,13 +758,10 @@ func stageExec(db kv.TemporalRwDB, ctx context.Context, logger log.Logger) error
758758
return err
759759
}
760760
}
761-
if err := doms.Flush(ctx, tx); err != nil {
762-
return err
763-
}
764-
doms.ClearRam(true)
765-
if err := tx.Commit(); err != nil {
761+
if err := doms.Commit(ctx, tx); err != nil {
766762
return err
767763
}
764+
doms.Close()
768765
if tx, err = db.BeginTemporalRw(ctx); err != nil {
769766
return err
770767
}
@@ -780,15 +777,20 @@ func stageExec(db kv.TemporalRwDB, ctx context.Context, logger log.Logger) error
780777
if err := tx.Commit(); err != nil {
781778
return err
782779
}
780+
if bn+1 >= block { // last block committed; nothing more to run
781+
break
782+
}
783783
if tx, err = db.BeginTemporalRw(ctx); err != nil {
784784
return err
785785
}
786+
// Fresh SD for the next block: a committed SD is never reused.
787+
if doms, err = execctx.NewSharedDomains(ctx, tx, logger); err != nil {
788+
return err
789+
}
790+
doms.SetInMemHistoryReads(false)
786791
}
787-
if err := doms.Flush(ctx, tx); err != nil {
788-
return err
789-
}
790-
doms.ClearRam(true)
791-
return tx.Commit()
792+
doms.Close()
793+
return nil
792794
}
793795
agg := (db.(dbstate.HasAgg).Agg()).(*dbstate.Aggregator)
794796
blockSnapBuildSema := semaphore.NewWeighted(int64(runtime.NumCPU()))

cmd/integration/commands/state_stages.go

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ func syncBySmallSteps(db kv.TemporalRwDB, builderConfig buildercfg.BuilderConfig
169169
if err != nil {
170170
return err
171171
}
172-
defer sd.Close()
172+
defer func() { sd.Close() }() // closes whichever SD is current after the commit loop swaps it
173173
sd.SetInMemHistoryReads(false)
174174

175175
var batchSize datasize.ByteSize
@@ -283,18 +283,20 @@ func syncBySmallSteps(db kv.TemporalRwDB, builderConfig buildercfg.BuilderConfig
283283
return err
284284
}
285285

286-
if err = sd.Flush(ctx, tx); err != nil {
287-
return err
288-
}
289-
sd.ClearRam(true)
290-
if err = tx.Commit(); err != nil {
286+
if err = sd.Commit(ctx, tx); err != nil {
291287
return err
292288
}
289+
sd.Close()
293290

294291
if tx, err = db.BeginTemporalRw(ctx); err != nil {
295292
return err
296293
}
297294
defer tx.Rollback()
295+
// Fresh SD: a committed SD is never reused.
296+
if sd, err = execctx.NewSharedDomains(ctx, tx, logger1); err != nil {
297+
return err
298+
}
299+
sd.SetInMemHistoryReads(false)
298300
}
299301

300302
//receiptsInDB := rawdb.ReadReceiptsByNumber(tx, progress(tx, stages.Execution)+1)

common/maphash/maphash.go

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,3 +163,26 @@ func (l *LRU[V]) SetByHash(hash uint64, value V) {
163163
func (l *LRU[V]) ContainsByHash(hash uint64) bool {
164164
return l.cache.Contains(hash)
165165
}
166+
167+
// DeleteByHash removes the entry under the pre-computed hash. Use
168+
// alongside Range when the byte-key is unknown.
169+
func (l *LRU[V]) DeleteByHash(hash uint64) {
170+
l.cache.Remove(hash)
171+
}
172+
173+
// Range iterates over every (hash, value) pair without affecting LRU
174+
// recency (uses Peek under the hood). Iteration order is unspecified.
175+
// Return false from fn to stop early.
176+
//
177+
// The original byte-key is not recoverable; use the hash as identity (same key → same hash, collisions aside).
178+
func (l *LRU[V]) Range(fn func(hash uint64, v V) bool) {
179+
for _, h := range l.cache.Keys() {
180+
v, ok := l.cache.Peek(h)
181+
if !ok {
182+
continue
183+
}
184+
if !fn(h, v) {
185+
return
186+
}
187+
}
188+
}

db/integrity/commitment_integrity.go

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -955,7 +955,6 @@ func checkCommitmentHistValBucket(ctx context.Context, tx kv.TemporalTx, br serv
955955
// checkCommitmentHistAtBlkWithIdx checks commitment for blockNum using the pre-built
956956
// per-domain key index from ChangedKeysPerBlockIdx.
957957
func checkCommitmentHistAtBlkWithIdx(ctx context.Context, tx kv.TemporalTx, sd *execctx.SharedDomains, br services.FullBlockReader, blockNum uint64, idx *ChangedKeysPerBlockIdx, lvl log.Lvl, logger log.Logger) error {
958-
sd.ClearRam(true)
959958
logger.Log(lvl, "checking commitment hist at block", "blockNum", blockNum)
960959
header, err := br.HeaderByNumber(ctx, tx, blockNum)
961960
if err != nil {
@@ -1131,19 +1130,22 @@ func CheckCommitmentHistAtBlkRange(ctx context.Context, sc SamplerCfg, db kv.Tem
11311130
return err
11321131
}
11331132
defer tx.Rollback()
1134-
sd, err := execctx.NewSharedDomains(wCtx, tx, logger, execctx.WithoutDeferredBranchUpdates())
1135-
if err != nil {
1136-
return err
1137-
}
1138-
defer sd.Close()
11391133
idx, err := NewChangedKeysPerBlockIdx(wCtx, tx, br, windowStart, windowEnd, logger)
11401134
if err != nil {
11411135
return fmt.Errorf("CheckCommitmentHistAtBlkRange: build index window=[%d,%d): %w", windowStart, windowEnd, err)
11421136
}
11431137
// Each goroutine needs its own Sampler — the RNG is not goroutine-safe.
11441138
sampler := sc.NewWindowSampler(windowStart)
11451139
for blockNum := range sampler.BlockNums(windowStart, windowEnd) {
1146-
if err := checkCommitmentHistAtBlkWithIdx(wCtx, tx, sd, br, blockNum, idx, log.LvlTrace, logger); err != nil {
1140+
// Fresh SharedDomains per block: an SD is committed-or-closed,
1141+
// never reset in place.
1142+
sd, err := execctx.NewSharedDomains(wCtx, tx, logger, execctx.WithoutDeferredBranchUpdates())
1143+
if err != nil {
1144+
return err
1145+
}
1146+
err = checkCommitmentHistAtBlkWithIdx(wCtx, tx, sd, br, blockNum, idx, log.LvlTrace, logger)
1147+
sd.Close()
1148+
if err != nil {
11471149
return fmt.Errorf("checkCommitmentHistAtBlk: %d, %w", blockNum, err)
11481150
}
11491151
checked.Add(1)

db/state/aggregator.go

Lines changed: 40 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -342,6 +342,13 @@ func (a *Aggregator) ConfigureDomains() error {
342342
}
343343
a.configured = true
344344

345+
// Attach the aggregator-lifetime BranchCache to the commitment domain; gated by USE_STATE_CACHE, nil = disabled.
346+
if dbg.UseStateCache {
347+
if cd := a.d[kv.CommitmentDomain]; cd != nil && cd.branchCache == nil {
348+
cd.branchCache = commitment.NewBranchCache(commitment.DefaultBranchCacheTailCapacity)
349+
}
350+
}
351+
345352
if a.disableFsync {
346353
for _, d := range a.d {
347354
if d != nil {
@@ -2353,6 +2360,14 @@ func (a *Aggregator) BeginFilesRo() *AggregatorRoTx {
23532360
return ac
23542361
}
23552362

2363+
// BranchCache attached to the commitment domain (implements commitment.BranchCacheProvider).
2364+
func (at *AggregatorRoTx) BranchCache() *commitment.BranchCache {
2365+
if at.d[kv.CommitmentDomain] == nil {
2366+
return nil
2367+
}
2368+
return at.d[kv.CommitmentDomain].d.branchCache
2369+
}
2370+
23562371
func (at *AggregatorRoTx) Dirs() datadir.Dirs { return at.a.dirs }
23572372
func (at *AggregatorRoTx) standaloneIIs() []*InvertedIndexRoTx { return at.iis[:at.iisCount] }
23582373

@@ -2401,31 +2416,50 @@ func (at *AggregatorRoTx) MeteredGetLatest(domain kv.Domain, k []byte, tx kv.Tx,
24012416
return at.getLatest(domain, k, tx, maxStep, metrics, start)
24022417
}
24032418

2419+
// MeteredGetLatestWithTxN returns the high-water txN alongside (value,
2420+
// step) for tagging BranchCache entries so UnwindTo can evict by
2421+
// watermark. Non-CommitmentDomain reads return txN=0.
2422+
func (at *AggregatorRoTx) MeteredGetLatestWithTxN(domain kv.Domain, k []byte, tx kv.Tx, maxStep kv.Step, metrics *changeset.DomainMetrics, start time.Time) (v []byte, step kv.Step, txN uint64, ok bool, err error) {
2423+
return at.getLatestWithTxN(domain, k, tx, maxStep, metrics, start)
2424+
}
2425+
24042426
func (at *AggregatorRoTx) getLatest(domain kv.Domain, k []byte, tx kv.Tx, maxStep kv.Step, metrics *changeset.DomainMetrics, start time.Time) (v []byte, step kv.Step, ok bool, err error) {
2427+
v, step, _, ok, err = at.getLatestWithTxN(domain, k, tx, maxStep, metrics, start)
2428+
return v, step, ok, err
2429+
}
2430+
2431+
func (at *AggregatorRoTx) getLatestWithTxN(domain kv.Domain, k []byte, tx kv.Tx, maxStep kv.Step, metrics *changeset.DomainMetrics, start time.Time) (v []byte, step kv.Step, txN uint64, ok bool, err error) {
24052432
if domain != kv.CommitmentDomain {
2406-
return at.d[domain].getLatest(k, tx, maxStep, metrics, start)
2433+
v, step, ok, err = at.d[domain].getLatest(k, tx, maxStep, metrics, start)
2434+
return v, step, 0, ok, err
24072435
}
24082436

24092437
v, step, ok, err = at.d[domain].getLatestFromDb(k, tx)
24102438
if err != nil {
2411-
return nil, kv.Step(0), false, err
2439+
return nil, kv.Step(0), 0, false, err
24122440
}
24132441
if ok && step <= maxStep {
24142442
if metrics != nil && dbg.KVReadLevelledMetrics {
24152443
metrics.UpdateDbReads(domain, start)
24162444
}
2417-
return v, step, true, nil
2445+
// DB-sourced: tag with the step's high-water; the exact write
2446+
// txN isn't recoverable from the step-keyed record.
2447+
return v, step, lastTxNumOfStep(step, at.StepSize()), true, nil
24182448
}
24192449

24202450
v, found, fileStartTxNum, fileEndTxNum, err := at.d[domain].getLatestFromFiles(k, 0)
24212451
if !found {
2422-
return nil, kv.Step(0), false, err
2452+
return nil, kv.Step(0), 0, false, err
24232453
}
24242454
if metrics != nil && dbg.KVReadLevelledMetrics {
2425-
metrics.UpdateFileReads(domain, start)
2455+
// UpdateFileReadsUnique tracks total + distinct prefixes; the
2456+
// ratio is the read amplification factor (hot prefixes re-read).
2457+
metrics.UpdateFileReadsUnique(domain, k, start)
24262458
}
24272459
v, err = at.replaceShortenedKeysInBranch(k, commitment.BranchData(v), fileStartTxNum, fileEndTxNum)
2428-
return v, kv.Step(fileEndTxNum / at.StepSize()), found, err
2460+
// File-sourced: tag with fileEndTxNum; snapshots are immutable and
2461+
// unwind can't cross them, so this is always <= any legal watermark.
2462+
return v, kv.Step(fileEndTxNum / at.StepSize()), fileEndTxNum, found, err
24292463
}
24302464

24312465
func (at *AggregatorRoTx) DebugGetLatestFromDB(domain kv.Domain, key []byte, tx kv.Tx) ([]byte, kv.Step, bool, error) {

0 commit comments

Comments
 (0)