Commit d593fd3
db/state: restore empty tombstones on unwind and honor them in getLatestFromDb (#20627)
## Summary
Re-land the narrow core of #20483 (reverted by #20509), addressing the
DB-layer siblings of the post-unwind stale-read bug. Complements #20625
which addressed the in-memory overlay side.
Two linked bugs in the DB-level domain unwind and read paths caused
stale data resurrection after an unwind that reverted a first-time write
or a deletion inside the unwound range.
### Symptom observed on mainnet
Post-Fusaka mainnet catch-up sync with a chaindata/ wipe (snapshots/
retained). On the first re-executed block after a forkchoice-driven
unwind, execution returned less gas than the header — diffs observed of
`-34200` (block 24,898,955), `-73829` (block 24,899,403), `-118872`
(block 24,899,594). The diffs break down into multiples of `SSTORE_SET -
SSTORE_RESET = 17100` plus cold-access flips of `2600`.
Previous PR #20625 cleared the first two by pruning the in-memory
overlay on Unwind. Block 24,899,594 still failed because the overlay was
already flushed to DB at Unwind time — the stale-read path now was
purely DB-layer, addressed here.
## Fixes
### 1. `unwind()` must restore empty tombstones —
`db/state/domain.go:1317` (both DupSort and LargeValues paths)
`DomainEntryDiff.Value` has three shapes, documented in
`db/kv/helpers.go:247`:
- `nil` — "different step": prev value lives at another step, skip
restore (legacy V0 changeset shape)
- `[]byte{}` — "no previous value": key was absent before this step;
write an empty tombstone so the key appears absent again after the
unwind completes
- non-empty — restore the actual previous value
The old guard `if len(value) > 0` skipped *both* `nil` and `[]byte{}`,
leaving no tombstone after unwinding a first-time write. Corrected to
`if value != nil`.
### 2. `getLatestFromDb` must treat empty values as authoritative —
`db/state/domain.go:1665`
Empty-value entries are deletion tombstones. The step-age guard
previously discarded them when their step fell within the frozen file
range, causing the caller to fall through to `getLatestFromFiles`.
Frozen files encode deletions as absence, so the file returns the
pre-deletion value — the exact resurrection the deletion was meant to
prevent. Empty entries are now returned as `found=true` regardless of
step age; the step-age guard still applies to non-empty entries.
## Relationship to #20483 / #20509
This is a deliberate re-land of the narrow core of #20483. Key
differences:
- **Excluded**: the LargeValues cross-check in `getLatest` (PR #20483
lines 1697–1737). That handled an interrupted-`PruneSmallBatches` edge
case specific to `CodeDomain` / `RCacheDomain` and was the most likely
source of the regressions that motivated #20509's revert. If it proves
necessary, it can be added later as a separate PR with its own dedicated
test.
- **Included**: both matched fixes (write-side tombstone + read-side
authoritativeness). They are a pair — neither is useful alone; staging
them separately risks merging half and shipping a version that's still
broken.
## Tests
- `TestDomain_UnwindRestoresDeletionMarker` (DupSort + LargeValues
subtests) — writes a key, deletes it, re-writes within the same step,
builds files, then unwinds the re-write. **Fails on pre-fix code**
(getLatest returns the stale post-unwind `value2` from the frozen file);
passes with the fix. Exercises the write-side bug directly.
- `TestDomain_DeletedKeyNotResurrectedByFiles` (DupSort + LargeValues
subtests) — documents the read-side contract by writing a key and
deleting it at a step that falls within file range. Passes on current
`main` even without the fix (the file-build + prune semantics evolved
since #20483 and no longer hit the specific stale-read in this exact
test scenario), but retained as a forward regression guard and as
documentation of the invariant.
## Test plan
- [x] `go test -short ./db/state/...` — all pass
- [x] `make lint` — 0 issues
- [x] `make erigon` — builds clean
- [x] Manual repro of the production symptom (mainnet sync from
snapshots-only) in combination with #20625 — sync progresses past the
catch-up / first-forkchoice-unwind window without a gas mismatch.
(Re-verification run in progress alongside this PR.)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent fe4698c commit d593fd3
2 files changed
+199
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1292 | 1292 | | |
1293 | 1293 | | |
1294 | 1294 | | |
1295 | | - | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
1296 | 1303 | | |
1297 | 1304 | | |
1298 | 1305 | | |
| |||
1313 | 1320 | | |
1314 | 1321 | | |
1315 | 1322 | | |
1316 | | - | |
1317 | | - | |
| 1323 | + | |
| 1324 | + | |
1318 | 1325 | | |
1319 | 1326 | | |
1320 | 1327 | | |
| |||
1338 | 1345 | | |
1339 | 1346 | | |
1340 | 1347 | | |
1341 | | - | |
1342 | | - | |
| 1348 | + | |
| 1349 | + | |
1343 | 1350 | | |
1344 | 1351 | | |
1345 | 1352 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3541 | 3541 | | |
3542 | 3542 | | |
3543 | 3543 | | |
| 3544 | + | |
| 3545 | + | |
| 3546 | + | |
| 3547 | + | |
| 3548 | + | |
| 3549 | + | |
| 3550 | + | |
| 3551 | + | |
| 3552 | + | |
| 3553 | + | |
| 3554 | + | |
| 3555 | + | |
| 3556 | + | |
| 3557 | + | |
| 3558 | + | |
| 3559 | + | |
| 3560 | + | |
| 3561 | + | |
| 3562 | + | |
| 3563 | + | |
| 3564 | + | |
| 3565 | + | |
| 3566 | + | |
| 3567 | + | |
| 3568 | + | |
| 3569 | + | |
| 3570 | + | |
| 3571 | + | |
| 3572 | + | |
| 3573 | + | |
| 3574 | + | |
| 3575 | + | |
| 3576 | + | |
| 3577 | + | |
| 3578 | + | |
| 3579 | + | |
| 3580 | + | |
| 3581 | + | |
| 3582 | + | |
| 3583 | + | |
| 3584 | + | |
| 3585 | + | |
| 3586 | + | |
| 3587 | + | |
| 3588 | + | |
| 3589 | + | |
| 3590 | + | |
| 3591 | + | |
| 3592 | + | |
| 3593 | + | |
| 3594 | + | |
| 3595 | + | |
| 3596 | + | |
| 3597 | + | |
| 3598 | + | |
| 3599 | + | |
| 3600 | + | |
| 3601 | + | |
| 3602 | + | |
| 3603 | + | |
| 3604 | + | |
| 3605 | + | |
| 3606 | + | |
| 3607 | + | |
| 3608 | + | |
| 3609 | + | |
| 3610 | + | |
| 3611 | + | |
| 3612 | + | |
| 3613 | + | |
| 3614 | + | |
| 3615 | + | |
| 3616 | + | |
| 3617 | + | |
| 3618 | + | |
| 3619 | + | |
| 3620 | + | |
| 3621 | + | |
| 3622 | + | |
| 3623 | + | |
| 3624 | + | |
| 3625 | + | |
| 3626 | + | |
| 3627 | + | |
| 3628 | + | |
| 3629 | + | |
| 3630 | + | |
| 3631 | + | |
| 3632 | + | |
| 3633 | + | |
| 3634 | + | |
| 3635 | + | |
| 3636 | + | |
| 3637 | + | |
| 3638 | + | |
| 3639 | + | |
| 3640 | + | |
| 3641 | + | |
| 3642 | + | |
| 3643 | + | |
| 3644 | + | |
| 3645 | + | |
| 3646 | + | |
| 3647 | + | |
| 3648 | + | |
| 3649 | + | |
| 3650 | + | |
| 3651 | + | |
| 3652 | + | |
| 3653 | + | |
| 3654 | + | |
| 3655 | + | |
| 3656 | + | |
| 3657 | + | |
| 3658 | + | |
| 3659 | + | |
| 3660 | + | |
| 3661 | + | |
| 3662 | + | |
| 3663 | + | |
| 3664 | + | |
| 3665 | + | |
| 3666 | + | |
| 3667 | + | |
| 3668 | + | |
| 3669 | + | |
| 3670 | + | |
| 3671 | + | |
| 3672 | + | |
| 3673 | + | |
| 3674 | + | |
| 3675 | + | |
| 3676 | + | |
| 3677 | + | |
| 3678 | + | |
| 3679 | + | |
| 3680 | + | |
| 3681 | + | |
| 3682 | + | |
| 3683 | + | |
| 3684 | + | |
| 3685 | + | |
| 3686 | + | |
| 3687 | + | |
| 3688 | + | |
| 3689 | + | |
| 3690 | + | |
| 3691 | + | |
| 3692 | + | |
| 3693 | + | |
| 3694 | + | |
| 3695 | + | |
| 3696 | + | |
| 3697 | + | |
| 3698 | + | |
| 3699 | + | |
| 3700 | + | |
| 3701 | + | |
| 3702 | + | |
| 3703 | + | |
| 3704 | + | |
| 3705 | + | |
| 3706 | + | |
| 3707 | + | |
| 3708 | + | |
| 3709 | + | |
| 3710 | + | |
| 3711 | + | |
| 3712 | + | |
| 3713 | + | |
| 3714 | + | |
| 3715 | + | |
| 3716 | + | |
| 3717 | + | |
| 3718 | + | |
| 3719 | + | |
| 3720 | + | |
| 3721 | + | |
| 3722 | + | |
| 3723 | + | |
| 3724 | + | |
| 3725 | + | |
| 3726 | + | |
| 3727 | + | |
| 3728 | + | |
| 3729 | + | |
| 3730 | + | |
0 commit comments