db/snaptype: add cache-busting hash particle to snapshot filenames by anacrolix · Pull Request #19150 · erigontech/erigon

anacrolix · 2026-02-13T04:23:32Z

Summary

Adds optional cache-busting hash particle to snapshot filenames, placed as a dot-separated component before the file extension (e.g. v1.0-000000-001000-headers.abc123def0.seg)
Computes a truncated SHA256 hash of .seg file content after compression and renames the file to include it, ensuring snapshot filenames change when content changes and preventing stale BitTorrent downloads
Fully backward compatible: filenames without the particle parse identically to before
Includes FileInfo.Hash field, WithHash()/As() hash preservation, construction helpers, and hash-tolerant glob masks
Applies content hash after Compress() in all snapshot generation paths: dumpRange, ExtractRange, merge, caplin beacon/blob/state dumps
Updates DirtySegment to carry the hash through FileName() and FileInfo()

anacrolix · 2026-03-04T05:13:42Z

#19150

anacrolix · 2026-03-06T01:19:32Z

Manually dispatched two additional workflow runs to validate the snapshot filename changes against real-world data:

Manifest Check — builds the downloader binary and runs manifest-verify against all 6 production webseed servers (mainnet, bor-mainnet, gnosis, chiado, sepolia, amoy), directly parsing real snapshot filenames from the listings. This verifies the filename parsing changes are compatible with existing production filenames.
QA Snapshot Download — runs Erigon against mainnet and exercises the full snapshot pipeline: filename discovery, torrent download, file opening, and indexing with real .seg files.

These don't trigger automatically on this PR since it targets a non-release branch and go.mod wasn't changed.

anacrolix · 2026-03-06T01:35:18Z

The 3 failing checks (gnosis-rpc-integ-tests, mainnet-rpc-integ-tests, mainnet-rpc-integ-tests-remote) are unrelated to this PR. They are RPC response comparison tests running against a live reference node and are failing intermittently across multiple unrelated PRs right now (e.g. alex/etl_mmap_34, alex/histoy_table_format_change_34). Re-running until green.

…enames Support an optional hash particle in snapshot filenames for cache busting. The particle sits as a dot-separated component before the file extension: V2: v1.0-000000-001000-headers.abc123def0.seg V3: v12.13-accounts.100-164.abc123def0.efi Filenames without the particle parse identically to before. The existing extension-stripping loop in ParseFileName already handled extra dot components; this change captures the first one as FileInfo.Hash. Adds WithHash/As hash preservation, construction helpers, and hash-tolerant glob masks.

…fter compression Compute a truncated SHA256 hash of .seg file content after compression and rename the file to include it as a cache-busting particle. This ensures snapshot filenames change when content changes, preventing stale BitTorrent downloads. Changes: - Add ApplyContentHash/computeFileHash helpers in db/snaptype/files.go - Add hash field to DirtySegment, update FileName() and FileInfo() - Call ApplyContentHash after Compress() in all snapshot generation paths: dumpRange, ExtractRange, merge, caplin beacon/blob/state dumps - Update merge() to return updated FileInfo for correct error cleanup - Fix FileInfo.As() to strip hash (content-specific per type) - Fix ReplaceVersionWithMask to match optional hash in glob patterns

Copilot

Pull request overview

This PR adds an optional cache-busting content-hash particle to snapshot filenames and propagates it through snapshot generation and snapshot-sync code paths to avoid stale BitTorrent artifacts when snapshot content changes.

Changes:

Added FileInfo.Hash, WithHash(), and helpers/masks to construct and match hashed snapshot filenames.
Applied content hashing after Compress() across multiple snapshot generation paths (ExtractRange, dumpRange, merge, Caplin dumps) and carried the hash through DirtySegment.
Updated tests to validate parsing/round-tripping of hashed filenames and to tolerate hashed outputs in merge tests.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
db/version/file_version.go	Expands version-masked patterns to also match optional hash particles.
db/snaptype/type.go	Applies content hash after compression in `ExtractRange` before index building.
db/snaptype/files.go	Adds hash-aware filename helpers, parsing into `FileInfo.Hash`, and file hashing/rename logic.
db/snaptype/files_test.go	Adds unit tests for hash parsing, naming helpers, masks, and `ApplyContentHash`.
db/snapshotsync/snapshots.go	Extends `DirtySegment` to carry and emit hashed snapshot filenames.
db/snapshotsync/snapshots_test.go	Updates merge test to locate merged output via a hash-tolerant glob mask.
db/snapshotsync/merger.go	Applies content hash after merge compression and propagates into `DirtySegment`.
db/snapshotsync/freezeblocks/caplin_snapshots.go	Applies content hash after compression for Caplin block/blob dumps.
db/snapshotsync/freezeblocks/block_snapshots.go	Applies content hash after compression for execution-layer snapshot dumps.
db/snapshotsync/caplin_state_snapshots.go	Applies content hash after compression for Caplin state dumps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Agent-Logs-Url: https://github.com/erigontech/erigon/sessions/a0810048-9378-4978-b03b-4f118e34968b Co-authored-by: anacrolix <988750+anacrolix@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T01:57:41Z

 	for ext := filepath.Ext(croppedFileName); ext != "" && !strings.Contains(ext, "-"); ext = filepath.Ext(croppedFileName) {
 		croppedFileName = strings.TrimSuffix(croppedFileName, ext)
+		if res.Hash == "" {
+			res.Hash = ext[1:] // strip leading dot
+		}
 	}


The hash-extraction loop sets res.Hash to the first stripped dotted suffix unconditionally. This will mis-classify non-hash suffixes (e.g. .torrent in ...ef.torrent.tmp..., or .seg/.idx when parsing .torrent* wrappers) as the content hash. A more robust approach is to only accept a hash particle when the suffix looks like a hex digest (and/or when it is not a known snapshot extension like .seg/.idx/.efi).

Copilot · 2026-04-29T01:57:41Z

+	f, err = ApplyContentHash(f)
+	if err != nil {
+		return lastKeyValue, err
+	}
+


Applying the content hash to the segment filename here means the .seg name changes when content changes, but the index-building code still writes .idx files using IdxFileName(...) (which does not incorporate FileInfo.Hash). If .idx files/torrents are distributed (they are treated as seedable extensions elsewhere), this can still allow stale .idx/.idx.torrent downloads and even mismatched index+segment pairs. Consider propagating the same hash particle into index filenames as well (or otherwise tying index identity to the segment hash).

anacrolix self-assigned this Mar 4, 2026

anacrolix mentioned this pull request Mar 6, 2026

add cache buster hash to snapshot names #18510

Open

anacrolix force-pushed the anacrolix/cache-busting-snapshots branch from 9a059b8 to 5468163 Compare March 6, 2026 00:44

anacrolix marked this pull request as ready for review March 6, 2026 01:49

anacrolix requested review from AskAlexSharov and sudeepdino008 as code owners March 6, 2026 01:49

Giulio2002 added the ErigonDB label Mar 10, 2026

anacrolix added 2 commits March 27, 2026 15:37

anacrolix force-pushed the anacrolix/cache-busting-snapshots branch from 89a48cc to 7c21bf8 Compare March 27, 2026 04:47

anacrolix added 2 commits March 28, 2026 10:08

Merge branch 'main' into anacrolix/cache-busting-snapshots

466a1e7

Merge branch 'main' into anacrolix/cache-busting-snapshots

b357012

AskAlexSharov requested a review from Copilot April 8, 2026 04:42

Copilot started reviewing on behalf of AskAlexSharov April 8, 2026 04:43 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Comment thread db/snaptype/files.go

Comment thread db/snapshotsync/merger.go

Comment thread db/version/file_version.go Outdated

Comment thread db/snaptype/files.go

Merge branch 'main' into anacrolix/cache-busting-snapshots

3b61195

Copilot started work on behalf of anacrolix April 17, 2026 02:29 View session

anacrolix and others added 3 commits April 17, 2026 12:30

Update db/version/file_version.go

edf349b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

db/snaptype: avoid parsing wrapper ext as hash in ParseFileName

9fb92c8

Agent-Logs-Url: https://github.com/erigontech/erigon/sessions/a0810048-9378-4978-b03b-4f118e34968b Co-authored-by: anacrolix <988750+anacrolix@users.noreply.github.com>

db/snaptype: parse .torrent wrappers without treating inner ext as hash

e71bd05

Agent-Logs-Url: https://github.com/erigontech/erigon/sessions/a0810048-9378-4978-b03b-4f118e34968b Co-authored-by: anacrolix <988750+anacrolix@users.noreply.github.com>

Copilot finished work on behalf of anacrolix April 17, 2026 03:23

Merge branch 'main' into anacrolix/cache-busting-snapshots

bd32045

AskAlexSharov requested a review from Copilot April 29, 2026 01:49

Copilot started reviewing on behalf of AskAlexSharov April 29, 2026 01:50 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db/snaptype: add cache-busting hash particle to snapshot filenames#19150

db/snaptype: add cache-busting hash particle to snapshot filenames#19150
anacrolix wants to merge 9 commits intomainfrom
anacrolix/cache-busting-snapshots

anacrolix commented Feb 13, 2026 •

edited

Loading

Uh oh!

anacrolix commented Mar 4, 2026

Uh oh!

anacrolix commented Mar 6, 2026

Uh oh!

anacrolix commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

anacrolix commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

anacrolix commented Mar 4, 2026

Uh oh!

anacrolix commented Mar 6, 2026

Uh oh!

anacrolix commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

anacrolix commented Feb 13, 2026 •

edited

Loading