Skip to content

Commit 047fad3

Browse files
wmitsudaclaude
andauthored
Detect duplicate version ranges and version downgrades in snapshot PRs (#1084)
Follow-up to #1083. Adds two new critical checks to the summarize-changes skill: - Version conflict detection: scans the full toml file to find multiple versions of the same file type coexisting for the same range (e.g., v1.1 and v1.2 of accounts.0-32.vi). These cause download conflicts. - Version downgrade detection: identifies files replaced with a LOWER version than before (e.g., v1.2 -> v1.1), indicating a regression. Both are flagged as critical and recommend not merging until investigated. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent dce4040 commit 047fad3

File tree

2 files changed

+135
-17
lines changed

2 files changed

+135
-17
lines changed

.claude/skills/summarize-changes/SKILL.md

Lines changed: 64 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,20 @@ If $PR_URL_OR_NUMBER is a full URL, extract the PR number from it. If it is just
1515
Run these commands using the Bash tool:
1616

1717
1. `gh pr view <number> --repo erigontech/erigon-snapshot` to get PR title, description, and metadata
18-
2. Save the diff to a temp file and run the analysis script:
18+
2. Save the diff to a temp file:
1919
```
20-
gh pr diff <number> --repo erigontech/erigon-snapshot > /tmp/pr_diff.txt && python3 "$(git rev-parse --show-toplevel)/.claude/skills/summarize-changes/analyze_diff.py" /tmp/pr_diff.txt
20+
gh pr diff <number> --repo erigontech/erigon-snapshot > /tmp/pr_diff.txt
21+
```
22+
3. Fetch the full toml file from the PR's head branch:
23+
```
24+
HEAD_REF=$(gh pr view <number> --repo erigontech/erigon-snapshot --json headRefName -q .headRefName)
25+
TOML_FILE=$(gh pr diff <number> --repo erigontech/erigon-snapshot --name-only | head -1)
26+
gh api "repos/erigontech/erigon-snapshot/contents/$TOML_FILE" \
27+
-H "Accept: application/vnd.github.raw" -F ref="$HEAD_REF" > /tmp/pr_toml.txt
28+
```
29+
4. Run the analysis script with both files:
30+
```
31+
python3 "$(git rev-parse --show-toplevel)/.claude/skills/summarize-changes/analyze_diff.py" /tmp/pr_diff.txt /tmp/pr_toml.txt
2132
```
2233

2334
The script (`analyze_diff.py` in the skill directory) parses the diff, classifies all changes, and outputs structured sections. Use its output to build the final report.
@@ -27,8 +38,10 @@ The script (`analyze_diff.py` in the skill directory) parses the diff, classifie
2738
- **Hash Changes**: same filename in both removed and added sets with different hash (CRITICAL)
2839
- **Range Merges**: multiple smaller removed ranges replaced by a single larger added range
2940
- **Version Upgrades**: removed entries at one version replaced by added entries at a newer version
41+
- **Version Downgrades**: removed entries at one version replaced by added entries at a LOWER version (CRITICAL — indicates regression)
3042
- **New Data Pruned from MDBX**: added entries beyond the previously highest block number
3143
- **Unexpected Deletions**: removed entries not covered by any of the above (CRITICAL)
44+
- **Version Conflicts**: multiple versions of the same file type and range coexisting in the final toml (CRITICAL)
3245

3346
## Step 4: Generate Report
3447

@@ -93,6 +106,28 @@ If NO unexpected deletions, use ✅ emoji:
93106

94107
---
95108

109+
### VERSION CONFLICTS
110+
111+
If version conflicts exist, use 🚨 emoji:
112+
113+
### 🚨🚨🚨 VERSION CONFLICTS — DO NOT MERGE
114+
115+
> **Multiple versions of the same file for the same range must not coexist. This will cause download conflicts for nodes.**
116+
117+
Group by State Snapshots / CL Snapshots / EL Block Snapshots. Show a table:
118+
119+
| Type | Ext | Range | Versions |
120+
|------|-----|-------|----------|
121+
| accounts | .vi | 0-32 | v1.1, v1.2 |
122+
123+
If NO version conflicts, use ✅ emoji:
124+
125+
### ✅ Version Conflicts
126+
127+
**No version conflicts detected.** Each file type has exactly one version per range.
128+
129+
---
130+
96131
### Merged Ranges
97132

98133
Present merges in a table format, with one table per high-level group (State Snapshots / CL Snapshots / EL Block Snapshots). Sort rows by subdir (accessor, domain, history, idx, caplin, or root) then by snapshot type (datatype).
@@ -139,6 +174,28 @@ Use continuation rows (empty Subdir cell) for subsequent files in the same subdi
139174

140175
---
141176

177+
### VERSION DOWNGRADES
178+
179+
If version downgrades exist, use 🚨 emoji:
180+
181+
### 🚨🚨🚨 VERSION DOWNGRADES — DO NOT MERGE
182+
183+
> **Files were replaced with a LOWER version than they had before. This is a regression that needs investigation.**
184+
185+
Group by State Snapshots / CL Snapshots / EL Block Snapshots. Show a table:
186+
187+
| Subdir | Type | Ext | Old Version | New Version | Ranges |
188+
|--------|------|-----|-------------|-------------|--------|
189+
| accessor | accounts | .vi | v1.2 | v1.1 | 0-256, 256-512, 512-768 |
190+
191+
If NO version downgrades, use ✅ emoji:
192+
193+
### ✅ Version Downgrades
194+
195+
**No version downgrades detected.** All version changes go to equal or higher versions.
196+
197+
---
198+
142199
### Version Upgrades
143200

144201
Group by State Snapshots / CL Snapshots / EL Block Snapshots. List version transitions (e.g., v1.1 -> v2.0) by category and datatype.
@@ -153,20 +210,21 @@ Any changes to Other Files (salt files, etc.) or anything not fitting the above
153210

154211
### Reviewer Recommendation
155212

156-
Based on the two critical signals (hash changes and unexpected deletions), add a final recommendation section:
213+
Based on the four critical signals (hash changes, unexpected deletions, version conflicts, and version downgrades), add a final recommendation section:
157214

158-
If NO hash changes AND NO unexpected deletions:
215+
If NO hash changes AND NO unexpected deletions AND NO version conflicts AND NO version downgrades:
159216

160217
### ✅ Recommendation: Safe to Approve
161218

162219
This PR contains only routine changes: range merges, version upgrades, and new data pruned from MDBX. No anomalies detected.
163220

164-
If hash changes OR unexpected deletions exist:
221+
If hash changes OR unexpected deletions OR version conflicts OR version downgrades exist:
165222

166223
### 🚨 Recommendation: Investigation Required
167224

168225
This PR contains changes that need manual review before approval:
169-
- (list each concern: N hash change(s), N unexpected deletion(s), with brief context from the sections above)
226+
- (list each concern: N hash change(s), N unexpected deletion(s), N version conflict(s), N version downgrade(s), with brief context from the sections above)
227+
- Version conflicts and version downgrades specifically mean "do not merge" — they will cause download conflicts or regressions for nodes
170228

171229
## Step 5: Offer to Post as PR Comment
172230

.claude/skills/summarize-changes/analyze_diff.py

Lines changed: 71 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
#!/usr/bin/env python3
22
"""Analyze a snapshot PR diff file and classify all changes.
33
4-
Usage: python3 analyze_diff.py <diff_file>
4+
Usage: python3 analyze_diff.py <diff_file> [toml_file]
55
66
The diff file should be the raw output of `gh pr diff`.
7+
The optional toml_file is the full toml from the PR branch, used to detect
8+
version conflicts (multiple versions of the same file type coexisting).
79
Outputs structured text sections for hash changes, merges, version upgrades,
8-
new data pruned from MDBX, and unexpected deletions.
10+
new data pruned from MDBX, unexpected deletions, and version conflicts.
911
"""
1012

1113
import re
@@ -57,6 +59,35 @@ def hgroup(cat):
5759
return "other"
5860

5961

62+
def parse_ver(v):
63+
return tuple(int(x) for x in v.lstrip("v").split("."))
64+
65+
66+
def parse_toml(path):
67+
filenames = []
68+
with open(path) as f:
69+
for line in f:
70+
m = re.match(r"^'([^']+)'\s*=\s*'[a-f0-9]+'", line.strip())
71+
if m:
72+
filenames.append(m.group(1))
73+
return filenames
74+
75+
76+
def detect_version_conflicts(filenames):
77+
groups = defaultdict(set)
78+
for fname in filenames:
79+
info = parse_filename(fname)
80+
if info["cat"] in ("other", "unknown"):
81+
continue
82+
key = (info["cat"], info["dt"], info["ext"], info["s"], info["e"])
83+
groups[key].add(info["ver"])
84+
conflicts = {}
85+
for key, versions in groups.items():
86+
if len(versions) >= 2:
87+
conflicts[key] = sorted(versions)
88+
return conflicts
89+
90+
6091
def classify(removed, added):
6192
# 1. Hash changes
6293
hash_changes = []
@@ -126,7 +157,7 @@ def classify(removed, added):
126157
return hash_changes, merges, version_upgrades_list, frontier, unexpected
127158

128159

129-
def print_report(removed, added, hash_changes, merges, version_upgrades_list, frontier, unexpected):
160+
def print_report(removed, added, hash_changes, merges, version_upgrades_list, frontier, unexpected, version_conflicts=None):
130161
# Hash changes
131162
print("=== HASH CHANGES ===")
132163
for f, oh, nh in hash_changes:
@@ -140,6 +171,14 @@ def print_report(removed, added, hash_changes, merges, version_upgrades_list, fr
140171
print(f" [{hgroup(u['cat'])}] {u['fname']}")
141172
print(f" count={len(unexpected)}")
142173

174+
# Version conflicts
175+
if version_conflicts is not None:
176+
print("=== VERSION CONFLICTS ===")
177+
for (cat, dt, ext, s, e), versions in sorted(version_conflicts.items()):
178+
ver_str = ", ".join(f"{v} (.{ext})" for v in versions)
179+
print(f" [{hgroup(cat)}] {cat}/{dt} range {s}-{e}: {ver_str}")
180+
print(f" count={len(version_conflicts)}")
181+
143182
# Merges table
144183
print("=== MERGES TABLE ===")
145184
mp = defaultdict(list)
@@ -162,14 +201,28 @@ def print_report(removed, added, hash_changes, merges, version_upgrades_list, fr
162201
types_str = ", ".join(sorted(set(items)))
163202
print(f" [{hg}] | {cat} | {types_str} | {old_r} | {ar[0]}-{ar[1]} [{nv}] | {note}")
164203

165-
# Version upgrades
166-
print("=== VERSION UPGRADES ===")
167-
vup = defaultdict(list)
204+
# Version upgrades and downgrades
205+
upgrades = defaultdict(list)
206+
downgrades = defaultdict(list)
168207
for vu in version_upgrades_list:
169208
vk = (hgroup(vu["cat"]), vu["cat"], tuple(sorted(vu["old_vers"])), vu["new_ver"])
170209
rr = [(s, e) for s, e, v in vu["rem_ranges"]]
171-
vup[vk].append(f"{vu['dt']} (.{vu['ext']}): {', '.join(f'{s}-{e}' for s, e in rr)} -> {vu['add_range'][0]}-{vu['add_range'][1]}")
172-
for (hg, cat, ov, nv), items in sorted(vup.items()):
210+
detail = f"{vu['dt']} (.{vu['ext']}): {', '.join(f'{s}-{e}' for s, e in rr)} -> {vu['add_range'][0]}-{vu['add_range'][1]}"
211+
max_old = max(parse_ver(v) for v in vu["old_vers"])
212+
if parse_ver(vu["new_ver"]) < max_old:
213+
downgrades[vk].append(detail)
214+
else:
215+
upgrades[vk].append(detail)
216+
217+
print("=== VERSION DOWNGRADES ===")
218+
for (hg, cat, ov, nv), items in sorted(downgrades.items()):
219+
print(f" [{hg}] {cat}: {','.join(ov)} -> {nv}")
220+
for i in sorted(items):
221+
print(f" {i}")
222+
print(f" count={sum(len(v) for v in downgrades.values())}")
223+
224+
print("=== VERSION UPGRADES ===")
225+
for (hg, cat, ov, nv), items in sorted(upgrades.items()):
173226
print(f" [{hg}] {cat}: {','.join(ov)} -> {nv}")
174227
for i in sorted(items):
175228
print(f" {i}")
@@ -190,14 +243,21 @@ def print_report(removed, added, hash_changes, merges, version_upgrades_list, fr
190243

191244

192245
def main():
193-
if len(sys.argv) != 2:
194-
print(f"Usage: {sys.argv[0]} <diff_file>", file=sys.stderr)
246+
if len(sys.argv) < 2 or len(sys.argv) > 3:
247+
print(f"Usage: {sys.argv[0]} <diff_file> [toml_file]", file=sys.stderr)
195248
sys.exit(1)
196249

197250
diff_file = sys.argv[1]
198251
removed, added = parse_diff(diff_file)
199252
hash_changes, merges, version_upgrades_list, frontier, unexpected = classify(removed, added)
200-
print_report(removed, added, hash_changes, merges, version_upgrades_list, frontier, unexpected)
253+
254+
version_conflicts = None
255+
if len(sys.argv) == 3:
256+
toml_file = sys.argv[2]
257+
filenames = parse_toml(toml_file)
258+
version_conflicts = detect_version_conflicts(filenames)
259+
260+
print_report(removed, added, hash_changes, merges, version_upgrades_list, frontier, unexpected, version_conflicts)
201261

202262

203263
if __name__ == "__main__":

0 commit comments

Comments
 (0)