This repository was archived by the owner on May 31, 2026. It is now read-only.
Commit 1d7ec02
v0.4.24: auto-isolation — definitive REPLACE answer without pulling DIMMs
When the post-test verdict detects errors distributed across 2+ DIMMs
in a block-mapped Type 20 layout (typical Intel consumer board with
disjoint DIMM ranges), the user no longer needs to physically pull
sticks one-by-one to identify which is bad. Press [I] on the verdict
screen and the program automatically:
1. Picks the kernel that found the most errors (e.g. AVX2 Sustained
for the Habr-user case where 24 errors all came from AVX2).
2. For each of the affected DIMMs in turn:
a. Frees the whole-RAM buffer
b. Sets TestOnlyDimm=N internally
c. Allocates a 256 MB buffer pinned to that DIMM's physical
address range via UEFI AllocateAddress + SMBIOS Type 20
d. Runs the kernel 3 times, snapshots error count delta
e. Truncates isolation records from g_err_records so the
original verdict data is preserved
f. Frees the per-DIMM buffer
3. Re-allocates the whole-RAM buffer.
4. Shows a definitive verdict screen with per-DIMM error counts:
DDR4-A2 ✓ ЧИСТАЯ · 0 errors in 3 passes
DDR4-B2 ✗ НЕИСПРАВНА · 8 errors in 3 passes
▶ ТОЧНО: REPLACE DDR4-B2 (HIGH confidence, confirmed by isolation)
Total time: ~5 min for a 2-DIMM isolation.
Why this is gated on block-mapped Type 20: on real cache-line
interleave (overlapping Type 20 ranges) TestOnlyDimm doesn't
physically isolate because the iMC continues to alternate cache lines
between channels regardless of which "logical" DIMM range we allocate
into. Detected via type20_has_overlapping_ranges() — if any DIMM
ranges overlap, the [I] offer is suppressed and the user gets the
existing manual-isolation instructions instead.
UI flow:
- Verdict footer gains [I] when applicable.
- Pressing [I] shows a live "ИЗОЛЯЦИЯ" panel with current DIMM,
pass counter, elapsed time, errors so far.
- ESC during isolation aborts and returns to verdict.
- After isolation, the result screen is shown; [D] takes user back
to the original verdict, [M] to menu, [ESC] reboots.
Logging:
[ISO] === auto-isolation started ===
[ISO] target kernel = AVX2 Sustained; 2 DIMM(s) to test
[ISO] DDR4-A2: alloc OK at 0x140000000, 65536 pages — running ...
[ISO] pass 1/3: errors=0
[ISO] pass 2/3: errors=0
[ISO] pass 3/3: errors=0
[ISO] result: DDR4-A2 status=2 errors=0 passes=3
... same for B2 ...
[ISO] === auto-isolation finished ===
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>1 parent 508dcc5 commit 1d7ec02
3 files changed
Lines changed: 458 additions & 45 deletions
0 commit comments