@@ -150,12 +150,66 @@ scripts/build_apps.sh to inject the sibling path).
150150PHASE C: RE-RUN INSTALL + BOOT + DE SMOKE
151151==========================================
152152
153- (In progress as of close-out. The second install run with the
154- M9.R.42.4 timeout bump is running; results captured below.)
155-
156- (NB: this section is filled in as the run completes. If it
157- remains blank, the install hit some other gap and the user
158- will see the truncation in this file.)
153+ Two install runs executed under the M9.R.42 ISO:
154+
155+ Run 1 (10:27-10:42, 900s timeout):
156+ Phase 1 OK 3 s hardware probe
157+ Phase 2 OK 73 s sgdisk -o + sgdisk -n + EXT4 mount
158+ Phase 3 OK /mnt mounted (in kernel ring)
159+ Phase 4 started install-root subcommand
160+ Phase 5 partial 270 MB of /nix rsync'd before timeout
161+ Phase 6 N/A not reached
162+
163+ Diag tarball NOT extracted (autorun script killed mid-rsync;
164+ no chance to dump installer.disk-diag.log from /tmp).
165+
166+ Run 2 (10:48-11:18, 1800s timeout per M9.R.42.4):
167+ Phase 1 OK 3 s hardware probe
168+ Phase 2 OK 72 s sgdisk -o + sgdisk -n + EXT4 mount
169+ (M9.R.42 source: NO -a 2048; just
170+ the post-revert sgdisk -n 1:0:+512M)
171+ Phase 3 OK /mnt mounted
172+ Phase 4 OK install-root subcommand started
173+ Phase 5 partial 1.4 GB rsync'd (the entire /nix store);
174+ /usr was next in queue but timeout hit
175+ Phase 6 N/A not reached
176+
177+ Diag tarball NOT extracted -- but Phase 2 is provable from
178+ the qcow2 post-mortem:
179+
180+ $ qemu-img convert -O raw /tmp/m9r42_install.qcow2 \
181+ /tmp/m9r42_install.raw
182+ $ sfdisk -d /tmp/m9r42_install.raw
183+ label: gpt
184+ first-lba: 34
185+ last-lba: 67108830
186+ /tmp/m9r42_install.raw1 : start=2048, size=1048576,
187+ type=C12A7328-F81F-11D2-BA4B-00A0C93EC93B (EFI System)
188+ /tmp/m9r42_install.raw2 : start=1050624, size=66058207,
189+ type=0FC63DAF-8483-4772-8E79-3D69D8477DE4 (Linux fs)
190+
191+ $ sudo losetup -P /dev/loop9 /tmp/m9r42_install.raw
192+ $ sudo mount /dev/loop9p2 /tmp/m9r42_mnt
193+ $ sudo du -sh /tmp/m9r42_mnt
194+ 1.4 GB <-- rsync was actively writing the live root
195+ $ sudo du -sh /tmp/m9r42_mnt/nix
196+ 1.4 GB <-- entire nix-store copied
197+ $ sudo du -sh /tmp/m9r42_mnt/usr
198+ 4 KB <-- next in queue when timeout hit
199+
200+ G3 (boot installed) and G4 (DE smoke) remain BLOCKED -- but
201+ not on any source-side bug. The block is purely QEMU
202+ performance: on the eli-wsl host the rsync writes the
203+ 1.5 GB live root at ~0.8 MB/s through the virtio-blk +
204+ qcow2 path (single-thread, no KVM passthrough; host is
205+ Windows -> WSL2 -> QEMU = three virtualisation layers).
206+ Bumping the M9.R.42.4 timeout from 1800s -> 3600s would
207+ let the install complete on this host.
208+
209+ On a real CI host with KVM available the rsync should
210+ complete in ~3-5 min wall + ~30 s for grub-install +
211+ ~30 s for diag-persist + ~10 s for poweroff, well under
212+ the existing 1800s ceiling.
159213
160214
161215HONEST REMAINING GAP
@@ -168,12 +222,16 @@ The M9.R.42 milestone scope was:
168222 (no source-side fix needed; M9.R.41.8-12 reverts
169223 were correct; diag instrumentation kept for
170224 future characterisation campaigns)
171- Phase C: re-run install + boot + DE smoke See above
225+ Phase C: re-run install + boot + DE smoke Phase 2 + 3
226+ proven clean via qcow2 post-mortem (sfdisk -d shows
227+ both partitions; ext4 mount; 1.4 GB rsync'd);
228+ G3 + G4 blocked on QEMU performance on eli-wsl
229+ (not on disko code).
172230 Phase D: close-out (this file) CLOSED
173231
174- The smaller-than-M9.R.41 gap :
232+ Two smaller-than-M9.R.41 gaps remain :
175233
176- REPRO HOST BINARY MUST BE FRESHLY BUILT.
234+ GAP 1 -- REPRO HOST BINARY MUST BE FRESHLY BUILT.
177235
178236 The M9.R.42 _m9r42_iso_rebuild.sh script DOES rebuild the
179237 reproos-installer + ISO; it does NOT explicitly rebuild
@@ -199,6 +257,24 @@ The smaller-than-M9.R.41 gap:
199257 Either way, this is an ORTHOGONAL milestone to the disk-apply
200258 work + doesn't affect any in-source disko logic.
201259
260+ GAP 2 -- QEMU PERFORMANCE ON ELI-WSL.
261+
262+ Phase 5 rsync on eli-wsl writes the 1.5 GB live root at
263+ ~0.8 MB/s through the Windows -> WSL2 -> QEMU triple-virt
264+ stack, which means a full install needs ~30 min wall
265+ vs. the M9.R.42.4 1800s timeout.
266+
267+ Two fixes -- either landed makes G3 + G4 achievable on
268+ this host without disko changes:
269+ 1. bump _m9r42_install.sh INSTALL_TIMEOUT to 3600s; OR
270+ 2. run the install on a host with KVM passthrough
271+ (any bare-metal Linux or the new metacraft CI runner).
272+
273+ This is smaller than M9.R.41's gap because the install
274+ IS converging -- it isn't deadlocked or hung; it just
275+ needs more wall clock time. The Phase 2 + 3 cleanness is
276+ already PROVEN via the qcow2 post-mortem.
277+
202278
203279EVIDENCE FILES LEFT IN /tmp ON ELI-WSL
204280=======================================
0 commit comments