Skip to content

file-backed block device could be hardened for NFS-backed files in the face of live migration #396

Open
@jordanhendricks

Description

@jordanhendricks

In testing live migrations, I've been using a file-backed disk for my guest image, in which the file is a path on a NFS server. I've run a lot of migrations against these guests, some migrations ending less gracefully than others. Over time, these disks have begun to degrade and ultimately the guests report being very sad about data corruption.

I didn't look too much into this, but I suspect there may be some things we could do to make the file-backed block device work better with NFS (like making sure the file is fsynced at appropriate times). This obviously isn't a priority for the product, but is a helpful development shortcut that enables inter-machine migrations without having to set up crucible or another network-based disk.

Example sad helios guest:

BdsDxe: starting Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x4,0x0)
ZFS: i/o error - all block copies unavailable
ZFS: i/o error - all block copies unavailable
ZFS: i/o error - all block copies unavailable
ZFS: i/o error - all block copies unavailable
can't open '/boot/forth/brand.4th': unknown error (78)

ZFS: i/o error - all block copies unavailable
ZFS: i/o error - all block copies unavailable
Loading /boot/defaults/loader.conf 
ZFS: i/o error - all block copies unavailable
ZFS: i/o error - all block copies unavailable
Loading unix...
Loading /platform/i86pc/amd64/boot_archive...
ZFS: i/o error - all block copies unavailable
ZFS: i/o error - all block copies unavailable
Loading /platform/i86pc/amd64/boot_archive.hash...
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/platform/i86pc/kernel/amd64/unix] in 8 seconds... Booting [/platform/i86pc/kernel/amd64/unix] inBooting [/platform/i86pc/kernel/amd64/unix] in 1 sBooting [/platform/i86pc/kernel/amd64/unix]...               
No rootfs module provided, aborting

Type '?' for a list of commands, 'help' for more detailed help


Example sad debian11 guest:

Loading, please wait...
Starting version 247.3-7+deb11u1
[    1.148531] piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr
[    1.152395] PCI Interrupt Link [LNKA] enabled at IRQ 11
[    1.153256] virtio-pci 0000:00:04.0: virtio_pci: leaving for legacy driver
[    1.163160] virtio_blk virtio0: [vda] 4194304 512-byte logical blocks (2.15 GB/2.00 GiB)
[    1.164231] vda: detected capacity change from 0 to 2147483648
[    1.179888]  vda: vda1 vda14 vda15
[    1.972097] tsc: Refined TSC clocksource calibration: 2649.950 MHz
[    1.974207] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x26328e53751, max_idle_ns: 440795320526 ns
[    1.977986] clocksource: Switched to clocksource tsc
[    4.404081] floppy0: no floppy controllers found
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.36.1
[/sbin/fsck.ext4 (1) -- /dev/vda1] fsck.ext4 -a -C0 /dev/vda1 
/dev/vda1: recovering journal
/dev/vda1 contains a file system with errors, check forced.
Entry 'status' in /var/lib/logrotate (10258) has deleted/unused inode 22848.  CLEARED.    
Unattached inode 22788                                                                    


/dev/vda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)
fsck exited with status code 4
done.
Failure: File system check of the root filesystem failed

Metadata

Metadata

Assignees

No one assigned

    Labels

    storageRelated to storage devices/backends.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions