Skip to content

LXD copying lxd-agent binary into VM's config drive eventually causes the config drive quota to be reached #16694

@tomponline

Description

@tomponline

Please confirm

  • I have searched existing issues to check if an issue already exists for the bug I encountered.

Distribution

ubuntu

Distribution version

2404

Output of "snap list --all lxd core20 core22 core24 snapd"

N/A

Output of "lxc info" or system info if it fails

N/A

Issue description

Reported by https://discourse.ubuntu.com/t/windows-vm-error-no-space-left-on-device/68869/13

When I start my Windows VM I get the following error:

Error: open /var/snap/lxd/common/lxd/virtual-machines/win11d/config/server.crt: no space left on device

I install some software inside Windows which again, works well. After a few sessions I run into this error when trying lxc start win11d.

I believe this is caused by the interplay between LXD copying the lxd-agent binary into the VM's config drive (the default/virtual-machines/win11d volume) each time the VM is started (if it has changes, i.e from a snap refresh update), and the taking of snapshots, which means that the old version of the lxd-agent is stored in the snapshot which increases the CoW storage accounted for the volume, eventually after enough snap refreshes, snapshots and VM restarts the config drive is full.

Setting size.state increases the size of the VM's config drive, allowing this pattern to continue for longer.

I think we need to look into using bind-mounts to get the lxd-agent into the VM's config drive mount without needing to use space in the config drive volume itself.

Steps to reproduce

  1. Launch a VM
  2. Touch the lxd-agent binary (simulating a snap refresh that updates the binary).
  3. Create a snapshot
  4. Restart VM

Repeat this cycle until the VM's config drive fills up and VM refuses to start.

The available space for the VMs config drive should show as 0B when using ZFS:

E.g.

    default/virtual-machines/win11d 7.87M 0B 7.68M legacy

Information to attach

  • Any relevant kernel output (dmesg)
  • Instance log (lxc info NAME --show-log)
  • Instance configuration (lxc config show NAME --expanded)
  • Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
  • Output of the client with --debug
  • Output of the daemon with --debug (or use lxc monitor while reproducing the issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    JiraTriggers the synchronization of a GitHub issue in Jira

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions