Skip to content

fix: nspawn_host_dev - restore /dev/pts and /dev/shm after devtmpfs remount#1749

Open
nikromen wants to merge 1 commit into
rpm-software-management:mainfrom
nikromen:kiwi
Open

fix: nspawn_host_dev - restore /dev/pts and /dev/shm after devtmpfs remount#1749
nikromen wants to merge 1 commit into
rpm-software-management:mainfrom
nikromen:kiwi

Conversation

@nikromen
Copy link
Copy Markdown
Collaborator

@nikromen nikromen commented Apr 27, 2026

systemd-nspawn sets up /dev as tmpfs, so dynamically created device
nodes (e.g. /dev/loop0p1) never appear inside the container.
This is a known kernel limitation. Block devices are not virtualized
for containers (systemd/systemd#21987, Poettering on systemd-devel:
https://lists.freedesktop.org/archives/systemd-devel/2017-August/039453.html).

The nspawn_host_dev option works around this by remounting /dev as
devtmpfs inside the container. Since devtmpfs is a single kernel-wide
instance, all device nodes (including those created dynamically by)
become visible. The osbuild/image-builder uses the same approach
(see osbuild/image-builder-cli pkg/setup/setup.go and #1554 discussion).

When nspawn_host_dev is enabled:

  • CAP_SYS_ADMIN is granted and DevicePolicy set to auto.
  • /dev is remounted as devtmpfs before each command, then /dev/pts
    and /dev/shm are restored (the remount masks these submounts).
  • The remount is skipped in bootstrap chroots where CAP_SYS_ADMIN
    is not granted.

Fixes: #1554

Comment thread mock/py/mockbuild/util.py Fixed
@nikromen nikromen changed the title wip Fix kiwi builds fail with nspawn due to broken loop device Apr 27, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the nspawn_host_dev configuration option, allowing Mock to remount /dev as devtmpfs within systemd-nspawn containers to support image-building tools like kiwi and osbuild. Review feedback indicates that the current implementation of remounting /dev will mask essential submounts such as /dev/pts and /dev/shm, which must be manually restored to maintain PTY and shared memory functionality. Furthermore, an inconsistency was noted where the mount command is attempted in bootstrap chroots even when the necessary privileges are not granted, suggesting a need for better synchronization between the configuration and the execution logic.

Comment thread mock/py/mockbuild/util.py Outdated
Comment thread mock/py/mockbuild/buildroot.py
@nikromen nikromen marked this pull request as ready for review May 5, 2026 22:14
@@ -0,0 +1,17 @@
#!/bin/sh

. ${TESTDIR}/functions
@@ -0,0 +1,17 @@
#!/bin/sh

. ${TESTDIR}/functions
@nikromen
Copy link
Copy Markdown
Collaborator Author

nikromen commented May 5, 2026

/gemini review

@nikromen nikromen changed the title Fix kiwi builds fail with nspawn due to broken loop device fix: nspawn_host_dev - restore /dev/pts and /dev/shm after devtmpfs remount May 5, 2026
@nikromen
Copy link
Copy Markdown
Collaborator Author

nikromen commented May 5, 2026

@praiskup @Conan-Kudo This follows the same approach as osbuild/image-builder
(see osbuild/image-builder-cli pkg/setup/setup.go and @supakeen's comment in #1554).

Does this approach work for you?

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the nspawn_host_dev configuration option, which allows systemd-nspawn containers to access dynamically created loop partition devices by remounting /dev as devtmpfs. While the changes include documentation and integration tests, the review identified several critical issues: the remount logic will fail in non-privileged builds because it attempts to run mount commands after dropping root privileges; the check for the cap_sys_admin capability is fragile and may miss valid configurations; and an early return in the loop device setup could skip necessary host-side device creation.

Comment thread mock/py/mockbuild/util.py Outdated
Comment on lines +804 to +809
cmd = ['/bin/sh', '-c',
'mount -t devtmpfs devtmpfs /dev'
' && mount -t devpts devpts /dev/pts -o gid=5,mode=620'
' && mount -t tmpfs tmpfs /dev/shm -o mode=1777;'
' exec "$@"',
'--'] + cmd
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The mount commands will fail when running a non-privileged build (the default for Mock). When systemd-nspawn is called with the -u option (lines 762-764), it switches to the specified user before executing the command. A non-root user cannot perform mount operations, even if the container has CAP_SYS_ADMIN, because the capability is not in their effective set. Since most Mock builds run as the mockbuild user, this feature will fail. To fix this, the remount must happen as root before dropping privileges. Alternatively, consider using systemd-nspawn --bind=/dev which performs the bind mount as root before switching the user.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the user context. For --shell and --chroot the
command runs as root, so the mount works. For rpmbuild it runs as
chrootuser , but kiwi/osbuild builds use --chroot) not rpmbuild, so the
primary use case is covered. Worth noting in docs though, so I should document this perhaps.

Ad the suggested alternatives: moving the remount before the user switch
would require hooking into nspawn's own startup sequence which would add significant complexity.
And --bind=/dev exposes all host devices and masks /dev/pts and /dev/shm
the same way, so it doesn't actually help.

'--capability=cap_sys_admin',
'--property=DevicePolicy=auto',
])
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

By returning early here, the code skips the pre-creation of loop devices on the host (lines 877-883). While modern kernels create loop devices on demand, Mock explicitly manages dev_loop_count to ensure a specific number of devices are available. If nspawn_host_dev is enabled on a system that doesn't automatically create enough loop devices, this might lead to failures. Consider ensuring the host-side device creation still runs even when nspawn_host_dev is enabled.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With nspawn_host_dev the whole point is that /dev is devtmpfs, which shows all
host devices including loop nodes created on demand via /dev/loop-control.
I think pre-creating them is unnecessary.

Comment thread mock/py/mockbuild/util.py Outdated
if isinstance(cmd, str):
cmd = ['/bin/sh', '-c', cmd]

if USE_NSPAWN_HOST_DEV and nspawn_args and '--capability=cap_sys_admin' in nspawn_args:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check for cap_sys_admin in nspawn_args is fragile because it only matches the exact string '--capability=cap_sys_admin'. If a user provides this capability in a different format (e.g., ['--capability', 'cap_sys_admin'] or as part of a comma-separated list), the remount logic will be skipped. Consider using a more robust check, such as any('cap_sys_admin' in arg for arg in nspawn_args if arg.startswith('--capability')).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capability string is generated by our own code in _setup_nspawn_loop_devices(), not user input as it will always be exactly --capability=cap_sys_admin.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we check for USE_NSPAWN_HOST_DEV, the option must be there in nspawn_args, or? no need to check for it twice?

@supakeen
Copy link
Copy Markdown

supakeen commented May 6, 2026

This approach looks OK at a first read. I don't know when I'll have time to try this out but hopefully next week.

Very cool if we can make this work 🙂

@Conan-Kudo
Copy link
Copy Markdown
Member

Anyone can verify if this fixes the problem by testing the reproducer in the ticket this PR claims to fix.

@praiskup
Copy link
Copy Markdown
Member

Anyone can verify if this fixes the problem by testing the reproducer in the ticket this PR claims to fix.

Failed for me:

[ DEBUG   ]: 09:33:50 | EXEC: [rm -f /builddir/kiwi-build/build/image-root/etc/hosts]
[ DEBUG   ]: 09:33:50 | Looking for rm in /usr/bin:/bin:/usr/sbin:/sbin
[ DEBUG   ]: 09:33:50 | EXEC: [rm -f /builddir/kiwi-build/build/image-root/etc/resolv.conf.kiwi /builddir/kiwi-build/build/image-root/etc/resolv.conf.sha /builddir/kiwi-build/build/image-root/etc/hosts.kiwi /builddir/kiwi-build/build/image-root/etc/hosts.sha]
[ ERROR   ]: 09:33:50 | KiwiCommandError: chroot: stderr: setfiles: Could not set context for /usr/lib/systemd/system-generators/anaconda-generator:  Invalid argument
setfiles: Could not set context for /usr/lib/systemd/system-generators/kdump-dep-generator.sh:  Invalid argument
/ 100.0%: 

Comment thread mock/py/mockbuild/util.py Outdated
'mount -t devtmpfs devtmpfs /dev'
' && mount -t devpts devpts /dev/pts -o gid=5,mode=620'
' && mount -t tmpfs tmpfs /dev/shm -o mode=1777;'
' exec "$@"',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this shell -> exec & set of mounts seems pretty dangerous. How do we umount?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the way to go (we should be using some builtin mechanism for handling mountpoints 🤷, some hints anyway

  • && should be used instead of ; before exec
  • please use set -x
  • perhaps set -e instead of &&

…emount

systemd-nspawn sets up /dev as tmpfs, so dynamically created device
nodes (e.g. /dev/loop0p1) never appear inside the container.
This is a known kernel limitation. Block devices are not virtualized
for containers (systemd/systemd#21987, Poettering on systemd-devel:
https://lists.freedesktop.org/archives/systemd-devel/2017-August/039453.html).

The nspawn_host_dev option works around this by remounting /dev as
devtmpfs inside the container.  Since devtmpfs is a single kernel-wide
instance, all device nodes (including those created dynamically by)
become visible. The osbuild/image-builder uses the same approach
(see osbuild/image-builder-cli pkg/setup/setup.go and rpm-software-management#1554 discussion).

When nspawn_host_dev is enabled:
- CAP_SYS_ADMIN is granted and DevicePolicy set to auto.
- /dev is remounted as devtmpfs before each command, then /dev/pts
  and /dev/shm are restored (the remount masks these submounts).
- The remount is skipped in bootstrap chroots where CAP_SYS_ADMIN
  is not granted.

Fixes: rpm-software-management#1554
@nikromen
Copy link
Copy Markdown
Collaborator Author

nikromen commented May 25, 2026

@Conan-Kudo The reproducer in this issue runs a full kiwi image build, which can fail for many unrelated reasons (e.g. praiskup hit an SELinux setfiles error) since the reproducer is really broad and can fail with different causes. Could you provide a minimal reproducer that only tests the specific failure -- loop partition device visibility so I can verify against it?

@Conan-Kudo
Copy link
Copy Markdown
Member

@Conan-Kudo The reproducer in this issue runs a full kiwi image build, which can fail for many unrelated reasons (e.g. praiskup hit an SELinux setfiles error) since the reproducer is really broad and can fail with different causes. Could you provide a minimal reproducer that only tests the specific failure -- loop partition device visibility so I can verify against it?

I could, but it would be disingenuous, because this bug needs to fix that use case. It isn't useful if we can't build a Fedora image on a Fedora host with it.

If you cannot build an image, then this cannot be declared fixed.

Anyone can verify if this fixes the problem by testing the reproducer in the ticket this PR claims to fix.

Failed for me:

[ DEBUG   ]: 09:33:50 | EXEC: [rm -f /builddir/kiwi-build/build/image-root/etc/hosts]
[ DEBUG   ]: 09:33:50 | Looking for rm in /usr/bin:/bin:/usr/sbin:/sbin
[ DEBUG   ]: 09:33:50 | EXEC: [rm -f /builddir/kiwi-build/build/image-root/etc/resolv.conf.kiwi /builddir/kiwi-build/build/image-root/etc/resolv.conf.sha /builddir/kiwi-build/build/image-root/etc/hosts.kiwi /builddir/kiwi-build/build/image-root/etc/hosts.sha]
[ ERROR   ]: 09:33:50 | KiwiCommandError: chroot: stderr: setfiles: Could not set context for /usr/lib/systemd/system-generators/anaconda-generator:  Invalid argument
setfiles: Could not set context for /usr/lib/systemd/system-generators/kdump-dep-generator.sh:  Invalid argument
/ 100.0%: 

Are you building on a Fedora host? Is selinux-policy in sync between the host and the image environment? If they are too out of sync, you will see failures like this since the host policy affects SELinux tools. Labels it doesn't recognize will fail with "Invalid argument".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kiwi image builds fail with nspawn due to broken loop devices

5 participants