Description
Description
When running a container with a user namespace (e.g., via OCI spec with linux.namespaces containing user type),
runc create fails with:
unable to start container process: error during container init: error preparing rootfs: remount-private
dst=/run/containerd/.../rootfs, flags=MS_PRIVATE: operation not permitted
Root Cause
rootfsParentMountPrivate() in libcontainer/rootfs_linux.go calls mount("", path, "", MS_PRIVATE, "") on the rootfs
parent mount. In a user namespace, mounts inherited from a more privileged mount namespace are "locked" by the kernel,
and any propagation type change is rejected with EPERM.
Why It Is Safe to Skip
prepareRoot() has already called mount("", "/", "", MS_SLAVE|MS_REC, "") before rootfsParentMountPrivate().
MS_SLAVE is sufficient:
pivot_root() succeeds (parent mount is not shared)
- Mount events do not propagate from the container to the parent namespace
MS_PRIVATE is defense-in-depth on top of MS_SLAVE, redundant in this context
Proposed Fix
Skip EPERM in rootfsParentMountPrivate() when running inside a user namespace:
if err == unix.EPERM && userns.RunningInUserNS() {
return nil
}
Steps to reproduce the issue
Requires: root, kernel 4.x+, runc built from main branch.
# 1. Create an OCI bundle with busybox rootfs
mkdir -p /tmp/userns-test/rootfs
cd /tmp/userns-test
docker export $(docker create busybox) | tar -C rootfs -xf -
runc spec
# 2. Add user namespace and UID/GID mappings to config.json
jq '.linux.namespaces += [{"type": "user"}]
| .linux.uidMappings = [{"containerID": 0, "hostID": 100000, "size": 65536}]
| .linux.gidMappings = [{"containerID": 0, "hostID": 100000, "size": 65536}]
| .linux.rootfsPropagation = "rslave"
| .linux.devices = []
| .process.args = ["id"]
| .process.terminal = false' config.json > config.json.tmp && mv config.json.tmp config.json
# 3. Remap rootfs ownership to mapped UID range
chown -R 100000:100000 rootfs/
# 4. Make the parent mount shared (simulates containerd environment)
mount --make-rshared /
# 5. Run the container — this will fail with EPERM
runc run test-userns
Expected: container runs and prints uid=0(root) gid=0(root)
Describe the results you received and expected
runc create failed: unable to start container process: error during container init:
error preparing rootfs: remount-private dst=/run/containerd/.../rootfs,
flags=MS_PRIVATE: operation not permitted
What version of runc are you using?
runc: v1.5.0
Host OS information
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.5 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Host kernel information
Kernel: 5.15 (also reproducible on 4.18+)
Description
Description
When running a container with a user namespace (e.g., via OCI spec with
linux.namespacescontainingusertype),runc createfails with:unable to start container process: error during container init: error preparing rootfs: remount-private
dst=/run/containerd/.../rootfs, flags=MS_PRIVATE: operation not permitted
Root Cause
rootfsParentMountPrivate()inlibcontainer/rootfs_linux.gocallsmount("", path, "", MS_PRIVATE, "")on the rootfsparent mount. In a user namespace, mounts inherited from a more privileged mount namespace are "locked" by the kernel,
and any propagation type change is rejected with
EPERM.Why It Is Safe to Skip
prepareRoot()has already calledmount("", "/", "", MS_SLAVE|MS_REC, "")beforerootfsParentMountPrivate().MS_SLAVEis sufficient:pivot_root()succeeds (parent mount is not shared)MS_PRIVATEis defense-in-depth on top ofMS_SLAVE, redundant in this contextProposed Fix
Skip
EPERMinrootfsParentMountPrivate()when running inside a user namespace:Steps to reproduce the issue
Requires: root, kernel 4.x+, runc built from main branch.
Expected: container runs and prints uid=0(root) gid=0(root)
Describe the results you received and expected
What version of runc are you using?
runc: v1.5.0
Host OS information
Host kernel information
Kernel: 5.15 (also reproducible on 4.18+)