Description
On cgroup v1, runc exec --cgroup <controller>:<subpath> is intended to place the exec process into an existing sub-cgroup below the container's cgroup for the selected controller.
For per-controller cgroup v1 subpaths, addIntoCgroupV1() resolves the target path with path.Join(base, sub) and then checks containment with strings.HasPrefix(cgPath, base). This is not a safe path containment check because it does not check path-component boundaries.
For example, if the container's freezer cgroup is:
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo
and the requested exec cgroup is:
then path.Join(base, sub) resolves the target to:
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo2
This is a sibling of foo, not a child of foo. However, it still has the string prefix:
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo
so strings.HasPrefix(cgPath, base) accepts it.
As a result, runc exec --cgroup freezer:../foo2 can place the exec process into an existing prefix-colliding sibling cgroup instead of rejecting the subpath as outside the container cgroup subtree.
Relevant code path:
libcontainer/process_linux.go
setnsProcess.addIntoCgroupV1()
Relevant code pattern:
cgPath := path.Join(base, sub)
if !strings.HasPrefix(cgPath, base) {
return fmt.Errorf("bad sub cgroup path: %s", sub)
}
paths[ctrl] = cgPath
Steps to reproduce the issue
Prerequisites:
- Run as root.
- The repro needs runc to see a cgroup v1 layout with the
freezer and devices controllers mounted.
- If the host already uses a cgroup v1
/sys/fs/cgroup layout, skip step 1.
1. On a cgroup v2 host, create a temporary cgroup v1 view
Simply mounting freezer and devices under the existing cgroup v2 root is not enough, because runc will still detect and use its cgroup v2 manager. Use a private mount namespace and rebuild /sys/fs/cgroup there.
unshare --mount --propagation private bash
Run the following commands inside the new root shell:
mount --make-rprivate /
umount -l /sys/fs/cgroup/freezer 2>/dev/null || true
umount -l /sys/fs/cgroup/devices 2>/dev/null || true
umount -l /sys/fs/cgroup 2>/dev/null || true
mount -t tmpfs cgroup_root /sys/fs/cgroup
mkdir -p /sys/fs/cgroup/freezer /sys/fs/cgroup/devices
mount -t cgroup -o freezer none /sys/fs/cgroup/freezer
mount -t cgroup -o devices none /sys/fs/cgroup/devices
findmnt -R /sys/fs/cgroup -o TARGET,FSTYPE,OPTIONS
Expected shape:
TARGET FSTYPE OPTIONS
/sys/fs/cgroup tmpfs ...
├─/sys/fs/cgroup/freezer cgroup ... freezer
└─/sys/fs/cgroup/devices cgroup ... devices
2. Prepare variables
export RUNC=/path/to/runc
export LAB=/tmp/runc-cgroup-prefix
export CID=runc-cgroup-prefix-demo
export BUNDLE="$LAB/bundle"
export ROOT="$LAB/runc-root"
rm -rf "$LAB"
mkdir -p "$BUNDLE/rootfs" "$ROOT"
3. Create a minimal OCI bundle
mount --bind / "$BUNDLE/rootfs"
cat > "$BUNDLE/config.json" <<'JSON'
{
"ociVersion": "1.0.2",
"process": {
"terminal": false,
"user": {"uid": 0, "gid": 0},
"args": ["/bin/sleep", "600"],
"env": ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],
"cwd": "/"
},
"root": {"path": "rootfs", "readonly": false},
"mounts": [
{"destination": "/proc", "type": "proc", "source": "proc", "options": ["nosuid", "noexec", "nodev"]},
{"destination": "/dev", "type": "tmpfs", "source": "tmpfs", "options": ["nosuid", "strictatime", "mode=755", "size=65536k"]}
],
"linux": {
"cgroupsPath": "runc-poc-cgroup/foo",
"namespaces": [
{"type": "pid"},
{"type": "ipc"},
{"type": "uts"},
{"type": "mount"}
],
"maskedPaths": [],
"readonlyPaths": []
}
}
JSON
The /dev tmpfs is only to avoid inheriting host /dev/fd symlinks from the bind-mounted rootfs. It is not part of the cgroup issue.
4. Start the container in cgroup foo
cd "$BUNDLE"
"$RUNC" --root "$ROOT" run -d "$CID"
Confirm that the container is in the v1 freezer cgroup foo:
"$RUNC" --root "$ROOT" exec "$CID" /bin/sh -c 'cat /proc/self/cgroup'
Expected output includes hierarchy numbers that may differ, but the paths should include:
devices:/runc-poc-cgroup/foo
freezer:/runc-poc-cgroup/foo
5. Create a prefix-colliding sibling cgroup
Create foo2 next to foo. It is not under foo.
mkdir -p /sys/fs/cgroup/freezer/runc-poc-cgroup/foo2
find /sys/fs/cgroup/freezer/runc-poc-cgroup -maxdepth 1 -type d -print
Expected shape:
/sys/fs/cgroup/freezer/runc-poc-cgroup
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo2
6. Execute with an escaped per-controller sub-cgroup value
"$RUNC" --root "$ROOT" exec --cgroup freezer:../foo2 "$CID" /bin/sh -c \
'cat /proc/self/cgroup; grep -q "freezer:.*/runc-poc-cgroup/foo2" /proc/self/cgroup && echo "REPRODUCED"'
Describe the results you received and expected
Actual result:
devices:/runc-poc-cgroup/foo
freezer:/runc-poc-cgroup/foo2
REPRODUCED
The hierarchy numbers may differ, but the important part is that the exec process joined:
for the freezer controller.
Expected result:
runc exec --cgroup freezer:../foo2 ... should be rejected because foo2 is outside the container's freezer cgroup subtree:
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo
The exec process should only be allowed to join an existing sub-cgroup below the container cgroup, not a sibling cgroup whose path merely shares the same string prefix.
What version of runc are you using?
runc version 1.5.0-rc.1+dev
commit: v1.5.0-rc.1-140-g122fb7a6
Host OS information
PRETTY_NAME="Ubuntu 24.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.4 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
Host kernel information
Linux ubuntu2404 6.17.0-35-generic #35~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May 26 19:30:42 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Description
On cgroup v1,
runc exec --cgroup <controller>:<subpath>is intended to place the exec process into an existing sub-cgroup below the container's cgroup for the selected controller.For per-controller cgroup v1 subpaths,
addIntoCgroupV1()resolves the target path withpath.Join(base, sub)and then checks containment withstrings.HasPrefix(cgPath, base). This is not a safe path containment check because it does not check path-component boundaries.For example, if the container's freezer cgroup is:
and the requested exec cgroup is:
then
path.Join(base, sub)resolves the target to:This is a sibling of
foo, not a child offoo. However, it still has the string prefix:so
strings.HasPrefix(cgPath, base)accepts it.As a result,
runc exec --cgroup freezer:../foo2can place the exec process into an existing prefix-colliding sibling cgroup instead of rejecting the subpath as outside the container cgroup subtree.Relevant code path:
Relevant code pattern:
Steps to reproduce the issue
Prerequisites:
freezeranddevicescontrollers mounted./sys/fs/cgrouplayout, skip step 1.1. On a cgroup v2 host, create a temporary cgroup v1 view
Simply mounting
freezeranddevicesunder the existing cgroup v2 root is not enough, because runc will still detect and use its cgroup v2 manager. Use a private mount namespace and rebuild/sys/fs/cgroupthere.Run the following commands inside the new root shell:
Expected shape:
2. Prepare variables
3. Create a minimal OCI bundle
The
/devtmpfs is only to avoid inheriting host/dev/fdsymlinks from the bind-mounted rootfs. It is not part of the cgroup issue.4. Start the container in cgroup
fooConfirm that the container is in the v1
freezercgroupfoo:Expected output includes hierarchy numbers that may differ, but the paths should include:
5. Create a prefix-colliding sibling cgroup
Create
foo2next tofoo. It is not underfoo.Expected shape:
6. Execute with an escaped per-controller sub-cgroup value
Describe the results you received and expected
Actual result:
The hierarchy numbers may differ, but the important part is that the exec process joined:
for the
freezercontroller.Expected result:
runc exec --cgroup freezer:../foo2 ...should be rejected becausefoo2is outside the container's freezer cgroup subtree:The exec process should only be allowed to join an existing sub-cgroup below the container cgroup, not a sibling cgroup whose path merely shares the same string prefix.
What version of runc are you using?
Host OS information
Host kernel information