Skip to content

runc exec --cgroup can join a prefix-colliding sibling cgroup on cgroup v1 #5351

Description

@Ftres

Description

On cgroup v1, runc exec --cgroup <controller>:<subpath> is intended to place the exec process into an existing sub-cgroup below the container's cgroup for the selected controller.

For per-controller cgroup v1 subpaths, addIntoCgroupV1() resolves the target path with path.Join(base, sub) and then checks containment with strings.HasPrefix(cgPath, base). This is not a safe path containment check because it does not check path-component boundaries.

For example, if the container's freezer cgroup is:

/sys/fs/cgroup/freezer/runc-poc-cgroup/foo

and the requested exec cgroup is:

freezer:../foo2

then path.Join(base, sub) resolves the target to:

/sys/fs/cgroup/freezer/runc-poc-cgroup/foo2

This is a sibling of foo, not a child of foo. However, it still has the string prefix:

/sys/fs/cgroup/freezer/runc-poc-cgroup/foo

so strings.HasPrefix(cgPath, base) accepts it.

As a result, runc exec --cgroup freezer:../foo2 can place the exec process into an existing prefix-colliding sibling cgroup instead of rejecting the subpath as outside the container cgroup subtree.

Relevant code path:

libcontainer/process_linux.go
setnsProcess.addIntoCgroupV1()

Relevant code pattern:

cgPath := path.Join(base, sub)
if !strings.HasPrefix(cgPath, base) {
        return fmt.Errorf("bad sub cgroup path: %s", sub)
}
paths[ctrl] = cgPath

Steps to reproduce the issue

Prerequisites:

  • Run as root.
  • The repro needs runc to see a cgroup v1 layout with the freezer and devices controllers mounted.
  • If the host already uses a cgroup v1 /sys/fs/cgroup layout, skip step 1.

1. On a cgroup v2 host, create a temporary cgroup v1 view

Simply mounting freezer and devices under the existing cgroup v2 root is not enough, because runc will still detect and use its cgroup v2 manager. Use a private mount namespace and rebuild /sys/fs/cgroup there.

unshare --mount --propagation private bash

Run the following commands inside the new root shell:

mount --make-rprivate /

umount -l /sys/fs/cgroup/freezer 2>/dev/null || true
umount -l /sys/fs/cgroup/devices 2>/dev/null || true
umount -l /sys/fs/cgroup 2>/dev/null || true

mount -t tmpfs cgroup_root /sys/fs/cgroup
mkdir -p /sys/fs/cgroup/freezer /sys/fs/cgroup/devices
mount -t cgroup -o freezer none /sys/fs/cgroup/freezer
mount -t cgroup -o devices none /sys/fs/cgroup/devices

findmnt -R /sys/fs/cgroup -o TARGET,FSTYPE,OPTIONS

Expected shape:

TARGET                   FSTYPE OPTIONS
/sys/fs/cgroup           tmpfs  ...
├─/sys/fs/cgroup/freezer cgroup ... freezer
└─/sys/fs/cgroup/devices cgroup ... devices

2. Prepare variables

export RUNC=/path/to/runc
export LAB=/tmp/runc-cgroup-prefix
export CID=runc-cgroup-prefix-demo
export BUNDLE="$LAB/bundle"
export ROOT="$LAB/runc-root"

rm -rf "$LAB"
mkdir -p "$BUNDLE/rootfs" "$ROOT"

3. Create a minimal OCI bundle

mount --bind / "$BUNDLE/rootfs"

cat > "$BUNDLE/config.json" <<'JSON'
{
  "ociVersion": "1.0.2",
  "process": {
    "terminal": false,
    "user": {"uid": 0, "gid": 0},
    "args": ["/bin/sleep", "600"],
    "env": ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],
    "cwd": "/"
  },
  "root": {"path": "rootfs", "readonly": false},
  "mounts": [
    {"destination": "/proc", "type": "proc", "source": "proc", "options": ["nosuid", "noexec", "nodev"]},
    {"destination": "/dev", "type": "tmpfs", "source": "tmpfs", "options": ["nosuid", "strictatime", "mode=755", "size=65536k"]}
  ],
  "linux": {
    "cgroupsPath": "runc-poc-cgroup/foo",
    "namespaces": [
      {"type": "pid"},
      {"type": "ipc"},
      {"type": "uts"},
      {"type": "mount"}
    ],
    "maskedPaths": [],
    "readonlyPaths": []
  }
}
JSON

The /dev tmpfs is only to avoid inheriting host /dev/fd symlinks from the bind-mounted rootfs. It is not part of the cgroup issue.

4. Start the container in cgroup foo

cd "$BUNDLE"
"$RUNC" --root "$ROOT" run -d "$CID"

Confirm that the container is in the v1 freezer cgroup foo:

"$RUNC" --root "$ROOT" exec "$CID" /bin/sh -c 'cat /proc/self/cgroup'

Expected output includes hierarchy numbers that may differ, but the paths should include:

devices:/runc-poc-cgroup/foo
freezer:/runc-poc-cgroup/foo

5. Create a prefix-colliding sibling cgroup

Create foo2 next to foo. It is not under foo.

mkdir -p /sys/fs/cgroup/freezer/runc-poc-cgroup/foo2
find /sys/fs/cgroup/freezer/runc-poc-cgroup -maxdepth 1 -type d -print

Expected shape:

/sys/fs/cgroup/freezer/runc-poc-cgroup
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo
/sys/fs/cgroup/freezer/runc-poc-cgroup/foo2

6. Execute with an escaped per-controller sub-cgroup value

"$RUNC" --root "$ROOT" exec --cgroup freezer:../foo2 "$CID" /bin/sh -c \
  'cat /proc/self/cgroup; grep -q "freezer:.*/runc-poc-cgroup/foo2" /proc/self/cgroup && echo "REPRODUCED"'

Describe the results you received and expected

Actual result:

devices:/runc-poc-cgroup/foo
freezer:/runc-poc-cgroup/foo2
REPRODUCED

The hierarchy numbers may differ, but the important part is that the exec process joined:

/runc-poc-cgroup/foo2

for the freezer controller.

Expected result:

runc exec --cgroup freezer:../foo2 ... should be rejected because foo2 is outside the container's freezer cgroup subtree:

/sys/fs/cgroup/freezer/runc-poc-cgroup/foo

The exec process should only be allowed to join an existing sub-cgroup below the container cgroup, not a sibling cgroup whose path merely shares the same string prefix.

What version of runc are you using?

runc version 1.5.0-rc.1+dev
commit: v1.5.0-rc.1-140-g122fb7a6

Host OS information

PRETTY_NAME="Ubuntu 24.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.4 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo

Host kernel information

Linux ubuntu2404 6.17.0-35-generic #35~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May 26 19:30:42 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions