cgroup-parent puts BuildKit builds in the wrong cgroup #2903
Open
Description
Contributing guidelines
- I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- ... the documentation does not mention anything about my problem
- ... there are no open or closed issues that are related to my problem
Description
When cgroup-parent
is set in /etc/docker/daemon.json, it puts BuildKit builds in the wrong parent cgroup, so any applicable limits don't take effect.
Expected behaviour
BuildKit builds should get put in the same parent cgroup as ordinary containers and non-BuildKit builds.
Actual behaviour
BuildKit builds get put in a slightly different parent cgroup than ordinary containers and non-BuildKit builds.
Buildx version
github.com/docker/buildx v0.19.3 48d6a39
Docker info
Client: Docker Engine - Community
Version: v27.4.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.19.3
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.32.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: v27.4.1
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 88bf19b2105c8b17560993bee28a01ddc2f97182
runc version: v1.2.2-0-g7cb3632
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.14.0-503.21.1.el9_5.ppc64le
Operating System: Red Hat Enterprise Linux 9.5 (Plow)
OSType: linux
Architecture: ppc64le
CPUs: 144
Total Memory: 123.6GiB
Name: <redacted>
ID: 28e41ca5-866a-47aa-a1ee-e4b2e5bf109e
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Builders list
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default* docker
\_ default \_ default running v0.17.3 linux/ppc64le
Configuration
/etc/systemd/system/docker_limit.slice
:
[Unit]
Description=Slice that limits docker resources
Before=slices.target
[Slice]
MemoryAccounting=true
MemoryLimit=64G
/etc/docker/daemon.json
:
{
"cgroup-parent": "docker_limit.slice"
}
Dockerfile
:
FROM gcc
COPY ./use100gb.c .
RUN gcc use100gb.c -o use100gb && ./use100gb
use100gb.c
:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define GB 1024*1024*1024
int main(void) {
for(int i = 1; i <= 100; ++i) {
void *buf = malloc(GB);
if(!buf) break;
memset(buf, 'x', GB);
printf("%d\n", i);
}
puts("Press Ctrl+C to exit");
for(;;) sleep(1);
}
$ DOCKER_BUILDKIT=0 docker build . # gets OOM-killed, correctly
$ docker build . # takes up 100GB of memory, incorrectly
Build logs
Additional info
For ordinary containers and non-BuildKit builds, /proc/PID/cgroup
contains something like this (correct):
0::/docker_limit.slice/docker-b9ba399c60c1a2407001ee90ee3307ee3104e2f1c1db25ec6337ab298fe7518f.scope
For BuildKit builds, /proc/PID/cgroup
contains something like this (incorrect):
0::/system.slice/docker_limit.slice:docker:k4xghul4df75u16fk09bfyizr