Skip to content

cgroup-parent puts BuildKit builds in the wrong cgroup #2903

Open
@josephcsible

Description

Contributing guidelines

I've found a bug and checked that ...

  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

When cgroup-parent is set in /etc/docker/daemon.json, it puts BuildKit builds in the wrong parent cgroup, so any applicable limits don't take effect.

Expected behaviour

BuildKit builds should get put in the same parent cgroup as ordinary containers and non-BuildKit builds.

Actual behaviour

BuildKit builds get put in a slightly different parent cgroup than ordinary containers and non-BuildKit builds.

Buildx version

github.com/docker/buildx v0.19.3 48d6a39

Docker info

Client: Docker Engine - Community
 Version:    v27.4.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.19.3
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.32.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: v27.4.1
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 88bf19b2105c8b17560993bee28a01ddc2f97182
 runc version: v1.2.2-0-g7cb3632
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.14.0-503.21.1.el9_5.ppc64le
 Operating System: Red Hat Enterprise Linux 9.5 (Plow)
 OSType: linux
 Architecture: ppc64le
 CPUs: 144
 Total Memory: 123.6GiB
 Name: <redacted>
 ID: 28e41ca5-866a-47aa-a1ee-e4b2e5bf109e
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Builders list

NAME/NODE     DRIVER/ENDPOINT   STATUS    BUILDKIT   PLATFORMS
default*      docker                                 
 \_ default    \_ default       running   v0.17.3    linux/ppc64le

Configuration

/etc/systemd/system/docker_limit.slice:

[Unit]
Description=Slice that limits docker resources
Before=slices.target

[Slice]
MemoryAccounting=true
MemoryLimit=64G

/etc/docker/daemon.json:

{
    "cgroup-parent": "docker_limit.slice"
}

Dockerfile:

FROM gcc
COPY ./use100gb.c .
RUN gcc use100gb.c -o use100gb && ./use100gb

use100gb.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define GB 1024*1024*1024

int main(void) {
    for(int i = 1; i <= 100; ++i) {
        void *buf = malloc(GB);
        if(!buf) break;
        memset(buf, 'x', GB);
        printf("%d\n", i);
    }
    puts("Press Ctrl+C to exit");
    for(;;) sleep(1);
}
$ DOCKER_BUILDKIT=0 docker build .  # gets OOM-killed, correctly
$ docker build .                    # takes up 100GB of memory, incorrectly

Build logs


Additional info

For ordinary containers and non-BuildKit builds, /proc/PID/cgroup contains something like this (correct):

0::/docker_limit.slice/docker-b9ba399c60c1a2407001ee90ee3307ee3104e2f1c1db25ec6337ab298fe7518f.scope

For BuildKit builds, /proc/PID/cgroup contains something like this (incorrect):

0::/system.slice/docker_limit.slice:docker:k4xghul4df75u16fk09bfyizr

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions