Skip to content

Occasional slow startup times during Booting builder step #283

Open
@niodice

Description

Contributing guidelines

I've found a bug, and:

  • The documentation does not mention anything about my problem
  • There are no open or closed issues that are related to my problem

Description

I see that, about 25% of the time, that the Booting builder step can take quite some time to execute. I am running on Amazon EKS and we are using self hosted runners deployed using https://github.com/actions/actions-runner-controller. Before my runner becomes available to pick up a workflow, I see that the docker daemon has started successfully:

time="2023-10-25T21:08:30.983564648Z" level=info msg="Daemon has completed initialization"
time="2023-10-25T21:08:31.036285627Z" level=info msg="API listen on /var/run/docker.sock"

Expected behaviour

Consistent startup time

Actual behaviour

Startup time is very slow at time.

Repository URL

No response

Workflow run URL

No response

YAML workflow

The relevant parts of the workflow are:


jobs:
  publish-docker:
    name: publish docker image
    runs-on: self-hosted
    steps:
      - uses: actions/checkout@v3
      - name: setup docker buildx
        uses: docker/setup-buildx-action@v3

Workflow logs

Wed, 25 Oct 2023 21:08:49 GMT
  /usr/local/bin/docker buildx inspect --bootstrap --builder builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb9
Wed, 25 Oct 2023 21:08:49 GMT
  #1 [internal] booting buildkit
Wed, 25 Oct 2023 21:08:49 GMT
  #1 pulling image moby/buildkit:buildx-stable-1
Wed, 25 Oct 2023 21:08:57 GMT
  #1 pulling image moby/buildkit:buildx-stable-1 7.5s done
Wed, 25 Oct 2023 21:08:57 GMT
  #1 creating container buildx_buildkit_builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb90
Wed, 25 Oct 2023 21:11:12 GMT
  #1 143.0 error: failed to list workers: Unavailable: connection error: desc = "transport: error while dialing: dial unix /run/buildkit/buildkitd.sock: connect: no such file or directory"
Wed, 25 Oct 2023 21:11:12 GMT
  #1 creating container buildx_buildkit_builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb90 135.5s done
Wed, 25 Oct 2023 21:11:12 GMT
  #1 ERROR: exit code 1
Wed, 25 Oct 2023 21:11:12 GMT
  ------
Wed, 25 Oct 2023 21:11:12 GMT
   > [internal] booting buildkit:
Wed, 25 Oct 2023 21:11:12 GMT
  143.0 error: failed to list workers: Unavailable: connection error: desc = "transport: error while dialing: dial unix /run/buildkit/buildkitd.sock: connect: no such file or directory"
Wed, 25 Oct 2023 21:11:12 GMT
  ------
Wed, 25 Oct 2023 21:11:12 GMT
  Name:          builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb9
Wed, 25 Oct 2023 21:11:12 GMT
  Driver:        docker-container
Wed, 25 Oct 2023 21:11:12 GMT
  Last Activity: 2023-10-25 21:08:49 +0000 UTC
Wed, 25 Oct 2023 21:11:12 GMT
  
Wed, 25 Oct 2023 21:11:12 GMT
  Nodes:
Wed, 25 Oct 2023 21:11:12 GMT
  Name:     builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb90
Wed, 25 Oct 2023 21:11:12 GMT
  Endpoint: unix:///var/run/docker.sock
Wed, 25 Oct 2023 21:11:12 GMT
  Error:    Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/buildx_buildkit_builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb90/json": context deadline exceeded

The delay seems to be in the step that produces this log line:

  Error:    Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/buildx_buildkit_builder-9ff3bc3c-4aa4-43d7-b3b0-9566551cacb90/json": context deadline exceeded

BuildKit logs

No response

Additional info

No response

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions