Skip to content

fix(ci): ensure Docker daemon is ready on Windows 2022 runners#2124

Open
nddq wants to merge 1 commit intomainfrom
fix/windows-docker-ci-flake
Open

fix(ci): ensure Docker daemon is ready on Windows 2022 runners#2124
nddq wants to merge 1 commit intomainfrom
fix/windows-docker-ci-flake

Conversation

@nddq
Copy link
Member

@nddq nddq commented Mar 20, 2026

Description

The "Build Agent Image - Windows 2022" job intermittently fails with:

failed to connect to the docker API at npipe:////./pipe/docker_engine;
open //./pipe/docker_engine: The system cannot find the file specified.

Unlike Linux runners where Docker is managed by systemd and always available, Windows runners start Docker Engine as a Windows Service that can be slow to initialize or sometimes not running at job start.

Add a Docker daemon readiness step that polls docker info with a 2-minute timeout, attempting to start the service if it's not running.

Related Issue

N/A — fixing flaky CI observed across multiple recent runs.

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Screenshots (if applicable) or Testing Completed

CI will validate — the Docker readiness step itself will be exercised on the next windows-2022 run.

Recent failures with the same root cause:

Additional Notes

N/A

@nddq nddq requested a review from a team as a code owner March 20, 2026 01:05
@nddq nddq requested review from camrynl and snguyen64 March 20, 2026 01:05
The "Build Agent Image - Windows 2022" job occasionally fails because
the Docker daemon is not ready when the job starts on GitHub-hosted
windows-2022 runners. Unlike Linux runners where Docker is managed by
systemd and reliably available, Windows runners start Docker Engine as
a Windows Service that can be slow to initialize.

Add a Docker daemon readiness step that polls for Docker availability
(with a 2-minute timeout) before any Docker commands run. The step
attempts to start the Docker service if it is not running, then polls
docker info until it succeeds.

Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
@nddq nddq force-pushed the fix/windows-docker-ci-flake branch from 7da0b2b to 781e288 Compare March 20, 2026 01:11
@nddq nddq changed the title fix(ci): ensure Docker daemon is ready and switch to buildx on Windows runners fix(ci): ensure Docker daemon is ready on Windows 2022 runners Mar 20, 2026
@github-actions
Copy link

Retina Code Coverage Report

Total coverage decreased from 33.3% to 33.2%

Increased diff

Impacted Files Coverage
pkg/controllers/daemon/namespace/namespace_controller.go 76.24% ... 78.46% (2.22%) ⬆️

Decreased diff

Impacted Files Coverage
pkg/controllers/operator/retinaendpoint/retinaendpoint_controller.go 83.28% ... 82.25% (-1.03%) ⬇️

@nddq nddq requested a review from matmerr March 20, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant