Skip to content

Commit 4add51d

Browse files
2.3.3: CI reliability: retry transient network failures in the JS image build and remove a SIGPIPE false-negative from the dind example tests (issue #104 / PR #105). The JS image build (ubuntu/24.04/js/install.sh, COPY'd into every dind/language image) occasionally died on a single transient third-party error, with no retry: the lean/language build hit a flaky npm registry response during npm install -g npm@latest, and the dind-swift build hit playwright install … msedge … getting an invalid GPG key body from packages.microsoft.com ("gpg: no valid OpenPGP data found" → "Failed to install msedge"). Every network-bound build step — the npm self-update, the Playwright/Puppeteer CLI install, and the Playwright browser-binary download — now goes through a run_with_retry wrapper that retries with exponential backoff (mirroring apt_update_with_retry in common.sh, with the same overridable retry budget so it stays unit-testable). playwright install skips already-present browsers, so a retry only re-attempts the one that blipped. This is build-time resilience only — the resulting image is unchanged on success. Separately, the dind example suite asserted on container logs with docker logs … | grep -q "needle". Under set -o pipefail, grep -q closes the pipe the instant it matches, which can deliver SIGPIPE to the still-streaming docker logs; pipefail then propagates that 141 and a present message reads as absent, failing the test spuriously (observed on the preload test even though the expected line was right there in the logs). tests/dind/lib.sh now provides a pipe-free logs_contain helper (capture once, match with a case glob) and all example assertions use it. Covered by new unit tests experiments/test-issue104-build-retry.sh and experiments/test-issue104-logs-contain.sh.; dind-box: warn when the nested daemon runs on the vfs storage driver (issue #104). When the inner dockerd ends up on vfs — either pinned explicitly via DIND_STORAGE_DRIVER=vfs (e.g. for overlay-on-overlay compatibility) or reached as the last-resort auto-detect fallback — large images could fail to pull/run with a cryptic failed to register layer: no space left on device and **no hint** that the storage driver was the cause. vfs performs no copy-on-write: it stores every image layer as a full, independent copy, so a multi-GB image's on-disk footprint becomes the *sum* of all cumulative layer sizes (many times the image size), and a >30 GB image can overflow a disk with far more than 30 GB free (link-assistant/hive-mind#1914). This is observability, not a default change — vfs stays the safe fallback. The entrypoint now emits a single, actionable warning right after the daemon becomes ready whenever the active driver is vfs, explaining the copy-on-write/disk implication and naming the DIND_STORAGE_DRIVER=fuse-overlayfs remediation (copy-on-write, works overlay-on-overlay, already shipped in the image). The remediation line adapts to whether /dev/fuse is present, so when it is missing it points at --privileged / --device /dev/fuse first. The DIND_STORAGE_DRIVER doc comment now spells out the vfs disk amplification too. Covered by a new unit test (experiments/test-issue104-vfs-warning.sh) and a new assertion in the CI-run tests/dind/example-storage-driver-vfs.sh; documented in docs/dind/USAGE.md.
1 parent 4f67188 commit 4add51d

3 files changed

Lines changed: 1 addition & 62 deletions

File tree

.changeset/issue-104-ci-reliability.md

Lines changed: 0 additions & 32 deletions
This file was deleted.

.changeset/issue-104-vfs-storage-driver-warning.md

Lines changed: 0 additions & 29 deletions
This file was deleted.

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.3.2
1+
2.3.3

0 commit comments

Comments
 (0)