fix: Windows Docker build improvements #4

robobun · 2026-01-17T09:03:33Z

Summary

Split into two separate jobs: bootstrap and prebuilt
- Bootstrap job checks if image exists and skips rebuild if cached (completes in ~20s when cached)
- Prebuilt job depends on bootstrap and builds remaining stages
Pre-install node_modules in base stage BEFORE cmake runs
- Fixes "Cannot find module '@lezer/lr'" race condition
- The codegen scripts (cppbind.ts) now have all dependencies available when they run
Add Windows Defender exclusions and disable real-time monitoring
- Exclusions for C:\ProgramData\docker and D:\a
- Exclusions for docker.exe and dockerd.exe processes
Increase bootstrap timeout to 120 minutes
Use environment variables for registry and image name

Test Results

✅ Workflow run: https://github.com/oven-sh/bun-development-docker-image/actions/runs/21090665080

build-bootstrap: 20s (cached)
build-prebuilt: 1h37m25s
Total: ~1h38m (down from >6h timeout failures)

Key Fixes

Race Condition Fix

The main issue was that CMake's parallel build was running cppbind.ts (step 15) before bun install completed (step 17), causing:

error: Cannot find module '@lezer/lr' from 'C:\workspace\bun\node_modules\.bun\@[email protected]\node_modules\@lezer\cpp\dist\index.js'

Fixed by adding an explicit bun install step in the Dockerfile's base stage, ensuring dependencies are installed before the build runs.

Performance Optimizations

Disabling Windows Defender real-time monitoring
Adding exclusion paths for Docker directories
Docker system prune before builds

🤖 Generated with Claude Code

- Split into two separate jobs: bootstrap and prebuilt - Bootstrap job checks if image exists and skips rebuild if cached - Prebuilt job depends on bootstrap and builds remaining stages - Pre-install node_modules in base stage BEFORE cmake runs - Fixes "Cannot find module '@lezer/lr'" race condition - cppbind.ts codegen now has dependencies available - Add Windows Defender exclusions and disable real-time monitoring - Exclusions for C:\ProgramData\docker and D:\a - Exclusions for docker.exe and dockerd.exe processes - Increase bootstrap timeout to 120 minutes - Use environment variables for registry and image name Co-Authored-By: Claude Opus 4.5 <[email protected]>

coderabbitai · 2026-01-17T09:10:13Z

Walkthrough

Restructures the Windows build workflow to split jobs with explicit dependencies, introduces hash and date-driven image caching, implements bootstrap image caching logic, and improves space management and resilience. Updates Dockerfile.windows to pre-install node_modules in the base stage before the build step.

Changes

Cohort / File(s)	Summary
Workflow job orchestration and caching `.github/workflows/daily-build-windows.yml`	Splits build into `build-bootstrap` and `build-prebuilt` jobs with explicit dependencies; adds job outputs for `bootstrap_hash` and `date`. Introduces `REGISTRY` and `IMAGE_NAME` environment variables. Replaces hard-coded repository references with `IMAGE_NAME`-based tags across all build stages (bootstrap, base, prebuilt, artifacts, run). Adds bootstrap image caching logic with conditional skip branches. Implements dynamic hash/date-based tagging strategy. Extends timeouts and retry policies for increased resilience. Adds space pruning, Defender monitoring toggles, and cache cleanup steps. Introduces new test section validating prebuilt image with dynamically-built test Dockerfile.
Dockerfile dependencies `Dockerfile.windows`	Adds pre-installation step in base stage to run `bun install` with frozen lockfile for node_modules before cmake build, ensuring codegen dependencies are available. Updates build comment to reflect pre-installed node_modules.

Possibly related PRs

oven-sh/bun-development-docker-image#1 — Introduces the initial Windows workflow and Dockerfile that this PR refactors with job splitting, caching strategy, and dependency updates.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: Windows Docker build improvements' directly relates to the main changes—improving Docker build reliability and performance on Windows by splitting jobs, fixing race conditions, and optimizing Defender settings.
Description check	✅ Passed	The description thoroughly explains the changeset: job splitting, node_modules pre-installation to fix race condition, Defender exclusions, timeout increases, and environment variables. It includes test results demonstrating significant performance improvements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/daily-build-windows.yml (1)
424-433: Good smoke test, consider cleanup of test artifacts.

The inline test Dockerfile and version check is a solid smoke test. Consider adding cleanup to remove the test image and Dockerfile.test after the test:
          docker build --isolation=process -f Dockerfile.test -t test-bun .
          docker run --isolation=process --rm test-bun
+          Remove-Item -Force Dockerfile.test -ErrorAction SilentlyContinue
+          docker rmi test-bun -f 2>$null
        shell: pwsh

🤖 Fix all issues with AI agents

In @.github/workflows/daily-build-windows.yml:
- Around line 220-263: The "Clear disk space" step is duplicated between the
build-bootstrap and build-prebuilt jobs; extract it into a reusable composite
action (e.g., .github/actions/cleanup-windows) that runs PowerShell (shell:
pwsh) and accepts an input list of paths and a flag for running docker prune so
each job can pass its specific paths; update the build-bootstrap and
build-prebuilt jobs to call this composite action (preserving behavior such as
Get-WmiObject queries, Directory::Delete with try/catch, Java* deletion, and
docker system prune) so the logic is centralized but still configurable per-job.
- Around line 50-59: The PowerShell step with id "check-bootstrap" incorrectly
assumes docker manifest inspect non-zero exits throw into the try/catch; update
the logic to run docker manifest inspect and then check $LASTEXITCODE (or $?) to
determine success instead of relying on try/catch, set $exists accordingly, and
then write EXISTS to GITHUB_OUTPUT; reference the step id "check-bootstrap" and
the command "docker manifest inspect" when making the change.
- Around line 203-211: The step "Get Bun repo commit SHA" (id: bun-sha)
currently assumes the gh api call returns a value; add error handling to
validate $sha is non-empty and that the gh command succeeded before computing
$shortSha and writing to $GITHUB_OUTPUT. Wrap the gh api call in a try/catch or
check its exit status, emit a clear error and exit non-zero if the API call
fails or $sha is empty, or optionally implement a retry/backoff and/or fallback
tag, and only echo "SHA=$shortSha" when $shortSha is a valid 7-character commit
SHA to avoid producing empty image tags.

coderabbitai · 2026-01-17T09:10:16Z

.github/workflows/daily-build-windows.yml

+      - name: Check if bootstrap image exists
+        id: check-bootstrap
+        run: |
+          $exists = $false
+          try {
+            docker manifest inspect ghcr.io/${{ env.IMAGE_NAME }}:bootstrap-amd64-${{ steps.bootstrap-hash.outputs.HASH }} 2>$null
+            $exists = $true
+          } catch { }
+          echo "EXISTS=$exists" >> $env:GITHUB_OUTPUT
+        shell: pwsh


⚠️ Potential issue | 🟡 Minor

The docker manifest inspect error handling may not work as expected.

In PowerShell, docker manifest inspect returning a non-zero exit code doesn't automatically throw an exception. The try/catch block will only catch .NET exceptions, not process exit codes.

🔧 Proposed fix using $LASTEXITCODE

- name: Check if bootstrap image exists id: check-bootstrap run: | $exists = $false - try { - docker manifest inspect ghcr.io/${{ env.IMAGE_NAME }}:bootstrap-amd64-${{ steps.bootstrap-hash.outputs.HASH }} 2>$null - $exists = $true - } catch { } + docker manifest inspect ghcr.io/${{ env.IMAGE_NAME }}:bootstrap-amd64-${{ steps.bootstrap-hash.outputs.HASH }} 2>$null + if ($LASTEXITCODE -eq 0) { + $exists = $true + } echo "EXISTS=$exists" >> $env:GITHUB_OUTPUT shell: pwsh

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Check if bootstrap image exists

id: check-bootstrap

run: |

$exists = $false

try {

docker manifest inspect ghcr.io/${{ env.IMAGE_NAME }}:bootstrap-amd64-${{ steps.bootstrap-hash.outputs.HASH }} 2>$null

$exists = $true

} catch { }

echo "EXISTS=$exists" >> $env:GITHUB_OUTPUT

shell: pwsh

- name: Check if bootstrap image exists

id: check-bootstrap

run: |

$exists = $false

docker manifest inspect ghcr.io/${{ env.IMAGE_NAME }}:bootstrap-amd64-${{ steps.bootstrap-hash.outputs.HASH }} 2>$null

if ($LASTEXITCODE -eq 0) {

$exists = $true

}

echo "EXISTS=$exists" >> $env:GITHUB_OUTPUT

shell: pwsh

🤖 Prompt for AI Agents

In @.github/workflows/daily-build-windows.yml around lines 50 - 59, The PowerShell step with id "check-bootstrap" incorrectly assumes docker manifest inspect non-zero exits throw into the try/catch; update the logic to run docker manifest inspect and then check $LASTEXITCODE (or $?) to determine success instead of relying on try/catch, set $exists accordingly, and then write EXISTS to GITHUB_OUTPUT; reference the step id "check-bootstrap" and the command "docker manifest inspect" when making the change.

coderabbitai · 2026-01-17T09:10:16Z

.github/workflows/daily-build-windows.yml

+      - name: Get Bun repo commit SHA
+        id: bun-sha
+        run: |
+          $sha = gh api repos/oven-sh/bun/commits/main --jq .sha
+          $shortSha = $sha.Substring(0, 7)
+          echo "SHA=$shortSha" >> $env:GITHUB_OUTPUT
+        shell: pwsh
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}


⚠️ Potential issue | 🟡 Minor

Add error handling for the GitHub API call.

If the gh api call fails (rate limiting, network issues), $sha will be empty, causing downstream issues with image tags like base-amd64- (no SHA suffix).

🔧 Proposed fix with validation

- name: Get Bun repo commit SHA id: bun-sha run: | $sha = gh api repos/oven-sh/bun/commits/main --jq .sha + if (-not $sha) { + Write-Error "Failed to fetch Bun commit SHA" + exit 1 + } $shortSha = $sha.Substring(0, 7) echo "SHA=$shortSha" >> $env:GITHUB_OUTPUT shell: pwsh env: GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

🤖 Prompt for AI Agents

In @.github/workflows/daily-build-windows.yml around lines 203 - 211, The step "Get Bun repo commit SHA" (id: bun-sha) currently assumes the gh api call returns a value; add error handling to validate $sha is non-empty and that the gh command succeeded before computing $shortSha and writing to $GITHUB_OUTPUT. Wrap the gh api call in a try/catch or check its exit status, emit a clear error and exit non-zero if the API call fails or $sha is empty, or optionally implement a retry/backoff and/or fallback tag, and only echo "SHA=$shortSha" when $shortSha is a valid 7-character commit SHA to avoid producing empty image tags.

coderabbitai · 2026-01-17T09:10:16Z

.github/workflows/daily-build-windows.yml

+      - name: Clear disk space
+        run: |
+          $disk = Get-WmiObject -Class Win32_LogicalDisk -Filter "DeviceID='C:'"
+          $totalGB = [math]::Round($disk.Size / 1GB, 2)
+          $usedGB = [math]::Round(($disk.Size - $disk.FreeSpace) / 1GB, 2)
+          Write-Host "Disk space before cleanup: $usedGB GB used / $totalGB GB total" -ForegroundColor Yellow
+
+          $paths = @(
+            "C:\Program Files\dotnet",
+            "C:\Program Files (x86)\dotnet",
+            "C:\Android",
+            "C:\Program Files (x86)\Android",
+            "$Env:AGENT_TOOLSDIRECTORY\CodeQL",
+            "$Env:AGENT_TOOLSDIRECTORY\Ruby",
+            "$Env:AGENT_TOOLSDIRECTORY\go",
+            "$Env:AGENT_TOOLSDIRECTORY\Python",
+            "$Env:AGENT_TOOLSDIRECTORY\PyPy",
+            "C:\Program Files\MySQL",
+            "C:\Program Files\PostgreSQL",
+            "C:\Rust",
+            "C:\vcpkg",
+            "C:\msys64",
+            "C:\Strawberry",
+            "C:\Program Files\Mozilla Firefox",
+            "C:\Julia",
+            "C:\Miniconda",
+            "C:\Program Files\R",
+            "C:\Program Files\Microsoft SQL Server"
+          )
+          foreach ($path in $paths) {
+            if (Test-Path $path) {
+              try { [System.IO.Directory]::Delete($path, $true) } catch { }
+            }
+          }
+          Get-ChildItem "$Env:AGENT_TOOLSDIRECTORY" -Filter "Java*" -Directory -ErrorAction SilentlyContinue | ForEach-Object {
+            try { [System.IO.Directory]::Delete($_.FullName, $true) } catch { }
+          }
+
+          docker system prune -af 2>$null
+
+          $diskAfter = Get-WmiObject -Class Win32_LogicalDisk -Filter "DeviceID='C:'"
+          $usedGBAfter = [math]::Round(($diskAfter.Size - $diskAfter.FreeSpace) / 1GB, 2)
+          Write-Host "Reclaimed $([math]::Round(($usedGB - $usedGBAfter), 2)) GB" -ForegroundColor Green
+        shell: pwsh


🧹 Nitpick | 🔵 Trivial

Consider extracting disk cleanup into a reusable composite action.

The cleanup logic is duplicated between build-bootstrap (lines 62-139) and build-prebuilt (lines 220-263). While the lists differ slightly, extracting this into a composite action would improve maintainability.

This is low priority since:

Each job runs on a separate ephemeral runner (duplication is necessary)

The lists are intentionally different (bootstrap needs more aggressive cleanup for VS install)

🤖 Prompt for AI Agents

In @.github/workflows/daily-build-windows.yml around lines 220 - 263, The "Clear disk space" step is duplicated between the build-bootstrap and build-prebuilt jobs; extract it into a reusable composite action (e.g., .github/actions/cleanup-windows) that runs PowerShell (shell: pwsh) and accepts an input list of paths and a flag for running docker prune so each job can pass its specific paths; update the build-bootstrap and build-prebuilt jobs to call this composite action (preserving behavior such as Get-WmiObject queries, Directory::Delete with try/catch, Java* deletion, and docker system prune) so the logic is centralized but still configurable per-job.

ScottBrenner · 2026-01-26T00:57:59Z

.github/workflows/daily-build-windows.yml

+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4


Suggested change

uses: actions/checkout@v4

uses: actions/checkout@v6

https://github.com/actions/checkout?tab=readme-ov-file#checkout-v6

coderabbitai bot reviewed Jan 17, 2026

View reviewed changes

ScottBrenner reviewed Jan 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Windows Docker build improvements #4

fix: Windows Docker build improvements #4

Uh oh!

robobun commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 17, 2026

Uh oh!

coderabbitai bot Jan 17, 2026

Uh oh!

coderabbitai bot Jan 17, 2026

Uh oh!

ScottBrenner Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: Windows Docker build improvements #4

Are you sure you want to change the base?

fix: Windows Docker build improvements #4

Uh oh!

Conversation

robobun commented Jan 17, 2026

Summary

Test Results

Key Fixes

Race Condition Fix

Performance Optimizations

Uh oh!

coderabbitai bot commented Jan 17, 2026

Walkthrough

Changes

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

ScottBrenner Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants