-
-
Notifications
You must be signed in to change notification settings - Fork 512
image import: fallback to crictl pull when ctr import fails with missing digest #1634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
image import: fallback to crictl pull when ctr import fails with missing digest #1634
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a fallback mechanism to handle image import failures when using Docker Desktop with containerd image store. When ctr images import fails due to missing digest blobs, the system can fall back to using crictl pull to pull images directly into cluster nodes.
Key Changes:
- Adds a new
ImageImportModePullmode that bypasses tar export/import and directly pulls images into nodes - Enables automatic fallback to
crictl pullin auto-detect mode when tar import fails with missing digest errors - Adds improved error collection and reporting for both copy and import operations
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| pkg/types/types.go | Adds the ImageImportModePull constant and FallbackPull field to support the new pull-based import mode |
| pkg/client/tools.go | Implements fallback logic to detect missing digest errors and retry with crictl pull, plus enhanced error collection |
| pkg/client/pull.go | New file implementing direct pull-based image import functionality for all cluster nodes concurrently |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ImportModeAutoDetect ImportMode = "auto" | ||
| ImportModeDirect ImportMode = "direct" | ||
| ImportModeToolsNode ImportMode = "tools-node" | ||
| ImageImportModePull ImportMode = "pull" |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constant name ImageImportModePull is inconsistent with the existing naming pattern. The other ImportMode constants follow the pattern ImportMode* (e.g., ImportModeAutoDetect, ImportModeDirect, ImportModeToolsNode), but this one is named ImageImportModePull. Consider renaming it to ImportModePull to maintain consistency with the existing constants in this type.
| if opts.FallbackPull && isMissingDigestErr(err) && len(imagesFromRuntime) > 0 { | ||
| l.Log().Warnf("tar import failed in node '%s' due to missing digest; falling back to pull", node.Name) | ||
|
|
||
| // pull each runtime image into this node | ||
| for _, img := range imagesFromRuntime { | ||
| pullCmd := []string{"crictl", "pull", img} | ||
| if pullErr := runtime.ExecInNode(ctx, node, pullCmd); pullErr != nil { | ||
| errCh <- fmt.Errorf("node %s: import failed (%v) and pull fallback failed for %s: %w", | ||
| node.Name, err, img, pullErr) | ||
| return | ||
| } | ||
| } | ||
| // fallback succeeded, so treat as success | ||
| return | ||
| } |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback pull logic returns early without calling wg.Done(), which will cause the importWaitgroup.Wait() at line 185 to hang indefinitely. When the fallback succeeds and returns at line 177, or when it fails and returns at line 173, the WaitGroup counter is never decremented. This will cause a deadlock. The wg.Done() call should be executed before any return statement, or deferred at the beginning of the goroutine.
| if len(allErrs) > 0 { | ||
| // Go 1.20+: return errors.Join(allErrs...) | ||
| // Compatible fallback: | ||
| return fmt.Errorf("image import encountered %d error(s): %v", len(allErrs), allErrs) |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message format "image import encountered %d error(s): %v" will print the entire error slice in Go's default format (e.g., [error1 error2]), which may not be user-friendly. Consider formatting each error on a separate line or using a more structured approach to present multiple errors, similar to how errors.Join would format them.
| return fmt.Errorf("image import encountered %d error(s): %v", len(allErrs), allErrs) | |
| var sb strings.Builder | |
| for i, err := range allErrs { | |
| fmt.Fprintf(&sb, " %d. %v\n", i+1, err) | |
| } | |
| return fmt.Errorf("image import encountered %d error(s):\n%s", len(allErrs), sb.String()) |
| errs = append(errs, e) | ||
| } | ||
| if len(errs) > 0 { | ||
| return fmt.Errorf("pull-based import had %d error(s): %v", len(errs), errs) |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message format "pull-based import had %d error(s): %v" will print the entire error slice in Go's default format (e.g., [error1 error2]), which may not be user-friendly. Consider formatting each error on a separate line or using a more structured approach to present multiple errors.
| if err := runtime.ExecInNode(ctx, node, []string{"ctr", "image", "import", "--all-platforms", tarPath}); err != nil { | ||
| l.Log().Errorf("failed to import images in node '%s': %v", node.Name, err) |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When ctr image import fails but the error is not a missing digest error (or when FallbackPull is false, or when there are no runtime images), the error is logged but wg.Done() is never called. This will cause the importWaitgroup.Wait() at line 185 to hang indefinitely. The wg.Done() should be called in all error paths, not just the success path. Consider using defer wg.Done() at the beginning of the goroutine to ensure it's always called.
| go func(n *k3d.Node, imageRef string) { | ||
| defer wg.Done() | ||
|
|
||
| // Use ctr; fully-qualified refs are safest |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "Use ctr" but the code actually uses "crictl". The comment should be updated to match the implementation, or vice versa. Note that crictl and ctr are different tools with different behaviors - crictl is the CRI (Container Runtime Interface) CLI, while ctr is the containerd CLI.
| // Use ctr; fully-qualified refs are safest | |
| // Use crictl; fully-qualified refs are safest |
| func importWithPull(ctx context.Context, runtime runtimes.Runtime, cluster *k3d.Cluster, images []string) error { | ||
| var wg sync.WaitGroup | ||
| errCh := make(chan error, len(images)*len(cluster.Nodes)) | ||
|
|
||
| for _, node := range cluster.Nodes { | ||
| if node.Role != k3d.ServerRole && node.Role != k3d.AgentRole { | ||
| continue | ||
| } | ||
| for _, img := range images { | ||
| wg.Add(1) | ||
| go func(n *k3d.Node, imageRef string) { | ||
| defer wg.Done() | ||
|
|
||
| // Use ctr; fully-qualified refs are safest | ||
| cmd := []string{"crictl", "pull", imageRef} | ||
| if err := runtime.ExecInNode(ctx, n, cmd); err != nil { | ||
| errCh <- fmt.Errorf("node %s: pull %s failed: %w", n.Name, imageRef, err) | ||
| } | ||
| }(node, img) | ||
| } | ||
| } | ||
|
|
||
| wg.Wait() | ||
| close(errCh) | ||
|
|
||
| var errs []error | ||
| for e := range errCh { | ||
| errs = append(errs, e) | ||
| } | ||
| if len(errs) > 0 { | ||
| return fmt.Errorf("pull-based import had %d error(s): %v", len(errs), errs) | ||
| } | ||
| return nil | ||
| } |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new importWithPull function lacks test coverage. Given that the repository has test coverage for similar functions in tools_test.go (e.g., Test_findImages, Test_findRuntimeImage), this new function should also have corresponding tests to verify its behavior, especially error handling and the concurrent pull operations across multiple nodes.
| func isMissingDigestErr(err error) bool { | ||
| if err == nil { | ||
| return false | ||
| } | ||
| s := err.Error() | ||
| return strings.Contains(s, "content digest sha256:") && strings.Contains(s, "not found") | ||
| } |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new isMissingDigestErr helper function lacks test coverage. Given that the repository has test coverage for functions in tools_test.go, this function should have tests to verify it correctly identifies missing digest errors and handles edge cases (e.g., nil errors, errors with partial matches to the pattern, etc.).
| var importErrs []error | ||
| for e := range errCh { | ||
| importErrs = append(importErrs, e) | ||
| } | ||
| allErrs := append(copyErrs, importErrs...) | ||
| if len(allErrs) > 0 { | ||
| // Go 1.20+: return errors.Join(allErrs...) | ||
| // Compatible fallback: | ||
| return fmt.Errorf("image import encountered %d error(s): %v", len(allErrs), allErrs) |
Copilot
AI
Dec 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error collection pattern (creating a buffered channel, collecting errors in goroutines, waiting, closing the channel, and draining it into a slice) is duplicated between this function and importWithPull in pull.go. Consider extracting this pattern into a shared helper function to reduce code duplication and improve maintainability.
What
Adds an optional fallback to
crictl pullwhenctr images importfailswith
content digest sha256:... not found.Why
When Docker Desktop uses the containerd image store,
k3d image importmay export tarballs referencing missing blobs, causing import failures
(see #1538).
This change:
--mode=pullScope
Related issues
Fixes / mitigates #1538