You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a new config option parallel_pull_as_fallback under
[pull_modes.parallel_pull_unpack] that enables parallel-pull as an
automatic fallback when lazy-load is the primary mode but no SOCI
index is found for an image.
Today, lazy-load and parallel-pull are mutually exclusive at the daemon
level. When lazy-load is enabled and no SOCI index exists, the
snapshotter defers to the container runtime's sequential pull, which is
5-40% slower than Docker. This forces operators to choose between
optimal performance for indexed images (lazy-load) or non-indexed
images (parallel-pull), but not both.
With parallel_pull_as_fallback = true (and enable = false), the
snapshotter first attempts lazy-load for every image. Only when no
SOCI index is found does it fall back to parallel-pull instead of the
slow sequential path. This gives optimal performance for both cases:
- Images WITH a SOCI index: lazy-load (83-96% faster than Docker)
- Images WITHOUT a SOCI index: parallel-pull (14-50% faster than Docker)
The option defaults to false, preserving existing behavior for all
current users. When enable = true, the fallback is a no-op since
parallel-pull is already the primary mode.
Tested on AL2 and AL2023 instances with both small (nginx) and large
(20GB+) container images. All existing unit tests pass.
Signed-off-by: deeppcs <deeppcs@amazon.com>
Copy file name to clipboardExpand all lines: docs/config.md
+18Lines changed: 18 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -97,6 +97,24 @@ This set of variables must be at the top of your TOML file due to not belonging
97
97
98
98
## config/service.go
99
99
100
+
## config/pull_modes.go
101
+
102
+
### [pull_modes.soci_v1]
103
+
-`enable` (bool) — Enables SOCI v1 index discovery via the OCI Referrers API. Default: false.
104
+
105
+
### [pull_modes.soci_v2]
106
+
-`enable` (bool) — Enables SOCI v2 index discovery via image manifest annotations. Default: true.
107
+
108
+
### [pull_modes.parallel_pull_unpack]
109
+
-`enable` (bool) — Enables parallel pull and unpack as the primary pull mode. When true, lazy-load is skipped entirely. Default: false.
110
+
-`experimental_parallel_pull_as_fallback` (bool) — **[EXPERIMENTAL]** When true (and `enable` is false), uses parallel-pull as an automatic fallback when lazy-load is the primary mode but no SOCI index is found for an image. Requires containerd content store (unless `discard_unpacked_layers = true`). Lazy-load with the containerd content store may have garbage collection edge cases. See [#1843](https://github.com/awslabs/soci-snapshotter/issues/1843). Ignored when `enable` is true. Default: false.
111
+
-`max_concurrent_downloads` (int) — Max concurrent downloads across all images. -1 for unlimited. Default: -1.
112
+
-`max_concurrent_downloads_per_image` (int) — Max concurrent downloads per image. Default: 3.
113
+
-`concurrent_download_chunk_size` (string) — Size of each download chunk (e.g. "8mb", "16mb"). Empty means full layer. Default: "".
114
+
-`max_concurrent_unpacks` (int) — Max concurrent unpacks across all images. -1 for unlimited. Default: -1.
115
+
-`max_concurrent_unpacks_per_image` (int) — Max concurrent unpacks per image. Default: 1.
116
+
-`discard_unpacked_layers` (bool) — Discard layer blobs after unpacking to save disk space. Default: false.
117
+
100
118
### [snapshotter]
101
119
-`min_layer_size` (int) — Sets the minimum threshold for lazy loading a layer. Any layer smaller than this value will ignore the zTOC for the layer and pull the entire layer ahead of time. We generally recommend setting it to 10MiB (10000000). Default: 0.
102
120
-`allow_invalid_mounts_on_restart` (bool) — Allows the snapshotter to start even if preexisting snapshots cannot connect to their data source on startup. Useful on unexpected daemon crashes/corruption. Default: false.
Copy file name to clipboardExpand all lines: docs/parallel-mode.md
+26Lines changed: 26 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -94,6 +94,32 @@ If you have any questions or need further assistance, please don't hesitate to r
94
94
95
95
## Known Limitations
96
96
97
+
### Parallel Pull as Fallback for Lazy-Load
98
+
99
+
When using lazy-load as the primary pull mode (`soci_v2.enable = true` or `soci_v1.enable = true`), images without a SOCI index normally fall back to the container runtime's sequential pull, which can be slower than a standard image pull. To avoid this behavior, you can enable `experimental_parallel_pull_as_fallback`:
100
+
101
+
```toml
102
+
[pull_modes.soci_v2]
103
+
enable = true
104
+
105
+
[pull_modes.parallel_pull_unpack]
106
+
enable = false
107
+
experimental_parallel_pull_as_fallback = true
108
+
max_concurrent_downloads_per_image = 10
109
+
concurrent_download_chunk_size = "16mb"
110
+
max_concurrent_unpacks_per_image = 10
111
+
discard_unpacked_layers = true
112
+
```
113
+
114
+
With this configuration:
115
+
- Images **with** a SOCI index use lazy-load
116
+
- Images **without** a SOCI index use parallel-pull
117
+
- No image falls through to the slow sequential containerd pull
118
+
119
+
> **EXPERIMENTAL**: This option requires the containerd content store (`type = "containerd"` under `[content_store]`) for both lazy-load and parallel-pull. Lazy-load with the containerd content store may have garbage collection edge cases and does not carry the same stability guarantees as using either mode independently. See [#1843](https://github.com/awslabs/soci-snapshotter/issues/1843) for details.
120
+
121
+
Note: `experimental_parallel_pull_as_fallback` is ignored when `enable = true`, since parallel-pull is already the primary mode in that case.
122
+
97
123
### Registries
98
124
99
125
Any registry that supports ranged GET requests and has sufficient request limits should work with parallel pull mode. If a registry is rate limiting image pull requests, users can attempt to lower `max_concurrent_downloads` or `max_concurrent_downloads_per_image` and see if it alleviates the issue, however this will result in less of a performance benefit compared to regular pulling.
returnnil, errors.New("parallel_pull_unpack mode requires containerd content store (type=\"containerd\" under [content_store])")
245
+
returnnil, errors.New("parallel_pull_unpack mode requires containerd content store (type=\"containerd\" under [content_store] or discard_unpacked_layers = true)")
returnnil, errors.New("experimental_parallel_pull_as_fallback requires containerd content store (type=\"containerd\" under [content_store] or discard_unpacked_layers = true)")
0 commit comments