Various features I need for bootc-delta#75
Various features I need for bootc-delta#75alexlarsson wants to merge 6 commits intocontainers:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #75 +/- ##
==========================================
+ Coverage 59.51% 69.01% +9.49%
==========================================
Files 10 10
Lines 1035 1107 +72
==========================================
+ Hits 616 764 +148
+ Misses 305 231 -74
+ Partials 114 112 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for Docker/OCI-style whiteout files and adds functionality to ignore specific source path prefixes during delta analysis. It refactors the analysis logic into a reusable SourceAnalysis structure, facilitating concurrent operations and pre-computation. A new flag for a custom temporary directory was also added. Feedback was provided regarding an inefficiency in how opaque whiteouts are processed, suggesting a single-pass approach to improve performance when handling multiple whiteouts.
We want the default to not be /tmp, because that is often a tmpfs and delta files can be large. NOTE: On win32, we can't use the hardcoded value Signed-off-by: Alexander Larsson <alexl@redhat.com>
We can't just use strings.HasPrefix(), because we want an ignored prefix of /foo to ignore /foo/bar, but not /foobar. Signed-off-by: Alexander Larsson <alexl@redhat.com>
This adds a new AnalyzeSources() function that gathers all the information for a set of sources, and DiffWithSources() that does the diff against those sources. With this setup you can do diffs against multiple tar files with the same set of sources without repeating the source analysis. This is useful for the case of diffing multi-layer OCI images. The old Diff() API is still there, and calls AnalyzeSources() in the background. Signed-off-by: Alexander Larsson <alexl@redhat.com>
If this is specified, any files with this prefix will not be used as a delta source, which is useful in case the source files set is partial at apply time. In particular, in my work with bootc deltas, this is useful for layer files inside /sysroot/ostree, as they are not all stored on the bootc system. Signed-off-by: Alexander Larsson <alexl@redhat.com>
If the tar files passed in are docker/oci style layers, then it makes sense to allow applying the whiteouts in them. That way, if a file is removed in a layer, and we have only the complete image mounted as a source of delta info (which is what we have in the bootc case), then we won't use this removed file as a delta source. Signed-off-by: Alexander Larsson <alexl@redhat.com>
e0b6e6b to
8436733
Compare
This creates multi-file diffs, including with whiteouts and tries them both with and without whiteouts. Signed-off-by: Alexander Larsson <alexl@redhat.com>
348c824 to
e8b8330
Compare
I got some feedback on my bootc-delta work and had to change the approach a bit (see alexlarsson/bootc-delta#1). Now I diff against the full container image, not just the ostree objects.
This means we now have to filter out /sysroot/ostree (rather than before to only look at that).
Also, since we're diffing against the resulting image rather than the individual layers we need to handle docker layer whiteouts.
Also, for better performance on many layers, I split the API up so that we can share the source analysis between different layers.