Skip to content

Make estargz compression-algorithm-agnostic and support zstd (a.k.a. zstd:chunked) #293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 30, 2021

Conversation

ktock
Copy link
Member

@ktock ktock commented Mar 26, 2021

This is the subset of #281 but re-designed.

Initially, eStargz is based on gzip compression. But, through zstd:chunked work, it turned out that eStargz is not limited to gzip compression and the same chunking & verifying & prefetching method can be applied to other compression algorithms as well (e.g. zstd).

This commit makes estargz pkg configurable and agnostic about compression algorithms. For supporting non-gzip compression, the user must implement estargz.Decompressor and estargz.Compressor interfaces and must plug them to estargz tools (e.g. estargz.Open and estargz.NewWriterWithCompression). estargz also provides test suite that is usable for testing these non-gzip eStargz implementations.

This commit comes with zstdchunked pkg that support zstd compression for eStargz (a.k.a. zstd:chunked), based on the above extensibility. zstdchunked pkg contains zstdchunked.Decompressor and zstdchunked.Compressor that allows estargz pkg to use zstd compression (i.e. zstd:chunked) instead of gzip.

Layer converter and filesystem now support zstd:chunked leveraging zstdchunked pkg. ctr-remote image optimize and ctr-remote image convert support --zstdchunked option that omits zstd-based eStargz and filesystem supports zstd-based eStargz layers by default.

In the future, we can support other compression algorithms (e.g. lz4 used by nydus?) as well.


cc @giuseppe @AkihiroSuda

@ktock
Copy link
Member Author

ktock commented Mar 26, 2021

@giuseppe WDYT? I know zstd:chunked is still under discussion (containers/storage#775) and that spec may change in the future. But this implementation will help to make that discussion forward?

@ktock ktock force-pushed the genericesgz branch 2 times, most recently from 3dfabd0 to 3aa157d Compare March 26, 2021 11:00
@AkihiroSuda
Copy link
Member

@giuseppe Could you take a look? 🙏

@giuseppe
Copy link

giuseppe commented Apr 8, 2021

sorry, I've missed the previous ping.

@ktock sure! And we can try to make it as close as possible to the estargz TOC to simplify using it in stargz-snaphshotter.

The only difference I am aware of is how the chunking is represented. Does the current metadata for files chunking work for you?

@ktock
Copy link
Member Author

ktock commented Apr 8, 2021

@giuseppe Metadata (TOC) implemented in containers/storage#775 looks good to me.
Could you add the following annotation to the result image for enabling chunk verification by stargz snapshotter?

  • containerd.io/snapshot/stargz/toc.digest = uncompressed digest of manifest (TOC) (sha256:deadbeef...)

@giuseppe
Copy link

giuseppe commented Apr 8, 2021

containerd.io/snapshot/stargz/toc.digest = uncompressed digest of manifest (TOC) (sha256:deadbeef...)

is it fine if it is named io.containers.zstd-chunked.manifest-checksum as it is currently?

@ktock
Copy link
Member Author

ktock commented Apr 8, 2021

Containerd currently doesn't allow snapshotter to handle annotations that are not prefixed by containerd.io/snapshot/. Additionally, io.containers.zstd-chunked.manifest-checksum seems stores "compressed" digest of TOC JSON but Stargz Snapshotter currently requires "uncompressed" digest of TOC JSON.

I also think we need standard spec for lazy-pulling-related annotations (currently discussed in opencontainers/image-spec#815).

)

var GetTOCDigestCommand = cli.Command{
Name: "gettocdigest",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: “get-toc-digest” might be more readable , but no strong opinion

@ktock ktock requested a review from AkihiroSuda August 30, 2021 02:13
@ktock ktock force-pushed the genericesgz branch 2 times, most recently from eddb783 to f7e1437 Compare August 30, 2021 03:24
return nil, err
}
if uncompressedDesc == nil {
return nil, errors.Errorf("unexpectedly got the same blob aftr compression (%s, %q)", desc.Digest, desc.MediaType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

newDesc := desc
if uncompress.IsUncompressedType(newDesc.MediaType) {
if images.IsDockerType(newDesc.MediaType) {
newDesc.MediaType += ".zstd"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to support zstd for DockerType?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't. Fixed it to always convert to oci type.

…zstd:chunked)

Initially, eStargz is based on gzip compression. But, through zstd:chunked work,
it turned out that eStargz is not limited to gzip compression and the same
chunking & verifying & prefetching method can be applied to other compression
algorithms as well (e.g. zstd).

This commit makes `estargz` pkg configurable and agnostic about compression
algorithms. For supporting non-gzip compression, the user must implement
`estargz.Decompressor` and `estargz.Compressor` interfaces and must plug them to
`estargz` tools (e.g. `estargz.Open` and
`estargz.NewWriterWithCompression`). `estargz` also provides test suite that is
usable for testing these non-gzip eStargz implementations.

This commit comes with `zstdchunked` pkg that support zstd compression for
eStargz (a.k.a. zstd:chunked), based on the above extensibility. `zstdchunked`
pkg contains `zstdchunked.Decompressor` and `zstdchunked.Compressor` that allows
`estargz` pkg to use zstd compression (i.e. zstd:chunked) instead of gzip.

Layer converter and filesystem now support zstd:chunked leveraging `zstdchunked`
pkg. `ctr-remote image optimize` and `ctr-remote image convert` support
`--zstdchunked` option that omits zstd-based eStargz and filesystem supports
zstd-based eStargz layers by default.

Signed-off-by: Kohei Tokunaga <[email protected]>
@AkihiroSuda AkihiroSuda merged commit 8480338 into containerd:main Aug 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants