-
Notifications
You must be signed in to change notification settings - Fork 27
Add ZstdCompressor #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ZstdCompressor #180
Conversation
Pull Request Test Coverage Report for Build 14317383070Details
💛 - Coveralls |
very much looking forward to both Zstd and the multithreading it brings. are there tests we could add to this PR to ensure it is thread safe? |
#181 would add some basic round trip tests, which should cover all the code in this PR. I'm not sure how to test if this is thread-safe, but in ChunkCodecLibZstd, there is no global state being mutated, and the underlying C library is supposed to be safe to use in multiple threads. |
Since this requires Julia 1.11 anyways, could we make this into a package extension and an optional dependency instead of a hard dependency? The main advantage for the merge strategy here is that we do not make Zarr.jl require Julia 1.11. I would at most be more comfortable making it require Julia 1.10. |
I'm happy to accept a PR to ChunkCodecLibZstd.jl to support Julia 1.10. Currently, the only 1.11 feature I am using is the public keyword. But is there a need to install the latest version of an in-development package on an old version of Julia? |
If the in-development package is "Zarr.jl", then yes. Julia 1.10 is the current long-term-support release, and I would expect upcoming releases of Zarr.jl to support Julia 1.10 for some time. Making "ChunkCodecLibZstd.jl" a mandatory dependency of Zarr.jl would prevent that. I am less concerned about support for Julia versions prior to Julia 1.10. For "ChunkCodecLibZstd.jl", dependence on Julia 1.11 is less of an issue as long as it is only an optional dependency of Zarr.jl. Compat.jl could be used to address the Julia version dependency. However, I still prefer codecs as optional dependencies when possible. If a convenience package, ZarrUniverse.jl for example, is needed that loads Zarr.jl and all optional dependencies, that would not be hard to accomodate. I will send a pull request. |
if it's just |
PR for using Compat.jl for |
I started to test the ZarrUniverse idea here: using Pkg
Pkg.add(url="https://github.com/mkitti/Zarr.jl", rev="mkitti-zarr-universe", subdir="lib/ZarrUniverse") or
|
I've updated the PR. It should work with Julia 1.10 now. Also, the new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, woudl be very interested to test if this works in multithreaded settings, but is defineitely not a merge-stopper since most other compressors break there as well.
This should be thread safe because it creates a new context for each compression and decompression call. I understand that @bjarthur has tested these changes under a multithreaded context. It would be great to see a test for this in the test suite. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Minor points:
- I am wondering about the need for ChunkCodecCore to be a direct dependency here
- Moving codecs out to package extensions can be done earlier. What Fabian discussed was perhaps making Zarr.jl the all encompassing package and then creating a core package that had optional dependencies.
Alternative to #149
This implementation supports multithreaded compression and decompression, and also supports the
checksum
option.ChunkCodecLibZstd is being added as a direct dependency instead of a package extension, because Zarr.jl already depends on zstd through blosc.
One thing to note is that ChunkCodecLibZstd needs Julia
1.111.10, andthe ChunkCodec API is still experimental. Any suggestions for improving the API would be helpful.