-
Notifications
You must be signed in to change notification settings - Fork 12
Write OME-Zarr 0.5 #310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write OME-Zarr 0.5 #310
Conversation
|
cc @bentaculum |
aliddell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question for you, but looks good.
|
@talonchandler @JoOkuma Here's a daxi2 volume converted so that it has 90x smaller chunks but 10x fewer files: |
|
@ziw-liu thanks for your careful work with this. @aliddell Is acquire-zarr API a drop in replacement for parts of the code that execute the file i/o? I haven't had a chance to read the code closely, but I am curious if the zarr layout generation and metadata creation are sufficiently separated from file i/o that we can improve the performance for sharded arrays using acquire-zarr. |
|
@mattersoflight acquire-zarr is not a drop-in replacement, because it doesn't provide random access writing. If iohub had a "streaming mode," then it could suit there, but if you treat your Zarrs as in-memory arrays, it won't. |
since it cannot be installed on Bruno
|
Merging into the staging branch (#301). We will handle performance in a future PR. |
Enable writing OME-Zarr 0.5 (zarr v3). Tested output with:
Also includes miscellaneous typing improvements.
Breaking changes:
Performance
Warning
Writing sharded arrays with zarr-python is extremely slow. Writing a 128x2000x4000 array with sharding takes more than 20 minutes! Using tensorstore or zarrs-python is >100x faster.
Other observations on the performance hit when switching to the sharded arrays with zarr-python 3:
Other known issues:
dask.array.to_zarrdoes not work with shards: Support for sharding when storing dask arrays to zarr dask/dask#11778