Skip to content

Initial Zarr v3 support#148

Merged
sbesson merged 16 commits intoglencoesoftware:masterfrom
melissalinkert:zarr-v3
Mar 9, 2026
Merged

Initial Zarr v3 support#148
sbesson merged 16 commits intoglencoesoftware:masterfrom
melissalinkert:zarr-v3

Conversation

@melissalinkert
Copy link
Copy Markdown
Member

Pairs with glencoesoftware/bioformats2raw#290.

This still needs some polishing, HCS support, and tests, so is a draft for now. In the current state, running CMU-1.svs through bioformats2raw --v3 --compress-inner-chunk and then converting the v3 output with raw2ometiff --rgb seems to work as expected.

Expect v3 tests to fail right now as there is no released version of
bioformats2raw that includes the `--v3` option.

This also brings the JUnit version into alignment with what
bioformats2raw uses, so we don't need to keep track of two test APIs.
@melissalinkert
Copy link
Copy Markdown
Member Author

Last two commits here cover everything I had initially planned. The build failures are expected as noted in the commit message for 8577ea3.

Expected next steps are:

  • wait for zarr-java 0.0.5 to be deployed (cc @joshmoore)
  • update to zarr-java 0.0.5 in Initial support for Zarr v3 bioformats2raw#290
  • tag an RC of bioformats2raw that includes v3 support
  • update to zarr-java 0.0.5 and the bioformats2raw RC here (at this point tests should pass)
  • take this PR out of draft status and assign reviewers

@melissalinkert melissalinkert added this to the 0.10.0 milestone Dec 3, 2025
@melissalinkert
Copy link
Copy Markdown
Member Author

98f04a2 updates to use the new RC of bioformats2raw which includes the --ngff-version option. A little extra work on label image support was needed to address some test failures. There are still two failing tests that need further investigation; glencoesoftware/bioformats2raw#296 may be related.

@melissalinkert
Copy link
Copy Markdown
Member Author

Down to just one failing test here, see zarr-developers/zarr-java#47 for a minimal example that reproduces the failing test with zarr-java alone.

@melissalinkert melissalinkert marked this pull request as ready for review February 3, 2026 01:43
@melissalinkert
Copy link
Copy Markdown
Member Author

Build now passing, so marking as ready for review. This does not switch over to zarr-java for v2 reading; I can do that here if preferred, or as a separate PR after this is merged.

Copy link
Copy Markdown
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally tested using the same public samples as the ones described in glencoesoftware/bioformats2raw#290 (review) that cover brightfield whole-slide imaging, muliplexed fluorescence whole slide imaging as well as high-content screening.

Various OME-Zarr datasets was generated using these datasets as source and bioformats2raw

  • --compact
  • --ngff-version 0.5
  • --ngff-version 0.5 --compact

All these datasets were converted back to OME-TIFF using this utility. File sizes are consistent independently of the source

3.8G	0.4_compact/Leica-1.ome.tiff
2.1G	0.4_compact/LuCa-7color_Scan1_XYC.ome.tiff
2.7G	0.4_compact/NIRHTa+001.ome.tiff
3.8G	0.5/Leica-1.ome.tiff
2.1G	0.5/LuCa-7color_Scan1.ome.tiff
2.7G	0.5/NIRHTa+001.ome.tiff
3.8G	0.5_compact/Leica-1.ome.tiff
2.1G	0.5_compact/LuCa-7color_Scan1.ome.tiff
2.7G	0.5_compact/NIRHTa+001.ome.tiff
26G	total

The data was loaded into OMERO Plus for visual assessment and confirms the binary data is identical independently of the OME-Zarr source. The conversion failed while converting OME-Zarr v3 datasets that have been generated with sharding options but these are issues with the source data which are already captured elsewhere glencoesoftware/bioformats2raw#295. As we address them upstream, we might need to retest the conversion via raw2ometiff

The code changes are fairly significant but mostly revolved around 1- importing the relevant utilities from zarr-java, 2- isolating the v2 specific calls into dedicated methods, 3- adding v3 counterparts to these methods and 3- adding v2/v3 switches wherever necessary. As a possible next step is to look into using zarr-java as the single library for conversion for both Zarr v2 vs v3, this might reduce the complexity in this code. Unless it would be advantageous functionally to make the transition in one go, I am happy to get this in (and possible tagged as a release candidate) and look into using zarr-java for all input datasets as a follow-up.

While reviewing the v2/v3 switches, I suspect there will still be a need to handle backwards-incompatible changes in the metadata of Zarr v3 datasets. This might be something we need to tackle as support gets introduced to future (currently unreleased) versions of the OME-NGFF specification

Copy link
Copy Markdown
Member

@erindiel erindiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted the test data generated via glencoesoftware/bioformats2raw#302 using --ngff-version 0.5 --compression null with this build of raw2ometiff.

This was successful using default options, --compression JPEG, and --rgb options. The output of --split was also as expected.

Output OME-TIFFs were validated with visualization in FIJI.

@sbesson sbesson merged commit 0ae187f into glencoesoftware:master Mar 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants