Skip to content

drop ome-zarr dependency#123

Open
will-moore wants to merge 31 commits intoome:mainfrom
will-moore:investigate_ome_zarr_py_alternative
Open

drop ome-zarr dependency#123
will-moore wants to merge 31 commits intoome:mainfrom
will-moore:investigate_ome_zarr_py_alternative

Conversation

@will-moore
Copy link
Copy Markdown
Member

@will-moore will-moore commented Dec 13, 2024

Investigating a lighter-weight alternative to ome-zarr-py.
Uses zarrv3.

This includes the functionality from un-merged Plate Labels Fix (ome/ome-zarr-py#207 with #54).

Also includes the handling of bioformats2raw, similar to behaviour from un-merged ome/ome-zarr-py#174

Pros:

  • Mapping from OME-Zarr metadata to napari data happens in one step, instead of being split between ome-zarr-py and napari-ome-zarr.
  • This allows us to more easily support additional complexity as in the RFC-5 spec.

Cons:

  • Some small duplication of logic between napari-ome-zarr and ome-zarr-py

The PR handles bioformats2raw, channels metadata, labels, plates and plates with labels. Testing with:

$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr

$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0066/ExpD_chicken_embryo_MIP.ome.zarr

$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0048A/9846152.zarr

# bioformats2raw.layout (single image)
$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0048A/9846151.zarr

# bioformats2raw.layout (3 images in series)
$ napari --plugin napari-ome-zarr https://storage.googleapis.com/jax-public-ngff/public/397.zarr

# v0.5 plate (see screenshot)
$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0090/190129.zarr

# Plate with labels (NB: need to scroll to first Z-plane to see labels) (screenshot below)
$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0001A/2551.zarr

# Nine images in bioformats2raw layout, translated into a 3 x 3 grid (screenshot below)
$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0033A/BR00109990_C2.zarr

# labels colors, and properties (screenshot below, comparing with biongff-viewer) - mouseover labels to see properties
$ napari --plugin napari-ome-zarr https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0079A/idr0079_images.zarr/2

TODO:

  • handle labels colors
  • coordinateTransformations
  • pre v0.4 (missing/minimal axes etc)

Screenshot 2025-05-22 at 17 08 31

Screenshot 2025-09-22 at 13 48 59 Screenshot 2025-09-22 at 14 04 37 Screenshot 2025-09-22 at 14 52 13

@will-moore
Copy link
Copy Markdown
Member Author

As discussed with @joshmoore this morning, I looked into whether https://github.com/BioImageTools/ome-zarr-models-py could perform some of the graph traversal logic in this PR, e.g. multiscales -> labels or bioformats2raw -> multiscales (not yet done in this PR). But I don't see any of that functionality in ome-zarr-models.py?
cc @dstansby

@dstansby
Copy link
Copy Markdown
Contributor

graph traversal logic in this PR

I had a quick look at the diff, but didn't quite understand what the required logic is. Could you explain a bit more? (or maybe add some short docstrings to the new classes/methods?)

We are generally 👍 on adding helpful functionality to ome-zarr-models-py, and I'm going to be sprinting to a first release tomorrow actually, so now is a good time to request stuff 😄

def matches(group: Group) -> bool:
return "multiscales" in Spec.get_attrs(group)

def children(self):
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dstansby By "graph traversal" logic, I mean, if I start with multiscales group e.g. group = zarr.open("https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr") I then want to get the labels (if they exist). Here this is implemented in the children() method, where we know to look in a child "labels" group and check attrs for "labels": ["labels1.zarr", "labels2.zarr"] then return objects for those child labels so that the arrays (and metadata) can be added to the layers that are passed to napari.

I don't see that ome-zarr-models-py includes that kind of logic for traversing the graph between these objects?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the list of labels paths from Image.attributes.labels. But the labels part of the spec just says these point to "labels objects", which I don't think are more specifically defined anywhere else?

If the OME-Zarr spec was more prescriptive about what these "labels objects" were (are they meant to be groups with image-label metadata ??) then we could certainly do more, but I don't think the spec allows us to make those assumptions unfortunately.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am only now reading version 0.5 of OME-zarr, and see that the labels section is much improved over 0.4 😄 . It's definitely within scope of ome-zarr-models-py to provide logic for getting from an Image dataset to the labels dataset if it's in the metadata. Tracking issue at ome-zarr-models/ome-zarr-models-py#92

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently in the tutorial at https://github.com/BioImageTools/ome-zarr-models-py/blob/main/docs/tutorial.py
If I add:

print(ome_zarr_image.attributes.labels)

I get None (even though that image does have labels).
I don't see any population of the labels in https://github.com/BioImageTools/ome-zarr-models-py/blob/7659a114a2428fe9d8acbd06aa7bc1c9d32624bb/src/ome_zarr_models/v04/image.py#L85 ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though that image does have labels

There's a labels group, but looking at that dataset in the validator the top level group is missing the labels metadata, which is why .labels is giving None.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be precise, if you look at the image group in the validator, it should have a "labels" key at the same level as the "multiscales" and "omero" keys. If that was there, the paths under the "labels" key would be in the .labels attribute in ome-zarr-models-py

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hold on, am I just reading the spec wrong? Does:

The special group "labels" found under an image Zarr

Really mean:

The special Zarr group "labels" found inside an image Zarr group

?

If so then we should definitely implement that in ome-zarr-models-py!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, image.zarr/labels/ group.
This is shown a bit more clearly in the layout at https://ngff.openmicroscopy.org/0.4/index.html#image-layout

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, I always thought the "labels" group was an arbitrary name and the example was just an example 🤦 - thanks for explaining, and I'll let you know once I've implemented this in ome-zarr-models-py 😄

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@will-moore
Copy link
Copy Markdown
Member Author

One issue I'm having with supporting bioformats2raw.layout here is how to load the /OME/METADATA.ome.xml (which could be local or remote etc).

E.g. https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0048A/9846151.zarr/OME/METADATA.ome.xml

I'm trying something like:

from zarr.core.buffer import default_buffer_prototype
url = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0048A/9846151.zarr"
group = zarr.open(url)
xml_data = await group.store.get("OME/METADATA.ome.xml", prototype=default_buffer_prototype())

but I want to avoid using async await if I can.

In https://github.com/ome/ome-zarr-py/pull/174/files it looks like this only handles local OME.xml files with root = ET.parse(filename)

@joshmoore
Copy link
Copy Markdown
Member

In ome/ome-zarr-py#174 (files) it looks like this only handles local OME.xml files with root = ET.parse(filename)

Then that would have been a bug. I assume a method like get_json() (get_text or get_contents) would have been needed to slurp the XML.

@will-moore
Copy link
Copy Markdown
Member Author

Seems this works for getting XML, but need to check if there's a better way that doesn't use testing classes...

import zarr
from zarr.core.buffer import default_buffer_prototype
from zarr.testing.stateful import SyncStoreWrapper
url = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0048A/9846151.zarr"
group = zarr.open(url)
store = group.store
wrapper = SyncStoreWrapper(store)
xml_data = wrapper.get("OME/METADATA.ome.xml", prototype=default_buffer_prototype())
print("xml_data", xml_data.to_bytes())

@joshmoore
Copy link
Copy Markdown
Member

@will-moore
Copy link
Copy Markdown
Member Author

Thanks, this is working...

from zarr.core.sync import SyncMixin
from zarr.core.buffer import default_buffer_prototype   
url = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0048A/9846151.zarr"
group = zarr.open(url)
store = group.store
mx = SyncMixin()
xml_data = mx._sync(store.get("OME/METADATA.ome.xml", prototype=default_buffer_prototype()))
print("xml_data", xml_data.to_bytes())

@will-moore will-moore force-pushed the investigate_ome_zarr_py_alternative branch from bc3d71a to e4cda75 Compare January 6, 2025 15:07
@will-moore
Copy link
Copy Markdown
Member Author

If we wanted to use ome-zarr-models-py to handle some of the graph traversal (e.g. labels), this would now look like this:

import zarr

from ome_zarr_models.v04.image import Image
from ome_zarr_models.v04.labels import Labels, LabelsAttrs



group = zarr.open_group(
    "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr", mode="r"
)
image = Image.from_zarr(group)
print("image.labels", image.labels)
assert image.labels == Labels(
    zarr_version=2, attributes=LabelsAttrs(labels=["0"]), members={}
)

for label in image.labels.attributes.labels:
    print("label", label)
    label_group = group[f"labels/{label}"]
    print("label_group", label_group)
    label_image = Image.from_zarr(label_group)
    print("label_image", label_image)

    first_dataset_path = label_image.attributes.multiscales[0].datasets[0].path
    zarr_arr = label_group[first_dataset_path]
    print("zarr_arr", zarr_arr)

See ome-zarr-models/ome-zarr-models-py#96

@will-moore
Copy link
Copy Markdown
Member Author

@joshmoore I'll stop working on this for now, as I think there's enough here to evaluate this approach.
I think this is a viable option to provide OME-Zarr support to napari without using ome-zarr-py, but would appreciate feedback & discussion

@dstansby
Copy link
Copy Markdown
Contributor

To chip in from ome-zarr-models-py, this is exactly the use case we'd love to support, so if there's anything else we can add to make this easier please let us know!

@will-moore will-moore force-pushed the investigate_ome_zarr_py_alternative branch from 920d9cf to dd63922 Compare May 22, 2025 16:16
@will-moore will-moore force-pushed the investigate_ome_zarr_py_alternative branch from e908d7a to a92c2a1 Compare June 9, 2025 10:14
@will-moore will-moore force-pushed the investigate_ome_zarr_py_alternative branch from b8758ad to e338190 Compare September 1, 2025 11:21
@will-moore will-moore force-pushed the investigate_ome_zarr_py_alternative branch from a8ffede to e34c881 Compare September 1, 2025 11:54
@will-moore will-moore marked this pull request as ready for review September 22, 2025 14:53
@will-moore will-moore changed the title investigate dropping ome-zarr dependency propose dropping ome-zarr dependency Sep 22, 2025
@joshmoore joshmoore requested a review from jo-mueller October 24, 2025 12:58
@will-moore
Copy link
Copy Markdown
Member Author

@jluethi - This PR fixes the Plate Labels support for napari if you'd like to give it a try (as an alternative approach to #54)?

@will-moore
Copy link
Copy Markdown
Member Author

@jluethi - Apologies: just tested the Plate with Labels above and some minor bug since the previous screenshot - looking to fix now...

Screenshot 2025-10-29 at 13 29 00

@will-moore
Copy link
Copy Markdown
Member Author

OK @jluethi - PlateLabels fixed now, at least for the test data I have.

@jluethi
Copy link
Copy Markdown

jluethi commented Oct 30, 2025

Oh, that's exciting! Thanks for the tag Will, will try to test this next week!

@@ -0,0 +1,409 @@
# zarr v3
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No header?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't see any other files in this repo having a header, same for ome-zarr-py.

@will-moore will-moore changed the title propose dropping ome-zarr dependency drop ome-zarr dependency Jan 14, 2026
@jo-mueller
Copy link
Copy Markdown

jo-mueller commented Jan 14, 2026

Hi @will-moore , just went through this in a superficial manner. What's implemented here is exactly the kind of API that I (personally) think is missing over at ome-zarr-py, i.e. exposing the data from ome.zarr files through some sort of object-oriented logic. In this case it makes it easy for napari to grab the data to be viewed from Multiscales.data, but I would really love to be able to do the same from code, too!

Do you think it would be possible to move this upstream into ome-zarr-py? This would also allow ome-zarr-py to expose some of the cool tools of ome-zarr-models-py through its own API and keep naparei-ome-zarr really lightweight.

Edit: typo

@will-moore
Copy link
Copy Markdown
Member Author

@jo-mueller Those Spec classes already exist in ome-zarr-py - see https://github.com/ome/ome-zarr-py/blob/7627a54b624247ec45415bc34457ae6d8ef5990c/ome_zarr/reader.py#L158 but we haven't documented their use as part of the API. They are a fair bit more complex at ome-zarr-py (e.g. use a ZarrLocation class to do the zarr access) and I struggle to understand them each time I need to work on them, fix bugs etc. So part of my motivation for implementing a simpler version of them here was to understand them better.

What we do with them at ome-zarr-py can be discussed next week

@jo-mueller
Copy link
Copy Markdown

jo-mueller commented Jan 14, 2026

@will-moore thanks for the pointer, definitely helpful to get on the same page until then.

I think what's confusing (at least to me) about the Spec and Node instances, is that they are object-oriented classes for metadata that are also owning an attribute that allows to access the data. I.e., Multiscales(Spec) allows to give you access to multiscale_object.node.zarr, if I see it correctly.

I guess my idea (for ome-zarr-py) but also for the changes here would be:

  • Introduce the Spec and derived classes from this PR in ome-zarr-py and replace the implementations there. Especially the data, children and metadata attributes are useful.
  • Replace the functionality inside def metadata -> Dict with ome-zarr-models-py. I figure the latter (correct me if I'm wrong @dstansby) allows to simply ingest a nested directory of ome-zarr metadata into corresponding metadata objects and spit out a dict object via metadata.model_dump() if required.
    This will save a lot of type checking, validation, etc. The only things you'd really have to check here would be the version field to invoke the correct metadata model from ome-zarr-models-py.

I'd have some other ideas particularly for the Plate structure (maybe a getter à la __get__(self, row, col) would be cool?) but the bigger decision to make here is probably the relationship between ome-zarr-py and napari-ome-zarr.

@will-moore
Copy link
Copy Markdown
Member Author

Thanks @jo-mueller - All good points to discuss next week...

@jo-mueller jo-mueller mentioned this pull request Jan 21, 2026
2 tasks
@will-moore will-moore force-pushed the investigate_ome_zarr_py_alternative branch from adf6093 to ba7e50f Compare February 9, 2026 17:09
@will-moore
Copy link
Copy Markdown
Member Author

@jo-mueller I guess we need to revive this discussion...

This PR fixes a bunch of issues and I'm adding all the RFC5 / v0.6 stuff on top of it, so it would be nice to get merged.
I don't know if it's worth holding off until ome-zarr-py is in a state where it can support everything we need here.

Maybe discuss at next week's Tuesday call?

@jo-mueller
Copy link
Copy Markdown

Hey @will-moore thanks for the ping. Totally agree! Personally I was hoping to

  • use the Image class I was developing over at ome-zarr-py
  • make stronger use of either yaozarrs or ome-zarr-models-py to outsource the model handling (if we find ourselves in conflict with development decisions there I guess we can always overload 🙊)

I'm relatively confident that I can get the development at ome/ome-zarr-py#515 into a presentable state by then 🤞

@joshmoore
Copy link
Copy Markdown
Member

(I assume the title of this PR can be updated now? 😄 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants