Skip to content

Add image class#515

Open
jo-mueller wants to merge 71 commits intoome:masterfrom
jo-mueller:introduce-image-class
Open

Add image class#515
jo-mueller wants to merge 71 commits intoome:masterfrom
jo-mueller:introduce-image-class

Conversation

@jo-mueller
Copy link
Copy Markdown
Collaborator

@jo-mueller jo-mueller commented Jan 21, 2026

Hi @will-moore ,

this is an implementation of my idea for a class-based, user-facing API to writing and reading.

Key features

  • Implemented classes: Implements the NgffImage and NgffMultiscales (similar to the implementation over at ngff-zarr), that serve as primary entrypoints to the writing. NgffImage accepts the data to be written as an array and coerces it to dask internally. It requires the dims (i.e., "tczyx") to be passed at instantiation. It also accepts kwargs for pixel sizes (scale), axes units (axes_units) and the name of the image (name) which are later serialized in the ome-zarr metadata.
    The NgffMultiscales then constructs a pyramid using the already existing methods (_build_pyramid) that were implemented in deprecate scaler class #516.
  • Ome-zarr-models-py: Usage of ome-zarr-models-py for the internal construction of the corresponding Metadata class from there for simpke serialization and de-serialization in the write/read process. Primarily, the coordinate transformation classes and the Multiscales metadata classes are used.
    Importantly, all metadata is internally coerced to ozmp.v05.Multiscales. Only on writing the metadata class is converted to whatever ome-zarr version is desired.
  • Writing: Writing happens through the NgffMultiscales.to_ome_zarr() method. This method makes use of the already existing writing API from Streamline writing #531 (_write_pyramid_to_zarr). It then converts the metadata to the chosen version and uses pydantic's object.model_dump() to create the metadata dictionary. Importantly, the version conversion is only implemented in implement version converters ome-zarr-models/ome-zarr-models-py#398, so this is currently blocked by that.
  • Reading: The implemented NgffMultiscales class also has an attached from_ome_zarr(...) classmethod. The argument is simply the path/group of the ome-zarr image. The function then reads the metadata and the multiscales as dask arrays and returns an instance of NgffMultiscales. The version is automatically detected and again coerced to ozmp.v05.Multiscales internally.
  • Labels: Writing labels can be done by converting them to instances of NgffMultiscales and passing them as a single image or as a dict(str, NgffMultiscales) to the to_ome_zarr writer function. I have yet to implement the same functionality for the reading.

All in all, I think especially the to_ome_zarr and from_ome__zarr methods are super convenient. I have written a follow-up implementation of the scene metadata from 0.6 and making use of the same API there makes a lot of sense. We could think of similar entrypoints to writing HCS layouts.

Further considerations

  • Endorse ome-zarr-models-py here or implement own model after all?
  • Keep the current API? Since the currently existing endpoints now pretty much enter the same functionality - most of the magic is happening in _build_pyramid and write_pyramid_to_zarr - I see no harm in keeping the currently existing API around if it works for people. Maybe we'd have to update the write_image function so that it would also accept an instance of NgffImage, otherwise it may be confusing? I'm not so decided on the reading though.

TODOs:

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 93.36100% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.58%. Comparing base (cade24e) to head (d70e55b).

Files with missing lines Patch % Lines
ome_zarr/image.py 93.46% 13 Missing ⚠️
ome_zarr/axes.py 80.00% 2 Missing ⚠️
ome_zarr/writer.py 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #515      +/-   ##
==========================================
+ Coverage   86.05%   86.58%   +0.53%     
==========================================
  Files          14       15       +1     
  Lines        1857     2073     +216     
==========================================
+ Hits         1598     1795     +197     
- Misses        259      278      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@will-moore
Copy link
Copy Markdown
Member

@jo-mueller Thanks for that - This looks like a nice approach to separate the metadata creation and manipulation from the writing to zarr.

Comparing APIs:

  • This PR:
i = Image(data = my_array, dims = ["c", "z", "y", "x"], scale_factors = [2, 4], scale = [1, 0.5, 0.3, 0.3], axes_units = [None, "micrometer", "micrometer", "micrometer"], name="my image"]
i.to_ome_zarr("my_image.zarr", version="0.4")
  • ngff-zarr
image = nz.to_ngff_image(data, dims=['y', 'x'], scale={'y': 1.0, 'x': 1.0}, translation={'y': 0.0, 'x': 0.0})
multiscales = nz.to_multiscales(image, scale_factors=[2,4], chunks=64)
nz.to_ngff_zarr('lightsheet.ome.zarr', multiscales, chunks_per_shard=2, version="0.4")

Various comments, questions. I realise some of this is just not implemented yet... And I haven't tried the code (which might answer some of these)...

  • Spec(ABC) class isn't used/needed
  • to_zarr() should take an existing root OR string
  • No code for reading ome-zarrs yet
  • The current Scaler doesn't downsample in z in most cases (existing issue). Might be best to calculate the scales for each Dataset from the shape of each new_image
  • axes_units isn't used, nor is labels
  • metadata writing to_zarr is missing currently
  • "version" is ignored. How do we handle versions / converting e.g. read v0.4 and write v0.5?
  • No translation added. See code to add this at 492
  • da.to_zarr() needs to specify dimension_names for zarr v3 arrays, e.g. fixed in fix recursion error if __store.fs.protocol is a tuple #511
  • Where do we specify chunks, shards, compressors and other options to pass down to zarr-python?
  • support for omero metadata? Yaozarrs has _omero but it's not exposed very much
  • I think we could drop support for writing v0.1, v0.2 and v0.3, but what about reading them?
  • Yaozarrs is slightly less verbose than ome-zarr-models-py, but generally similar
  • If your data is a 5D array shape = (50, 3, 100, 1024, 1024) and your dims/scale were 4D, e.g. dims = ["c", "z", "y", "x"] would you get any validation errors?

@jo-mueller
Copy link
Copy Markdown
Collaborator Author

@will-moore thanks for the breakdown. I was aware of some of these points (not all) but decided to send it anyway to not go too far in the wrong direction in case there were strong objections to the approach.

Actually, what you wrote is an excellent to-do list :)

@jo-mueller
Copy link
Copy Markdown
Collaborator Author

@will-moore I think this is taking a bit more shape towards how I'd expect it. But before this can continue, I think there is value in discussing first the current duplicity in functionality between these two functions:

  • write_image: calls _build_pyramid under the hood and uses _write_dask_image to serialize the data and metadata to disk
  • write_multiscales: Requires up-front call to _build_pyramid and then serializes data to disk, uses write_multiscales_metadata to write metadata to disc

There's a lot of overlap between these two functions which I think can be condensed so we'd have a single place where

  • metadata is created
  • format parsing is happening

Anyway, I just tried the Image.to_ome_zarr with a large image (3 x 15k x 15k) and I'm getting very decent writing speeds!

@imagesc-bot
Copy link
Copy Markdown

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/separate-tiles-to-ome-zarr/109071/55

@will-moore
Copy link
Copy Markdown
Member

@jo-mueller Could you update the description to reflect where this is heading now?

Are we planning to keep all the existing write methods (any API changes)?

@jo-mueller jo-mueller force-pushed the introduce-image-class branch from 79fdff7 to 8e1ffdf Compare March 5, 2026 08:22
fix: pass name to multiscales class

feat: add reading/writing

fix: do not use v06 yet

test: add testcases for class-based writer

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

refactor: write scales to zarr group scale<idx>

add image class

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

remove unused ABC

fix multiscale creation

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

use dask utils for rescaling

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

more general input coercion

ensure `scaleX` naming for levels

rely on refactoring changes for pyramid creation & write

raise validation error of ndims != data shape

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

docstring

pass storage options on to writer function

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

move down group check

improve metadata handling

change multiscale path formatting

split into two different classes

compute shapes off actual shapes rather than ideal factors

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

use ome-zarr-models-py v06

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

add version to export args

correct naming in docstring

code style

remove dead code

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

add docstrings

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

use CoordinateSystem and Axis from ozmp and pop empty fields

pass name to multiscales class

correct arg passage

Create scene.py

add support for passing additional transformations through class interface

add reading/writing

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

remove class conversion methods

add fallback for scales

remove v06 objects

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Delete scene.py

fix: make precommit happy

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

fix: typesafety

fix: no self-conversion

test: add testcases for class-based writer

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

refactor: write scales to zarr group scale<idx>

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

refactor: write scales to zarr group scale<idx>

pass name to multiscales class

correct arg passage

Create scene.py

add support for passing additional transformations through class interface

add reading/writing

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

remove class conversion methods

add fallback for scales

remove v06 objects

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Delete scene.py

fix: make precommit happy

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

fix: typesafety

fix: no self-conversion

test: add testcases for class-based writer

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

refactor: write scales to zarr group scale<idx>

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
@jo-mueller jo-mueller force-pushed the introduce-image-class branch from 2b38492 to c67569d Compare March 5, 2026 09:00
@will-moore
Copy link
Copy Markdown
Member

It would be nice to support writing of "omero" metadata. I think that's covered by ome-zarr-models-py too.
Could be a follow-up PR?

@jo-mueller
Copy link
Copy Markdown
Collaborator Author

jo-mueller commented Mar 6, 2026

@will-moore

Could be a follow-up PR?

Agree.

where this is heading now?
Are we planning to keep all the existing write methods (any API changes)?

THAT is a good question I'm not entirely sure of myself. I guess if we want to go this way further, we would ultimately deprecate write_image and its siblings over the class-based API. As an intermediate step, we could make these functions do something like this under the hood:

def write_image(args, kwargs):
  image = NGffImage(args, kwargs)
  multiscales = NgffMultiscales(image, ....)
  multiscales.to_ome_zarr(....)

which would at least reduce the amount of code to maintain and make sure that everything we do on the class-based API side is covered well by the already existing tests.

What's missing here

The only thing that makes tests fail here currently is this one: ome-zarr-models/ome-zarr-models-py#398. Locally, all tests are passing.

Also, note that this branch has been rebased on #544, so that'll have to go in first, too.

@jo-mueller jo-mueller added the enhancement New feature or request label Mar 19, 2026
pre-commit-ci bot and others added 20 commits March 19, 2026 09:45
[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

chore: remove unused variable
[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants