Skip to content

Commit 311febb

Browse files
authored
Merge pull request #3 from lorenzocerrone/rfc-8
2 parents 3f653f9 + ec4edc4 commit 311febb

1 file changed

Lines changed: 219 additions & 44 deletions

File tree

rfc/8/index.md

Lines changed: 219 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -119,10 +119,10 @@ In a way, coordinate transformations and systems simply become a subset of the m
119119

120120

121121
#### 5. High Content Screening (HCS) plates
122-
(hcs-plates-collection)=
123-
OME-Zarr high content screening plates are a current example of a very narrowly defined type of collection. They allow to group OME-Zarr images in multiple hierarchy levels: A plate contains wells, which are organized as row folders with column subfolders in each. Each well folder can contain a number of images. There is defined metadata about which wells are in a plate and about which images are in a well at the different hierarchy levels, typically with some additional optional metadata like the acquisitions that exist in a plate and which image belongs to which acquisition.
122+
123+
OME-Zarr high content screening plates are a current example of a narrowly defined type of collection. They allow to group OME-Zarr images in multiple hierarchy levels: A plate contains wells, which are organized as row folders with column subfolders in each. Each well folder can contain a number of images. There is defined metadata about which wells are in a plate and about which images are in a well at the different hierarchy levels, typically with some additional optional metadata like the acquisitions that exist in a plate and which image belongs to which acquisition.
124124
This hierarchy is very useful for typical experiments where researchers imaged a multi-well plate. Multiple viewers like MoBIE, napari & ViZarr support displaying the different wells arranged in the plate format given the OME-Zarr HCS metadata, thus avoiding the need for tool-specific metadata and showing the benefits of such collection concepts.
125-
The current HCS spec also has its limitations: It has a strict definition of potential metadata fields at the plate and well level. There are multiple areas where it would be interesting to extend this spec. There are [ongoing discussions](https://github.com/ome/ngff/pull/137) about whether individual microscope fields of view (ie. well) should be stored as individual OME-Zarr images or as a single OME-Zarr image. In that context, it is unclear how one would provide metadata about the individual images in a well and what a viewer should do with them. For example, depending on whether an OME-Zarr image in a well is an individual field of view of a given acquisition, a second acquisition of the same region in a plate or an image derived from a given processing operation, the optimal viewer default on whether to show or not show multiple images at once will vary. A flexible metadata field like `attributes` would allow us to better define such image metadata. A more flexible HCS collection system could also allow to provide advanced metadata on well positions [when wells have different sizes](https://github.com/ome/ome-zarr-py/issues/240) or address other edge-cases in the current HCS configuration.
125+
The current HCS spec also has its limitations: It has a strict definition of potential metadata fields at the plate and well level. There are multiple areas where it would be interesting to extend this spec. There are [ongoing discussions](https://github.com/ome/ngff/pull/137) about whether individual microscope fields of view (ie. well) should be stored as individual OME-Zarr images or as a single OME-Zarr image and how one would represent [different processing intermediates in a plate](https://forum.image.sc/t/how-to-build-hcs-zarrs-with-multiple-image-types-per-fov/119329). In these contexts, the current HCS spec lacks flexibility to get additional metadata about how images in a well are related and what a viewer should do with them. For example, depending on whether an OME-Zarr image in a well is an individual field of view of a given acquisition, a second acquisition of the same region in a plate or an image derived from a given processing operation, the optimal viewer default on whether to show or not show multiple images at once will vary. A flexible metadata field like `attributes` would allow us to better define such image metadata. A more flexible HCS collection system could also allow to provide advanced metadata on well positions [when wells have different sizes](https://github.com/ome/ome-zarr-py/issues/240) or address other edge-cases in the current HCS configuration.
126126

127127

128128
#### 6. Image Archive
@@ -346,7 +346,7 @@ Unprefixed attribute keys that are defined as part of this RFC are:
346346
- `coordinateSystems`
347347
- `coordinateTransformations`
348348
- `labels`, as well as `label-value` and `color` in label attributes
349-
- `plate` and `well`
349+
- `plate`, `well`, `acquisition` for HCS metadata
350350

351351
### Extensibility
352352

@@ -655,70 +655,245 @@ Additional keys MAY be added, following the key naming rules.
655655

656656
### HCS metadata
657657

658-
High content screening data is commonly composed of multiple multiscale images ("well") that are arranged in a grid on a "plate".
659-
Additional metadata for organizing the wells on a plate is introduced here.
658+
High-content screening data is typically organized as a grid of wells on a plate, where each well contains one or more multiscale images from one or more acquisition rounds.
659+
This section introduces additional metadata for organizing wells on a plate.
660660

661-
TODO:
662-
Open questions Joel:
663-
How do we relate derived data to existing data best in the HCS context without becoming a nesting nightmare?
661+
#### `plate` attribute
664662

665-
We have a well with X images. All of the images can have labels and tables. And maybe one would use the collection spec to allow for labels that apply to multiple images in the same well or tables that apply to multiple images.
663+
A `collection` node representing a plate MUST have a `plate` attribute with the following fields:
666664

667-
How do we represent images in wells that can optionally be related to labels and optionally be related to tables? Does the well always contain a nested collection (we called that “the OME-Zarr container”, e.g. the object that knows about the image data, label data and related table data like ROI tables in our work so far)? Or is it sometimes nested, sometimes not?
665+
| Field | Type | Required? | Notes |
666+
| - | - | - | - |
667+
| `"acquisitions"` | array of objects | no | List of acquisitions performed on the plate. |
668+
| `"columns"` | array of objects | yes | List of columns in the plate. Each object MUST have a `"name"` string field. |
669+
| `"rows"` | array of objects | yes | List of rows in the plate. Each object MUST have a `"name"` string field. |
668670

669-
#### Example
670-
```jsonc
671+
Each object in `acquisitions` MAY have the following fields:
672+
673+
| Field | Type | Required? | Notes |
674+
| - | - | - | - |
675+
| `"id"` | number | yes | A unique integer identifier for the acquisition. |
676+
| `"name"` | string | no | A human-readable name for the acquisition. |
677+
678+
#### `well` attribute
679+
680+
A `collection` node representing a well MUST have a `well` attribute with the following fields:
681+
682+
| Field | Type | Required? | Notes |
683+
| - | - | - | - |
684+
| `"column"` | number | yes | The column name of the well in the plate. |
685+
| `"row"` | string | yes | The row name of the well in the plate. |
686+
687+
#### `acquisition` attribute
688+
689+
The `acquisition` attribute is a number whose value MUST match the `id` of one of the acquisitions listed in the `plate` attribute.
690+
It MAY be set on individual `multiscale` nodes within a well or on a `collection` sub-node grouping all images from a single acquisition.
691+
692+
We suggest two possible layouts for HCS data, which are not mutually exclusive and can be used in combination: a "wide" layout where all images are direct children of the well collection and a "tall" layout where images are grouped in sub-collections by acquisition.
693+
694+
#### Wide example (acquisitions flat in the well)
695+
696+
In this layout, all multiscale nodes are direct children of the well collection.
697+
Each node carries an `acquisition` attribute.
698+
Derived images such as label maps are siblings of their source image, can be still linked via the `source` reference in their `labels` attribute, (or similarly via third-party attributes such as `ngio:source`). This layout is more compact but can become cluttered when there are multiple acquisitions and derived nodes.
699+
700+
```json
671701
{
672702
"ome": {
673703
"version": "0.x",
674704
"type": "collection",
675705
"name": "hcs-plate-001",
676706
"attributes": {
677707
"plate": {
678-
"acquisitions": [...],
679-
"columns": [...],
680-
"rows": [...],
708+
"acquisitions": [
709+
{
710+
"id": 0,
711+
"name": "Acquisition Round 1"
712+
}
713+
],
714+
"columns": [
715+
{
716+
"name": "1"
717+
}
718+
],
719+
"rows": [
720+
{
721+
"name": "A"
722+
}
723+
]
681724
}
682-
}
725+
},
683726
"nodes": [
684727
{
685-
"type": "collection",
686-
"name": "well A01",
687-
"attributes": {
688-
"well": {
689-
"column": 1,
690-
"row": "A",
691-
"acquisition": 0
692-
}
693-
}
694-
"nodes": [
695-
{
696-
"type": "multiscale",
697-
"name": "well-001-001",
698-
"path": {
699-
"type": "zarr",
700-
"path": "./A/01/001.img.zarr"
728+
"type": "collection",
729+
"name": "well A01",
730+
"attributes": {
731+
"well": {
732+
"column": "1",
733+
"row": "A"
701734
}
702735
},
703-
{
704-
"type": "multiscale",
705-
"name": "A01_0_nuclei",
706-
"path": {
707-
"type": "zarr",
708-
"path": "/full/path/A/01/nuclei.img.zarr"
736+
"nodes": [
737+
{
738+
"type": "multiscale",
739+
"name": "A01_0",
740+
"path": {
741+
"type": "zarr",
742+
"path": "./A/01/001.img"
743+
},
744+
"attributes": {
745+
"acquisition": 0
746+
}
747+
},
748+
{
749+
"type": "multiscale",
750+
"name": "A01_0_ill_corrected",
751+
"path": {
752+
"type": "zarr",
753+
"path": "./A/01/001_ill_corrected.img"
754+
},
755+
"attributes": {
756+
"acquisition": 0,
757+
"ngio:source": ["A01_0"]
758+
}
709759
},
710-
"attributes": {
711-
"labels": {}
760+
{
761+
"type": "multiscale",
762+
"name": "A01_0_nuclei",
763+
"path": {
764+
"type": "zarr",
765+
"path": "./A/01/001_nuclei.img"
766+
},
767+
"attributes": {
768+
"acquisition": 0,
769+
"labels": {
770+
"source": ["A01_0"]
771+
}
772+
}
773+
}
774+
]
775+
}
776+
]
777+
}
778+
}
779+
```
780+
781+
#### Tall example (acquisitions as sub-collections)
782+
783+
In this layout, each acquisition is wrapped in a sub-collection inside the well.
784+
The `acquisition` attribute is set on the sub-collection rather than on individual nodes.
785+
This serves as an example that wells can consist of collections, not just multiscales.
786+
787+
```json
788+
{
789+
"ome": {
790+
"version": "0.x",
791+
"type": "collection",
792+
"name": "hcs-plate-001",
793+
"attributes": {
794+
"plate": {
795+
"acquisitions": [
796+
{
797+
"id": 0,
798+
"name": "Acquisition Round 1"
799+
},
800+
{
801+
"id": 1,
802+
"name": "Acquisition Round 2"
803+
}
804+
],
805+
"columns": [
806+
{
807+
"name": "1"
808+
}
809+
],
810+
"rows": [
811+
{
812+
"name": "A"
813+
}
814+
]
815+
}
816+
},
817+
"nodes": [
818+
{
819+
"type": "collection",
820+
"name": "well A01",
821+
"attributes": {
822+
"well": {
823+
"column": "1",
824+
"row": "A"
712825
}
713826
},
714-
...
827+
"nodes": [
828+
{
829+
"type": "collection",
830+
"name": "A01_acq0",
831+
"attributes": {
832+
"acquisition": 0
833+
},
834+
"nodes": [
835+
{
836+
"type": "multiscale",
837+
"name": "A01_0",
838+
"path": {
839+
"type": "zarr",
840+
"path": "./A/01/001.img"
841+
}
842+
},
843+
{
844+
"type": "multiscale",
845+
"name": "A01_0_nuclei",
846+
"path": {
847+
"type": "zarr",
848+
"path": "./A/01/001_nuclei.img"
849+
},
850+
"attributes": {
851+
"labels": {
852+
"source": ["A01_0"]
853+
}
854+
}
855+
}
856+
]
857+
},
858+
{
859+
"type": "collection",
860+
"name": "A01_acq1",
861+
"attributes": {
862+
"acquisition": 1
863+
},
864+
"nodes": [
865+
{
866+
"type": "multiscale",
867+
"name": "A01_1",
868+
"path": {
869+
"type": "zarr",
870+
"path": "./A/01/002.img"
871+
}
872+
},
873+
{
874+
"type": "multiscale",
875+
"name": "A01_1_nuclei",
876+
"path": {
877+
"type": "zarr",
878+
"path": "./A/01/002_nuclei.img"
879+
},
880+
"attributes": {
881+
"labels": {
882+
"source": ["A01_1"]
883+
}
884+
}
885+
}
886+
]
887+
}
715888
]
716-
},
717-
...]
889+
}
890+
]
718891
}
719892
}
720893
```
721894

895+
While inlined plate collections are shown above for simplicity, an on-disk plate collection could still refer to separate on-disk collections within each well that are well collections.
896+
722897
### bioformats2raw.layout metadata
723898

724899
The bioformats2raw.layout metadata is replaced by this proposal.

0 commit comments

Comments
 (0)