You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/index.md
+20-7Lines changed: 20 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,6 +52,11 @@ print(spec.model_dump())
52
52
53
53
More examples can be found in the [usage guide](usage_zarr_v2.md).
54
54
55
+
## Installation
56
+
57
+
`pip install -U pydantic-zarr`
58
+
59
+
55
60
### Limitations
56
61
57
62
#### No array data operations
@@ -61,11 +66,6 @@ This library only provides tools to represent the *layout* of Zarr groups and ar
61
66
62
67
This library supports [version 2](https://zarr.readthedocs.io/en/stable/spec/v2.html) of the Zarr format, with partial support for [Zarr v3](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html). Progress towards complete support for Zarr v3 is tracked by [this issue](https://github.com/d-v-b/pydantic-zarr/issues/3).
63
68
64
-
65
-
## Installation
66
-
67
-
`pip install -U pydantic-zarr`
68
-
69
69
## Design
70
70
71
71
A Zarr group can be modeled as an object with two properties:
@@ -79,12 +79,25 @@ Note the use of the term "modeled": Zarr arrays are useful because they store N-
79
79
80
80
In `pydantic-zarr`, Zarr groups are modeled by the `GroupSpec` class, which is a [`Pydantic model`](https://docs.pydantic.dev/latest/concepts/models/) with two fields:
81
81
82
-
-`GroupSpec.attributes`: either a `Mapping` or a `pydantic.BaseModel`.
83
-
-`GroupSpec.members`: a mapping with string keys and values that must be `GroupSpec` or `ArraySpec` instances.
82
+
-`attributes`: either a `Mapping` or a `pydantic.BaseModel`.
83
+
-`members`: either a mapping with string keys and values that must be `GroupSpec` or `ArraySpec` instances, or the value `Null`. The use of nullability is explained in its own [section](#nullable-members).
84
84
85
85
Zarr arrays are represented by the `ArraySpec` class, which has a similar `attributes` field, as well as fields for all the Zarr array properties (`dtype`, `shape`, `chunks`, etc).
86
86
87
87
`GroupSpec` and `ArraySpec` are both [generic models](https://docs.pydantic.dev/1.10/usage/models/#generic-models). `GroupSpec` takes two type parameters, the first specializing the type of `GroupSpec.attributes`, and the second specializing the type of the *values* of `GroupSpec.members` (the keys of `GroupSpec.members` are always strings). `ArraySpec` only takes one type parameter, which specializes the type of `ArraySpec.attributes`.
88
88
89
89
Examples using this generic typing functionality can be found in the [usage guide](usage_zarr_v2.md#using-generic-types).
90
90
91
+
### Nullable `members`
92
+
93
+
When a Zarr group has no members, a `GroupSpec` model of that Zarr group will have its `members` attribute set to the empty dict `{}`. But there are scenarios where the members of a Zarr group are unknown:
94
+
95
+
- Some Zarr storage backends do not support directory listing, in which case it is possible to access a Zarr group and inspect its attributes, but impossible to discover its members. So the members of such a Zarr group are unknown.
96
+
- Traversing a deeply nested large Zarr group on high latency storage can be slow. This can be mitigated by only partially traversing the hierarchy, e.g. only inspecting the root group and N subgroups. This defines a sub-hierarchy of the full hierarchy; leaf groups of this subtree by definition did not have their members checked, and so their members are unknown.
97
+
- A Zarr hierarchy can be represented as a mapping `M` from paths to nodes (array or group). In this case, if `M["key"]` is a model of a Zarr group `G`, then `M["key/subkey"]` would encode a member of `G`. Since the key structure of the mapping `M` is doing the work of encoding the members of `G`, there is no value in `G` having a members attribute that claims anything about the members of `G`, and so `G.members` should be modeled as unknown.
98
+
99
+
To handle these cases, `pydantic-zarr` allows the `members` attribute of a `GroupSpec` to be `Null`.
100
+
101
+
## Standardization
102
+
103
+
The Zarr specifications do not define a model of the Zarr hierarchy. `pydantic-zarr` is an implementation of a particular model that can be found formalized in this [specification document](https://github.com/d-v-b/zeps/blob/zom/draft/ZEP0006.md), which has been proposed for inclusion in the Zarr specifications. You can find the discussion of that proposal in [this pull request](https://github.com/zarr-developers/zeps/pull/46).
Copy file name to clipboardExpand all lines: docs/usage_zarr_v2.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@
6
6
7
7
The `GroupSpec` and `ArraySpec` classes represent Zarr v2 groups and arrays, respectively. To create an instance of a `GroupSpec` or `ArraySpec` from an existing Zarr group or array, pass the Zarr group / array to the `.from_zarr` method defined on the `GroupSpec` / `ArraySpec` classes. This will result in a `pydantic-zarr` model of the Zarr object.
8
8
9
-
Note that `GroupSpec.from_zarr(zarr_group)` will traverse the entire hierarchy under `zarr_group`. Future versions of this library may introduce a limit on the depth of this traversal: see [#2](https://github.com/d-v-b/pydantic-zarr/issues/2).
9
+
> By default `GroupSpec.from_zarr(zarr_group)` will traverse the entire hierarchy under `zarr_group`. This can be extremely slow if used on an extensive Zarr group on high latency storage. To limit the depth of traversal to a specific depth, use the `depth` keyword argument, e.g. `GroupSpec.from_zarr(zarr_group, depth=1)`
10
10
11
11
Note that `from_zarr` will *not* read the data inside an array.
0 commit comments