xpublish-community · mpiannucci · May 26, 2026 · May 19, 2026 · May 19, 2026 · May 19, 2026
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # Xpublish
 
-Publish Xarray Datasets to the web
+Publish Xarray Datasets and DataTrees to the web
 
 <!-- badges-start -->
 
@@ -18,12 +18,14 @@ Publish Xarray Datasets to the web
 
 ## A quick example
 
-**Serverside: Publish a Xarray Dataset through a rest API**
+**Serverside: Publish an Xarray Dataset or DataTree through a REST API**
 
 <!-- server-example-start -->
 
 ```python
 ds.rest.serve(host="0.0.0.0", port=9000)
+# or, for a hierarchical DataTree, the API is identical:
+dt.rest.serve(host="0.0.0.0", port=9000)
 ```
 
 <!-- server-example-end -->
@@ -55,9 +57,9 @@ Or to explore other access methods, open [http://0.0.0.0:9000/docs](http://0.0.0
 
 ## Why?
 
-Xpublish lets you serve/share/publish Xarray Datasets via a web application.
+Xpublish lets you serve/share/publish Xarray Datasets and DataTrees via a web application.
 
-The data and/or metadata in the Xarray Datasets can be exposed in various forms through [pluggable REST API endpoints](https://xpublish.readthedocs.io/en/latest/user-guide/plugins.html).
+The data and/or metadata can be exposed in various forms through [pluggable REST API endpoints](https://xpublish.readthedocs.io/en/latest/user-guide/plugins.html). Hierarchical data is supported natively — bare Datasets are wrapped in a single-node DataTree internally so the same routes, accessors, and plugins work whether you're serving a flat dataset or a deeply nested tree.
 Efficient, on-demand delivery of large datasets may be enabled with Dask on the server-side.
 
 Xpublish's [plugin ecosystem](https://xpublish.readthedocs.io/en/latest/ecosystem/index.html#plugins) has capabilities including:

diff --git a/docs/source/api/included_plugins.md b/docs/source/api/included_plugins.md
@@ -8,7 +8,21 @@ Xpublish includes a set of built in plugins with associated endpoints.
 
 ## Dataset Info
 
-The dataset info plugin provides a handful of default ways to display datasets and their metadata.
+The dataset info plugin provides a handful of default ways to display datasets
+and their metadata. Endpoints come in two flavors:
+
+- **Root endpoints** — `/`, `/keys`, `/dict`, `/info` — operate on the root node
+  of the underlying {py:class}`xarray.DataTree`. For a flat dataset this is just
+  the dataset itself; for a DataTree it is the root group.
+- **Group-aware endpoints** — `/groups/{group_path:path}/`,
+  `/groups/{group_path:path}/keys`, `/groups/{group_path:path}/dict`, and
+  `/groups/{group_path:path}/info` — return the same information for the node at
+  the given group path in the tree.
+
+In addition, two tree-shaped endpoints expose the DataTree directly:
+
+- `/tree` — HTML representation of the full DataTree.
+- `/groups` — JSON list of every group path in the tree (e.g. `["/", "/a", "/a/b"]`).
 
 ```{eval-rst}
 .. autosummary::
@@ -22,6 +36,12 @@ The dataset info plugin provides a handful of default ways to display datasets a
     /datasets/{dataset_id}/keys
     /datasets/{dataset_id}/dict
     /datasets/{dataset_id}/info
+    /datasets/{dataset_id}/tree
+    /datasets/{dataset_id}/groups
+    /datasets/{dataset_id}/groups/{group_path}
+    /datasets/{dataset_id}/groups/{group_path}/keys
+    /datasets/{dataset_id}/groups/{group_path}/dict
+    /datasets/{dataset_id}/groups/{group_path}/info
 ```
 
 ## Module Version

diff --git a/docs/source/api/index.md b/docs/source/api/index.md
@@ -16,7 +16,9 @@ plugins
 ## Top-level Rest class
 
 The {class}`~xpublish.Rest` class can be used for publishing a
-{class}`xarray.Dataset` object or a collection of Dataset objects.
+{class}`xarray.Dataset` or {class}`xarray.DataTree` object, or a collection of either.
+A bare Dataset is wrapped in a single-node DataTree internally so the rest of the
+library operates uniformly on hierarchical data.
 
 The main interfaces to Xpublish that many users may use.
 
@@ -44,6 +46,7 @@ by plugin dependencies.
    Rest.setup_datasets
    Rest.get_datasets_from_plugins
    Rest.get_dataset_from_plugins
+   Rest.get_datatree_from_plugins
    Rest.setup_plugins
    Rest.init_cache_kwargs
    Rest.init_app_kwargs
@@ -120,6 +123,50 @@ dataset. Proper use of this accessor should be like:
    Dataset.rest.serve
 ```
 
+## DataTree.rest (xarray accessor)
+
+The same accessor is registered on {py:class}`xarray.DataTree`, exposing the
+same interface for publishing a single hierarchical tree:
+
+```
+>>> import xarray as xr
+>>> import xpublish
+>>> dt = xr.DataTree()          # or load one with xr.open_datatree(...)
+>>> dt.rest(...)                # configure (optional)
+>>> dt.rest.serve()             # serve the tree
+```
+
+**Calling the accessor**
+
+```{eval-rst}
+.. autosummary::
+   :toctree: generated/
+   :template: autosummary/accessor_callable.rst
+
+   DataTree.rest
+```
+
+**Properties**
+
+```{eval-rst}
+.. autosummary::
+   :toctree: generated/
+   :template: autosummary/accessor_attribute.rst
+
+   DataTree.rest.app
+   DataTree.rest.cache
+```
+
+**Methods**
+
+```{eval-rst}
+.. autosummary::
+   :toctree: generated/
+   :template: autosummary/accessor_method.rst
+
+   DataTree.rest.serve
+```
+
 ## FastAPI dependencies
 
 The functions below are defined in module `xpublish.dependencies` and can
@@ -139,7 +186,13 @@ passed in to the `Plugin.app_router` or `Plugin.dataset_router` method.
 
    get_dataset_ids
    get_dataset
+   get_datatree
    get_cache
    get_plugins
    get_plugin_manager
 ```
+
+When a route declares a `{group_path:path}` segment, `get_dataset` returns
+the Dataset at that node of the underlying DataTree (or the root dataset if no
+`group_path` is present). `get_datatree` returns the subtree rooted at the
+requested group.
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -68,6 +68,14 @@
 # https://autodoc-pydantic.readthedocs.io/en/stable/users/configuration.html#show-schema-json-error-strategy
 autodoc_pydantic_model_show_json_error_strategy = 'coerce'
 
+# Skip rendering the JSON schema collapsible block for pydantic models.
+# Several xpublish models (e.g. Dependencies) hold Callable fields that aren't
+# JSON-serializable; on autodoc_pydantic 2.2.0 + pydantic 2.13 the second
+# invocation of the "coerce" sanitized-model fallback for the same model
+# returns a sibling whose core schema is still a MockCoreSchema and raises
+# `PydanticUserError: <Model> is not fully defined`.
+autodoc_pydantic_model_show_json = False
+
 myst_enable_extensions = []
 myst_heading_anchors = 6
 

diff --git a/docs/source/getting-started/tutorial/dataset-provider-plugin.md b/docs/source/getting-started/tutorial/dataset-provider-plugin.md
@@ -9,6 +9,12 @@ This also allows organizations to quickly be able to adapt Xpublish to work in t
 
 With this plugin, Xpublish can serve the same datasets as we explictly defined and loaded in [serving multiple datasets](./serving-multiple-datasets.md), as well as any others supported by [`xr.tutorial`](https://github.com/pydata/xarray/blob/main/xarray/tutorial.py)
 
+The plugin implements {py:meth}`xpublish.plugins.hooks.PluginSpec.get_datatree` —
+the modern provider hook. The older `get_dataset` hook is still honored for
+backwards compatibility (with a {py:class}`DeprecationWarning`) but new plugins
+should always implement `get_datatree`. See the [DataTrees tutorial](./datatrees.md)
+for the lazy-by-group pattern used by Zarr/Icechunk-backed providers.
+
 ```{note}
 For more details on building dataset provider plugins, please see the [plugin user guide](../../user-guide/plugins.md#dataset-provider-plugins)
 ```
diff --git a/docs/source/getting-started/tutorial/dataset-provider-plugin.py b/docs/source/getting-started/tutorial/dataset-provider-plugin.py
@@ -12,11 +12,17 @@ def get_datasets(self):
         return list(xr.tutorial.file_formats)
 
     @hookimpl
-    def get_dataset(self, dataset_id: str):
+    def get_datatree(self, dataset_id: str, group: str):
+        # The xarray tutorial datasets are flat, so we only serve the root.
+        # Note: ``group`` must be a positional parameter (no default) — pluggy
+        # will not forward arguments that have defaults to the hookimpl.
+        if group:
+            return None
         try:
-            return xr.tutorial.open_dataset(dataset_id)
+            ds = xr.tutorial.open_dataset(dataset_id)
         except HTTPError:
             return None
+        return xr.DataTree(dataset=ds)
 
 
 rest = Rest({})

diff --git a/docs/source/getting-started/tutorial/datatrees.md b/docs/source/getting-started/tutorial/datatrees.md
@@ -0,0 +1,130 @@
+# Serving DataTrees
+
+Xpublish treats {py:class}`xarray.DataTree` as its core data primitive. A bare
+{py:class}`xarray.Dataset` is just a one-node tree under the hood, so everything
+you've learned so far about serving Datasets applies unchanged when you switch
+to trees.
+
+## Serving a single DataTree
+
+You can publish a `DataTree` directly with {class}`~xpublish.SingleDatasetRest`
+or the `.rest` accessor — the API is identical to the Dataset case:
+
+```python
+import xarray as xr
+import xpublish
+
+dt = xr.DataTree(name="root")
+dt["a"] = xr.DataTree(dataset=xr.Dataset({"x": ("i", [1, 2, 3])}))
+dt["a/b"] = xr.DataTree(dataset=xr.Dataset({"y": ("j", [10.0, 20.0])}))
+
+rest = xpublish.SingleDatasetRest(dt)
+# or, equivalently:
+dt.rest.serve()
+```
+
+## Serving a collection of trees (and datasets)
+
+{class}`~xpublish.Rest` accepts a mapping whose values can be either
+`Dataset` or `DataTree` objects in any combination:
+
+```python
+rest = xpublish.Rest(
+    {
+        "flat": xr.Dataset({"var": ("x", [1, 2, 3])}),
+        "tree": dt,
+    }
+)
+rest.serve()
+```
+
+The flat dataset is wrapped in a single-node tree internally, so it shows up
+in the `/groups` listing as just `["/"]`.
+
+## Navigating groups via the URL
+
+Per-dataset routes can include an optional `{group_path:path}` segment to
+navigate into a node of the tree. The included `dataset_info` plugin uses this
+convention to expose group-aware variants of its endpoints:
+
+| URL                              | What it returns                      |
+| -------------------------------- | ------------------------------------ |
+| `/datasets/tree/`                | HTML repr of the root node           |
+| `/datasets/tree/keys`            | Variable keys at the root            |
+| `/datasets/tree/groups`          | List of every group path in the tree |
+| `/datasets/tree/tree`            | HTML repr of the full DataTree       |
+| `/datasets/tree/groups/a/keys`   | Variable keys at the `/a` node       |
+| `/datasets/tree/groups/a/b/info` | Schema info at the `/a/b` node       |
+
+Group paths can be arbitrarily nested — the `{group_path:path}` parameter
+matches across slashes. An unknown group returns a `404`.
+
+## Dataset provider plugins for trees
+
+The provider hook for plugins is
+{py:meth}`xpublish.plugins.hooks.PluginSpec.get_datatree`. It receives both the
+`dataset_id` and the requested `group` path, and returns the
+{py:class}`xarray.DataTree` rooted at that group (or `None` to pass to the next
+plugin). The returned tree's root corresponds to the requested group.
+
+```{important}
+``group`` must be declared as a **positional** parameter (no default) on your
+hookimpl. [Pluggy](https://pluggy.readthedocs.io/) does not forward arguments
+that have defaults, so a signature like ``def get_datatree(self, dataset_id, group="")``
+will silently receive an empty string regardless of the URL. See the
+[plugin user guide](../../user-guide/plugins.md#dataset-provider-plugins) for
+details.
+```
+
+### The lazy-by-group pattern
+
+For backends where loading the whole tree is expensive (Zarr v3, Icechunk,
+remote object stores), implement `get_datatree` so it opens *only* the
+requested group and wraps it in a single-node tree:
+
+```python
+import xarray as xr
+from xpublish import Plugin, hookimpl
+
+
+class IcechunkProvider(Plugin):
+    name: str = "icechunk"
+
+    @hookimpl
+    def get_datasets(self):
+        return list(self._known_repos)
+
+    @hookimpl
+    def get_datatree(self, dataset_id: str, group: str):
+        store = self._store_for(dataset_id)
+        if store is None:
+            return None
+        ds = xr.open_zarr(store, group=group or None, consolidated=False)
+        return xr.DataTree(dataset=ds)
+```
+
+Each request opens just the one group being viewed, so cost stays proportional
+to what's actually queried.
+
+## Migrating from `get_dataset`
+
+The older {py:meth}`xpublish.plugins.hooks.PluginSpec.get_dataset` hook is still
+honored but emits a {py:class}`DeprecationWarning`. The Dataset it returns is
+wrapped in a single-node DataTree, so only the root group is reachable through
+it. Migrate to `get_datatree` to expose hierarchical data — the rename is
+mechanical:
+
+```python
+# Before
+@hookimpl
+def get_dataset(self, dataset_id: str):
+    return xr.tutorial.open_dataset(dataset_id)
+
+
+# After
+@hookimpl
+def get_datatree(self, dataset_id: str, group: str):
+    if group:
+        return None  # we only serve a flat dataset
+    return xr.DataTree(dataset=xr.tutorial.open_dataset(dataset_id))
+```
diff --git a/docs/source/getting-started/tutorial/index.md b/docs/source/getting-started/tutorial/index.md
@@ -11,6 +11,7 @@ hidden:
 introduction
 dataset-router
 serving-multiple-datasets
+datatrees
 using-plugins
 dataset-router-plugin
 dataset-provider-plugin

diff --git a/docs/source/getting-started/tutorial/introduction.md b/docs/source/getting-started/tutorial/introduction.md
@@ -32,6 +32,10 @@ for more convenience:
 ds.rest
 ```
 
+The same accessor is registered on {py:class}`xarray.DataTree` — `dt.rest`
+works exactly like `ds.rest`. See the [DataTrees tutorial](./datatrees.md) for
+how hierarchical data is served and navigated.
+
 Optional customization of the underlying [FastAPI application](https://fastapi.tiangolo.com) or the server-side [cache](https://github.com/dask/cachey) is possible, e.g.,
 
 ```python

diff --git a/docs/source/getting-started/why-xpublish.md b/docs/source/getting-started/why-xpublish.md
@@ -4,7 +4,7 @@ Xarray provides an intuitive API on top of a foundational data model, labeled ar
 This API and data model has formed the basis for a large and growing ecosystem of tools.
 
 Xpublish stands on the shoulders of Xarray and the greater PyData ecosystem enabling both new and old users, interactions, and clients.
-Xpublish does this by using Xarray datasets as the core data interchange format within the server, and surrounding that with an ecosystem of plugins.
+Xpublish does this by using Xarray datasets and DataTrees as the core data interchange format within the server, and surrounding that with an ecosystem of plugins.
 
 ```{warning} Hold on to your hats, we're about to say Xpublish a lot
 <div style='position:relative; padding-bottom:calc(75.00% + 44px)'><iframe src='https://gfycat.com/ifr/ShadowyHoarseInganue' frameborder='0' scrolling='no' width='100%' height='100%' style='position:absolute;top:0;left:0;' allowfullscreen></iframe></div><p> <a href="https://gfycat.com/shadowyhoarseinganue">via Gfycat</a></p>