Skip to content

Remove zarr plugin and decouple core from zarr#333

Open
mpiannucci wants to merge 2 commits into
mainfrom
claude/compassionate-lovelace-00fca6
Open

Remove zarr plugin and decouple core from zarr#333
mpiannucci wants to merge 2 commits into
mainfrom
claude/compassionate-lovelace-00fca6

Conversation

@mpiannucci
Copy link
Copy Markdown
Contributor

@mpiannucci mpiannucci commented May 20, 2026

Summary

The zarr plugin has been ported to its own project (xpublish-community/xpublish-zarr#1). This PR removes the built-in ZarrPlugin, the xpublish.utils.zarr module, and the zarr / numcodecs / dask dependencies from xpublish core.

What changed

Removed

  • xpublish/plugins/included/zarr.py (the plugin) and xpublish/utils/zarr.py (the zmetadata/chunk helpers)
  • The zarr entry point, zarr/numcodecs/dask dependencies, and zarr keyword from pyproject.toml
  • zarr and numcodecs from the modules reported by ModuleVersionPlugin
  • tests/test_zarr_compat.py, tests/test_fsspec_compat.py, and the now-unused tests/utils.py (TestStore/create_dataset were only consumed by those files)
  • Zarr-only test cases / cache-key assertions in tests/test_rest_api.py, tests/test_core.py, and tests/test_plugin_management.py
  • The zarr-focused examples/open_dataset.ipynb notebook and zarr/numcodecs from .binder/environment.yml / .binder/test.py
  • Stale "Disable Zarr upstream tests" comment in the CI workflow

Decoupled /info
The dataset_info plugin's /info endpoint previously ran the dataset through the zarr consolidated-metadata pipeline to read attrs/dims/dtypes. It now reads them directly from the xarray.Dataset, with a small _jsonable helper that converts numpy scalars / arrays into JSON-serializable types. The response shape is unchanged for the test fixtures.

Docs

  • README, the getting-started tutorial, and docs/source/api/included_plugins.md no longer describe built-in zarr endpoints; they point at xpublish-zarr instead.

🤖 Generated with Claude Code

mpiannucci and others added 2 commits May 20, 2026 13:34
The zarr plugin has been ported to its own project
(xpublish-community/xpublish-zarr#1). This drops the built-in
ZarrPlugin, the xpublish.utils.zarr module, and the zarr/numcodecs
dependencies from xpublish core.

- Delete xpublish/plugins/included/zarr.py and xpublish/utils/zarr.py
- Rewrite dataset_info's /info endpoint to read dims/dtype/attrs
  directly from the xarray Dataset (with a small helper to make numpy
  scalars JSON-serializable), removing the zmetadata round-trip
- Drop zarr/numcodecs from module_version's reported modules
- Remove zarr, numcodecs, and dask dependencies, the zarr entry point,
  and the zarr keyword from pyproject.toml
- Update README, tutorial, and included-plugins docs to point at
  xpublish-zarr instead of describing built-in zarr endpoints
- Remove zarr-only tests and prune zarr cases from the remaining suite
- Drop the zarr-focused example notebook and zarr/numcodecs from the
  binder environment

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Drop dask from the upstream dev-build install list since it's no
  longer a dependency; installing it with --no-deps left cloudpickle
  missing in dev builds, which broke xarray decoding
- Reorder xpublish/plugins/hooks.py so Plugin is defined before
  Dependencies, eliminating the 'Plugin' forward reference in
  Dependencies.plugins that tripped autodoc-pydantic during the docs
  build
- Disable autodoc_pydantic_model_show_json globally; the Dependencies
  model is all Callable fields and pydantic can't generate a JSON
  schema for it (the coerce strategy didn't fully cover this case)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mpiannucci mpiannucci requested review from abkfenris and jhamman May 20, 2026 18:23
@mpiannucci mpiannucci marked this pull request as ready for review May 21, 2026 19:49
Copy link
Copy Markdown
Member

@abkfenris abkfenris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main thing is making it more clear that Zarr was ripped out and now needs to also be installed for previous functionality.

Comment on lines -59 to -70
The Zarr plugin provides consolidated Zarr v2 access to the loaded datasets.

```{eval-rst}
.. autosummary::
:toctree: generated/

plugins.included.zarr.ZarrPlugin

.. openapi:: ./openapi.json
:include:
/datasets/{dataset_id}/zarr/*
```
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets put a notice here that Zarr has migrated to it's own library as well.

Comment on lines -55 to +68
from ...utils.zarr import attrs_key, get_zmetadata, get_zvariables # type: ignore

zvariables = get_zvariables(dataset, cache)
zmetadata = get_zmetadata(dataset, cache, zvariables)

info = {}
info: dict = {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a slowdown from the lack of caching here? Could we cache info?

Comment thread xpublish/plugins/hooks.py
return d


class Dependencies(BaseModel):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved just because?

Comment on lines +74 to +75
Additional data access endpoints (such as Zarr-compatible access via
[xpublish-zarr]) can be added by installing or writing plugins.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be worth an admonition to be more explicit that Zarr was ripped out into a standalone library for the next version and now has to be independently installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants