Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 264 additions & 0 deletions .github/release-notes/v0.7.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
# v0.7.0

**zarrita** aims to be a delightful little library for working with Zarr in
TypeScript. Since the v0.5 rewrite, real-world data and use cases have moved
faster than the library, and our initial abstractions began forcing users to
reach past the public API to get their work done
([#349](https://github.com/manzt/zarrita.js/issues/349),
[#310](https://github.com/manzt/zarrita.js/issues/310),
[#325](https://github.com/manzt/zarrita.js/issues/325),
[#352](https://github.com/manzt/zarrita.js/issues/352)). You filed issues. It
took us some time to work out the right shape for the fixes. Thank you for your
patience.

**v0.7 focuses on extensibility and correctness.** **Extensibility**, because
most real use cases weren't about writing new stores (you still can, and
`FetchStore` covers the common case). They were about *layering behavior*
over an existing store: caching, request batching, auth, virtual-format
translation. People were already doing that through subclassing and
hand-rolled proxies. v0.7 makes layering simple and enjoyable.
**Correctness**, because the other gaps were places where the library
silently got the wrong answer on real data.

> ⚠️ **Heads up**: v0.7 has one hard break (`zarr.create` options are now
> camelCase) and one deprecation (`FetchStore`'s `overrides` option, in
> favor of `fetch`). Everything else is additive. Full details in the
> [migration guide](https://zarrita.dev/migration/v0.7).

## Extensions

Before v0.7, zarrita had blessed extension points for custom codecs and custom
stores (implementing `AsyncReadable` from scratch), but nothing for **layering
behavior on an existing store**. Adding these kinds of features to a
`FetchStore`, for example, meant subclassing it, wrapping it in a hand-rolled
`Proxy`, or smuggling per-call state through the `AsyncReadable<Options>`
generic. Intercepting at the *chunk* layer (instead of the byte layer) meant
replacing `zarr.Array` with a bare `Proxy` entirely; there was no extension
point there.

v0.7 introduces two symmetric extension points, one per layer:

| Layer | Intercepts | Define | Compose |
| --------- | ------------------------- | --------------------------- | ------------------ |
| Transport | `store.get(key, range)` | `zarr.defineStoreExtension` | `zarr.extendStore` |
| Data | `array.getChunk(coords)` | `zarr.defineArrayExtension` | `zarr.extendArray` |

Store extensions handle paths and bytes. Array extensions handle chunk
coordinates. A factory receives the inner value and user options and returns an
object of method overrides; everything else is delegated through a `Proxy`, so
consumers never notice the wrapper. Compose extensions in a pipeline:

```ts
let store = await zarr.extendStore(
new zarr.FetchStore("https://example.com/data.zarr"),
zarr.withConsolidatedMetadata,
(s) => zarr.withRangeCoalescing(s, { coalesceSize: 32_768 }),
(s) => zarr.withByteCaching(s),
);

let arr = await zarr.open(store, { kind: "array" });
```

Three store extensions ship in the box, built on the new primitive rather
than bolted alongside it:

- **`zarr.withConsolidatedMetadata`**: short-circuits metadata reads from a
pre-fetched consolidated blob. Now reads v3 `consolidated_metadata`
from the root `zarr.json`, matching zarr-python. A `format` option
picks v2, v3, or a fallback order; auto-detects by default. *v3
consolidated metadata is not yet part of the official spec; treat
it as experimental.*
- **`zarr.withRangeCoalescing`**: microtask-tick range batcher. Concurrent
`getRange` calls within a microtask are grouped by path, coalesced
across a byte-gap threshold, and issued as a single request per group.
Big win for many-small-chunk workloads.
- **`zarr.withByteCaching`**: byte cache over `get` and `getRange`, with an
optional `keyFor` for cache-policy narrowing. The cache container is
any object implementing `has`/`get`/`set`; a plain `Map` is the default.

### Virtual-format adapters

A store extension can also declare an `arrayExtensions` field. `zarr.open`
wraps every array it pulls out of that store with those extensions
automatically. This is the primitive behind **virtual-format adapters**:
projects that synthesize Zarr metadata at the transport layer and supply
decoded chunks at the data layer from a single factory with shared closure
state:

```ts
const hdf5VirtualZarr = zarr.defineStoreExtension(
(inner, extOptions: { root: string }) => {
let parsed = parseHdf5(extOptions.root); // shared between layers
return {
async get(key, options) {
if (isVirtualMetadataKey(key, parsed)) {
return synthesizeJson(key, parsed);
}
return inner.get(key, options);
},
arrayExtensions: [
zarr.defineArrayExtension((_inner) => ({
async getChunk(coords) {
return parsed.readChunk(coords);
},
})),
],
};
},
);
```

Full API reference: [store extensions docs](https://zarrita.dev/store-extensions).

## Correctness

### Structured errors

Zarrita now throws a small set of tagged error types from a single hierarchy,
replacing the ad-hoc mix of plain `Error`s and one-off classes
(`NodeNotFoundError`, `JsonDecodeError`, `KeyError`, `IndexError`). Six tags
cover every failure mode reachable from `zarr.open`, `zarr.get`, and
`zarr.set`:

| Tag | Thrown when | Extra fields |
| ----------------------- | --------------------------------------------------------------------------------------------------------- | ----------------------------- |
| `NotFoundError` | store returned nothing, or `open({ kind })` found a different node kind | `path`, `found` |
| `InvalidMetadataError` | JSON decode failure, unknown dtype or chunk-key encoding, codec config rejected at load time | |
| `UnknownCodecError` | codec name not in the registry | `codec` |
| `CodecPipelineError` | codec encode/decode threw at runtime | `direction`, `codec`, `cause` |
| `InvalidSelectionError` | bad rank, out-of-bounds, zero step, dimension-name mismatch, scalar-shape mismatch | |
| `UnsupportedError` | capability limit (sharded set, unimplemented codec encode paths, missing `DataView.prototype.getFloat16`) | |

```ts
try {
await zarr.open(store, { kind: "array" });
} catch (e) {
if (zarr.isZarritaError(e, "NotFoundError")) {
// e.path, e.found available
} else if (zarr.isZarritaError(e, "UnknownCodecError")) {
// e.codec is the unregistered codec name; register and retry
}
}
```

Classes are exported, but the docs steer callers toward `zarr.isZarritaError`
so `instanceof` checks aren't load-bearing. The class hierarchy can change
later without breaking tag-based call sites.

### Quantized data: `cast_value`, `scale_offset`, `fixedscaleoffset`

v0.7 adds the
[`cast_value`](https://github.com/zarr-developers/zarr-extensions/tree/main/codecs/cast_value)
and
[`scale_offset`](https://github.com/zarr-developers/zarr-extensions/tree/main/codecs/scale_offset)
codecs from zarr-extensions, which together read arrays that store quantized
values on disk (e.g. `int16` scaled into a `float32` range). Zarrita also
translates the v2 `fixedscaleoffset` numcodecs filter into a `scale_offset` +
`cast_value` pair on read, so v2 arrays produced by
`numcodecs.FixedScaleOffset` (common in climate and imaging data) open
transparently. The array is exposed with its logical (decoded) data type even
though the bytes on disk are quantized.

### Fill values, scalars, and browser autodetect

`NaN`, `Infinity`, and `-Infinity` now round-trip correctly as fill values per
the Zarr v3 spec (previously they silently serialized as `null`, and missing
chunks filled with `0`). `get` and `set` work for scalar arrays (`shape=[]`).
`zarr.open`'s version autodetection no longer fails in browsers when a server
returns a non-JSON response for a v2 metadata key. Each was a
silent-wrong-answer bug on real data. Not anymore.

## Ergonomics

### Custom `fetch` on `FetchStore`

`FetchStore` now accepts a [WinterTC](https://wintertc.org/)-style `fetch`
handler (a function from `Request` to `Promise<Response>`) for the long tail of
fetch-level concerns: auth, presigning, header injection, response remapping,
retries. Configured once on the store, it runs on every outgoing request, so
the rest of your code doesn't have to know:

```ts
const store = new FetchStore("https://example.com/data.zarr", {
async fetch(request) {
const token = await getAccessToken();
request.headers.set("Authorization", `Bearer ${token}`);
return fetch(request);
},
});
```

The previous `overrides` option only supported static `RequestInit`, so
anything dynamic (like refreshing a token) had to be threaded through every
`zarr.get` call site. `overrides` is deprecated in favor of `fetch` and will be
removed in a future major release. See the [migration
guide](https://zarrita.dev/migration/v0.7#fetchstore-s-overrides-option-is-deprecated-in-favor-of-fetch).

### Cancellation with `AbortSignal`

`zarr.open`, `zarr.get`, and `zarr.set` now accept an
[`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal).
Signals are forwarded to the store and checked between async steps, so
in-flight work cancels cleanly:

```ts
const controller = new AbortController();
await zarr.get(arr, null, { signal: controller.signal });
```

Removing the `Options` generic on `AsyncReadable` (the same change that
unblocked array extensions) let `signal` become a plain field on `GetOptions`
instead of opaque per-call state threaded through the store.

### Named-dimension selection

`zarr.Array` now exposes a `dimensionNames` getter that returns the same answer
for v2 (`_ARRAY_DIMENSIONS` in `.zattrs`) and v3 (`dimension_names` in
`zarr.json`). You don't need to know which. The new `zarr.select` helper
converts a record of dimension names into the positional selection array
`zarr.get` and `zarr.set` accept:

```ts
// arr.dimensionNames -> ["time", "lat", "lon"]
let selection = zarr.select(arr, { lat: zarr.slice(100, 200), time: 0 });
// -> [0, slice(100, 200), null]
let result = await zarr.get(arr, selection);
```

Unknown dimension names throw an `InvalidSelectionError` (see above).

## Also in this release

- **camelCase `zarr.create` options**: `chunkShape`, `fillValue`,
`chunkSeparator`, `dimensionNames`, `dtype` (was `data_type`). The
one hard break in v0.7; TypeScript will flag every call site.
Migration in
[the guide](https://zarrita.dev/migration/v0.7#zarr-create-options-are-camelcase).
- **`numcodecs.*` namespace** for the v2 codec registry, matching
[zarr-python's convention](https://numcodecs.readthedocs.io/).
Built-in `numcodecs.shuffle` and `numcodecs.delta` (pure JS, no WASM).
- **`bigint` in `slice()`** for addressing large dimensions.
- **`attrs` option in top-level `open()`** to skip `.zattrs` loading
for v2 stores.
- **Source maps to TypeScript** (`declarationMap`) so "go to
definition" resolves to `.ts` source instead of `.d.ts`.
- **`unzipit`** upgraded from 1.4.3 to 2.0.0.
- **Bug fixes**: `NarrowDataType` narrows the `"boolean"` query to
`Bool`, plus smaller correctness fixes.

For the full per-PR changelog, see the `CHANGELOG.md` for
[`zarrita`](https://github.com/manzt/zarrita.js/blob/main/packages/zarrita/CHANGELOG.md)
and
[`@zarrita/storage`](https://github.com/manzt/zarrita.js/blob/main/packages/@zarrita-storage/CHANGELOG.md).

## Thank you

Thank you to everyone who contributed to v0.7:
[@infinite-tsukuyomi](https://github.com/infinite-tsukuyomi),
[@d-v-b](https://github.com/d-v-b),
[@kylebarron](https://github.com/kylebarron),
[@thewtex](https://github.com/thewtex),
and [@Wietze](https://github.com/Wietze).
A special thanks to the downstream maintainers (at vole-core, vizarr,
and the growing collection of virtual-zarr adapters) whose real
workloads shaped the extension API long before it existed as code.