Implement fixedscaleoffset codec by kleinschmidt · Pull Request #312 · manzt/zarrita.js

kleinschmidt · 2025-10-31T21:37:18Z

This implementation is my (i.e. a TypeScript non-knower's) attempt to pattern match the
other codecs here along with @manzt's suggestion in
manzt/numcodecs.js#49 (comment).

One thing I noticed is that the ArrayArrayCodec type has a single type
parameter (I think?). Does that imply that all array-array codecs must output
the same type that they accept as input? If so, I think that's probably too
restrictive; the python fixedscaleoffset codec explicitly supports encoding to a
different type than the input uses, and my intended use case is to decode
int16-quantized floats.

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1211799105283496

changeset-bot · 2025-10-31T21:37:22Z

⚠️ No Changeset found

Latest commit: 6e0129b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

d-v-b · 2025-10-31T21:40:49Z

packages/zarrita/src/codecs/fixedscaleoffset.ts

+    dtype: string;
+    astype?: string;


zarr v3 data types can be a string or a JSON object with type {name: string, configuration: object}. See an example here

this one i think needs to be restricted to number types; can numeric types also have that format? is that how endianness is stored?

ah I see in the spec:

Each data type is associated with an identifier, which can be used in metadata documents to refer to the data type. For the data types defined in this specification, the identifier is a simple ASCII string. However, extensions may use any JSON value to identify a data type.

and endianness is specified as a bytes codec, I'm inferring.

yeah the bytes codec sets the endianness of the encoded data, but for decoded data, endianness is up to the implementation

kleinschmidt · 2025-11-01T19:49:05Z

packages/zarrita/src/codecs/fixedscaleoffset.ts

+    #TypedArrayOut: TypedArrayConstructor<A>
+
+    constructor(configuration: FixedScaleOffsetConfig) {
+        const { data_type } = coerce_dtype(configuration.dtype);


testing on v3 has made me realize that coerce_dtype is really only meant for v2 data type strings, so this will need to be adjusted.

kleinschmidt · 2025-11-03T21:17:17Z

I think resolving the v2 vs. v3 issues may be trickier than I'm prepared to take on right now. Would you be willing to accept v2-only support @manzt ?

The numcodecs.zarr3 python functions generate a different ID (numcodecs.fixedscaleoffset vs. fixedscaleoffset in v2) so leaving this as-is will simply error.

emmanuelmathot · 2026-01-12T07:04:05Z

Very interested by this codec. Also for V3. Any progress on merging this one?
cc @ahocevar

d-v-b · 2026-04-01T11:11:46Z

the fixedscaleoffset codec should not be used for zarr v3 data because it does not define how values from one data type should be cast to values of another data type. the old zarr v2 fixedscaleoffset codec basically just used numpy's casting behavior, but that's not really workable in zarr v3 because zarr v3 has a formal data type model, and we don't want to be dependent on the runtime behavior of numpy for something that spans multiple programming languages.

thanks to a lot of community involvement we made a new codec in zarr-extensions called cast_value that is narrowly scoped to managing casting between ints and floats. the tl;dr is that the codec configuration contains a declaration of:

the target data type
a rounding mode
a mode for handling out-of-range values
an explicit input scalar : output scalar lookup table (necessary for non-numeric values like NaN)

The codec must halt if any value isn't covered by rounding, the out-of-range mode, or the lookup table. This, combined with the explicit lookup table, can be complicate performance in interpreted languages.

@manzt what effort is needed to get this into zarrita? the scale-offset part is really easy but the casting part might be warrant some performance considerations. I have a rust implementation here that could be compiled to wasm, but I don't know the rust + js interop story at all, so no clue if that work is worth the performance benefit.

casting is the tricky part of the scale-offset transformation. the actual scaling + offsetting is simple (but we also have a codec for that.

d-v-b · 2026-04-12T14:43:27Z

@kleinschmidt are you interested in working on a zarr v3 implementation of the scale offset functionality?

manzt · 2026-04-12T14:46:34Z

@manzt what effort is needed to get this into zarrita? the scale-offset part is really easy but the casting part might be warrant some performance considerations. I have a rust implementation here that could be compiled to wasm, but I don't know the rust + js interop story at all, so no clue if that work is worth the performance benefit.

I'd be up for adding a pure JS version in zarrita proper (at least initially). Folks can dynamically swap in codecs with zarr.registry.set, which I think satisfy the case where one wants a faster implementation (at the cost of a much larger bundle size with WASM).

kleinschmidt · 2026-04-12T14:52:56Z

@kleinschmidt are you interested in working on a zarr v3 implementation of the scale offset functionality?

It's not currently a priority for us at the moment so not for the foreseeable future.

d-v-b · 2026-04-12T14:53:48Z

sounds good, I'm happy to take this up

kleinschmidt added 2 commits October 31, 2025 17:28

First pass implementation of fixed scale offset decoder

97ecd61

rename, add to default registry

53a0e3b

d-v-b reviewed Oct 31, 2025

View reviewed changes

kleinschmidt added 2 commits November 1, 2025 14:35

fixes

2d3b27d

eff it lets encode, too

757a90e

kleinschmidt changed the title ~~Implement decode-only fixedscaleoffset codec~~ Implement fixedscaleoffset codec Nov 1, 2025

basic test and fixtures

6e0129b

kleinschmidt commented Nov 1, 2025

View reviewed changes

emmanuelmathot mentioned this pull request Jan 12, 2026

Request for Standardized FixedScaleOffset Codec Extension zarr-developers/zarr-extensions#42

Closed

manzt force-pushed the main branch 2 times, most recently from 02099c7 to 1facac0 Compare January 23, 2026 20:15

emmanuelmathot mentioned this pull request Mar 16, 2026

Implement scale/offset codec in the EOPF data model EOPF-Explorer/data-model#134

Open

4 tasks

emmanuelmathot mentioned this pull request Apr 2, 2026

FixedScaleOffset codec not preserved during zarr v2 to v3 conversion EOPF-Explorer/data-model#106

Open

Conversation

kleinschmidt commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

d-v-b Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

kleinschmidt Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

kleinschmidt Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

kleinschmidt Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

d-v-b Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

kleinschmidt Nov 1, 2025

Choose a reason for hiding this comment

Uh oh!

kleinschmidt commented Nov 3, 2025

Uh oh!

emmanuelmathot commented Jan 12, 2026

Uh oh!

d-v-b commented Apr 1, 2026

Uh oh!

d-v-b commented Apr 12, 2026

Uh oh!

manzt commented Apr 12, 2026

Uh oh!

kleinschmidt commented Apr 12, 2026

Uh oh!

d-v-b commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kleinschmidt commented Oct 31, 2025 •

edited

Loading

changeset-bot bot commented Oct 31, 2025 •

edited

Loading