Skip to content

Releases: ecmwf/tensogram

0.19.0

25 Apr 12:37

Choose a tag to compare

Minor release adding two platform integrations and the cross-codec preallocation hardening that has been incubating in [Unreleased].

Highlights

tensogram-earthkit — first-class earthkit-data plugin. A new pip package registers Tensogram as both an earthkit.data source (earthkit.data.sources.tensogram) and an encoder (earthkit.data.encoders.tensogram), so .tgm content flows through the same surface that already handles GRIB, NetCDF, BUFR, and Zarr. MARS-tagged tensograms surface as a FieldList; non-MARS tensograms go straight to xarray.

Bidirectional remote scan walker. Remote .tgm scans now walk from both ends of the file simultaneously via paired HTTP Range fetches, halving cold-scan latency on multi-message remote files and giving forward-canonical recovery when one direction hits locally-corrupt data.

simple_packing ergonomics. The encoder now auto-computes sp_reference_value and sp_binary_scale_factor from the input data when they are absent — the simplest usable descriptor is now {"encoding": "simple_packing", "sp_bits_per_value": 16}. Wire-format keys also rename to the codebase-wide <codec>_<key> convention (reference_valuesp_reference_value, etc.).

Added

  • tensogram-earthkit source + encoder plugins (local files, remote URLs http(s)/s3/gs/az, bytes, byte streams; array-namespace interop for numpy/torch/cupy/jax via earthkit-utils).
  • Bidirectional remote scan walker via paired forward/backward Range fetches; backward state treated as the suspect on disagreement, with named recovery reasons (gap-below-min-message-size, forward-exceeds-backward-bound).

Changed

  • simple_packing descriptor keys renamed to sp_* (sp_reference_value, sp_binary_scale_factor, sp_decimal_scale_factor, sp_bits_per_value). Pre-rename v3 messages with unprefixed keys are no longer readable.
  • Encoder auto-computes the two derived params when absent; explicit sp_reference_value+sp_binary_scale_factor are still trusted verbatim for advanced workflows.
  • Saturating arithmetic across the remote scan-state guards; async stale-dispatch race closed; gap-too-small fallback no longer drops collected backward layout.

Security

  • Cross-codec preallocation hardening — every descriptor-derived allocation on the decode path is now fallible (try_reserve_exact), and every size-arithmetic step that could wrap usize on hostile input is guarded by checked_mul / u128 promotion. Covers szip (FFI + pure-Rust), simple_packing, bitmask decoders, zfp, sz3, blosc2, shuffle, and the bytes_to_f64 / f64_to_bytes pipeline helpers.

Fixed

  • regular_ll longitude convention — tensogram-grib now emits canonical mars.area = [N, W, S, E] from the four geography-namespace corner points; Tensoscope reads mars.area when present and warns with a named DEFAULT_REGULAR_LL_AREA fallback when it is not. Eliminates the 180° render offset on ECMWF open-data GRIB-derived files.

Stats

Suite Count
Rust workspace 1546 passed, 5 ignored
tensogram remote,async features 863 passed, 2 ignored (+23 walker tests)
tensogram-grib 59 passed
tensogram-netcdf 69 passed
Python tensogram 541 passed, 46 skipped
Python tensogram-xarray 242 passed
Python tensogram-zarr 235 passed
Python tensogram-earthkit 140 passed, 1 skipped (new)

`cargo fmt --check`, `cargo clippy --workspace --all-targets -- -D warnings`, and `mdbook build docs/` all clean.

Install

```bash

Rust

cargo add tensogram@0.19.0

Python

pip install tensogram==0.19.0
pip install tensogram-xarray==0.19.0
pip install tensogram-zarr==0.19.0
pip install tensogram-anemoi==0.19.0
pip install tensogram-earthkit==0.19.0 # new

TypeScript / WASM

npm install @ecmwf.int/tensogram@0.19.0
```

Full details in CHANGELOG.md.

0.18.1

23 Apr 11:58

Choose a tag to compare

Patch release focused on Tensoscope rendering of real ECMWF GRIB-derived .tgm files, plus CI and publish-workflow improvements. No wire-format, CLI, or core-library changes.

Highlights

Tensoscope now opens GRIB-derived files end-to-end — files produced by tensogram convert-grib with mars.grid = "regular_ll" render directly in the browser, with no Python-side coordinate preprocessing required.

  • Auto-expand 1-D lat/lon axes into per-point meshgrid for the common [nLat, nLon] layout.
  • Infer lat/lon axes from mars.grid + mars.area when no explicit coordinate objects exist.
  • Per-message coordinate cache + atomic field/coords commit so heterogeneous multi-message files render the right mesh for every message.

Fixed

  • Integer mars.param codes (ECMWF GRIB: 167, 130, …) no longer crash the sidebar or field selection; bigint values are safely coerced.
  • Slicing on truly 2-D gridded data (e.g. [721, 1440] meshed) no longer produces a "donut in the north" — decideSliceDim correctly returns -1 when the data is already fully spatial.
  • Slicing on packed-level 3-D fields [N_lev, nLat, nLon] now slices on dim 0 as expected via integer-multiple detection.
  • useAppStore.selectField no longer uses stale msg-0 coords for other messages; coords are fetched per-message and committed atomically with fieldData.
  • mars.area bigint values that would overflow to Infinity are rejected by toNumber() and fall back to safe defaults.
  • Visible map flash / camera reset on file open eliminated (#84).

CI / Infrastructure

  • ci.yml runs make ts-install + make ts-build before Tensoscope's vitest, so pure-helper tests can resolve @ecmwf.int/tensogram through file:../typescript.
  • publish-pypi.yml now builds Linux aarch64 wheels in addition to x86_64 (#83).

Stats

Suite Count
Rust workspace tests 1505 passed, 5 ignored
tensogram remote,async features 824 passed, 2 ignored
tensogram-grib 36 passed
tensogram-netcdf 69 passed
Python tensogram 535 passed, 40 skipped
Python tensogram-xarray 242 passed
Python tensogram-zarr 235 passed
Tensoscope (vitest) 54 passed / 7 files

cargo fmt --check, cargo clippy --workspace --all-targets --all-features -- -D warnings, and mdbook build docs/ all clean.

Install

  • Rust: cargo add tensogram@0.18.1
  • Python: pip install tensogram==0.18.1
  • Python (all extras): pip install 'tensogram[all]'==0.18.1
  • CLI: cargo install tensogram-cli --version 0.18.1 (add --features grib,netcdf for converters)
  • TypeScript / WASM: npm install @ecmwf.int/tensogram@0.18.1

Full changelog

See the full 0.18.1 CHANGELOG entry.

0.18.0

23 Apr 07:18

Choose a tag to compare

Highlights

Two major themes land in this release:

🧹 Breaking — free-form CBOR metadata (#80)

The CBOR metadata frame is now fully free-form. The library-
interpreted top-level keys are just base, _reserved_, and
_extra_; anything else a caller supplies flows into _extra_ on
decode. The wire-format version lives exclusively in the preamble
and is never written to CBOR.

  • GlobalMetadata.version field removed from the public Rust
    struct. tensogram.encode({}, ...) is now valid input. A stray
    "version" top-level key from a legacy producer round-trips via
    _extra_.
  • New WIRE_VERSION constants in Rust / Python / TypeScript expose
    the preamble-sourced version for ergonomic access.
  • Zarr/xarray attribute renamed: _tensogram_version
    _tensogram_wire_version; tensogram_version
    tensogram_wire_version.
  • Golden fixtures regenerated under the new schema.

🌐 Browser-usable remote + async parity for TS / WASM / Tensoscope (#81)

@ecmwf.int/tensogram now matches the Rust core's object_store
integration over HTTP(S) and AWS-signed HTTPS. Browser consumers no
longer download whole messages — they fetch only the bytes they ask
for.

  • Layout-aware per-object access on TensogramFile.fromUrl:
    messageMetadata(i) (header chunk only), messageDescriptors(i)
    (CBOR-prefix optimisation), messageObject(i, j) (one Range GET),
    messageObjectRange(i, j, ranges), plus *Batch variants and
    prefetchLayouts.
  • Bounded-concurrency pool (default 6) tuned to browser per-host
    limits, with independent outer/inner limiters so nested fan-out
    never deadlocks.
  • AWS SigV4 helpers: signAwsV4Request (pure signer, byte-for-
    byte against AWS test suite vectors) + createAwsSigV4Fetch
    (fetch wrapper pluggable into FromUrlOptions.fetch).
  • Tensoscope uses prefetchLayouts + messageObject so opening
    large remote files costs one Range per message header instead of
    N full downloads.

Plus bug fixes for the messageDescriptors eager path, handle
lifecycle in wrapWbgDecodedMessage, AWS SigV4 header merge
semantics on Request inputs, and the descriptor-fetch concurrency
cap.

Stats

  • Rust workspace: 1505 tests passing (824 with tensogram --features remote,async). Excluded-crate suites: 36 across tensogram-grib
    • tensogram-netcdf.
  • Python: 530 + 242 + 235 tests across python/tests/,
    python/tensogram-xarray/tests/, and python/tensogram-zarr/tests/.
  • WASM: 172 tests. TypeScript: 376 tests across 27 files.
  • Tensoscope: 12 tests.

Links

0.17.0

22 Apr 19:41

Choose a tag to compare

0.17.0 — Wire format v3

A major release. Clean break from wire format v2 (no backward-compat shim), opt-in NaN / ±Inf preservation via companion bitmask frames, default-reject policy for non-finite values at encode time, and retirement of the deprecated tensogram-core redirect crate.

Highlights

🚨 BREAKING

  • Wire format v3. v2 messages are rejected at preamble read with a clear re-encode error. Postamble grew 16 → 24 B with mirrored total_length (enables bidirectional scan). Per-frame hash slots are now inline (12-byte common tail; 20-byte data-object footer). DataObject wire type 4 replaced by NTensorFrame / NTensorMaskedFrame (wire type 9). Spec at plans/WIRE_FORMAT.md.
  • Non-finite floats now error by default at encode. Pre-0.17 the library passed NaN / ±Inf through encoding="none" verbatim, with opt-in reject_nan / reject_inf flags. 0.17+ inverts: rejection is the default on every encode path; the opt-in flags are removed from Rust, Python, TypeScript, C FFI, C++, and CLI. Callers who intentionally shipped NaN-bearing data must either pre-process or opt in to the new allow_nan / allow_inf bitmask companion path.
  • tensogram-core redirect crate removed. Three minor versions after the 0.15 rename tensogram-coretensogram, the redirect retires. cargo add tensogram directly. Previously published tensogram-core versions (0.14.0–0.16.1, ~50 total downloads) remain on crates.io as frozen re-exports.
  • tensogram convert-grib / convert-netcdf hard-fail when --encoding simple_packing meets NaN / Inf data; previously silently downgraded to encoding="none" and hid real data-quality problems.

Added

  • NaN / ±Inf bitmask companion frame with per-kind compression (nan_mask_method / pos_inf_mask_method / neg_inf_mask_method, accepting none / rle / roaring / blosc2 / zstd / lz4). Decode-side restore_non_finite flag (default true) and the advanced decode_with_masks API in Rust + Python. CLI parity via --allow-nan / --allow-inf flags and TENSOGRAM_ALLOW_NAN / TENSOGRAM_ALLOW_INF env vars. Full guide at docs/src/guide/nan-inf-handling.md.
  • tensogram-xarray: dim_names hint convention. Producers may embed base[i]["dim_names"] (axis-ordered list) per object; backend validates and uses it in the resolution chain (user kwarg > coord match > per-object hint > _extra_ hint > generic fallback). (#66)
  • tensogram-anemoi: the anemoi-inference output plugin now writes per-object dim_names lists on each data field entry alongside the existing message-level _extra_["dim_names"] hint. (Plugin itself was added in #63 during 0.17 development.)
  • rle and roaring codecs promoted from mask-companion-only to first-class DataObjectDescriptor.compression values (bitmask dtype only).
  • simple_packing standalone-API safety net — validates SimplePackingParams on every encode path to catch hand-crafted or mutated params that would otherwise produce silently-wrong output.

Fixed

  • tensogram-zarr: TensogramStore falls back to descriptor-level keys (name, param, shortName, …) for variable naming when meta.base[i] doesn't supply one — matches the long-standing xarray backend behaviour. (#67)
  • tensogram-xarray: no more conflicting sizes for dimension 'dim_0' errors on mixed-rank multi-object messages without CF-recognised coordinate names. (#66)
  • simple_packing::encode correctly handles i32::MIN (was silently corrupting via i32::MIN.abs(); now uses saturating_abs()).

Stats

  • Rust workspace: 1487 tests passing (1593 with remote + async feature coverage). Excluded-crate suites: 387 tests across tensogram-grib, tensogram-netcdf, and tensogram-cli built with grib + netcdf.
  • Python: 526 + 242 + 235 tests across python/tests/, python/tensogram-xarray/tests/, and python/tensogram-zarr/tests/.
  • C++: 143 tests. WASM: 161 tests. TypeScript (vitest): 319 tests.

Available on

  • crates.io — 10 crates: tensogram-szip, tensogram-sz3-sys, tensogram-sz3, tensogram-encodings, tensogram, tensogram-grib, tensogram-netcdf, tensogram-ffi, tensogram-cli, tensogram-wasm
  • PyPItensogram, tensogram-xarray, tensogram-zarr, tensogram-anemoi
  • npm@ecmwf.int/tensogram

Full release notes: CHANGELOG.md

0.16.1

18 Apr 23:53

Choose a tag to compare

0.16.1

Patch release fixing Jupyter notebook CI failures caused by a stale version pin.

Fixed

  • Jupyter notebook CI failuresexamples/jupyter/pyproject.toml pinned tensogram>=0.15.0,<0.16 but the repo had moved to 0.16.0, causing uv pip install to silently replace the locally-built bindings with an older PyPI wheel lacking grib/netcdf features. Bumped all version references to >=0.16.0,<0.17.
  • make-release command — added examples/jupyter/pyproject.toml to the version bump checklist so future releases won't miss it.

Stats

  • Rust tests: 513 passed
  • Python tests: 513 passed, 40 skipped
  • xarray tests: 201 passed
  • zarr tests: 224 passed

Full changelog: CHANGELOG.md

0.16.0

18 Apr 22:05

Choose a tag to compare

Highlights

  • Strict-finite encode checks — new EncodeOptions flags reject_nan and reject_inf scan float payloads before the encoding pipeline runs and bail out on the first NaN / Inf. Exposed across every language surface (Rust, Python, TS, C FFI, C++), the CLI (--reject-nan / --reject-inf), and env vars (TENSOGRAM_REJECT_NAN / TENSOGRAM_REJECT_INF). Closes the silent-corruption gotcha where simple_packing::compute_params accepted Inf input and decoded to NaN everywhere.

  • TypeScript wrapper — Scope C.1 + C.2 shipped. New APIs: decodeRange, computeHash, simplePackingComputeParams, encodePreEncoded, validate / validateBuffer / validateFile, StreamingEncoder (with optional per-chunk onBytes callback sink — no internal buffering), TensogramFile#append, and a lazy HTTP Range backend on TensogramFile.fromUrl. First-class float16 / bfloat16 / complex64 / complex128 view classes including a TC39-Stage-3-accurate Float16Array polyfill.

  • Python compute_hash parity + tensogram.convert_grib / convert_netcdf as first-class PyO3 bindings (replacing the CLI-subprocess pattern).

  • Domain-agnostic repositioning. MOTIVATION, README, introduction, and concept pages now present Tensogram as a general-purpose N-tensor message format for scientific data at scale, with ECMWF weather-forecasting workloads as one well-validated use case. tensogram-grib / tensogram-netcdf renamed "converters" → "importers". New docs/src/guide/vocabularies.md plus matching 02b_generic_metadata.* examples in all four languages.

  • Rust core: reject ±Infinity in simple_packing alongside NaN (new PackingError::InfiniteValue(usize) variant). Defence-in-depth alongside the pipeline-independent reject_inf encode flag.

  • Jupyter notebook walk-through — five end-to-end notebooks under examples/jupyter/ executed on CI with nbval-lax.

Breaking changes

  • TS: TensogramFile#rawMessage(index) is now async (returns Promise<Uint8Array>) to support the lazy HTTP Range backend. Call sites add await.
  • TS: typedArrayFor for half-precision (float16 / bfloat16) and complex dtypes returns view classes, not raw Uint16Array / interleaved Float32Array. Raw bits via .bits, complex storage via .data.
  • C FFI: tgm_encode, tgm_file_append, and tgm_streaming_encoder_create gained two bool parameters (reject_nan, reject_inf). Pre-0.16 C callers will see a compile error from the regenerated header; pass false, false to preserve previous behaviour. tgm_encode_pre_encoded intentionally does not — pre-encoded bytes are opaque to the library.

Installation

# Rust
cargo add tensogram@0.16.0

# Python
pip install tensogram==0.16.0
pip install "tensogram-xarray[dask]==0.16.0"
pip install tensogram-zarr==0.16.0

# Pure Rust leaf codecs (no C deps)
cargo add tensogram-szip@0.16.0

# CLI
cargo install tensogram-cli@0.16.0

Stats

  • Rust workspace: 1324 tests + 17 (tensogram-grib) + 44 (tensogram-netcdf)
  • Python: 538 core tests + 201 (xarray) + 224 (zarr)
  • C++: 154 tests
  • WASM: 161 tests (wasm-bindgen-test via wasm-pack)
  • TypeScript: 326 vitest tests across 23 files
  • All 17 Rust + 19 Python + 7 C++ + 13 TypeScript examples build and run clean.
  • cargo fmt, cargo clippy --workspace -- -D warnings, ruff check, ruff format --check, mdbook build — all green.
  • Preflight workflow (release-preflight.yml, run 24614264513): 15/15 steps green in 6m46s including version consistency, 10 crate tarball lists, cargo publish --dry-run on leaf crates, maturin release wheel, and twine check on all three PyPI packages.

Release verification

  • crates.io — all 10 publishable crates published + cargo add tensogram@0.16.0 && cargo build from a fresh project succeeds.
  • PyPIpip install tensogram==0.16.0 + numpy roundtrip succeeds on Linux x86_64 and macOS arm64 wheels for Python 3.9–3.14 + free-threaded 3.13t/3.14t.
  • PyPI extraspip install tensogram-xarray==0.16.0 tensogram-zarr==0.16.0 imports clean.

Full changelog

See CHANGELOG.md for the complete entry.

0.15.0

18 Apr 01:29

Choose a tag to compare

The primary library crate is now tensogram (renamed from tensogram-core).

cargo add tensogram
pip install tensogram

What's new

  • Crate renamedcargo add tensogram instead of cargo add tensogram-core. The old name still works via a redirect crate that re-exports everything with all features forwarded.

  • Hash-while-encoding — the xxh3-64 integrity hash is now computed inline during the encode pipeline. No extra pass over the data, no performance cost.

  • Python extras — install companion packages in one command:

    pip install tensogram[all]       # xarray + zarr backends
    pip install tensogram[xarray]    # xarray only
    pip install tensogram[zarr]      # zarr only
    
  • CLIcargo install tensogram-cli

Migrating from 0.14.0

Update your dependency and imports:

# Cargo.toml
- tensogram-core = "0.14"
+ tensogram = "0.15"
# Rust source
- use tensogram_core::{encode, decode};
+ use tensogram::{encode, decode};

Or just bump to tensogram-core = "0.15" — the redirect handles it.

Links

0.14.0

17 Apr 15:28

Choose a tag to compare

First public release

First version published to crates.io and PyPI.

Note: The primary crate was published as tensogram-core in this release. It was renamed to tensogram in 0.15.0.

Added

  • crates.io publishing — 10 Rust crates published with full package metadata (license, description, repository, homepage, documentation, readme, keywords, categories, authors, rust-version)
  • PyPI publishing — 14 Python wheels (Linux x86_64, macOS arm64, Python 3.9–3.14 including free-threaded 3.13t and 3.14t)
  • Publish workflowspublish-crates.yml (sequential with index polling), publish-pypi-tensogram.yml (maturin), publish-pypi-extras.yml (xarray + zarr), release-preflight.yml (validation)
  • Per-crate READMEs for all 10 Rust crates and the Python bindings
  • Composite LICENSES.md for tensogram-sz3-sys (Apache-2.0 + Argonne BSD + Boost-1.0)
  • Python extraspip install tensogram[xarray], pip install tensogram[zarr], pip install tensogram[all]
  • Make-release command extended with registry publishing steps

Changed

  • Edition 2024 — workspace migrated from Rust 2021 to 2024 (resolver 3)
  • MSRV 1.87rust-version set across all publishable crates
  • thiserror v2tensogram-sz3 migrated from v1 to v2 via workspace inheritance
  • Inter-crate version pins — all path dependencies carry exact version pins for cargo publish
  • Multi-threaded coding pipeline — caller-controlled threads on encode/decode (from 0.13.0)

Fixed

  • FFI: narrowed unsafe blocks, fail hard on missing cbindgen output
  • sz3-sys: correct OpenMP runtime linking
  • Core: gate async-only lock_state behind cfg(feature = "async")
  • Apache 2.0 license headers on all source files

Install

cargo add tensogram-core    # renamed to tensogram in 0.15.0
pip install tensogram

Links

0.13.0

16 Apr 23:34

Choose a tag to compare

Highlights

  • Multi-threaded coding pipeline. Caller-controlled threads: u32 on EncodeOptions / DecodeOptions (default 0 = sequential, identical to 0.12.0). A new tensogram_core::parallel module wraps a scoped rayon pool and dispatches along two axes:
    • Axis B (preferred) — intra-codec parallelism for blosc2 (CParams/DParams::nthreads), zstd (NbWorkers), byte-aligned and non-byte-aligned simple_packing (lcm(8, bpv) chunks), and shuffle / unshuffle.
    • Axis A (fallback) — rayon par_iter across objects, only when no object uses an axis-B-friendly codec, so the total thread count never exceeds the caller's budget.
  • Byte-identity / lossless guarantees. Transparent codecs (none, lz4, szip, zfp, sz3, simple_packing, shuffle) produce byte-identical encoded payloads across all threads values. Opaque codecs (blosc2, zstd with workers) round-trip losslessly but may reorder compressed blocks by completion order.
  • TypeScript wrapper (@ecmwf/tensogram, typescript/). Ergonomic TypeScript bindings over the existing WASM crate with a full Vitest suite (smoke, init, encode, decode, metadata, streaming, errors, dtype, file, property-based, cross-language golden parity).
  • Docs reorganisation. ARCHITECTURE.md moved under plans/; plans/DONE.md and plans/TEST.md reshaped to be version-agnostic; plans/BRAINSTORMING.md added for exploratory future directions.

Added

  • threads parameters on every Rust/Python/FFI/C++ encode/decode entry point plus TensogramFile.decode_message, async variants, batch variants, and StreamingEncoder.
  • CLI global --threads N flag with TENSOGRAM_THREADS env fallback; decode-heavy subcommands honour it, metadata-only subcommands ignore it.
  • threads-scaling benchmark binary sweeping representative codec combinations across a configurable thread budget.
  • New docs page docs/src/guide/multi-threaded-pipeline.md covering option semantics, axis-A/B policy, the determinism contract, env-var precedence, and tuning recommendations. Benchmark results page extended with a Threading Scaling section.
  • Determinism tests at every layer: Rust integration suite, Python test module, per-codec unit tests, and C++ GoogleTest coverage.
  • TensogramError::Remote error class with a dedicated string in the C FFI formatter and a remote_error exception class in the C++ wrapper.
  • README documentation badge and online docs link.

Changed

  • New cargo feature threads (default-on native, off on wasm32) on both tensogram-core and tensogram-encodings.
  • Workspace zstd gains the zstdmt feature so libzstd is built with thread support.
  • Workspace clap gains the env feature for automatic TENSOGRAM_THREADS reading.
  • PipelineConfig gains an intra_codec_threads: u32 field.
  • FFI signatures gain a threads parameter (ABI break vs. 0.12.0; defaults match previous behaviour).
  • plans/DONE.md rewritten as a version-agnostic implementation-path log with explicit instructions that agents must not add version numbers or fixed test counts to that file.
  • plans/TEST.md replaced with a shape-over-counts coverage description.
  • plans/IDEAS.md cleaned of already-shipped items.
  • Apache 2.0 licence metadata tightened across the repository.
  • Docs fact-check pass corrected multiple API signatures (EncodeOptions, DecodeOptions, decode_range, simple_packing, shuffle, CLI usage) against source.

Stats

Component Result
cargo test --workspace 1226 passed, 0 failed
cargo test -p tensogram-core --features remote,async all passed
pytest python/tests/ 436 passed, 1 skipped
pytest python/tensogram-xarray/tests/ 201 passed
pytest python/tensogram-zarr/tests/ 224 passed
cargo clippy --workspace --all-targets --all-features clean
mdbook build docs/ clean

Full changelog

See CHANGELOG.md#0130---2026-04-17.

0.12.0

15 Apr 22:40

Choose a tag to compare

Highlights

This release restructures the repository for open sourcing, replaces the GPL sz3-sys dependency with a clean-room Apache-2.0 alternative, adds producer metadata dimension hints to the xarray backend, and includes a systematic edge case audit with 122 new tests.

Added

  • Producer metadata dimension hints — xarray backend resolves dimension names from _extra_["dim_names"] (list or dict format) embedded by writers, before falling back to dim_N. (PR #34, @HCookie)
  • Open source preparation — Apache 2.0 licence headers on all 203 source files, THIRD_PARTY_LICENSES audit (166/166 compatible), CODE_OF_CONDUCT.md, SECURITY.md, PR template with CLA, branch protection. (PR #35)
  • Clean-room tensogram-sz3-sys (Apache-2.0 OR MIT) — replaces GPL-licensed sz3-sys. Zero GPL code in the dependency tree.
  • Docker CI image — parallel lint/test/python/C++ jobs. (PR #36)
  • Top-level Makefilemake rust-test, make python-test, make cpp-test, make lint
  • 122 new tests — edge cases (30), code coverage (92): NaN/Inf round-trip, bitmask validation, 100-object stress, unicode metadata, mixed streaming+buffered, and more.

Changed

  • Repository restructuredrust/, python/, cpp/ language-grouped layout. Workspace Cargo.toml stays at root. (PR #37)
  • Copyright headers 2024-2026-; 2 library panics replaced with Result propagation.

Fixed

  • Bitmask data length validated at encode time. compute_strides overflow guard. FFI tgm_scan_entry OOB returns error. FFI null pointer safety. SZ3 GCC compatibility.

Stats

  • 1,848+ total tests (920 Rust + 423 Python + 201 xarray + 224 zarr + 80+ C++) — all green
  • 0 clippy warnings, 0 ruff issues
  • 166/166 third-party dependencies Apache-2.0 compatible

Full changelog

See CHANGELOG.md