Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Claude and Other Agents

# Guidelines

- NOTE: When the user's request matches an available skill:
- ALWAYS invoke it using the Skill tool as your FIRST action.
- Do NOT answer directly, do NOT use other tools first.
- The skill has specialized workflows that produce better results than ad-hoc answers.

- CRITICAL: Always prefer the LSP tool over Grep/Read for code navigation.
- Use it to find definitions, references, and workspace symbols.


- IMPORTANT: when planning and before you do any work:
- ALWAYS mention how you would verify and validate that work is correct
- include TDD tests in your plan
- take a behaviour driven approach
- you are very much ENCOURAGED to ask questions to get the design correct
- ALWAYS seek clarifications to sort out ambiguities
- ALWAYS provide a summary of the Design and implementation Plan


- NOTE: When the user asks for "second pass", "third pass" or "N-th pass" perform:
- simplification opportunities,
- naming/comments/docs quality review,
- scan for edge-cases and logical regression,
- in C/C++ NEVER produce undefined behavior and never segfault or stop executing without returning error or exceptions
- all documentation up-to-date with changes,
- running required formatter/lint/tests

- NOTE: when user asks for 'error handling' checks:
- verify no panic in rust code
- verify how errors are handled across-code base, all languages
- ensure all errors handled and reported correctly with enough information reaching users

- NOTE: when user asks for 'edge cases':
- look specifically edge cases
- look for undefined behaviour or ambiguities
- if necessary, ask the user to clarify

- NOTE: when user asks for 'code coverage':
- explore all the code base looking for code that isn't yet tested.
- Look specifically for testing edge cases.
- Aim to have at least 95% test coverage.

- NOTE: When user asks for 'final prep' make:
- final check everything builds, all languages and all tests pass
- all examples in all languages compile and run
- all docs build
- if successful, carefully:
- select files and contributions to git add
- ignore the build files and artifacts, don't add hidden directories
- if not in a branch, create a new properly named branch
- git commit
- make a pull request to upstream github project

- NOTE: when user asks to do 'pr reply' or 'pull request reply':
- check github pull request reviews
- consider them with respect to the philosophy and aims of this software
- if in doubt seek user clarifications
- fix code and address the raised issues
- update the docs/
- make a summary and push your changes to update the PR
- poll to wait for the CI to finish running
- continue iterating until all recommendations and issues were addressed

- NOTE: When user asks for 'make release' execute:
- check all changes are committed and pushed upstream
- final check everything builds, all languages and all tests pass
- all examples in all languages compile and run
- all docs build
- if any of the above fails STOP and prompt the user for action
- otherwise, proceed by check the latest version upstream and in VERSION file
- if needed, bump in VERSION file, commit and tag then push to upstream
Comment on lines +67 to +74
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: commitedcommitted.

Copilot uses AI. Check for mistakes.
- make a Github release

# Design & Purpose

- README.md -- entry level generic information
- docs/ -- full documentation in RST format

# Build / lint / test (required before marking done)

## Languages
This project contains C, C++, Fortran and Python code

# Version control
- Git project in github.com/ecmwf/multio
- Use this repository's own versioning and release processes.
- REMEMBER on releases:
- check all is committed and pushed upstream, otherwise STOP and warn user
- update the VERSION file
- git tag with version
- push and create release in github
7 changes: 7 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,13 @@ if( HAVE_FORTRAN )
endif()


### Tensogram encoding

ecbuild_add_option( FEATURE TENSOGRAM
DEFAULT OFF
DESCRIPTION "Encode data using Tensogram format"
REQUIRED_PACKAGES "NAME tensogram" )

### Maestro plugin

ecbuild_add_option( FEATURE MAESTRO
Expand Down
120 changes: 120 additions & 0 deletions cmake/FindTensogram.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# (C) Copyright 2025- ECMWF.
#
# This software is licensed under the terms of the Apache Licence Version 2.0
# which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
# In applying this licence, ECMWF does not waive the privileges and immunities
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction.

# Try to find the Tensogram library (N-dimensional tensor message format)
#
# Tensogram is a Rust-core library with a C FFI layer and a C++ header-only wrapper.
# This module locates the pre-built static library and the required headers.
#
# The following paths will be searched, both in the environment and as CMake variables:
#
# TENSOGRAM_ROOT
# TENSOGRAM_DIR
# TENSOGRAM_PATH
#
# If found, the tensogram::tensogram imported target will be created.
#
# Output variables:
# tensogram_FOUND - True if tensogram was found
# TENSOGRAM_INCLUDE_DIRS - Include directories (C++ wrapper + C FFI header)
# TENSOGRAM_LIBRARIES - Libraries to link against

# --- Locate the C++ header-only wrapper: tensogram.hpp ---

find_path(TENSOGRAM_CPP_INCLUDE_DIR
NAMES tensogram.hpp
HINTS
${TENSOGRAM_ROOT}
${TENSOGRAM_DIR}
${TENSOGRAM_PATH}
ENV TENSOGRAM_ROOT
ENV TENSOGRAM_DIR
ENV TENSOGRAM_PATH
PATH_SUFFIXES include
)

# --- Locate the C FFI header: tensogram.h ---
# This is typically in a different include path (crates/tensogram-ffi/)

find_path(TENSOGRAM_FFI_INCLUDE_DIR
NAMES tensogram.h
HINTS
${TENSOGRAM_ROOT}
${TENSOGRAM_DIR}
${TENSOGRAM_PATH}
ENV TENSOGRAM_ROOT
ENV TENSOGRAM_DIR
ENV TENSOGRAM_PATH
PATH_SUFFIXES include crates/tensogram-ffi
)

# --- Locate the Rust static library: libtensogram_ffi.a ---

find_library(TENSOGRAM_LIBRARY
NAMES tensogram_ffi
HINTS
${TENSOGRAM_ROOT}
${TENSOGRAM_DIR}
${TENSOGRAM_PATH}
ENV TENSOGRAM_ROOT
ENV TENSOGRAM_DIR
ENV TENSOGRAM_PATH
PATH_SUFFIXES lib lib64 target/release
)

# --- Aggregate results ---

set(TENSOGRAM_INCLUDE_DIRS ${TENSOGRAM_CPP_INCLUDE_DIR} ${TENSOGRAM_FFI_INCLUDE_DIR})
set(TENSOGRAM_LIBRARIES ${TENSOGRAM_LIBRARY})

include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(tensogram
DEFAULT_MSG
TENSOGRAM_LIBRARY
TENSOGRAM_CPP_INCLUDE_DIR
TENSOGRAM_FFI_INCLUDE_DIR
)

mark_as_advanced(TENSOGRAM_CPP_INCLUDE_DIR TENSOGRAM_FFI_INCLUDE_DIR TENSOGRAM_LIBRARY)

# --- Create imported target ---

if(tensogram_FOUND AND NOT TARGET tensogram::tensogram)

# The Rust static library (imported)
add_library(tensogram_ffi STATIC IMPORTED GLOBAL)
set_target_properties(tensogram_ffi PROPERTIES
IMPORTED_LOCATION "${TENSOGRAM_LIBRARY}"
)

# Header-only C++ wrapper (INTERFACE) linking the Rust lib + platform libs
add_library(tensogram::tensogram INTERFACE IMPORTED GLOBAL)
set_target_properties(tensogram::tensogram PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES "${TENSOGRAM_CPP_INCLUDE_DIR};${TENSOGRAM_FFI_INCLUDE_DIR}"
)
target_link_libraries(tensogram::tensogram INTERFACE tensogram_ffi)

# Platform-specific system libraries required by the Rust static library
if(APPLE)
target_link_libraries(tensogram::tensogram INTERFACE
"-framework CoreFoundation"
"-framework Security"
"-framework SystemConfiguration"
"-lc++"
"-lm"
)
elseif(UNIX)
target_link_libraries(tensogram::tensogram INTERFACE
dl
pthread
m
stdc++
)
endif()

endif()
119 changes: 119 additions & 0 deletions docs/content/processing-pipelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,125 @@ the template, so what GRIB template to use will depend on the types of data bein
unstructured-grid-type : eORCA025


Encode-Tensogram
~~~~~~~~~~~~~~~~

This action encodes raw field data into the `Tensogram`_ N-dimensional tensor message format,
producing self-describing binary messages that preserve MARS metadata. This action is particularly
useful for producing compact, portable output that can be processed by external analysis tools.

MARS metadata from the input message is preserved in two ways:

* On the output Message itself (for downstream routing within multio pipelines)
* Embedded in the Tensogram payload under ``base[0].mars`` (for external tool interoperability)

The action supports configurable encoding (simple_packing for lossy compression), multiple
compression algorithms (szip, zstd, lz4), optional filtering (shuffle), and integrity
verification (xxh3 hashing).

**Note:** This action requires tensogram support to be enabled at build time with
``-DENABLE_TENSOGRAM=ON``. The tensogram library must be installed and available
(see `github.com/ecmwf/tensogram`_).

Configuration options:

======================== ======================== ==================== ============================================
Key Allowed Values Default Description
======================== ======================== ==================== ============================================
``encoding`` ``none``, ``simple_packing`` Encoding method: ``none`` (raw) or
``simple_packing`` ``simple_packing`` (quantized integers)
``compression`` ``none``, ``szip``, ``szip`` Compression algorithm applied after
``zstd``, ``lz4`` encoding
``filter`` ``none``, ``shuffle`` ``none`` Pre-compression filter
``hash`` ``xxh3``, ``""`` ``xxh3`` Hash algorithm for integrity checking
``bits-per-value`` Integer (1--64) ``16`` Bits per value for simple_packing
``decimal-scale-factor`` Integer ``0`` Decimal scale factor for simple_packing
======================== ======================== ==================== ============================================

Example configurations:

**High compression** (suitable for visual analysis):

.. code-block:: yaml

- type : encode-tensogram
encoding : simple_packing
compression : szip
bits-per-value : 12
filter : shuffle

**Lossless** (raw float64, no packing):

.. code-block:: yaml

- type : encode-tensogram
encoding : none
compression : zstd

**Balanced** (default settings):

.. code-block:: yaml

- type : encode-tensogram
# Uses defaults: simple_packing, 16 bits, szip compression

**Complete pipeline example** (select → encode → sink):

.. code-block:: yaml

- name : surface-to-tensogram
actions :
- type : select
match :
- levtype : [sfc]

- type : encode-tensogram
encoding : simple_packing
compression : szip
bits-per-value : 16
hash : xxh3

- type : sink
sinks :
- type : file
append : true
path : output.tgm

Output files can be validated and inspected using the tensogram command-line tools:

.. code-block:: bash

# Validate message integrity
tensogram validate output.tgm

# Display metadata
tensogram info output.tgm

# List all messages
tensogram ls output.tgm

# Dump message contents
tensogram dump output.tgm

Or processed in Python using the tensogram package:

.. code-block:: python

import tensogram

with tensogram.TensogramFile.open("output.tgm") as f:
for msg in f:
meta, objects = msg
# Access MARS metadata
mars = meta.base[0].get('mars', {})
# Access data arrays
desc, data = objects[0]
print(f"Shape: {data.shape}, dtype: {data.dtype}")

.. _`Tensogram`: https://github.com/ecmwf/tensogram
.. _`github.com/ecmwf/tensogram`: https://github.com/ecmwf/tensogram


Sink
~~~~

Expand Down
1 change: 1 addition & 0 deletions src/multio/action/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ add_subdirectory(transport)
add_subdirectory(sink)
add_subdirectory(encode)
add_subdirectory(encode-mtg2)
add_subdirectory(encode-tensogram)
add_subdirectory(single-field-sink)
add_subdirectory(print)
add_subdirectory(mask)
Expand Down
17 changes: 17 additions & 0 deletions src/multio/action/encode-tensogram/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
if( HAVE_TENSOGRAM )

ecbuild_add_library(
TARGET multio-action-encode-tensogram

TYPE SHARED # Due to reliance on factory self-registration this library cannot be static

SOURCES
EncodeTensogram.cc
EncodeTensogram.h

PUBLIC_LIBS
multio
tensogram::tensogram
)

endif()
Loading
Loading