Skip to content
Merged
53 changes: 53 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
cff-version: 1.2.0
title: Anscombe Transform Codec
message: If you use this software, please cite it.
version: 1.0.0
repository-code: https://github.com/datajoint/anscombe-transform
abstract: >
Codec implementation of the Anscombe transform for compressing photon-limited
microscopy, radiography, and astronomy image recordings, published by DataJoint.
authors:
- family-names: Yatsenko
given-names: Dimitri
email: [email protected]
- family-names: Lecoq
given-names: Jerome
email: [email protected]
- family-names: Bennett
given-names: Davis
email: [email protected]
contact:
- family-names: Yatsenko
given-names: Dimitri
email: [email protected]
license: MIT
preferred-citation:
type: article
title: Standardized measurements for monitoring and comparing multiphoton microscope systems
authors:
- family-names: Lees
given-names: Robert M.
- family-names: Bianco
given-names: Isaac H.
- family-names: Campbell
given-names: Robert A. A.
- family-names: Orlova
given-names: Natalia
- family-names: Peterka
given-names: Darcy S.
- family-names: Pichler
given-names: Bruno
- family-names: Smith
given-names: Spencer LaVere
- family-names: Yatsenko
given-names: Dimitri
- family-names: Yu
given-names: Che-Hang
- family-names: Packer
given-names: Adam M.
journal: Nature Protocols
volume: '20'
pages: 1-38
date-published: 2025-03-17
doi: 10.1038/s41596-024-01120-w
url: https://doi.org/10.1038/s41596-024-01120-w
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
[![PyPI version](https://badge.fury.io/py/anscombe-transform.svg)](https://badge.fury.io/py/anscombe-transform) ![tests](https://github.com/datajoint/anscombe-transform/actions/workflows/tests.yaml/badge.svg)
[![PyPI version](https://badge.fury.io/py/anscombe-transform.svg)](https://badge.fury.io/py/anscombe-transform) ![tests](https://github.com/datajoint/anscombe-transform/actions/workflows/test.yml/badge.svg)

# Anscombe transform

This codec is designed for compressing image recordings with Poisson noise, which are produced by photon-limited modalities such multiphoton microscopy, radiography, and astronomy.
This codec is designed for compressing image recordings with Poisson noise such as in microscopy, radiography, and astronomy.

The codec assumes that the video is linearly encoded with a potential offset (`zero_level`) and that the `photon_sensitivity` (the average increase in intensity per photon) is either already known or can be accurately estimated from the data.
**Status:** This is the official and actively maintained Anscombe transform codec for Zarr/Numcodecs, maintained by DataJoint.
It originated as a fork of [AllenNeuralDynamics/poisson-numcodecs](https://github.com/AllenNeuralDynamics/poisson-numcodecs) and earlier developments at https://github.com/datajoint/compress-multiphoton. It has since diverged significantly. New users should rely on this repository as the canonical source.

The codec assumes that the video is linearly encoded with a potential offset (`zero_level`) and that the `conversion_gain` (also called `photon_sensitivity`)—the average increase in intensity per photon—is either already known or can be accurately estimated from the data.

The codec re-quantizes the grayscale efficiently with a square-root-like transformation to equalize the noise variance across the grayscale levels: the [Anscombe Transform](https://en.wikipedia.org/wiki/Anscombe_transform).
This results in a smaller number of unique grayscale levels and significant improvements in the compressibility of the data without sacrificing signal accuracy.

To use the codec, one must supply two pieces of information: `zero_level` (the input value corresponding to the absence of light) and `photon_sensitivity` (levels/photon).
To use the codec, one must supply two pieces of information: `zero_level` (the input value corresponding to the absence of light) and `conversion_gain` (levels/photon).

The codec is used in Zarr as a filter prior to compression.

Expand Down
30 changes: 16 additions & 14 deletions docs/examples/workbook.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,19 @@ First, let's create synthetic data that mimics photon-limited imaging:

```python
# Parameters for synthetic data
n_frames = 100
height, width = 512, 512
mean_photons = 50 # Average photons per pixel
zero_level = 100 # Camera baseline
conversion_gain = 2.5 # ADU per photon
n_frames = 30
height, width = 120, 120
mean_photon_rate = 5.0 # Average photons per pixel (exponential distribution of rates)
zero_level = -10.0 # camera baseline
conversion_gain = 2.5 # levels per photon

# Generate Poisson-distributed photon counts
photon_counts = np.random.poisson(lam=mean_photons, size=(n_frames, height, width))
photon_rate = np.random.exponential(scale=5, size=(1, height, width))
photon_counts = np.random.poisson(lam=np.tile(photon_rate, (n_frames, 1, 1)))
measured_signal = photon_counts + np.random.randn(*size) * 0.2

# Convert to camera signal (ADU)
camera_signal = (photon_counts * conversion_gain + zero_level).astype('int16')
# Convert to camera signal
camera_signal = (measured_signal * conversion_gain + zero_level).astype('int16')

print(f"Data shape: {camera_signal.shape}")
print(f"Data range: [{camera_signal.min()}, {camera_signal.max()}]")
Expand All @@ -58,16 +60,16 @@ estimated_gain = result['sensitivity']
estimated_zero = result['zero_level']

print(f"\nTrue parameters:")
print(f" Conversion gain: {conversion_gain:.3f} ADU/photon")
print(f" Zero level: {zero_level:.1f} ADU")
print(f" Conversion gain: {conversion_gain:.3f} units/photon")
print(f" Zero level: {zero_level:.1f} ")

print(f"\nEstimated parameters:")
print(f" Conversion gain: {estimated_gain:.3f} ADU/photon")
print(f" Zero level: {estimated_zero:.1f} ADU")
print(f" Conversion gain: {estimated_gain:.3f} units/photon")
print(f" Zero level: {estimated_zero:.1f}")

print(f"\nEstimation error:")
print(f" Gain error: {abs(estimated_gain - conversion_gain):.3f} ADU/photon")
print(f" Zero level error: {abs(estimated_zero - zero_level):.1f} ADU")
print(f" Gain error: {abs(estimated_gain - conversion_gain):.3f} units/photon")
print(f" Zero level error: {abs(estimated_zero - zero_level):.1f} units")
```

## Visualize Noise Model
Expand Down
33 changes: 20 additions & 13 deletions docs/getting-started/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,25 @@ import zarr
import numpy as np
from anscombe_transform import AnscombeTransformV3

# Generate sample data with Poisson noise
# Simulating photon-limited imaging data
data = np.random.poisson(lam=50, size=(100, 512, 512)).astype('int16')
# Synthesize data with Poisson noise - simulating photon-limited microscopy data
n_frames, height, width = 50, 256, 256
mean_photon_rate = 5.0 # Average photons per pixel (exponential distribution of rates)
zero_level = 20.0 # camera baseline
conversion_gain = 30.0 # levels per photon
photon_rate = np.random.exponential(scale=mean_photon_rate, size=(1, height, width))
photon_counts = np.random.poisson(np.tile(photon_rate, (n_frames, 1, 1)))
measured_signal = photon_counts + np.random.randn(*size) * 0.2

data = (zero_level + conversion_gain * measured_signal).astype('int16')

# Create a Zarr array with the Anscombe codec
store = zarr.storage.MemoryStore()
arr = zarr.create(
store=store,
shape=data.shape,
chunks=(10, 512, 512),
chunks=(12, 160, 80),
dtype='int16',
filters=[AnscombeTransformV3(zero_level=100, conversion_gain=2.5)],
filters=[AnscombeTransformV3(zero_level=zero_level, conversion_gain=conversion_gain)],
zarr_format=3
)

Expand All @@ -44,10 +51,10 @@ import zarr
arr = zarr.open_array(
'data.zarr',
mode='w',
shape=(100, 512, 512),
chunks=(10, 512, 512),
shape=data.shape,
chunks=(12, 160, 80),
dtype='int16',
compressor=AnscombeTransformV2(zero_level=100, conversion_gain=2.5)
compressor=AnscombeTransformV2(zero_level=zero_level, conversion_gain=conversion_gain)
)

# Write and read data
Expand All @@ -64,10 +71,10 @@ from anscombe_transform import compute_conversion_gain
import numpy as np

# Load your movie data as (time, height, width)
movie = np.random.poisson(lam=50, size=(100, 512, 512))
movie = data

# Estimate parameters
result = compute_conversion_gain(movie)
result = compute_conversion_gain(data)

print(f"Estimated conversion gain: {result['sensitivity']:.3f}")
print(f"Estimated zero level: {result['zero_level']:.3f}")
Expand All @@ -90,10 +97,10 @@ from anscombe_transform import AnscombeTransformV3

# For Zarr V3, use filters + codecs
arr = zarr.create(
shape=(100, 512, 512),
shape=data.shape,
chunks=(10, 512, 512),
dtype='int16',
filters=[AnscombeTransformV3(zero_level=100, conversion_gain=2.5)],
filters=[AnscombeTransformV3(zero_level=zero_level, conversion_gain=conversion_gain)],
compressor={'id': 'blosc', 'cname': 'zstd', 'clevel': 5},
zarr_format=3
)
Expand All @@ -102,7 +109,7 @@ arr = zarr.create(
## Key Parameters

- **`zero_level`**: The signal value when no photons are detected. This is the baseline offset in your camera sensor.
- **`conversion_gain`** (also called `photon_sensitivity`): How many signal units correspond to one photon. For example, if your camera reports 2.5 ADU per photon, use `conversion_gain=2.5`.
- **`conversion_gain`** (also called `photon_sensitivity`): How many signal units correspond to one photon. For example, if your camera reports 2.5 levels increase in signal per photon, use `conversion_gain=2.5`.
- **`encoded_dtype`**: The data type for encoded values (default: `uint8`). Use `uint8` for maximum compression.
- **`decoded_dtype`**: The data type for decoded values (default: inferred from data).

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ These tables are computed once during codec initialization and reused for all en
### Compression Ratios

Typical compression ratios (Anscombe + Blosc/Zstd):
- **4-8x** for typical multiphoton microscopy data
- **3-8x** for typical multiphoton microscopy data
- **6-10x** for astronomy data
- **3-6x** for radiography data

Expand Down
4 changes: 2 additions & 2 deletions examples/workbook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -392,7 +392,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "microns_trippy",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -406,7 +406,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.16"
"version": "3.11.11"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ name = "anscombe_transform"
version = "1.0.0"

authors = [
{ name = "Jerome Lecoq", email = "[email protected]" },
{ name = "Dimitri Yatsenko", email = "[email protected]" },
{ name = "Jerome Lecoq", email = "[email protected]" },
{ name = "Davis Bennett", email = "[email protected]" },
]
maintainers = [
Expand Down
38 changes: 29 additions & 9 deletions tests/test_zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,35 +10,56 @@
def test_zarr_v2_roundtrip() -> None:
decoded_dtype = "int16"
encoded_dtype = "uint8"
data = np.random.poisson(100, size=(20, 20)).astype(decoded_dtype)

# generate fake data
np.random.seed(42)
size = (20, 20)
sensitivity = 100.0
zero_level = -5.0
true_rate = np.random.exponential(scale=5, size=size)
data = (
zero_level
+ sensitivity * (np.random.poisson(true_rate) + np.random.randn(*size) * 0.25)
).astype(decoded_dtype)

# construct codec
codec = AnscombeTransformV2(
conversion_gain=sensitivity,
zero_level=0,
zero_level=zero_level,
encoded_dtype=encoded_dtype,
decoded_dtype=decoded_dtype,
)
data_encoded = codec.encode(data)
data_rt = codec.decode(data_encoded).reshape(data.shape)

store = {}

# write data
store = {}
_ = create_array(store=store, data=data, zarr_format=2, compressors=codec)

# read data
z_arr_r = open_array(store=store)
assert z_arr_r.dtype == decoded_dtype
assert nearly_equal(z_arr_r, data_rt, sensitivity / 2)
assert nearly_equal(z_arr_r, data_rt, sensitivity * 0.5)


def test_zarr_v3_roundtrip() -> None:
decoded_dtype = "int16"
encoded_dtype = "uint8"
data = np.random.poisson(100, size=(20, 20)).astype(decoded_dtype)

# generate fake data
np.random.seed(42)
size = (20, 20)
sensitivity = 100.0
zero_level = -5.0
true_rate = np.random.exponential(scale=5, size=size)
data = (
zero_level
+ sensitivity * (np.random.poisson(true_rate) + np.random.randn(*size) * 0.25)
).astype(decoded_dtype)

codec = AnscombeTransformV3(
conversion_gain=sensitivity,
zero_level=0,
zero_level=zero_level,
encoded_dtype=encoded_dtype,
decoded_dtype=decoded_dtype,
)
Expand All @@ -47,9 +68,8 @@ def test_zarr_v3_roundtrip() -> None:
data_encoded = codec._encode(data)
data_rt = codec._decode(data_encoded).reshape(data.shape)

store = {}

# write data
store = {}
_ = create_array(store=store, data=data, zarr_format=3, filters=[codec])
# read data
z_arr_r = open_array(store=store)
Expand Down