Skip to content

Commit a7aea3a

Browse files
Merge pull request #1 from datajoint/chore-canon
Chore canon
2 parents d31340e + 0230a51 commit a7aea3a

File tree

8 files changed

+129
-44
lines changed

8 files changed

+129
-44
lines changed

CITATION.cff

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
cff-version: 1.2.0
2+
title: Anscombe Transform Codec
3+
message: If you use this software, please cite it.
4+
version: 1.0.0
5+
repository-code: https://github.com/datajoint/anscombe-transform
6+
abstract: >
7+
Codec implementation of the Anscombe transform for compressing photon-limited
8+
microscopy, radiography, and astronomy image recordings, published by DataJoint.
9+
authors:
10+
- family-names: Yatsenko
11+
given-names: Dimitri
12+
13+
- family-names: Lecoq
14+
given-names: Jerome
15+
16+
- family-names: Bennett
17+
given-names: Davis
18+
19+
contact:
20+
- family-names: Yatsenko
21+
given-names: Dimitri
22+
23+
license: MIT
24+
preferred-citation:
25+
type: article
26+
title: Standardized measurements for monitoring and comparing multiphoton microscope systems
27+
authors:
28+
- family-names: Lees
29+
given-names: Robert M.
30+
- family-names: Bianco
31+
given-names: Isaac H.
32+
- family-names: Campbell
33+
given-names: Robert A. A.
34+
- family-names: Orlova
35+
given-names: Natalia
36+
- family-names: Peterka
37+
given-names: Darcy S.
38+
- family-names: Pichler
39+
given-names: Bruno
40+
- family-names: Smith
41+
given-names: Spencer LaVere
42+
- family-names: Yatsenko
43+
given-names: Dimitri
44+
- family-names: Yu
45+
given-names: Che-Hang
46+
- family-names: Packer
47+
given-names: Adam M.
48+
journal: Nature Protocols
49+
volume: '20'
50+
pages: 1-38
51+
date-published: 2025-03-17
52+
doi: 10.1038/s41596-024-01120-w
53+
url: https://doi.org/10.1038/s41596-024-01120-w

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,18 @@
1-
[![PyPI version](https://badge.fury.io/py/anscombe-transform.svg)](https://badge.fury.io/py/anscombe-transform) ![tests](https://github.com/datajoint/anscombe-transform/actions/workflows/tests.yaml/badge.svg)
1+
[![PyPI version](https://badge.fury.io/py/anscombe-transform.svg)](https://badge.fury.io/py/anscombe-transform) ![tests](https://github.com/datajoint/anscombe-transform/actions/workflows/test.yml/badge.svg)
22

33
# Anscombe transform
44

5-
This codec is designed for compressing image recordings with Poisson noise, which are produced by photon-limited modalities such multiphoton microscopy, radiography, and astronomy.
5+
This codec is designed for compressing image recordings with Poisson noise such as in microscopy, radiography, and astronomy.
66

7-
The codec assumes that the video is linearly encoded with a potential offset (`zero_level`) and that the `photon_sensitivity` (the average increase in intensity per photon) is either already known or can be accurately estimated from the data.
7+
**Status:** This is the official and actively maintained Anscombe transform codec for Zarr/Numcodecs, maintained by DataJoint.
8+
It originated as a fork of [AllenNeuralDynamics/poisson-numcodecs](https://github.com/AllenNeuralDynamics/poisson-numcodecs) and earlier developments at https://github.com/datajoint/compress-multiphoton. It has since diverged significantly. New users should rely on this repository as the canonical source.
9+
10+
The codec assumes that the video is linearly encoded with a potential offset (`zero_level`) and that the `conversion_gain` (also called `photon_sensitivity`)—the average increase in intensity per photon—is either already known or can be accurately estimated from the data.
811

912
The codec re-quantizes the grayscale efficiently with a square-root-like transformation to equalize the noise variance across the grayscale levels: the [Anscombe Transform](https://en.wikipedia.org/wiki/Anscombe_transform).
1013
This results in a smaller number of unique grayscale levels and significant improvements in the compressibility of the data without sacrificing signal accuracy.
1114

12-
To use the codec, one must supply two pieces of information: `zero_level` (the input value corresponding to the absence of light) and `photon_sensitivity` (levels/photon).
15+
To use the codec, one must supply two pieces of information: `zero_level` (the input value corresponding to the absence of light) and `conversion_gain` (levels/photon).
1316

1417
The codec is used in Zarr as a filter prior to compression.
1518

docs/examples/workbook.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -29,17 +29,19 @@ First, let's create synthetic data that mimics photon-limited imaging:
2929

3030
```python
3131
# Parameters for synthetic data
32-
n_frames = 100
33-
height, width = 512, 512
34-
mean_photons = 50 # Average photons per pixel
35-
zero_level = 100 # Camera baseline
36-
conversion_gain = 2.5 # ADU per photon
32+
n_frames = 30
33+
height, width = 120, 120
34+
mean_photon_rate = 5.0 # Average photons per pixel (exponential distribution of rates)
35+
zero_level = -10.0 # camera baseline
36+
conversion_gain = 2.5 # levels per photon
3737

3838
# Generate Poisson-distributed photon counts
39-
photon_counts = np.random.poisson(lam=mean_photons, size=(n_frames, height, width))
39+
photon_rate = np.random.exponential(scale=5, size=(1, height, width))
40+
photon_counts = np.random.poisson(lam=np.tile(photon_rate, (n_frames, 1, 1)))
41+
measured_signal = photon_counts + np.random.randn(*size) * 0.2
4042

41-
# Convert to camera signal (ADU)
42-
camera_signal = (photon_counts * conversion_gain + zero_level).astype('int16')
43+
# Convert to camera signal
44+
camera_signal = (measured_signal * conversion_gain + zero_level).astype('int16')
4345

4446
print(f"Data shape: {camera_signal.shape}")
4547
print(f"Data range: [{camera_signal.min()}, {camera_signal.max()}]")
@@ -58,16 +60,16 @@ estimated_gain = result['sensitivity']
5860
estimated_zero = result['zero_level']
5961

6062
print(f"\nTrue parameters:")
61-
print(f" Conversion gain: {conversion_gain:.3f} ADU/photon")
62-
print(f" Zero level: {zero_level:.1f} ADU")
63+
print(f" Conversion gain: {conversion_gain:.3f} units/photon")
64+
print(f" Zero level: {zero_level:.1f} ")
6365

6466
print(f"\nEstimated parameters:")
65-
print(f" Conversion gain: {estimated_gain:.3f} ADU/photon")
66-
print(f" Zero level: {estimated_zero:.1f} ADU")
67+
print(f" Conversion gain: {estimated_gain:.3f} units/photon")
68+
print(f" Zero level: {estimated_zero:.1f}")
6769

6870
print(f"\nEstimation error:")
69-
print(f" Gain error: {abs(estimated_gain - conversion_gain):.3f} ADU/photon")
70-
print(f" Zero level error: {abs(estimated_zero - zero_level):.1f} ADU")
71+
print(f" Gain error: {abs(estimated_gain - conversion_gain):.3f} units/photon")
72+
print(f" Zero level error: {abs(estimated_zero - zero_level):.1f} units")
7173
```
7274

7375
## Visualize Noise Model

docs/getting-started/quick-start.md

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,25 @@ import zarr
99
import numpy as np
1010
from anscombe_transform import AnscombeTransformV3
1111

12-
# Generate sample data with Poisson noise
13-
# Simulating photon-limited imaging data
14-
data = np.random.poisson(lam=50, size=(100, 512, 512)).astype('int16')
12+
# Synthesize data with Poisson noise - simulating photon-limited microscopy data
13+
n_frames, height, width = 50, 256, 256
14+
mean_photon_rate = 5.0 # Average photons per pixel (exponential distribution of rates)
15+
zero_level = 20.0 # camera baseline
16+
conversion_gain = 30.0 # levels per photon
17+
photon_rate = np.random.exponential(scale=mean_photon_rate, size=(1, height, width))
18+
photon_counts = np.random.poisson(np.tile(photon_rate, (n_frames, 1, 1)))
19+
measured_signal = photon_counts + np.random.randn(*size) * 0.2
20+
21+
data = (zero_level + conversion_gain * measured_signal).astype('int16')
1522

1623
# Create a Zarr array with the Anscombe codec
1724
store = zarr.storage.MemoryStore()
1825
arr = zarr.create(
1926
store=store,
2027
shape=data.shape,
21-
chunks=(10, 512, 512),
28+
chunks=(12, 160, 80),
2229
dtype='int16',
23-
filters=[AnscombeTransformV3(zero_level=100, conversion_gain=2.5)],
30+
filters=[AnscombeTransformV3(zero_level=zero_level, conversion_gain=conversion_gain)],
2431
zarr_format=3
2532
)
2633

@@ -44,10 +51,10 @@ import zarr
4451
arr = zarr.open_array(
4552
'data.zarr',
4653
mode='w',
47-
shape=(100, 512, 512),
48-
chunks=(10, 512, 512),
54+
shape=data.shape,
55+
chunks=(12, 160, 80),
4956
dtype='int16',
50-
compressor=AnscombeTransformV2(zero_level=100, conversion_gain=2.5)
57+
compressor=AnscombeTransformV2(zero_level=zero_level, conversion_gain=conversion_gain)
5158
)
5259

5360
# Write and read data
@@ -64,10 +71,10 @@ from anscombe_transform import compute_conversion_gain
6471
import numpy as np
6572

6673
# Load your movie data as (time, height, width)
67-
movie = np.random.poisson(lam=50, size=(100, 512, 512))
74+
movie = data
6875

6976
# Estimate parameters
70-
result = compute_conversion_gain(movie)
77+
result = compute_conversion_gain(data)
7178

7279
print(f"Estimated conversion gain: {result['sensitivity']:.3f}")
7380
print(f"Estimated zero level: {result['zero_level']:.3f}")
@@ -90,10 +97,10 @@ from anscombe_transform import AnscombeTransformV3
9097

9198
# For Zarr V3, use filters + codecs
9299
arr = zarr.create(
93-
shape=(100, 512, 512),
100+
shape=data.shape,
94101
chunks=(10, 512, 512),
95102
dtype='int16',
96-
filters=[AnscombeTransformV3(zero_level=100, conversion_gain=2.5)],
103+
filters=[AnscombeTransformV3(zero_level=zero_level, conversion_gain=conversion_gain)],
97104
compressor={'id': 'blosc', 'cname': 'zstd', 'clevel': 5},
98105
zarr_format=3
99106
)
@@ -102,7 +109,7 @@ arr = zarr.create(
102109
## Key Parameters
103110

104111
- **`zero_level`**: The signal value when no photons are detected. This is the baseline offset in your camera sensor.
105-
- **`conversion_gain`** (also called `photon_sensitivity`): How many signal units correspond to one photon. For example, if your camera reports 2.5 ADU per photon, use `conversion_gain=2.5`.
112+
- **`conversion_gain`** (also called `photon_sensitivity`): How many signal units correspond to one photon. For example, if your camera reports 2.5 levels increase in signal per photon, use `conversion_gain=2.5`.
106113
- **`encoded_dtype`**: The data type for encoded values (default: `uint8`). Use `uint8` for maximum compression.
107114
- **`decoded_dtype`**: The data type for decoded values (default: inferred from data).
108115

docs/user-guide/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ These tables are computed once during codec initialization and reused for all en
7474
### Compression Ratios
7575

7676
Typical compression ratios (Anscombe + Blosc/Zstd):
77-
- **4-8x** for typical multiphoton microscopy data
77+
- **3-8x** for typical multiphoton microscopy data
7878
- **6-10x** for astronomy data
7979
- **3-6x** for radiography data
8080

examples/workbook.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -392,7 +392,7 @@
392392
],
393393
"metadata": {
394394
"kernelspec": {
395-
"display_name": "microns_trippy",
395+
"display_name": "Python 3",
396396
"language": "python",
397397
"name": "python3"
398398
},
@@ -406,7 +406,7 @@
406406
"name": "python",
407407
"nbconvert_exporter": "python",
408408
"pygments_lexer": "ipython3",
409-
"version": "3.10.16"
409+
"version": "3.11.11"
410410
}
411411
},
412412
"nbformat": 4,

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ name = "anscombe_transform"
77
version = "1.0.0"
88

99
authors = [
10-
{ name = "Jerome Lecoq", email = "[email protected]" },
1110
{ name = "Dimitri Yatsenko", email = "[email protected]" },
11+
{ name = "Jerome Lecoq", email = "[email protected]" },
1212
{ name = "Davis Bennett", email = "[email protected]" },
1313
]
1414
maintainers = [

tests/test_zarr.py

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,35 +10,56 @@
1010
def test_zarr_v2_roundtrip() -> None:
1111
decoded_dtype = "int16"
1212
encoded_dtype = "uint8"
13-
data = np.random.poisson(100, size=(20, 20)).astype(decoded_dtype)
13+
14+
# generate fake data
15+
np.random.seed(42)
16+
size = (20, 20)
1417
sensitivity = 100.0
18+
zero_level = -5.0
19+
true_rate = np.random.exponential(scale=5, size=size)
20+
data = (
21+
zero_level
22+
+ sensitivity * (np.random.poisson(true_rate) + np.random.randn(*size) * 0.25)
23+
).astype(decoded_dtype)
24+
25+
# construct codec
1526
codec = AnscombeTransformV2(
1627
conversion_gain=sensitivity,
17-
zero_level=0,
28+
zero_level=zero_level,
1829
encoded_dtype=encoded_dtype,
1930
decoded_dtype=decoded_dtype,
2031
)
2132
data_encoded = codec.encode(data)
2233
data_rt = codec.decode(data_encoded).reshape(data.shape)
2334

24-
store = {}
25-
2635
# write data
36+
store = {}
2737
_ = create_array(store=store, data=data, zarr_format=2, compressors=codec)
38+
2839
# read data
2940
z_arr_r = open_array(store=store)
3041
assert z_arr_r.dtype == decoded_dtype
31-
assert nearly_equal(z_arr_r, data_rt, sensitivity / 2)
42+
assert nearly_equal(z_arr_r, data_rt, sensitivity * 0.5)
3243

3344

3445
def test_zarr_v3_roundtrip() -> None:
3546
decoded_dtype = "int16"
3647
encoded_dtype = "uint8"
37-
data = np.random.poisson(100, size=(20, 20)).astype(decoded_dtype)
48+
49+
# generate fake data
50+
np.random.seed(42)
51+
size = (20, 20)
3852
sensitivity = 100.0
53+
zero_level = -5.0
54+
true_rate = np.random.exponential(scale=5, size=size)
55+
data = (
56+
zero_level
57+
+ sensitivity * (np.random.poisson(true_rate) + np.random.randn(*size) * 0.25)
58+
).astype(decoded_dtype)
59+
3960
codec = AnscombeTransformV3(
4061
conversion_gain=sensitivity,
41-
zero_level=0,
62+
zero_level=zero_level,
4263
encoded_dtype=encoded_dtype,
4364
decoded_dtype=decoded_dtype,
4465
)
@@ -47,9 +68,8 @@ def test_zarr_v3_roundtrip() -> None:
4768
data_encoded = codec._encode(data)
4869
data_rt = codec._decode(data_encoded).reshape(data.shape)
4970

50-
store = {}
51-
5271
# write data
72+
store = {}
5373
_ = create_array(store=store, data=data, zarr_format=3, filters=[codec])
5474
# read data
5575
z_arr_r = open_array(store=store)

0 commit comments

Comments
 (0)