Skip to content

Commit 3c5e7c5

Browse files
Docs: Add patch filtering tutorial (#953)
## Description > [!NOTE] > **tldr**: Add a tutorial on how to use patch filtering in CAREamics. Describes how to look at the patch filter maps to choose an appropriate threshold and how to create the configuration for training. Uses snippets, but the matplotlib figures are saved as pngs added to the docs repo in CAREamics/careamics.github.io#65. --- **Please ensure your PR meets the following requirements:** - [x] Code builds and passes tests locally, including doctests - [ ] New tests have been added (for bug fixes/features) - [x] Documentation has been updated - [x] Pre-commit passes --------- Co-authored-by: Joran Deschamps <6367888+jdeschamps@users.noreply.github.com>
1 parent e6e08cc commit 3c5e7c5

5 files changed

Lines changed: 205 additions & 5 deletions

File tree

docs/current/careamist_training.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -130,9 +130,12 @@ defined and instantiated, they can be passed to the `CAREamist` to train on cust
130130

131131
### Masking
132132

133-
CAREamics supports providing a mask of the training data to define from which region
134-
should the training patches be sampled. This can be useful to exclude certain regions
135-
from training, for example areas with no signal or with zero values.
133+
Masking can be used to exclude certain regions from training, for example areas with no signal or with zero values.
134+
135+
CAREamics supports two methods of masking data during training:
136+
137+
- providing a mask of the training data to define from which region should the training patches be sampled, or
138+
- built-in filtering functions. See the full [tutorial](../tutorials/patch_filtering.md#filtering-functions).
136139

137140
=== "Noise2Void"
138141

docs/nav.toml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,9 @@ nav = [
1313
]},
1414
{"Tutorials" = [
1515
"content/guides/tutorials/index.md",
16-
{"Custom format" = "content/guides/tutorials/custom_data.md"},
17-
{"Implementing an Image Stack" = "content/guides/tutorials/implementing_an_image_stack.md"}
16+
{"Custom Data Formats" = "content/guides/tutorials/custom_data.md"},
17+
{"Implementing an Image Stack" = "content/guides/tutorials/implementing_an_image_stack.md"},
18+
{"Patch Filtering" = "content/guides/tutorials/patch_filtering.md"}
1819
]},
1920
{"Legacy (v0.1)" = [
2021
"content/guides/v0.1/index.md",

docs/tutorials/checking_patches_and_bg.md

Whitespace-only changes.

docs/tutorials/patch_filtering.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# Patch Filtering
2+
3+
Patch filtering is useful if your data contains large areas with no signal. These areas can be filtered from the training process which can speed up the convergence of the model.
4+
5+
### How does it work?
6+
7+
CAREamics will perform a first pass through all the data before training starts to determine regions of background and regions of signal. Background regions will not be completely be excluded from training, instead their probability of being selected during an epoch will be reduced.
8+
9+
There are two options for the filtering function, either:
10+
11+
- [pre-computed masks](#pre-computed-masks) can be provided, or
12+
- one of the [built-in filtering functions](#filtering-functions) can be selected and parametrized.
13+
14+
## Pre-computed masks
15+
16+
Using precomputed masks is relatively simple, the masks in the same format as the data — either as an array or saved in files — can be provided during training.
17+
18+
=== "Noise2Void"
19+
20+
```python title="Specifying a mask for Noise2Void training"
21+
--8<-- "current/careamist_training.py:train_n2v_mask"
22+
```
23+
24+
1. The mask is passed alongside the data.
25+
26+
=== "CARE/N2N"
27+
28+
```python title="Specifying a mask for CARE training"
29+
--8<-- "current/careamist_training.py:train_care_mask"
30+
```
31+
32+
1. The mask is passed alongside the data.
33+
34+
!!! note "What is masked?"
35+
36+
The mask is a binary set of images with the same size as the training data and
37+
should have value `1` for pixels that should be included in the training and `0`
38+
for pixels that should be excluded.
39+
40+
## Filtering functions
41+
42+
CAREamics has 3 built-in filtering functions, which work by filtering out patches using thresholds on different metrics:
43+
44+
- [`MaxPatchFilter`][careamics.dataset.patch_filter.MaxPatchFilter]: that filters the data based on the max value of each region.
45+
- [`MeanStdPatchFilter`][careamics.dataset.patch_filter.MeanStdPatchFilter]: that filters the data based on the mean and optionally the standard deviation of regions of the data.
46+
- [`ShannonPatchFilter`][careamics.dataset.patch_filter.ShannonPatchFilter]: that filters the data based on the shannon entropy of regions of the data.
47+
48+
!!! note "Multi-channel data"
49+
50+
For multi-channel data the filtering function is only applied to a single channel of your choosing.
51+
52+
### Finding appropriate thresholds
53+
54+
Finding appropriate thresholds requires manually inspecting some examples. The patch filter classes provide `filter_map` and `plot_filter_map` which can be used to visualize at what threshold a region will be considered background.
55+
56+
---
57+
58+
For demonstration purposes we will use the Hagen dataset which is used in other CAREamics examples; however, it doesn't have enough background area to typically require patch filtering.
59+
60+
```python title="Download the data"
61+
--8<-- "tutorials/patch_filtering.py:download-data"
62+
```
63+
64+
---
65+
66+
Now we inspect the filter maps to decide on a patch filtering function and threshold. For data with multiple samples it is generally a good idea to inspect the filter maps of a few different samples; and for 3D data one should look at multiple z-slices.
67+
68+
```python title="Plot Filter Maps"
69+
--8<-- "tutorials/patch_filtering.py:filter-maps"
70+
```
71+
72+
!!! info "3D data"
73+
74+
For 3D data `plot_filter_map` has the `z_idx` argument to control which z-slice is displayed.
75+
76+
![Max filter map](../../../images/tutorials/patch_filtering/max_filter_map.png)
77+
78+
![Shannon filter map](../../../images/tutorials/patch_filtering/shannon_filter_map.png)
79+
80+
![Mean-Std filter map](../../../images/tutorials/patch_filtering/mean_std_filter_map.png)
81+
82+
---
83+
84+
We will choose the shannon patch filter, with a threshold of 7.5, and to confirm that this is a good choice we will look at the resulting mask, by using the `ShannonPatchFilter.apply_filter` method.
85+
86+
```python title="Plot Filter Maps"
87+
--8<-- "tutorials/patch_filtering.py:mask"
88+
```
89+
90+
![Filter mask](../../../images/tutorials/patch_filtering/filter_mask.png)
91+
92+
### Training
93+
94+
Next, we have to build the configuration.
95+
96+
Each of the patch filter classes has a corresponding configuration class, where the threshold parameters can be set:
97+
98+
- [`MaxPatchFilterConfig`][careamics.config.MaxPatchFilterConfig]
99+
- [`MeanStdPatchFilterConfig`][careamics.config.MeanStdPatchFilterConfig]
100+
- [`ShannonPatchFilterConfig`][careamics.config.ShannonPatchFilterConfig]
101+
102+
We will create the configuration using `create_advanced_n2v_config` and passing `ShannonPatchFilterConfig` with our selected threshold to the `patch_filter_config` argument.
103+
104+
105+
```python title="Create Config and Train"
106+
--8<-- "tutorials/patch_filtering.py:config"
107+
```
108+
109+
1. Using shannon filtering with a threshold of 7.5
110+
111+
!!! info "Multi-channel data"
112+
113+
For multi-channel data set the `ref_channel` parameter in the patch filter configs to the index of your desired channel.
114+
115+
!!! info "Other algorithms"
116+
117+
The configuration factory functions for other algorithms, such as CARE and N2N also have a `patch_filter_config` argument.
118+
119+
!!! success Success
120+
121+
If patch filtering was correctly applied during training, you should see a log similar to:
122+
123+
```
124+
Filtering background patches with filtering function shannon: 100%|██████████| 79/79 [00:06<00:00, 12.79it/s]
125+
Found 6345 background regions. Number of patches has been reduced to 14553 from 20224.
126+
```

docs/tutorials/patch_filtering.py

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# --8<-- [start:download-data]
2+
from pathlib import Path
3+
4+
import matplotlib.pyplot as plt
5+
import pooch
6+
import tifffile
7+
8+
from careamics.dataset.patch_filter import (
9+
MaxPatchFilter,
10+
MeanStdPatchFilter,
11+
ShannonPatchFilter,
12+
)
13+
14+
# --- download the data
15+
# folder in which to save all the data
16+
root = Path("hagen")
17+
18+
# download the data using pooch
19+
data_root = root / "data"
20+
dataset_url = "https://zenodo.org/records/10925855/files/noisy.tiff?download=1"
21+
22+
file = pooch.retrieve(
23+
url=dataset_url,
24+
known_hash="ff12ee5566f443d58976757c037ecef8bf53a00794fa822fe7bcd0dd776a9c0f",
25+
path=data_root,
26+
)
27+
28+
# Shape: (79, 1024, 1024), axes: SYX
29+
img = tifffile.imread(file)
30+
# --8<-- [end:download-data]
31+
32+
33+
# --8<-- [start:filter-maps]
34+
sample_idx = 4
35+
36+
max_filter_map = MaxPatchFilter.filter_map(img[sample_idx], (64, 64))
37+
MaxPatchFilter.plot_filter_map(img[sample_idx], max_filter_map)
38+
39+
shannon_filter_map = ShannonPatchFilter.filter_map(img[sample_idx], (64, 64))
40+
ShannonPatchFilter.plot_filter_map(img[sample_idx], shannon_filter_map)
41+
42+
meanstd_filter_map = MeanStdPatchFilter.filter_map(img[sample_idx], (64, 64))
43+
MeanStdPatchFilter.plot_filter_map(img[sample_idx], meanstd_filter_map)
44+
# --8<-- [end:filter-maps]
45+
46+
47+
# --8<-- [start:mask]
48+
plt.figure(constrained_layout=True)
49+
plt.imshow(ShannonPatchFilter.apply_filter(shannon_filter_map, threshold=7.5))
50+
plt.title("Filter mask")
51+
# --8<-- [end:mask]
52+
53+
54+
# --8<-- [start:config]
55+
from careamics import CAREamist
56+
from careamics.config import ShannonPatchFilterConfig, create_advanced_n2v_config
57+
58+
config = create_advanced_n2v_config(
59+
"hagen-shannon-filtering",
60+
data_type="array",
61+
axes="SYX",
62+
patch_size=(64, 64),
63+
batch_size=64,
64+
num_epochs=10,
65+
patch_filter_config=ShannonPatchFilterConfig(threshold=7.5), # (1)!
66+
)
67+
68+
careamist = CAREamist(config=config)
69+
careamist.train(train_data=img)
70+
# --8<-- [start:config]

0 commit comments

Comments
 (0)