Skip to content

Fix per-atlas overwrite in single_session_metrics#344

Merged
nx10 merged 1 commit into
mainfrom
lib/single-session-metrics-per-atlas-outdir
May 23, 2026
Merged

Fix per-atlas overwrite in single_session_metrics#344
nx10 merged 1 commit into
mainfrom
lib/single-session-metrics-per-atlas-outdir

Conversation

@nx10
Copy link
Copy Markdown
Contributor

@nx10 nx10 commented May 23, 2026

compute_timeseries derives its output filename from the BOLD stem, so the atlas loop in single_session_metrics was writing every atlas to the same file. The dict on MetricsOutputs.timeseries ended up with every label pointing at the last-processed atlas's data - meaning the orchestration pipeline's per-atlas timeseries and Pearson correlations all got the data of whichever atlas ran last, regardless of label.

Surfaced while validating the TR-fix script (#343) on real release data: every atlas timeseries came out shaped (T, 17) even though only Yeo17/Yeo17liberal have 17 ROIs. The script's regen is wrong as a result, but the underlying bug is in the library - it affects RBC's normal orchestration too.

Fix: give each atlas its own out_dir under the metrics work dir.

While in there, also switch compute_timeseries from get_fdata().astype(int) to np.asarray(atlas_img.dataobj).astype(int) for the atlas labels. get_fdata applies scl_slope/scl_inter and returns float64; for an integer atlas mistakenly shipped with non-trivial scaling, that would scale small labels into garbage floats. dataobj reads raw on-disk values, which is what we want for label data.

Regression test in tests/unit/workflows/test_metrics.py builds two atlases with 3 and 5 ROIs respectively, runs single_session_metrics, and asserts each atlas's timeseries lands in its own file with its own ROI count.

`compute_timeseries` derives its output filename from the BOLD stem, so
every atlas iteration in `single_session_metrics` was writing to the same
file. Every `MetricsOutputs.timeseries[label]` then pointed at the
last-processed atlas's data, silently corrupting the regular RBC pipeline's
atlas outputs (timeseries + Pearson correlations). Give each atlas its own
`out_dir`.

Also switch `compute_timeseries` from `get_fdata().astype(int)` to
`np.asarray(atlas_img.dataobj).astype(int)` so integer atlas labels survive
verbatim. `get_fdata` would apply `scl_slope`/`scl_inter` and scale small
labels into garbage floats if an atlas mistakenly ships with non-trivial
scaling.

Regression test in `tests/unit/workflows/test_metrics.py` builds two
atlases with different ROI counts (3 and 5) and asserts each is preserved
in `MetricsOutputs`, with distinct file paths.
@github-actions
Copy link
Copy Markdown

Coverage

Tests Skipped Failures Errors Time
792 0 💤 0 ❌ 0 🔥 10.723s ⏱️

@nx10 nx10 merged commit 7cfa758 into main May 23, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant