Fix per-atlas overwrite in single_session_metrics#344
Merged
Conversation
`compute_timeseries` derives its output filename from the BOLD stem, so every atlas iteration in `single_session_metrics` was writing to the same file. Every `MetricsOutputs.timeseries[label]` then pointed at the last-processed atlas's data, silently corrupting the regular RBC pipeline's atlas outputs (timeseries + Pearson correlations). Give each atlas its own `out_dir`. Also switch `compute_timeseries` from `get_fdata().astype(int)` to `np.asarray(atlas_img.dataobj).astype(int)` so integer atlas labels survive verbatim. `get_fdata` would apply `scl_slope`/`scl_inter` and scale small labels into garbage floats if an atlas mistakenly ships with non-trivial scaling. Regression test in `tests/unit/workflows/test_metrics.py` builds two atlases with different ROI counts (3 and 5) and asserts each is preserved in `MetricsOutputs`, with distinct file paths.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
compute_timeseriesderives its output filename from the BOLD stem, so the atlas loop insingle_session_metricswas writing every atlas to the same file. The dict onMetricsOutputs.timeseriesended up with every label pointing at the last-processed atlas's data - meaning the orchestration pipeline's per-atlas timeseries and Pearson correlations all got the data of whichever atlas ran last, regardless of label.Surfaced while validating the TR-fix script (#343) on real release data: every atlas timeseries came out shaped
(T, 17)even though only Yeo17/Yeo17liberal have 17 ROIs. The script's regen is wrong as a result, but the underlying bug is in the library - it affects RBC's normal orchestration too.Fix: give each atlas its own
out_dirunder the metrics work dir.While in there, also switch
compute_timeseriesfromget_fdata().astype(int)tonp.asarray(atlas_img.dataobj).astype(int)for the atlas labels.get_fdataappliesscl_slope/scl_interand returns float64; for an integer atlas mistakenly shipped with non-trivial scaling, that would scale small labels into garbage floats.dataobjreads raw on-disk values, which is what we want for label data.Regression test in
tests/unit/workflows/test_metrics.pybuilds two atlases with 3 and 5 ROIs respectively, runssingle_session_metrics, and asserts each atlas's timeseries lands in its own file with its own ROI count.