Log raw input image metadata at workflow start by nx10 · Pull Request #334 · childmindresearch/rbc

nx10 · 2026-05-11T23:26:09Z

We weren't logging much about the raw images going into the pipeline (just the file path for anat; TR + SliceTiming for func). This adds a header-only log_image_summary() in rbc.core.nifti and calls it from the anatomical and functional process_session orchestrators, so each run records exactly what entered: array shape, on-disk dtype, data size, voxel size, orientation, sform/qform spaces, and for 4D images volume count, slice axis/count/order, and header TR. No data is loaded, just the NIfTI header. It's best-effort: an unreadable header logs a warning rather than aborting the run.

Sample output on tests/data/ds000001:

Anatomical T1w: tests/data/ds000001/sub-01/anat/sub-01_T1w.nii.gz
Anatomical T1w: shape=(160, 192, 192), dtype=int16, size=11.2 MiB, voxel size=1 x 1.33 x 1.33 mm
Anatomical T1w: orientation=RAS, sform=SCANNER, qform=SCANNER
Functional BOLD: tests/data/ds000001/sub-01/func/sub-01_task-balloonanalogrisktask_run-01_bold.nii.gz
Functional BOLD: shape=(64, 64, 33, 300), dtype=int16, size=77.3 MiB, voxel size=3.12 x 3.12 x 4 mm
Functional BOLD: orientation=LAS, sform=SCANNER, qform=SCANNER
Functional BOLD: volumes=300, slices=33 along axis 2 (assumed; no dim_info), slice order=unknown, header TR=2 s

(size is the array footprint shape x dtype itemsize; the on-disk .nii.gz is much smaller. slice order=unknown is normal for BIDS data, where SliceTiming lives in the JSON sidecar - which FunctionalMetadata.load already logs separately.)

Functional runs still get the existing TR-source / SliceTiming logging from FunctionalMetadata.load. The all pipeline picks this up for free since it reuses these process_session functions. Longitudinal / metrics / QC are left alone since they consume derivatives, not raw inputs.

Add `log_image_summary()` to `rbc.core.nifti`: a header-only (no voxel data loaded) helper that logs an INFO summary of a raw NIfTI input - array shape, on-disk dtype, voxel size, axis orientation, sform/qform coordinate spaces, and for 4D images the volume count, slice count, and header TR. Call it from the anatomical and functional `process_session` orchestrators in place of the bare "Anatomical: <path>" / "Functional: <path>" lines, so each run records exactly what entered the pipeline. Functional runs still get TR-source and SliceTiming logging from `FunctionalMetadata.load`.

github-actions · 2026-05-11T23:26:59Z

Tests	Skipped	Failures	Errors	Time
791	0 💤	0 ❌	0 🔥	11.334s ⏱️

prod(shape) * dtype.itemsize, formatted with binary units (B/KiB/MiB/GiB). The on-disk .nii.gz size hides this; it's the number that matters when the pipeline loads the array.

- Best-effort: a header that can't be read logs a warning instead of aborting the run; the real failure surfaces later when processing touches the file. - 5D+ images report the trailing dims as `extra dims=...` instead of mislabeling them. - Voxel size notes `(units unknown)` when xyzt_units is unset. - Rename `uncompressed size` -> `size` (the loaded array is upcast to float64 anyway, so the longer name overpromised).

The 4D summary line now names the slice axis (from the header's dim_info, or "axis 2 (assumed; no dim_info)" when unset) and the slice acquisition order from slice_code ("unknown" when unset, which is the BIDS norm since SliceTiming lives in the JSON sidecar — already logged by FunctionalMetadata).

kaitj

lgtm 🚀 (for the cross sectional)

I haven't taken a close look at if / how to integrate this into the longitudinal, but any reason to not also include it there for the anatomical and functional workflows?

nx10 · 2026-05-12T15:38:02Z

Adding it to any raw/user provided images in longitudinal would be sensible - any images we create ourselves generally should not need it

kaitj · 2026-05-12T16:26:03Z

Adding it to any raw/user provided images in longitudinal would be sensible - any images we create ourselves generally should not need it

I don't think we currently have a way to check for the creator? If the information is available, I would assume its in the metadata, but I think right now as long as a dataset can be queried by b2t and the file name matches expected entities, it would run. If this is the case, probably safer to handle the majority (or all) the inputs? Typing this, I think the same is potentially true for all the individual workflows (qc, metrics, etc.), though I think those are less likely since more files are required.

Add uncompressed in-memory size to input image summary

4323be4

prod(shape) * dtype.itemsize, formatted with binary units (B/KiB/MiB/GiB). The on-disk .nii.gz size hides this; it's the number that matters when the pipeline loads the array.

nx10 requested a review from kaitj May 11, 2026 23:33

nx10 added 2 commits May 11, 2026 19:40

kaitj reviewed May 12, 2026

View reviewed changes

nx10 merged commit 0ae9776 into main May 12, 2026
8 checks passed

nx10 deleted the log-input-image-metadata branch May 12, 2026 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log raw input image metadata at workflow start#334

Log raw input image metadata at workflow start#334
nx10 merged 4 commits into
mainfrom
log-input-image-metadata

nx10 commented May 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 11, 2026 •

edited

Loading

Uh oh!

kaitj left a comment

Uh oh!

Uh oh!

nx10 commented May 12, 2026

Uh oh!

kaitj commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nx10 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaitj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nx10 commented May 12, 2026

Uh oh!

kaitj commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nx10 commented May 11, 2026 •

edited

Loading

github-actions Bot commented May 11, 2026 •

edited

Loading