Summary
After the BIDS consolidation (#260, #263, #264, #265), the architecture has clean layers for BIDS naming (bids/), computation (core/ + workflows/), and CLI arg parsing (cli/). However, the CLI modules still contain orchestration logic that doesn't belong there:
- Applying BIDS table filters (datatype, space, desc for raw vs derivative data)
- Sub/ses grouping and iteration loops
- Creating
RunContext per subject/session
- Loading sessions
- Calling discover -> compute -> export per run
- Progress reporting (tqdm)
- Reading TR from NIfTI headers (
_read_header_tr in metrics CLI)
The CLI should only parse args and hand off to an orchestration layer.
Proposed approach
Add an orchestration layer that owns the full pipeline loop. The key design challenge is how run_all composes per-session stages (passing anat outputs in-memory to functional, functional outputs to metrics/QC) while standalone workflows run independently.
Possible structure:
src/rbc/
bids/ # BIDS naming contracts (discover, resolve, export)
core/ # Processing primitives (ANTs, FSL wrappers)
workflows/ # Processing step chains (single_session_preprocess, etc.)
orchestration/ # Full pipeline loops (filter -> group -> discover -> process -> export)
anatomical.py
functional.py
metrics.py
qc.py
all.py # Composes per-session stages with in-memory output passing
cli/ # Arg parsing only, delegates to orchestration
Each orchestration function would take:
input_dir / output_dir
- A filter specification (participant labels, session labels, task, etc.) rather than a pre-filtered DataFrame
- Workflow-specific params (regressors, atlases, fwhm, etc.)
- Runner/verbose config
The filter specification could be a simple dataclass or TypedDict, keeping it decoupled from argparse.
After this, CLI modules become ~20 lines: args dataclass + register_command + a thin main() that constructs filters and calls orchestration.run_*().
Open questions
- Should the filter spec be a shared type, or just kwargs?
- Should
orchestration/all.py compose per-session functions from the other orchestration modules, or have its own implementation?
- Should progress reporting (tqdm) live in orchestration or be passed as a callback?
- Where does
_read_header_tr belong? It's used by metrics orchestration but it's really a NIfTI utility.
Context
This follows the PR stack:
Summary
After the BIDS consolidation (#260, #263, #264, #265), the architecture has clean layers for BIDS naming (
bids/), computation (core/+workflows/), and CLI arg parsing (cli/). However, the CLI modules still contain orchestration logic that doesn't belong there:RunContextper subject/session_read_header_trin metrics CLI)The CLI should only parse args and hand off to an orchestration layer.
Proposed approach
Add an orchestration layer that owns the full pipeline loop. The key design challenge is how
run_allcomposes per-session stages (passing anat outputs in-memory to functional, functional outputs to metrics/QC) while standalone workflows run independently.Possible structure:
Each orchestration function would take:
input_dir/output_dirThe filter specification could be a simple dataclass or TypedDict, keeping it decoupled from argparse.
After this, CLI modules become ~20 lines: args dataclass +
register_command+ a thinmain()that constructs filters and callsorchestration.run_*().Open questions
orchestration/all.pycompose per-session functions from the other orchestration modules, or have its own implementation?_read_header_trbelong? It's used by metrics orchestration but it's really a NIfTI utility.Context
This follows the PR stack: