Skip to content

Add orchestration layer between CLI and workflows #266

@nx10

Description

@nx10

Summary

After the BIDS consolidation (#260, #263, #264, #265), the architecture has clean layers for BIDS naming (bids/), computation (core/ + workflows/), and CLI arg parsing (cli/). However, the CLI modules still contain orchestration logic that doesn't belong there:

  • Applying BIDS table filters (datatype, space, desc for raw vs derivative data)
  • Sub/ses grouping and iteration loops
  • Creating RunContext per subject/session
  • Loading sessions
  • Calling discover -> compute -> export per run
  • Progress reporting (tqdm)
  • Reading TR from NIfTI headers (_read_header_tr in metrics CLI)

The CLI should only parse args and hand off to an orchestration layer.

Proposed approach

Add an orchestration layer that owns the full pipeline loop. The key design challenge is how run_all composes per-session stages (passing anat outputs in-memory to functional, functional outputs to metrics/QC) while standalone workflows run independently.

Possible structure:

src/rbc/
  bids/           # BIDS naming contracts (discover, resolve, export)
  core/           # Processing primitives (ANTs, FSL wrappers)
  workflows/      # Processing step chains (single_session_preprocess, etc.)
  orchestration/  # Full pipeline loops (filter -> group -> discover -> process -> export)
    anatomical.py
    functional.py
    metrics.py
    qc.py
    all.py        # Composes per-session stages with in-memory output passing
  cli/            # Arg parsing only, delegates to orchestration

Each orchestration function would take:

  • input_dir / output_dir
  • A filter specification (participant labels, session labels, task, etc.) rather than a pre-filtered DataFrame
  • Workflow-specific params (regressors, atlases, fwhm, etc.)
  • Runner/verbose config

The filter specification could be a simple dataclass or TypedDict, keeping it decoupled from argparse.

After this, CLI modules become ~20 lines: args dataclass + register_command + a thin main() that constructs filters and calls orchestration.run_*().

Open questions

  • Should the filter spec be a shared type, or just kwargs?
  • Should orchestration/all.py compose per-session functions from the other orchestration modules, or have its own implementation?
  • Should progress reporting (tqdm) live in orchestration or be passed as a callback?
  • Where does _read_header_tr belong? It's used by metrics orchestration but it's really a NIfTI utility.

Context

This follows the PR stack:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions