|
| 1 | +# AGENTS.md -- XCP-D |
| 2 | + |
| 3 | +This file provides instructions for AI coding agents and human maintainers working on **XCP-D**, a BIDS App for robust postprocessing of fMRI data. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## Shared Instructions (All PennLINC BIDS Apps) |
| 8 | + |
| 9 | +The following conventions apply equally to **qsiprep**, **qsirecon**, **xcp_d**, and **aslprep**. All four are PennLINC BIDS Apps built on the NiPreps stack. |
| 10 | + |
| 11 | +### Ecosystem Context |
| 12 | + |
| 13 | +- These projects belong to the [NiPreps](https://www.nipreps.org/) ecosystem and follow its community guidelines. |
| 14 | +- Core dependencies include **nipype** (workflow engine), **niworkflows** (reusable workflow components), **nireports** (visual reports), **pybids** (BIDS dataset querying), and **nibabel** (neuroimaging I/O). |
| 15 | +- All four apps are containerized via Docker and distributed on Docker Hub under the `pennlinc/` namespace. |
| 16 | +- Contributions follow the [NiPreps contributing guidelines](https://www.nipreps.org/community/CONTRIBUTING/). |
| 17 | + |
| 18 | +### Architecture Overview |
| 19 | + |
| 20 | +Every PennLINC BIDS App follows this execution flow: |
| 21 | + |
| 22 | +``` |
| 23 | +CLI (parser.py / run.py) |
| 24 | + -> config singleton (config.py, serialized as ToML) |
| 25 | + -> workflow graph construction (workflows/*.py) |
| 26 | + -> Nipype interfaces (interfaces/*.py) |
| 27 | + -> BIDS-compliant derivative outputs |
| 28 | +``` |
| 29 | + |
| 30 | +- **CLI** (`<pkg>/cli/`): `parser.py` defines argparse arguments; `run.py` is the entry point; `workflow.py` builds the execution graph; `version.py` handles `--version`. |
| 31 | +- **Config** (`<pkg>/config.py`): A singleton module with class-based sections (`environment`, `execution`, `workflow`, `nipype`, `seeds`). Config is serialized to ToML and passed across processes via the filesystem. Access settings as `config.section.setting`. |
| 32 | +- **Workflows** (`<pkg>/workflows/`): Built using `nipype.pipeline.engine` (`pe.Workflow`, `pe.Node`, `pe.MapNode`). Use `LiterateWorkflow` from `niworkflows.engine.workflows` for auto-documentation. Every workflow factory function must be named `init_<descriptive_name>_wf`. |
| 33 | +- **Interfaces** (`<pkg>/interfaces/`): Custom Nipype interfaces wrapping external tools or Python logic. Follow standard Nipype patterns: define `_InputSpec` / `_OutputSpec` with `BaseInterfaceInputSpec` / `TraitedSpec`, implement `_run_interface()`. |
| 34 | +- **Utilities** (`<pkg>/utils/`): Shared helper functions. BIDS-specific helpers live in `utils/bids.py`. |
| 35 | +- **Reports** (`<pkg>/reports/`): HTML report generation using nireports. |
| 36 | +- **Data** (`<pkg>/data/`): Static package data (config files, templates, atlases). Accessed via `importlib.resources` or the `acres` package. |
| 37 | +- **Tests** (`<pkg>/tests/`): Pytest-based. Unit tests run without external data. Integration tests are gated behind pytest markers and are skipped by default. |
| 38 | + |
| 39 | +### Workflow Authoring Rules |
| 40 | + |
| 41 | +1. Every workflow factory function must be named `init_<name>_wf` and return a `Workflow` object. |
| 42 | +2. Use `LiterateWorkflow` (from `niworkflows.engine.workflows`) to enable automatic workflow graph documentation. |
| 43 | +3. Define `inputnode` and `outputnode` as `niu.IdentityInterface` nodes to declare the workflow's external API. |
| 44 | +4. Connect nodes using `workflow.connect([(source, dest, [('out_field', 'in_field')])])` syntax. |
| 45 | +5. Add `# fmt:skip` after multi-line `workflow.connect()` calls to prevent ruff from reformatting them. |
| 46 | +6. Include a docstring with `Workflow Graph` and `.. workflow::` Sphinx directive for auto-generated documentation. |
| 47 | +7. Use `config` module values (not function parameters) for global settings inside workflow builders. |
| 48 | + |
| 49 | +### Interface Conventions |
| 50 | + |
| 51 | +1. Input/output specs use Nipype traits (`File`, `traits.Bool`, `traits.Int`, etc.). |
| 52 | +2. `mandatory = True` for required inputs; provide `desc=` for all traits. |
| 53 | +3. Implement `_run_interface(self, runtime)` -- never `run()`. |
| 54 | +4. Return `runtime` from `_run_interface`. |
| 55 | +5. Set outputs via `self._results['field'] = value`. |
| 56 | + |
| 57 | +### Config Module Usage |
| 58 | + |
| 59 | +```python |
| 60 | +from <pkg> import config |
| 61 | + |
| 62 | +# Read a setting |
| 63 | +work_dir = config.execution.work_dir |
| 64 | + |
| 65 | +# Serialize to disk |
| 66 | +config.to_filename(path) |
| 67 | + |
| 68 | +# Load from disk (in a subprocess) |
| 69 | +config.load(path) |
| 70 | +``` |
| 71 | + |
| 72 | +The config module is the single source of truth for runtime parameters. Never pass global settings as function arguments when they are available via config. |
| 73 | + |
| 74 | +### Testing Conventions |
| 75 | + |
| 76 | +- **Unit tests**: Files named `test_*.py` in `<pkg>/tests/`. Must not require external neuroimaging data or network access. |
| 77 | +- **Integration tests**: Decorated with `@pytest.mark.<marker_name>`. Excluded by default via `addopts` in `pyproject.toml`. Require Docker or pre-downloaded test datasets. |
| 78 | +- **Fixtures**: Defined in `conftest.py`. Common fixtures include `data_dir`, `working_dir`, `output_dir`, and `datasets`. |
| 79 | +- **Coverage**: Configured in `pyproject.toml` under `[tool.coverage.run]` and `[tool.coverage.report]`. |
| 80 | + |
| 81 | +### Documentation |
| 82 | + |
| 83 | +- Built with Sphinx using `sphinx_rtd_theme`. |
| 84 | +- Source files in `docs/`. |
| 85 | +- Workflow graphs are auto-rendered via `.. workflow::` directives that call `init_*_wf` functions. |
| 86 | +- API docs generated via `sphinxcontrib-apidoc`. |
| 87 | +- Bibliography managed with `sphinxcontrib-bibtex` and `boilerplate.bib`. |
| 88 | + |
| 89 | +### Docker |
| 90 | + |
| 91 | +- Each app has a custom base image: `pennlinc/<pkg>_build:<version>`. |
| 92 | +- The `Dockerfile` installs the app via `pip install` into the base image. |
| 93 | +- Entrypoint is the CLI command (e.g., `/usr/local/miniconda/bin/<pkg>`). |
| 94 | +- Labels follow the `org.label-schema` convention. |
| 95 | + |
| 96 | +### Release Process |
| 97 | + |
| 98 | +- Versions are derived from git tags via `hatch-vcs` (VCS-based versioning). |
| 99 | +- GitHub Releases use auto-generated changelogs configured in `.github/release.yml`. |
| 100 | +- Release categories: Breaking Changes, New Features, Deprecations, Bug Fixes, Other. |
| 101 | +- Docker images are built and pushed via CI on tagged releases. |
| 102 | + |
| 103 | +### Code Style |
| 104 | + |
| 105 | +- **Formatter**: `ruff format` (target: all four repos). |
| 106 | +- **Linter**: `ruff check` with an extended rule set (F, E, W, I, UP, YTT, S, BLE, B, A, C4, DTZ, T10, EXE, FA, ISC, ICN, PT, Q). |
| 107 | +- **Import sorting**: Handled by ruff's `I` rule (isort-compatible). |
| 108 | +- **Pre-commit**: Uses `ruff-pre-commit` hooks for both linting and formatting. |
| 109 | +- **Black is disabled**: `[tool.black] exclude = ".*"` in repos that have migrated to ruff. |
| 110 | + |
| 111 | +### BIDS Compliance |
| 112 | + |
| 113 | +- All outputs must conform to the [BIDS Derivatives](https://bids-specification.readthedocs.io/en/stable/derivatives/introduction.html) specification. |
| 114 | +- Use `pybids.BIDSLayout` for querying input datasets. |
| 115 | +- Use `DerivativesDataSink` (from the project's interfaces or niworkflows) for writing BIDS-compliant output files. |
| 116 | +- Entity names, suffixes, and extensions must match the BIDS specification. |
| 117 | + |
| 118 | +--- |
| 119 | + |
| 120 | +## XCP-D-Specific Instructions |
| 121 | + |
| 122 | +### Project Overview |
| 123 | + |
| 124 | +XCP-D is a BIDS App for postprocessing fMRI data that has been preprocessed by fMRIPrep, nibabies, or similar pipelines. It handles: |
| 125 | +- Confound regression and nuisance signal removal |
| 126 | +- Temporal filtering (band-pass, motion filtering) |
| 127 | +- Framewise displacement-based censoring (scrubbing) |
| 128 | +- Despiking |
| 129 | +- Smoothing |
| 130 | +- Parcellation and connectivity matrix generation |
| 131 | +- Surface processing (CIFTI/GIFTI workflows) |
| 132 | +- Concatenation of runs within task entity sets |
| 133 | +- Executive summary HTML report generation |
| 134 | +- Data ingression from non-BIDS formats (ABCD-BIDS, HCP-YA, UK Biobank) |
| 135 | + |
| 136 | +### Repository Details |
| 137 | + |
| 138 | +| Item | Value | |
| 139 | +|------|-------| |
| 140 | +| Package name | `xcp_d` | |
| 141 | +| Default branch | `main` | |
| 142 | +| Entry point | `xcp_d.cli.run:main` | |
| 143 | +| Python requirement | `>=3.10` | |
| 144 | +| Build backend | hatchling + hatch-vcs | |
| 145 | +| Linter | ruff ~= 0.15.0 | |
| 146 | +| Pre-commit | Yes (ruff v0.6.2) | |
| 147 | +| Tox | Yes | |
| 148 | +| Docker base | `pennlinc/xcp_d_build:<ver>` | |
| 149 | +| Dockerfile | Simple COPY + pip install | |
| 150 | + |
| 151 | +### Key Directories |
| 152 | + |
| 153 | +- `xcp_d/workflows/bold/`: BOLD postprocessing workflows for NIfTI and CIFTI data, plus run concatenation |
| 154 | +- `xcp_d/workflows/anatomical/`: Anatomical postprocessing (surfaces, volumes, parcellation) |
| 155 | +- `xcp_d/workflows/parcellation.py`: Atlas loading and parcellation workflow |
| 156 | +- `xcp_d/workflows/plotting.py`: QC plot generation workflows |
| 157 | +- `xcp_d/interfaces/`: Nipype interfaces for censoring, connectivity, concatenation, ANTs, nilearn, workbench, executive summary, plotting |
| 158 | +- `xcp_d/ingression/`: Modules for ingressing data from non-BIDS formats (ABCD-BIDS, HCP-YA, UK Biobank) |
| 159 | +- `xcp_d/data/atlases/`: Bundled brain atlases (Glasser, Gordon, HCP, Tian, MIDB, MyersLabonte) |
| 160 | +- `xcp_d/data/nuisance/`: YAML configs for nuisance regression strategies |
| 161 | +- `xcp_d/data/executive_summary_templates/`: Jinja2 HTML templates for the executive summary report |
| 162 | + |
| 163 | +### Version Management |
| 164 | + |
| 165 | +XCP-D uses `__about__.py` for version metadata: |
| 166 | +```python |
| 167 | +from xcp_d.__about__ import __copyright__, __credits__, __packagename__, __version__ |
| 168 | +``` |
| 169 | +This is different from qsiprep/qsirecon which import `__version__` directly from `_version.py`. Both patterns work; harmonization is a roadmap item. |
| 170 | + |
| 171 | +### Linting Status (Reference for Other Repos) |
| 172 | + |
| 173 | +XCP-D has the **cleanest ruff configuration** of all four repos, with only 3 suppressed rules: |
| 174 | +- `S105`: Hardcoded password detection (false positives) |
| 175 | +- `S311`: Random not for crypto (intentional) |
| 176 | +- `S603`: Subprocess with shell=True (trusted commands only) |
| 177 | + |
| 178 | +This minimal ignore set is the target for the other three repos. |
| 179 | + |
| 180 | +### Ingression Modules |
| 181 | + |
| 182 | +`xcp_d/ingression/` contains adapters for non-BIDS input formats: |
| 183 | +- `abcdbids.py`: ABCD-BIDS format |
| 184 | +- `hcpya.py`: HCP Young Adult format |
| 185 | +- `ukbiobank.py`: UK Biobank format |
| 186 | +- `utils.py`: Shared ingression utilities |
| 187 | + |
| 188 | +When adding support for new input formats, create a new module in this directory following the existing patterns. |
| 189 | + |
| 190 | +### Executive Summary |
| 191 | + |
| 192 | +XCP-D generates rich HTML executive summary reports using Jinja2 templates in `xcp_d/data/executive_summary_templates/`. These include: |
| 193 | +- BrainSprite interactive brain viewers |
| 194 | +- Anatomical registration quality plots |
| 195 | +- Task-specific static QC plots |
| 196 | + |
| 197 | +The `xcp_d/interfaces/execsummary.py` interface and `xcp_d/utils/execsummary.py` utilities handle generation. |
| 198 | + |
| 199 | +### `fill_doc` Decorator |
| 200 | + |
| 201 | +XCP-D uses a `@fill_doc` decorator (from `xcp_d.utils.doc`) to inject shared parameter documentation into function docstrings. Use it on any workflow init function that accepts standard parameters documented in the shared parameter dictionary. |
| 202 | + |
| 203 | +### Config Hashing |
| 204 | + |
| 205 | +XCP-D implements config hashing (`config.hash_config()`) to create unique identifiers for pipeline configurations. This is used to detect when outputs need to be regenerated due to parameter changes. |
| 206 | + |
| 207 | +--- |
| 208 | + |
| 209 | +## Cross-Project Development Roadmap |
| 210 | + |
| 211 | +This roadmap covers harmonization work across all four PennLINC BIDS Apps (qsiprep, qsirecon, xcp_d, aslprep) to reduce maintenance burden. |
| 212 | + |
| 213 | +### Phase 1: Bring qsirecon to parity |
| 214 | + |
| 215 | +1. **Migrate qsirecon from flake8+black+isort to ruff** -- copy the `[tool.ruff]` config from xcp_d's `pyproject.toml` and remove `[tool.black]`, `[tool.isort]`, `[tool.flake8]` sections. |
| 216 | +2. **Add `.pre-commit-config.yaml` to qsirecon** -- identical to the config used by qsiprep, xcp_d, and aslprep. |
| 217 | +3. **Add `tox.ini` to qsirecon** -- copy from qsiprep or xcp_d (they are identical). |
| 218 | +4. **Add `.github/dependabot.yml` to qsirecon**. |
| 219 | +5. **Reformat qsirecon codebase** -- run `ruff format` to switch from double quotes to single quotes. |
| 220 | + |
| 221 | +### Phase 2: Standardize across all four repos |
| 222 | + |
| 223 | +6. **Rename qsiprep default branch** from `master` to `main` and update `.github/workflows/lint.yml`. |
| 224 | +7. **Rename aslprep test extras** from `test` to `tests` for consistency with the other three repos. |
| 225 | +8. **Converge on version management** -- recommend the simpler `_version.py` direct-import pattern (used by qsiprep/qsirecon). Migrate xcp_d and aslprep away from `__about__.py`. |
| 226 | +9. **Pin the same ruff version** in all four repos' dev dependencies and `.pre-commit-config.yaml`. |
| 227 | +10. **Harmonize ruff ignore lists** -- adopt xcp_d's minimal set (`S105`, `S311`, `S603`) as the target; fix suppressed rules in qsiprep and aslprep incrementally. |
| 228 | + |
| 229 | +### Phase 3: Shared infrastructure |
| 230 | + |
| 231 | +11. **Extract a reusable GitHub Actions workflow** for lint + codespell + build checks, hosted in a shared repo (e.g., `PennLINC/.github`). |
| 232 | +12. **Standardize Dockerfile patterns** -- adopt multi-stage wheel builds (as qsiprep does) across all four repos. |
| 233 | +13. **Create a shared `pennlinc-style` package or cookiecutter template** providing `pyproject.toml` lint/test config, `.pre-commit-config.yaml`, `tox.ini`, and CI workflows. |
| 234 | +14. **Evaluate `nipreps-versions` calver** -- the `raw-options = { version_scheme = "nipreps-calver" }` line is commented out in all four repos. Decide whether to adopt it. |
| 235 | + |
0 commit comments