Metadata-level wiring tests against OpenNeuro BIDS index

## Motivation

The Stage 6 longitudinal PR (#317) exposed three wiring bugs that all passed unit tests but failed in the full pipeline (15-30 min CI round trip each):

1. Anat groupby filtering out mask rows from the resolve DataFrame
2. Metrics orchestration querying a `reg` column that doesn't exist in the BIDS table (extra entities live in `extra_entities`)
3. Session filter propagating to template discovery, hiding the multi-session view needed for longitudinal templates

These are all plumbing bugs between layers (CLI -> orchestration -> BIDS resolve -> DataFrame queries), not logic bugs. Unit tests mock too much to catch them; full_pipeline tests take 20+ minutes and only exercise one dataset.

## Proposal

Build a lightweight "wiring" test tier that runs orchestration logic against **real BIDS metadata from OpenNeuro** without downloading any imaging data or running containers.

### How it works

1. **Pre-compute a bids2table index** for OpenNeuro datasets (just file paths + sidecar metadata, no volumes). Cache as Parquet -- probably a few GB total for the full catalog.
2. **For each dataset**, run through the orchestration wiring with stubbed workflows:
   - `load_table` / `Filters.apply()` / `iter_sessions_with_template`
   - `resolve_*` functions (do the `Bids.expect()` / `Bids.find()` queries succeed?)
   - `export_*` naming (dry-run `Bids.save` that validates the output path without copying)
   - `discover_template_inputs` / `discover_derivative_runs` groupby logic
   - Workflow functions return fake `NamedTuple`s with placeholder paths
3. **Collect failures** as a compatibility matrix: which datasets break which resolve/export paths, and why.

### What it catches

- Entity combinations nobody anticipated (multi-echo + multi-run + session gaps)
- Single-session subjects mixed into multi-session datasets
- Missing sidecars, unusual suffix/desc combos, non-standard naming
- `groupby` logic failing on unexpected null patterns
- `Bids.expect()` queries that work on ds000001/ds000114 but fail on the long tail
- Filter propagation bugs (session/task filters reaching stages that need the full view)

### Scope

- Runs in seconds per dataset (no containers, no NIfTI I/O)
- Could cover thousands of datasets in a single CI job
- Cross-sectional and longitudinal wiring paths
- Separate from the existing test pyramid (unit/integration/full_pipeline)

### Open questions

- Where to host the pre-computed index (artifact in the repo? separate cache?)
- How to handle datasets that are *intentionally* unsupported (e.g., DWI-only, PET)
- Whether to gate PRs on this or run it as a nightly/weekly audit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata-level wiring tests against OpenNeuro BIDS index #318

Motivation

Proposal

How it works

What it catches

Scope

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metadata-level wiring tests against OpenNeuro BIDS index #318

Description

Motivation

Proposal

How it works

What it catches

Scope

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions