Additional "output" mode for  bids-validator -- "bids-derivative"

ATM there is two principle modes of output of bids-validator

- text mode -- human oriented, shortened for consumption unless --verbose etc
- json -- machine oriented

Also it could in principle contribute to changing "mental picture" of what bids-validator output is.  Since a few months back I started to "ship" bids-validator output under derivatives/bids-validator, e.g. https://github.com/OpenNeuroDatasets/ds005256/tree/main/derivatives/bids-validator (work with @jungheejung) and came up a few times last week on random occasion(s) e.g. in dialog with @jbpoline 's and my groups (hence attn @asmacdo and @michellewang @nikhil153) 

Moreover, I really think is that `bids-validator` output is nearly **the only derivative for any BIDS (raw or not) dataset worth shipping under that dataset's `derivatives/` folder** (as opposed to outside of that dataset e.g. while composing into a `study` type layout or following YODA). It is because

- those outputs are typically small (large one is the sign of a worry!!!), smaller than MRIQC output etc. 
- they are highly relevant to describe the state of this particular BIDS dataset: Ideally should be just a summary statement stating no errors present and based on which publishers could make claims of "bids compliance"

Cons: there is a problem of ensuring that it is "up to date" with the dataset it is contained in, but it is a common issue for any derivative / raw relation so should be ok.

NB note that `bids-validator` folder name under `derivatives/` is suboptimal since includes `-` and bids recommends `{pipeline}-{flavor}` naming so for a machine it would sound like `validator` flavor of `bids` which is not entirely incorrect but may be we should store indeed under `bids-validator-{version}` and adjust description in bids that `flavor` is the one which must be without `-`?

But with that in mind, and major shortcomings of hard to handle "text mode" output for humans, I think it is very well worth making output of `bids-validator` into a BIDS derivative dataset itself!!!  Then we could rely on having more human accessible summaries within .tsv files etc, potentially with some even more convenient renderers on top but also might be just to let humans run tools like visidata, e.g. here is a sample on ds000221 files:

<img width="1790" height="312" alt="Image" src="https://github.com/user-attachments/assets/5891db2f-60fd-4ed6-ba79-61bfeaa8a465" />

<img width="1780" height="510" alt="Image" src="https://github.com/user-attachments/assets/3581b04a-0c11-461d-b071-eb2638f7b80e" />

TODOs

- [ ] seek feedback
- [ ] provide prototype and examples.
   - [x] crude prototype by claude and yours truly: https://github.com/yarikoptic/bids-validator-derivative (see [code/bids_validator_relayout.py](https://github.com/yarikoptic/bids-validator-derivative/blob/master/code/bids_validator_relayout.py))
   - [x] example outputs on openneuro datasets (ran for now with `--ignoreNiftiHeaders` ... will do later via datalad-fuse): find under https://github.com/yarikoptic/bids-validator-derivatives/tree/master/openneuro  
  -  [ ] share repo with openneuro validator outputs
    - difficulty: can't go into a single git repo. So far totals to 1.1 million of files across openneuro datasets. So we are doomed to create organization for that etc
   - [ ] finish cooking for all openneuro datasets
   - [ ] rerun without `--ignoreNiftiHeaders`
- Potential improvements which came to mind
  - since many if not most records are just repeating across paths, instead of storing per each file, we could identify "unique collections" of issues by removing paths from records.    Then create `issue_collections.tsv` and .json (with exact detailed records) and then per file .json would be just linking or pointing to those specific collection ids.  Will be much more compact and more handy IMHO!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Additional "output" mode for bids-validator -- "bids-derivative" #253

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Additional "output" mode for bids-validator -- "bids-derivative" #253

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions