You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TODO: Provide description for the dataset -- basic details about the study, possibly pointing to pre-registration (if public or embargoed)
1
+
# A `heudiconv` example for BIDS-Prov
2
+
3
+
This example aims at showing provenance traces from a DICOM to Nifti conversion, performed by [`heudiconv`](https://heudiconv.readthedocs.io/en/latest/) on a Linux-based (Fedora) operating system.
4
+
5
+
## `heudiconv` installation
6
+
7
+
```shell
8
+
pip install heudiconv==1.3.2
9
+
```
10
+
11
+
## Source dataset
12
+
13
+
We get raw data from https://github.com/psychoinformatics-de/hirni-demo.git, a demo datalad dataset containing dicoms.
With this setup we are ready to convert dicoms to nifti files using `heudiconv`.
42
+
43
+
> [!NOTE] Note that we use an already existing heuritic files (`sourcedata/hirni-demo/code/hirni-toolbox/converters/heudiconv/hirni_heuristic.py`). This file needs the `HIRNI_STUDY_SPEC` environment variable to be set (see the following command lines).
We control that the BIDS dataset has been created and that it contains the nifti files.
58
+
59
+
```shell
60
+
ls -1
61
+
CHANGES
62
+
dataset_description.json
63
+
participants.json
64
+
participants.tsv
65
+
README
66
+
scans.json
67
+
sourcedata/
68
+
sub-001/
69
+
task-oneback_bold.json
70
+
71
+
tree sub-001/
72
+
sub-001/
73
+
├── anat
74
+
│ ├── sub-001_run-1_T1w.json
75
+
│ └── sub-001_run-1_T1w.nii.gz
76
+
├── func
77
+
│ ├── sub-001_task-oneback_run-01_bold.json
78
+
│ ├── sub-001_task-oneback_run-01_bold.nii.gz
79
+
│ └── sub-001_task-oneback_run-01_events.tsv
80
+
└── sub-001_scans.tsv
81
+
```
82
+
83
+
## Associated provenance
84
+
85
+
In order to describe provenance records using BIDS Prov, we use:
86
+
87
+
* the `GeneratedBy` field of JSON sidecars, already existing in the BIDS specification;
88
+
* modality agnostic files inside the `prov/` directory as follows:
89
+
90
+
```
91
+
.
92
+
├── prov
93
+
│ ├── prov-dcm2niix_act.prov.json
94
+
│ ├── prov-dcm2niix_base.prov.json
95
+
│ ├── prov-dcm2niix_ent.prov.json
96
+
│ ├── prov-dcm2niix_env.prov.json
97
+
│ └── prov-dcm2niix_soft.prov.json
98
+
└── sub-001
99
+
├── anat
100
+
│ └── sub-001_run-1_T1w.json
101
+
└── func
102
+
└── sub-001_task-oneback_run-01_bold.json
103
+
````
104
+
105
+
## New features for BIDS / BIDS Prov
106
+
107
+
We introduce the following BIDS entity that is currently not existing:
108
+
109
+
* `prov`
110
+
* Full name: Provenance traces
111
+
* Format: `prov-<label>`
112
+
* Definition: A grouping of provenance traces. Defining multiple provenance traces groups is appropriate when several processings have been performed on data.
113
+
114
+
We introduce the following BIDS suffixes that are currently not existing:
115
+
116
+
* `act`: the file describes BIDS Prov Activities for the group of provenance traces
117
+
* `soft`: the file describes BIDS Prov Software for the group of provenance traces
118
+
* `env`: the file describes BIDS Prov Environments for the group of provenance traces
119
+
* `ent`: the file describes BIDS Prov Entities for the group of provenance traces
120
+
* `base`: the file describes common BIDS Prov parameters for the group of provenance traces (version and context for BIDS Prov)
121
+
122
+
We use the `GeneratedBy` field of JSON sidecars to link to Activities that created the file the sidecars refers to.
123
+
124
+
## Merging JSON in a JSON-LD file and plotting graph
125
+
126
+
The python script `code/merge_prov.py` aims at merging all these provenance records into one JSON-LD graph.
127
+
128
+
```shell
129
+
mkdir prov/merged/
130
+
python code/merge_prov.py
131
+
```
132
+
133
+
From that, we generate the JSON-LD graph `prov/merge/prov-heudiconv.prov.jsonld`. Then we were able to plot the graph as a png file. We used this command:
In this example, we rely on the fact that nodes defined in the `prov/*.prov.jsonld` files have `bids::prov/` as base IRIs.
145
+
146
+
The `code/merge_prov.py` code is responsible for:
147
+
* merging the JSON provenance traces into the base JSON-LD graph;
148
+
* create an `Entity` and linking it to the `Activity` described by the `GeneratedBy` field in the case of JSON sidecars.
149
+
150
+
### Limitations
151
+
152
+
1. The `Environments` term is not defined in the current BIDS Prov context, hence we define environments as `Entities`.
153
+
154
+
2. Listing all the DICOM files used by the dcm2niix conversion steps would lower readability of the JSON-LD provenance files. Therefore we only listed the following directories as `Entities`:
although it is not allowed by the current version of the BIDS Prov specification to have directories as `Entities`.
159
+
160
+
3. In this example, the provenance for JSON sidecars files is not described.
161
+
162
+
4. For now, what happens indide `heudiconv` is described as one only activity. We might want to describe the fact that it uses `dcm2niix` as conversion software.
0 commit comments