Skip to content

Commit 7597be4

Browse files
committed
Adding todos
1 parent a150714 commit 7597be4

File tree

1 file changed

+72
-48
lines changed

1 file changed

+72
-48
lines changed

bep028spec.md

Lines changed: 72 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@ Provenance comes up in many different contexts in BIDS. This specification focus
3737
1. The raw conversion from DICOM images or other instrument native formats to BIDS layout, details of stimulus presentation and cognitive paradigms, and clinical and neuropsychiatric assessments, each come with their own details of provenance.
3838
2. In BIDS derivatives, the consideration of outputs requires knowledge of which inputs from the BIDS dataset were used together with what software was run in what environment and with what parameters.
3939

40-
TODO: those above should be covered with their own example
40+
> [!CAUTION]
41+
> TODO: those above should be covered with their own example
4142
4243
But provenance comes up in other contexts as well, which might be addressed at a later stage:
4344

@@ -127,15 +128,15 @@ A skeleton for a BIDS-Prov JSON-LD file looks like this:
127128
<tr>
128129
<td><code>Records</code>
129130
</td>
130-
<td>REQUIRED. A list of provenance records (Activity, Entity, Agent, Environement), describing the provenance (see the <a href="#2-provenance-records">Provenance records</a> section below).
131+
<td>REQUIRED. A list of provenance records (Activity, Entity, Agent, Environement), describing the provenance (see the <a href="#2-provenance-records">2. Provenance records</a> section below).
131132
</td>
132133
</tr>
133134
</table>
134135

135136
BIDS-Prov allows this skeleton to be splitted into several *JSON* files. This is described in sections [3.1.3 Suffixes](#3-1-3-suffixes)
136137
and [3.2 Provenance description levels](#3-2-provenance-description-levels).
137138

138-
Using tools provided by BIDS-Prov ([5 Tools](#5-tools)), these JSON contents can be merged back to a structured JSON-LD as described above.
139+
Using tools provided by BIDS-Prov ([5. Tools](#5-tools)), these JSON contents can be merged back to a structured JSON-LD as described above.
139140

140141
> [!NOTE]
141142
> Since the JSON-LD documents are graph objects, they can be aggregated using RDF tools without the need to apply the inheritance principle.
@@ -158,6 +159,11 @@ Activities represent the transformations that have been applied to the data. Eac
158159
### 2.1 Activity
159160
Each Activity record is a JSON Object with the following fields:
160161

162+
> [!CAUTION]
163+
> TODO: AssociatedWith and Used can also entirely describe the Agent (resp. Entity)
164+
> TODO: AssociatedWith and Used can be lists
165+
> TODO: Can an Activity represent a group of command lines ? If so, Command can be a list
166+
161167
<table>
162168
<tr>
163169
<td><strong>Key name</strong>
@@ -198,7 +204,7 @@ Each Activity record is a JSON Object with the following fields:
198204
<tr>
199205
<td><code>Type</code>
200206
</td>
201-
<td>OPTIONAL. URI. A term from a controlled vocabulary that more specifically describes the activity.
207+
<td>OPTIONAL. URI. A term from a controlled vocabulary that more specifically describes the Activity.
202208
</td>
203209
</tr>
204210
<tr>
@@ -235,6 +241,10 @@ Here is an example of an Activity record:
235241
### 2.2 Entity
236242
Each Entity record is a JSON Object with the following fields:
237243

244+
> [!CAUTION]
245+
> TODO: GeneratedBy can also entirely describe the Activity
246+
> TODO: GeneratedBy can be a list
247+
238248
<table>
239249
<tr>
240250
<td><strong>Key name</strong>
@@ -269,7 +279,7 @@ Each Entity record is a JSON Object with the following fields:
269279
<tr>
270280
<td><code>Type</code>
271281
</td>
272-
<td>OPTIONAL. URI. A term from a controlled vocabulary that more specifically describes the entity.
282+
<td>OPTIONAL. URI. A term from a controlled vocabulary that more specifically describes the Entity.
273283
</td>
274284
</tr>
275285
<tr>
@@ -297,6 +307,10 @@ Here is an example of an Entity record:
297307
### 2.3 Agent (Optional)
298308
Agent records are OPTIONAL. If included, each Agent record is a JSON Object with the following fields:
299309

310+
> [!CAUTION]
311+
> TODO: do we need a Type field for Agent?
312+
> TODO: shall we use `Software`, `Agent`, `SoftwareAgent` ?
313+
300314
<table>
301315
<tr>
302316
<td><strong>Key name</strong>
@@ -343,6 +357,10 @@ Here is an example of an Agent record:
343357
### 2.4 Environment (Optional)
344358
Environment records are OPTIONAL. If included, each Environment record is a JSON Object with the following fields:
345359

360+
> [!CAUTION]
361+
> TODO: do we need a Type field for Environment?
362+
> TODO: Environment not currently defined in the BIDS-Prov context
363+
346364
<table>
347365
<tr>
348366
<td><strong>Key name</strong>
@@ -545,7 +563,8 @@ If the `SidecarGenearatedBy` field is not defined, BIDS-Prov assumes that the si
545563

546564
No other field is allowed to describe provenance inside sidecar JSONs.
547565

548-
TODO: where are the @context and BIDSProvVersion ?
566+
> [!CAUTION]
567+
> TODO: where are the @context and BIDSProvVersion ?
549568
550569
#### 3.2.2 Subdirectories level provenance
551570

@@ -581,7 +600,8 @@ Here is an example dataset tree:
581600
└─ dataset_description.json
582601
```
583602

584-
TODO: where are the @context and BIDSProvVersion ?
603+
> [!CAUTION]
604+
> TODO: where are the @context and BIDSProvVersion ?
585605
586606
#### 3.2.3 Dataset level provenance - `prov/` directory
587607

@@ -648,8 +668,8 @@ Here is an example of a `GeneratedByProv` field containing the IRI of an `Entity
648668
"GeneratedByProv": "bids::#conversion-00f3a18f"
649669
}
650670
```
651-
652-
TODO: where are the @context and BIDSProvVersion ?
671+
> [!CAUTION]
672+
> TODO: where are the @context and BIDSProvVersion ?
653673
654674
## 4. Examples
655675

@@ -755,37 +775,41 @@ Simple answer : NO. BIDS-prov has been designed for provenance records to be sha
755775
If you have Activity 1 and Entity 1 defined in a provenance file called init.json, this file can look like the following
756776

757777
```JSON
758-
"prov:Activity": [
778+
"Activity": [
759779
{
760-
"@id": "niiri:init",
761-
"label": "Do some init",
762-
"command": "python -m my_module.init --weights '[0, 1]'",
763-
"parameters": {
780+
"Id": "niiri:init",
781+
"Label": "Do some init",
782+
"Command": "python -m my_module.init --weights '[0, 1]'",
783+
"Parameters": {
764784
"weights" : [0, 1]
765785
},
766-
"startedAtTime": "2020-10-10T10:00:00",
767-
"used": "niiri:bids_data1"
768-
},
786+
"StartedAtTime": "2020-10-10T10:00:00",
787+
"Used": "niiri:bids_data1"
788+
}
769789
],
770790

771-
"prov:Entity": [
772-
{"@id": "niiri:bids_data1", "label": "Bids dataset 1", "prov:atLocation": "data/bids_root"}
791+
"Entity": [
792+
{
793+
"Id": "niiri:bids_data1",
794+
"Label": "Bids dataset 1",
795+
"AtLocation": "data/bids_root"
796+
}
773797
]
774798
```
775799

776800
Now if we want `Entity 2` defined in `preproc.json` to also have a "wasGeneratedBy" field referencing "Activity 1" from `init.json`, we can simply write the following
777801

778802
```JSON
779-
"prov:Entity": [
803+
"Entity": [
780804
{
781-
"@id": "niiri:bids_data1",
782-
"label": "Bids dataset 1",
783-
"wasGeneratedBy": "niiri:init"
805+
"Id": "niiri:bids_data1",
806+
"Label": "Bids dataset 1",
807+
"GeneratedBy": "niiri:init"
784808
}
785809
]
786810
```
787811

788-
Needless to say, both `init.json` and `preproc.json` must have the reference the same context file (in a "@context" field at the very top)
812+
Needless to say, both `init.json` and `preproc.json` must have the reference the same context file (in a `"@context"` field at the very top)
789813

790814
#### I want to track provenance for subject-level analysis, should I declare a prov file per subject ?
791815

@@ -794,16 +818,16 @@ You can create a single prov file for every subject. Yet another option is to us
794818
Files for different subjects usually share common prefixes and extensions.
795819

796820
```JSON
797-
"prov:Entity": [
821+
"Entity": [
798822
{
799-
"@id": "niiri:hfdhbfd",
800-
"label": "anat raw files",
801-
"prov:atLocation": "sub-*/anat/sub-*_T1w.nii.gz"
823+
"Id": "niiri:hfdhbfd",
824+
"Label": "anat raw files",
825+
"AtLocation": "sub-*/anat/sub-*_T1w.nii.gz"
802826
},
803827
{
804-
"@id": "niiri:fdhbfd",
805-
"label": "func raw files",
806-
"prov:atLocation": "sub-*/func/sub-*_task-tonecounting_bold.nii.gz"
828+
"Id": "niiri:fdhbfd",
829+
"Label": "func raw files",
830+
"AtLocation": "sub-*/func/sub-*_task-tonecounting_bold.nii.gz"
807831
}
808832
]
809833
```
@@ -812,41 +836,41 @@ Files for different subjects usually share common prefixes and extensions.
812836

813837
An example of this can be [fMRIPrep](https://fmriprep.org/en/stable/index.html), which can be launched as a docker container.
814838

815-
The most simplistic way you can think of is to have this container "black-boxed" in your workflow. You basically record the calling of this container (`command` section) and the output (see the outputs section from fMRIPrep)
839+
The most simplistic way you can think of is to have this container "black-boxed" in your workflow. You basically record the calling of this container (`Command` section) and the output (see the outputs section from fMRIPrep)
816840

817841
```JSON
818-
"prov:Activity": [
842+
"Activity": [
819843
{
820-
"@id": "niiri:fMRIPrep1",
821-
"label": "fMRIPrep step",
822-
"command": "fmriprep data/bids_root/ out/ participant -w work/",
823-
"parameters": {
844+
"Id": "niiri:fMRIPrep1",
845+
"Label": "fMRIPrep step",
846+
"Command": "fmriprep data/bids_root/ out/ participant -w work/",
847+
"Parameters": {
824848
"bids_dir" : "data/bids_root",
825849
"output_dir" : "out/",
826850
"anaysis_level" : "participant"
827851
},
828-
"used": "niiri:bids_data1"
852+
"Used": "niiri:bids_data1"
829853
}
830854
],
831-
"prov:Entity": [
855+
"Entity": [
832856
{
833-
"@id": "niiri:bids_data1",
834-
"label": "Bids dataset 1",
835-
"prov:atLocation": "data/bids_root"
857+
"Id": "niiri:bids_data1",
858+
"Label": "Bids dataset 1",
859+
"AtLocation": "data/bids_root"
836860
},
837861
{
838-
"@id": "niiri:fmri_prep_output1",
839-
"label": "FMRI prep output 1",
840-
"prov:atLocation": "out/",
841-
"generatedAt": "2019-10-10T10:00:00",
842-
"wasGeneratedBy": "niiri:fMRIPrep1"
862+
"Id": "niiri:fmri_prep_output1",
863+
"Label": "FMRI prep output 1",
864+
"AtLocation": "out/",
865+
"GeneratedAt": "2019-10-10T10:00:00",
866+
"GeneratedBy": "niiri:fMRIPrep1"
843867
}
844868
]
845869
```
846870

847871
#### What if I have a group of tasks, belonging to a subgroup of tasks ?
848872

849-
You can the `prov-O` isPartOf relationship to add an extra link to you activity
873+
You can use the PROV-O `isPartOf` relationship to add an extra link to you activity
850874

851875
```JSON
852876
"prov:Activity": [

0 commit comments

Comments
 (0)