Skip to content

Commit 5a357dd

Browse files
authored
Merge pull request #2162 from effigies/feat/column-unification
[ENH] Deprecate 89+ string for default age column, increase expressiveness of column definitions in sidecar files
2 parents 631ab88 + 1162a5b commit 5a357dd

File tree

4 files changed

+76
-12
lines changed

4 files changed

+76
-12
lines changed

src/common-principles.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -545,16 +545,21 @@ and a guide for using macros can be found at
545545
"RECOMMENDED",
546546
"The description of the column.",
547547
),
548+
"Format": "OPTIONAL",
548549
"Levels": "RECOMMENDED",
549550
"Units": "RECOMMENDED",
550551
"Delimiter": "OPTIONAL",
551552
"TermURL": "RECOMMENDED",
552553
"HED": "OPTIONAL",
554+
"Maximum": "OPTIONAL",
555+
"Minimum": "OPTIONAL",
553556
}
554557
) }}
555558

556559
Please note that while both `Units` and `Levels` are RECOMMENDED, typically only one
557560
of these two fields would be specified for describing a single TSV file column.
561+
In the absence of `Format`, `Units` implies the column contains numeric values,
562+
and `Levels` implies the column contains strings.
558563

559564
Example:
560565

@@ -563,6 +568,7 @@ Example:
563568
"test": {
564569
"LongName": "Education level",
565570
"Description": "Education level, self-rated by participant",
571+
"Format": "integer",
566572
"Levels": {
567573
"1": "Finished primary school",
568574
"2": "Finished secondary school",
@@ -572,7 +578,9 @@ Example:
572578
},
573579
"bmi": {
574580
"LongName": "Body mass index",
581+
"Format": "number",
575582
"Units": "kg/m^2",
583+
"Minimum": 0,
576584
"TermURL": "https://purl.bioontology.org/ontology/SNOMEDCT/60621009"
577585
}
578586
}

src/schema/objects/columns.yaml

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,14 @@ age:
3838
description: |
3939
Numeric value in years (float or integer value).
4040
41-
It is recommended to tag participant ages that are 89 or higher as 89+,
42-
for privacy purposes.
41+
For privacy purposes, participant ages should be capped at 89.
42+
Using "89+" for ages above 88 is DEPRECATED.
4343
definition: {
4444
"LongName": "Subject age",
4545
"Description": "Subject age in postnatal years",
46+
"Format": "number",
4647
"Units": "year",
48+
"Maximum": 89,
4749
}
4850
cardiac:
4951
name: cardiac
@@ -52,6 +54,7 @@ cardiac:
5254
continuous pulse measurement
5355
definition: {
5456
"Description": "continuous pulse measurement",
57+
"Format": "number",
5558
"Units": "mV"
5659
}
5760
channel:
@@ -101,8 +104,7 @@ detector_type:
101104
display_name: Detector Type
102105
description: |
103106
The type of detector. Only to be used if the field `DetectorType` in `*_nirs.json` is set to `mixed`.
104-
anyOf:
105-
- type: string
107+
type: string
106108
derived_from:
107109
name: derived_from
108110
display_name: Derived from
@@ -176,15 +178,14 @@ group__channel:
176178
Which group of channels (grid/strip/seeg/depth) this channel belongs to.
177179
This is relevant because one group has one cable-bundle and noise can be shared.
178180
This can be a name or number.
179-
anyOf:
180-
- type: string
181-
- type: number
181+
type: string
182182
handedness:
183183
name: handedness
184184
display_name: Subject handedness
185185
definition: {
186186
"LongName": "Subject handedness",
187187
"Description": "String value indicating one of \"left\", \"right\", or \"ambidextrous\".",
188+
"Format": "string",
188189
"Levels": {
189190
"left": "Left-handed",
190191
"l": "Left-handed",
@@ -362,7 +363,8 @@ pathology:
362363
The pathology may be specified in either `samples.tsv` or
363364
`sessions.tsv`, depending on whether the pathology changes over time.
364365
definition: {
365-
"Description": "Description of the pathology of the sample or type of control."
366+
"Description": "Description of the pathology of the sample or type of control.",
367+
"Format": "string",
366368
}
367369
participant_id:
368370
name: participant_id
@@ -418,6 +420,7 @@ respiratory:
418420
continuous breathing measurement
419421
definition: {
420422
"Description": "continuous measurements by respiration belt",
423+
"Format": "number",
421424
"Units": "mV"
422425
}
423426
response_time:
@@ -483,6 +486,7 @@ sex:
483486
definition: {
484487
"LongName": "sex",
485488
"Description": "String value indicating phenotypical sex.",
489+
"Format": "string",
486490
"Levels": {
487491
"F": "Female",
488492
"FEMALE": "Female",
@@ -537,8 +541,7 @@ source__optodes:
537541
display_name: Source type
538542
description: |
539543
The type of source. Only to be used if the field `SourceType` in `*_nirs.json` is set to `mixed`.
540-
anyOf:
541-
- type: string
544+
type: string
542545
species:
543546
name: species
544547
display_name: Species
@@ -550,7 +553,8 @@ species:
550553
`homo sapiens`.
551554
definition: {
552555
"Description":
553-
"binomial species name from the NCBI Taxonomy (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi)"
556+
"Latin binomial species name from the NCBI Taxonomy (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi).",
557+
"Format": "string",
554558
}
555559
status:
556560
name: status
@@ -592,7 +596,8 @@ strain:
592596
For species different from `homo sapiens`, string value indicating
593597
the strain of the species, for example: `C57BL/6J`.
594598
definition: {
595-
"Description": "name of the strain of the species"
599+
"Description": "Name of the strain of the species.",
600+
"Format": "string",
596601
}
597602
strain_rrid:
598603
name: strain_rrid
@@ -635,6 +640,7 @@ trigger:
635640
continuous measurement of the scanner trigger signal
636641
definition: {
637642
"Description": "continuous measurement of the scanner trigger signal",
643+
"Format": "number",
638644
"Units": "arbitrary"
639645
}
640646
# type column in channels.tsv files

src/schema/objects/metadata.yaml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1199,6 +1199,39 @@ FlipAngle:
11991199
unit: degree
12001200
exclusiveMinimum: 0
12011201
maximum: 360
1202+
Format:
1203+
name: Format
1204+
display_name: Value format
1205+
description: |
1206+
Permitted formats for values in the described column.
1207+
type: string
1208+
# Keep synced with objects.formats
1209+
enum:
1210+
# Type formats
1211+
- string
1212+
- number
1213+
- integer
1214+
- boolean
1215+
# Numeric/alphanumeric strings
1216+
- index
1217+
- label
1218+
# Dates/times
1219+
- date
1220+
- datetime
1221+
- time
1222+
# Units
1223+
- unit
1224+
# URIs
1225+
- uri
1226+
- rrid
1227+
- bids_uri
1228+
# Paths
1229+
- dataset_relative
1230+
- file_relative
1231+
- participant_relative
1232+
- stimuli_relative
1233+
# Miscellaneous
1234+
- hed_version
12021235
FrameAcquisitionDuration:
12031236
name: FrameAcquisitionDuration
12041237
display_name: Frame Acquisition Duration
@@ -2153,6 +2186,12 @@ MaxMovement:
21532186
as measured by the head localization coils (for example, `4.8`).
21542187
type: number
21552188
unit: mm
2189+
Maximum:
2190+
name: Maximum
2191+
display_name: Maximum value
2192+
description: |
2193+
Maximum value a column entry is permitted to have.
2194+
type: number
21562195
MeasurementToolMetadata:
21572196
name: MeasurementToolMetadata
21582197
display_name: Measurement Tool Metadata
@@ -2191,6 +2230,12 @@ MetaboliteRecoveryCorrectionApplied:
21912230
If `true`, the `hplc_recovery_fractions` column MUST be present in the
21922231
corresponding `*_blood.tsv` file.
21932232
type: boolean
2233+
Minimum:
2234+
name: Minimum
2235+
display_name: Minimum value
2236+
description: |
2237+
Minimum value a column entry is permitted to have.
2238+
type: number
21942239
MiscChannelCount:
21952240
name: MiscChannelCount
21962241
display_name: Misc Channel Count

tools/schemacode/src/bidsschematools/tests/test_schema.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,11 @@ def test_formats(schema_obj):
182182
)
183183

184184

185+
def test_format_consistency(schema_obj):
186+
"""Test that the "Format" field is consistent with objects.formats."""
187+
assert set(schema_obj.objects.metadata.Format.enum) == schema_obj.objects.formats.keys()
188+
189+
185190
def test_dereferencing():
186191
orig = {
187192
"ReferencedObject": {

0 commit comments

Comments
 (0)