You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/explanations/curator_data_model.md
+53-37Lines changed: 53 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,8 +15,8 @@ The CSV data model described in this tutorial formalizes this structure:
15
15
16
16
Here is the Patient described above represented as a CSV data model:
17
17
18
-
| Attribute | DependsOn |
19
-
|---|---|
18
+
| Attribute | DependsOn |
19
+
|-----------|---------------------|
20
20
| Patient | "Age, Gender, Name" |
21
21
| Age ||
22
22
| Gender ||
@@ -48,9 +48,20 @@ The end goal is to create a JSON Schema that can be used in Curator. A JSON Sche
48
48
49
49
Note: Individual columns are covered later on this page.
50
50
51
+
These columns must be present in your CSV data model:
52
+
53
+
-`Attribute`
54
+
-`DependsOn`
55
+
-`Description`
56
+
-`Valid Values`
57
+
-`Required`
58
+
-`Parent`
59
+
-`Validation Rules`
60
+
51
61
Defining data types:
52
62
53
63
- Put a unique data type name in the `Attribute` column.
64
+
- Put the value `DataType` in the `Parent` column.
54
65
- List at least one attribute in the `DependsOn` column (comma-separated).
55
66
- Optionally add a description to the `Description` column.
56
67
@@ -79,8 +90,8 @@ Set of possible values for the current attribute. This attribute will be an enum
79
90
Data Model:
80
91
81
92
| Attribute | DependsOn | Valid Values |
82
-
|---|---|---|
83
-
| Patient | "Gender" ||
93
+
|-----------|-----------|-----------------------|
94
+
| Patient | "Gender" ||
84
95
| Gender || "Female, Male, Other" |
85
96
86
97
JSON Schema output:
@@ -107,8 +118,8 @@ Note: Leaving this empty is the equivalent of `False`.
107
118
Data Model:
108
119
109
120
| Attribute | DependsOn | Required |
110
-
|---|---|---|
111
-
| Patient | "Gender, Age" ||
121
+
|-----------|----------------|----------|
122
+
| Patient | "Gender, Age" ||
112
123
| Gender || True |
113
124
| Age || False |
114
125
@@ -131,6 +142,10 @@ JSON Schema output:
131
142
}
132
143
```
133
144
145
+
### Parent
146
+
147
+
This is mostly a remnant of the Schematic data model. It is currently used to find all the data types in the data model. Put the value `DataType` in this column if this row is a data type. Other vlaues are currently ignored.
148
+
134
149
### columnType
135
150
136
151
The data type of this attribute. See [type](https://json-schema.org/understanding-json-schema/reference/type).
| Patient | "Age, Weight, Health Score" |||| DataType|
283
+
| Age || integer | 0 | 120||
284
+
| Weight || number | 0.0 |||
285
+
| Health Score || number | 0.0 | 1.0||
271
286
272
287
JSON Schema output:
273
288
@@ -301,9 +316,9 @@ JSON Schema output:
301
316
302
317
### Validation Rules (deprecated)
303
318
304
-
This is a remnant from Schematic. It is still used (for now) to translate certain validation rules to other JSON Schema keywords.
319
+
This is a remnant from Schematic. It is still required and in use (for now) to translate certain validation rules to other JSON Schema keywords.
305
320
306
-
If you are starting a new data model, DO NOT use this column.
321
+
If you are starting a new data model, DO NOT fill out this column, just leave it blank.
307
322
308
323
If you have an existing data model using any of the following validation rules, follow these instructions to update it:
309
324
@@ -315,26 +330,27 @@ If you have an existing data model using any of the following validation rules,
315
330
316
331
## Conditional dependencies
317
332
318
-
The `DependsOn` and `Valid Values` columns can be used together to flexibly define conditional logic for determining the relevant attributes for a data type.
333
+
The `DependsOn`, `Valid Values` and `Parent` columns can be used together to flexibly define conditional logic for determining the relevant attributes for a data type.
319
334
320
335
In this example we have the `Patient` data type. The `Patient` can be diagnosed as healthy or with cancer. For Patients with cancer we also want to collect info about their cancer type, and any cancers in their family history.
0 commit comments