Sage-Bionetworks · andrewelamb · Jan 22, 2026 · Jan 22, 2026 · Jan 22, 2026 · Jan 22, 2026
@@ -55,13 +55,13 @@ These columns must be present in your CSV data model:
 - [Description](#description)
 - [Valid Values](#valid-values)
 - [Required](#required)
-- [Parent](#parent)
 - [Validation Rules](#validation-rules)
+- [IsTemplate](#validation-rules)
 
 Defining data types:
 
 - Put a unique data type name in the `Attribute` column.
-- Put the value `DataType` in the `Parent` column.
+- Put the value `True` in the `IsTemplate` column.
 - List at least one attribute in the `DependsOn` column (comma-separated).
 - Optionally add a description to the `Description` column.
 
@@ -142,9 +142,9 @@ JSON Schema output:
 }
 ```
 
-### Parent
+### IsTemplate
 
-Put the value `DataType` in this column if this row is a data type. Other values are currently ignored. It is currently used to find all the data types in the data model.
+Put the value `True` in this column if this row is a data type(template). It is currently used to find all the data types in the data model.
 
 ### columnType
 
@@ -164,11 +164,11 @@ Must be one of:
 
 Data Model:
 
-| Attribute | DependsOn         | columnType  | Parent   |
-|-----------|-------------------|-------------|----------|
-| Patient   | "Gender, Hobbies" |             | DataType |
-| Gender    |                   | string      |          |
-| Hobbies   |                   | string_list |          |
+| Attribute | DependsOn         | columnType  | IsTemplate |
+|-----------|-------------------|-------------|------------|
+| Patient   | "Gender, Hobbies" |             | True       |
+| Gender    |                   | string      |            |
+| Hobbies   |                   | string_list |            |
 
 JSON Schema output:
 
@@ -213,11 +213,11 @@ The format of this attribute. See [format](https://json-schema.org/understanding
 
 Data Model:
 
-| Attribute       | DependsOn            | columnType  | Format | Parent   |
-|-----------------|----------------------|-------------|--------|----------|
-| Patient         | "Gender, Birth Date" |             |        | DataType |
-| Gender          |                      | string      |        |          |
-| Birth Date      |                      | string      | date   |          |
+| Attribute       | DependsOn            | columnType  | Format | IsTemplate |
+|-----------------|----------------------|-------------|--------|------------|
+| Patient         | "Gender, Birth Date" |             |        | True       |
+| Gender          |                      | string      |        |            |
+| Birth Date      |                      | string      | date   |            |
 
 JSON Schema output:
 
@@ -246,11 +246,11 @@ The regex pattern this attribute must match. The type of this attribute must be
 
 Data Model:
 
-| Attribute | DependsOn     | columnType  | Pattern | Parent   |
-|-----------|---------------|-------------|---------|----------|
-| Patient   | "Gender, ID"  |             |         | DataType |
-| Gender    |               | string      |         |          |
-| ID        |               | string      | [a-f]   |          |
+| Attribute | DependsOn     | columnType  | Pattern | IsTemplate |
+|-----------|---------------|-------------|---------|------------|
+| Patient   | "Gender, ID"  |             |         | True       |
+| Gender    |               | string      |         |            |
+| ID        |               | string      | [a-f]   |            |
 
 JSON Schema output:
 
@@ -279,12 +279,12 @@ The range that this attribute's numeric values must fall within. The type of thi
 
 Data Model:
 
-| Attribute    | DependsOn                   | columnType  | Minimum | Maximum | Parent   |
-|--------------|-----------------------------|-------------|---------|---------|----------|
-| Patient      | "Age, Weight, Health Score" |             |         |         | DataType |
-| Age          |                             | integer     | 0       | 120     |          |
-| Weight       |                             | number      | 0.0     |         |          |
-| Health Score |                             | number      | 0.0     | 1.0     |          |
+| Attribute    | DependsOn                   | columnType  | Minimum | Maximum | IsTemplate |
+|--------------|-----------------------------|-------------|---------|---------|------------|
+| Patient      | "Age, Weight, Health Score" |             |         |         | True       |
+| Age          |                             | integer     | 0       | 120     |            |
+| Weight       |                             | number      | 0.0     |         |            |
+| Health Score |                             | number      | 0.0     | 1.0     |            |
 
 JSON Schema output:
 
@@ -334,23 +334,23 @@ If you have an existing data model using any of the following validation rules,
 
 ## Conditional dependencies
 
-The `DependsOn`, `Valid Values` and `Parent` columns can be used together to flexibly define conditional logic for determining the relevant attributes for a data type.
+The `DependsOn`, `Valid Values` and `IsTemplate` columns can be used together to flexibly define conditional logic for determining the relevant attributes for a data type.
 
 In this example we have the `Patient` data type. The `Patient` can be diagnosed as healthy or with cancer. For Patients with cancer we also want to collect info about their cancer type, and any cancers in their family history.
 
 Data Model:
 
-| Attribute      | DependsOn                     | Valid Values        | Required | columnType  | Parent   |
-|----------------|-------------------------------|---------------------|----------|-------------|----------|
-| Patient        | "Diagnosis"                   |                     |          |             | DataType |
-| Diagnosis      |                               | "Healthy, Cancer"   | True     | string      |          |
-| Cancer         | "Cancer Type, Family History" |                     |          |             |          |
-| Cancer Type    |                               | "Brain, Lung, Skin" | True     | string      |          |
-| Family History |                               | "Brain, Lung, Skin" | True     | string_list |          |
+| Attribute      | DependsOn                     | Valid Values        | Required | columnType  | IsTemplate |
+|----------------|-------------------------------|---------------------|----------|-------------|------------|
+| Patient        | "Diagnosis"                   |                     |          |             | True       |
+| Diagnosis      |                               | "Healthy, Cancer"   | True     | string      |            |
+| Cancer         | "Cancer Type, Family History" |                     |          |             |            |
+| Cancer Type    |                               | "Brain, Lung, Skin" |          | string      |            |
+| Family History |                               | "Brain, Lung, Skin" |          | string_list |            |
 
 To demonstrate this, see the above example with the `Patient` and `Cancer` data types:
 
-- `Patient` is a data type, but `Cancer` is not, as defined by the `Parent` column.
+- `Patient` is a data type, but `Cancer` is not, as defined by the `IsTemplate` column.
 - `Diagnosis` is an attribute of `Patient`.
 - `Diagnosis` has `Valid Values` of `Healthy` and `Cancer`.
 - `Cancer` is also a data type.

@@ -684,6 +684,10 @@ def gather_csv_attributes_relationships(
                     attr_rel_dictionary[attribute_name]["Relationships"].update(
                         {relationship: parsed_rel_entry}
                     )
+            is_template_dict = self.parse_is_template(attr)
+            attr_rel_dictionary[attribute_name]["Relationships"].update(
+                is_template_dict
+            )
             if model_includes_column_type:
                 column_type_dict = self.parse_column_type(attr)
                 attr_rel_dictionary[attribute_name]["Relationships"].update(
@@ -710,6 +714,7 @@ def gather_csv_attributes_relationships(
                 attr_rel_dictionary[attribute_name]["Relationships"].update(
                     pattern_dict
                 )
+
         return attr_rel_dictionary
 
     def parse_column_type(self, attr: dict) -> dict:
@@ -851,6 +856,40 @@ def parse_csv_model(
 
         return model_dict
 
+    def parse_is_template(self, attribute_dict: dict) -> dict[str, bool]:
+        """Parse the IsTemplate value for a given attribute.
+
+        Args:
+            attribute_dict: The attribute dictionary.
+
+        Returns:
+            dict: A dictionary containing the parsed IsTemplate value.
+
+        Raises:
+            ValueError: If the IsTemplate value is not a boolean.
+        """
+        from pandas import isna
+
+        is_template_value = attribute_dict.get("IsTemplate")
+
+        if isna(is_template_value):
+            template_value = False
+        elif isinstance(is_template_value, str):
+            if is_template_value.lower() == "true":
+                template_value = True
+            else:
+                template_value = False
+        else:
+            try:
+                template_value = bool(is_template_value)
+            except ValueError as exception:
+                raise ValueError(
+                    f"The IsTemplate value: {is_template_value} is not boolean, "
+                    "please correct this value in the data model."
+                ) from exception
+
+        return {"IsTemplate": template_value}
+
 
 class DataModelJSONLDParser:
     """DataModelJSONLDParser"""
@@ -1118,7 +1157,6 @@ def gather_jsonld_attributes_relationships(self, model_jsonld: list[dict]) -> di
                             attr_rel_dictionary[attr_key]["Relationships"].update(
                                 {rel_csv_header: parsed_rel_entry}
                             )
-
                 elif (
                     rel_vals["jsonld_key"] in entry.keys()
                     and not rel_vals["csv_header"]
@@ -1935,6 +1973,22 @@ def _get_node_label(
             return self.get_node_label(node_display_name)
         raise ValueError("Either 'node_label' or 'node_display_name' must be provided.")
 
+    def get_node_is_template(
+        self, node_label: Optional[str] = None, node_display_name: Optional[str] = None
+    ) -> bool:
+        """Check if a given node is a template or not
+
+        Args:
+            node_label: Label of the node for which you need to look up.
+            node_display_name: Display name of the node for which you want look up.
+        Returns:
+            True: If the given node is a template
+        """
+        node_label = self._get_node_label(node_label, node_display_name)
+        rel_node_label = self.dmr.get_relationship_value("IsTemplate", "node_label")
+        node_is_template = self.graph.nodes[node_label][rel_node_label]
+        return node_is_template
+
 
 @dataclass_json
 @dataclass
@@ -2048,7 +2102,6 @@ def __init__(self, graph: MULTI_GRAPH_TYPE, logger: Logger, output_path: str = "
 
         class_template = ClassTemplate()
         self.class_template = json.loads(class_template.to_json())
-        self.logger = logger
 
     def get_edges_associated_with_node(
         self, node: str
@@ -2279,15 +2332,15 @@ def add_contexts_to_entries(self, template: dict) -> dict:
             if rel_key:
                 rel_key = rel_key[0]
                 # If the current relationship can be defined with a 'node_attr_dict'
-                if "node_attr_dict" in self.rel_dict[rel_key].keys():
+                if "node_attr_dict" in self.rel_dict[rel_key]:
                     try:
                         # if possible pull standard function to get node information
                         rel_func = self.rel_dict[rel_key]["node_attr_dict"]["standard"]
                     except Exception:  # pylint:disable=bare-except
                         # if not pull default function to get node information
                         rel_func = self.rel_dict[rel_key]["node_attr_dict"]["default"]
 
-                    # Add appropritae contexts that have been removed in previous steps
+                    # Add appropriate contexts that have been removed in previous steps
                     # (for JSONLD) or did not exist to begin with (csv)
                     if (
                         rel_key == "id"
@@ -2296,7 +2349,7 @@ def add_contexts_to_entries(self, template: dict) -> dict:
                     ):
                         template[jsonld_key] = "bts:" + template[jsonld_key]
                     elif (
-                        rel_key == "required"
+                        self.rel_dict[rel_key].get("type") == bool
                         and rel_func == convert_bool_to_str
                         and "sms" not in str(template[jsonld_key]).lower()
                     ):
@@ -2971,6 +3024,19 @@ def define_data_model_relationships(self) -> dict:
                     "standard": convert_bool_to_str,
                 },
             },
+            "IsTemplate": {
+                "jsonld_key": "sms:IsTemplate",
+                "csv_header": "IsTemplate",
+                "node_label": "IsTemplate",
+                "type": bool,
+                "jsonld_default": "sms:false",
+                "required_header": False,
+                "edge_rel": False,
+                "node_attr_dict": {
+                    "default": False,
+                    "standard": convert_bool_to_str,
+                },
+            },
             "subClassOf": {
                 "jsonld_key": "rdfs:subClassOf",
                 "csv_header": "Parent",
@@ -2980,7 +3046,7 @@ def define_data_model_relationships(self) -> dict:
                 "jsonld_default": [{"@id": "bts:Thing"}],
                 "type": list,
                 "edge_rel": True,
-                "required_header": True,
+                "required_header": False,
             },
             "validationRules": {
                 "jsonld_key": "sms:validationRules",
@@ -5630,12 +5696,7 @@ def generate_jsonschema(
     # Gets all data types if none are specified
     if data_types is None or len(data_types) == 0:
         data_types = [
-            dmge.get_node_label(node[0])
-            for node in [
-                (k, v)
-                for k, v in parsed_data_model.items()
-                if v["Relationships"].get("Parent") == ["DataType"]
-            ]
+            node for node in dmge.find_classes() if dmge.get_node_is_template(node)
         ]
 
     if len(data_types) != 1 and output is not None and output.endswith(".json"):