Skip to content

Align schemas with those provided by PDBP #103

@philerooski

Description

@philerooski

All fields and their values will need to be mapped from the clinical data and the fox insight survey data to the fields and values provided by PDBP.

  1. Create separate mapping files of fields and values for both clinical and fox insight surveys.
  • The clinical mapping file will be a JSON file with the following structure:
{
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "target_field": {
                "type": "object",
                "properties": {
                    "source_fields": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "source_field": {
                                    "type": "object",
                                    "properties": {
                                        "name": {
                                            "type": "string"
                                        },
                                        "cohort": {
                                            "type": "string"
                                        }
                                    }
                                }
                            }
                        }
                    },
                    "source_values": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "source_value": {
                                    "type": "object",
                                    "properties": {
                                        "name": {
                                            "type": "string"
                                        },
                                        "target_value": {
                                            "type": "string"
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

That is, a target (PDBP) field will have a name and map to multiple source (clinical) fields. Each source field will have a name, a cohort, and one or more visits at which the field was collected. The cohort will not only be unique to the source field name, but will also identify which GUIDs we expect to have provided this source/target field information. The visits field will be an array of pre-defined strings which identify which visits are expected to provide that value. Each visit corresponds to a distinct record in the target schema. Although the source field names will vary depending on the cohort we are mapping from, the source fields will share the same value mapping. Each value mapping is a JSON object with a name (the literal of the source value) and target value (the literal of the target value). In summary, we map a target field to its potential source fields, then for each potential source field we map its values to the target values.

This looks like this as a list in R:

[[1]]
[[1]]$name
[1] "target_field"

[[1]]$source_fields
[[1]]$source_fields[[1]]
[[1]]$source_fields[[1]]$name
[1] "source_field_one"

[[1]]$source_fields[[1]]$cohort
[1] "at-home-pd"


[[1]]$source_fields[[2]]
[[1]]$source_fields[[2]]$name
[1] "source_field_two"

[[1]]$source_fields[[2]]$cohort
[1] "super-pd"


[[1]]$source_values
[[1]]$source_values[[1]]
[[1]]$source_values[[1]]$name
[1] "source_value_one"

[[1]]$source_values[[1]]$target_value
[1] "target_value_one"


[[1]]$source_values[[2]]
[[1]]$source_values[[2]]$name
[1] "source_value_two"

[[1]]$source_values[[2]]$target_value
[1] "target_value_two"
  • The Fox Insight mapping file will be a JSON file with a simpler structure:
{
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "target_field": {
                "type": "object",
                "properties": {
                    "name": {
                        "type": "string"
                    },
                    "source_field": {
                        "type": "string"
                    },
                    "source_values": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "source_value": {
                                    "type": "object",
                                    "properties": {
                                        "name": {
                                            "type": "string"
                                        },
                                        "target_value": {
                                            "type": "string"
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

In this case, each target field will map to a single source field. That source field will have a name and a variable number of possible source values, each of which maps to a single target value.

This looks like this as a list in R:

[[1]]
[[1]]$name
[1] "target_field_one"

[[1]]$source_field
[1] "source_field"

[[1]]$source_values
[[1]]$source_values[[1]]
[[1]]$source_values[[1]]$name
[1] "source_value_one"

[[1]]$source_values[[1]]$target_value
[1] "target_value_one"


[[1]]$source_values[[2]]
[[1]]$source_values[[2]]$name
[1] "source_value_two"

[[1]]$source_values[[2]]$target_value
[1] "target_value_two"
  1. Implement the logic for reading in clinical and fox surveys as well as their mappings, then produce a PDBP compliant dataset by mapping the appropriate fields to their new names and values.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions