A Python preprocessing package for working with the Healthy Brain Network clinician consensus diagnostic data.
The raw HBN dataset includes final clinician diagnostic data given across ten numbered diagnosis columns. The order of these diagnoses is not indicative of severity, chronology, or importance, and this format requires data manipulation to be useful for analysis. This package transforms the data to a wider format, organized by specific diagnoses or categories rather than diagnosis numbers. It also includes the option to filter by diagnostic certainty or time of diagnosis and creates a visualization of the diagnostic data. Option to either run interactively in the command line (recommended if not familiar with the dataset) or to install as a python package.
| DX_01 | DX_01_Cat | DX_01_Sub | DX_01_Time | DX_01_Confirmed | DX_01_Presum | DX_01_RC | DX_01_RuleOut | DX_02 | DX_02_Cat | DX_02_Sub | DX_02_Time | DX_02_Confirmed | DX_02_Presum | DX_02_RC | DX_02_RuleOut | ... |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ADHD - Hyperactive Type | Neurodevelopmental Disorders | ADHD | 1 | 1 | 0 | 0 | 0 | ... | ||||||||
| Selective Mutism | Anxiety Disorders | Anxiety Disorders | 1 | 1 | 0 | 0 | 0 | Autism Spectrum Disorder | Autism Spectrum Disorder | Neurodevelopmental Disorders | 1 | 0 | 0 | 0 | 1 | ... |
↓
| Neurodevelopmental_Disorders_CategoryPresent | ADHD_Hyperactive_Type_DiagnosisPresent | ADHD_Hyperactive_Type_Time | ADHD_Hyperactive_Type_Certainty | Autism_Spectrum_DisorderPresent | Autism_Spectrum_Time | Autism_Spectrum_Certainty | Anxiety_Disorders_CategoryPresent | Selective_Mutism_DisorderPresent | Selective_Mutism_Time | Selective_Mutism_Certainty |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | Current | Confirmed | 0 | 0 | 0 | ||||
| 1 | 0 | 1 | Current | Rule-Out | 1 | 1 | Past | Confirmed |
For more information on the HBN data, please see the HBN Data Portal
Install this package via :
pip install git+https://github.com/childmindresearch/hbn-ddp.githbnddp
from hbnddp import HBNData
processed = HBNData.process(
input_path="path/to/data.csv",
output_path="path/to/output.csv"
)