Skip to content

noxtoby/goldilocks-DPM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goldilocks DPM

A Disease Progression Modelling (DPM) implementation of the Goldilocks framework for data-driven model configuration.

Goldilocks (conference paper: Oxtoby, AAIC 2024) is a framework for helping users ensure that their data-driven model of choice is configured "just right" for the available data. Conceptually, Goldilocks informs on feature selection and hyperparameter tuning, with respect to signal in the data. This complements work in the field of explainable AI.

Goldilocks Zone for synthetic ADNI-like data

Contents

  1. goldilocks_dpm: python module for implementing the Goldilocks framework within data-driven DPM.
    1. demos: folder containing demo implementations of various DPMs.
      1. README: describes the goldilocks-dpm workflow
      2. goldilocks-pysustain.py: Subtype and Stage Inference
      3. Calls plot_SuStaIn_model_arbitrarycolours.py which demos how to use arbitrary colours in SuStaIn model plotting (thanks to Alex Young for help with this).
    2. TODO. goldilocks_ebm.py: Event-Based Model (GMM, KDE, Ordinal Scored Events)
  2. ADNIMERGE2023_synthetic.csv: Data mimicking ADNI data based on ADNIMERGE.csv downloaded in May 2023.
    1. synthetic_data.ipynb: Jupyter notebook for generating the above, FYI (don't try to run it unless you have an ADNIMERGE CSV to feed into it).

Workflow

See goldilocks-pysustain.py for a worked example using ZScoreSustain.

## 1. Prepare your data

## 2. Create a Goldilocks DPM object and run the framework

## ZScoreSuStaIn
from goldilocks_dpm import goldilocks_ZscoreSustain

gdpm = goldilocks_ZscoreSustain(
    classes = y,
    dpmData = X,
    output_folder = "path/to/output_folder",
    robust_zscores = False,
    case_label = 1,
    ctrl_label = 0, 
    direction_abnormal = direction_abnormal,
    biomarker_labels = biomarkers
)

gdpm.run_goldilocks()

## 3. Interrogate the resulting output.
print(gdpm.Z_vals)
print(gdpm.Z_max)

Licence

MIT.

About

Goldilocks framework for data-driven model configuration: finding the sweet spot between model and data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors