Skip to content

MLD3/NPW

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NPW: Measuring Model Performance in the Presence of an Intervention

Alt text

🔍 Overview

  • Real-world AI models are often used in the presence of an intervention (e.g., hospitals use AI to predict patients' risk of readmission and apply post-discharge phone check-in as an intervention to reduce readmission risk at the same time).

  • Evaluating a model's ability to predict outcome without intervention often requires randomized controlled trial (RCT) data, which is expensive to collect.

  • Standard evaluation can only use control group data in an RCT, while naïvely using both treatment and control group data leads to biased evaluation.

  • We proposed Nuisance Parameter Weighting (NPW), an unbiased model evaluation approach that uses all RCT Data.

  • We validated that NPW improves AUROC estimation, model ranking, and model selection across wide ranges of synthetic and real-world datasets.

  • This repository contains reproducible code for experimental results in "Measuring Model Performance in the Presence of an Intervention" (AAAI 2026).


▶️ Quick Start

1: obtain required datasets

  • Synthetic datasets will be automatically generated in the following steps
  • AMR-UTI dataset needs to be obtained from PhysioNet, and place the all_prescriptions.csv, all_uti_features.csv, and all_uti_resist_labels.csv files in data/AMR-UTI/raw folder.
  • The readmission dataset was collected at Michigan Medicine and is not publicly available to protect patient privacy. However, we provide the checkpoints of the model evaluation results for analysis. If you are interested in working with this dataset, please contact the authors for more information.

2: Install required packages

bash pip install -r requirements.txt

3: Reproducing Synthetic Experiments:

Path Description
src/run_gen_sim.sh Generate Synthetic Datasets
src/run_eval_models_sim.sh Run model evaluation with standard/naïve/NPW

3: Reproducing AMR-UTI Experiments:

Path Description
src/run_gen_amruti.sh Preprocess AMR-UTI Datasets
src/run_eval_models_amruti.sh Run model evaluation with standard/naïve/NPW

4 Visualizing Results

Path Description
notebooks/plot_sim_eval_results.ipynb Visualize Synthetic Experiment Results
notebooks/plot_real_eval_results.ipynb Visualize AMR-UTI & Readmission Experiment Results

📝 Citation

If you use NPW in your research, please cite:

Chen W., Sjoding M., Wiens J., Measuring Model Performance in the Presence of an Intervention. The 40th Annual AAAI Conference on Artificial Intelligence (2026).


🛠️ License

This project is licensed under the Apache 2.0 License.

About

[AAAI 2026] Measuring Model Performance in the Presence of an Intervention

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published