Skip to content

aleqrt/mit-brown-datathon

Repository files navigation

Welcome to the Brown University Datathon 2024!

This repository is based on the repo created by João Matos link.

Objectives and Tracks

The objective is to investigate the impact of the data issues that exist in electronic health records on downstream clinical prediction tasks. We shall investigate the effect of a faulty pulse oximeter reading, the effect of a missing serum lactate level, and the effect of the combination of the two on mortality prediction in the hospital. We will be creating 3 "altered" datasets in addition to the original WiDS dataset:

  1. A dataset where the SpO2 of the Black patients will be increased by 10%

  2. A dataset where we drop the serum lactate measurements of Black patients

  3. A dataset where the SpO2 of the Black patients will be increased by 10% and their serum lactate is dropped

We exaggerate these data issues to get a sense of their impact on machine learning which surprisingly has not been sufficiently explored by the machine learning community.

Notebook 0 executes these operations. Please run it before the Datathon, to make sure that your environment works!

Schedule

  • Notebook 1 - First hour: data visualization and table one of the WiDS dataset.

  • Notebook 2 - Second hour: Build a mortality prediction model using the WiDS dataset. Evaluate performance across race-ethnicities in the test set.

  • Notebook 3 - Third hour: Build a mortality prediction model using one of three altered datasets. Use the same test set as above, but with the new features.

  • Fourth hour: Compare the two models and prepare presentation for Day 2.

Materials (online)

  • WiDS dataset - please download the data ("training_v2.csv") from here, place it under the "data" folder, and run Notebook 0 to create the train and test subsets with the modified features - before the datathon!!

  • Data Dictionary - to understand what the variables mean

Happy coding!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published