Skip to content

Commit f66be60

Browse files
committed
Day 3: Validated NEWS2 scoring and designed ML feature pipelines
- Fixed bug in score_vital: LOC scoring was skipped due to early NaN return. - Validated GCS/LOC scoring against NHS NEWS2 rules (edge cases now correct). - Finalised missing data strategies: • Timestamp-level: LOCF + missingness flags. • Patient-level: median imputation + % missingness. - Designed pipeline for make_timestamp_features.py (LOCF, flags, rolling windows, staleness, ordinal encoding). - Designed pipeline for make_patient_features.py (aggregates, median imputation, % missingness, escalation summaries). - Selected LightGBM as primary ML model (handles NaNs, interpretable, best fit for tabular ICU data). - Documented all findings, rationales, and lessons in notes.md.
1 parent 3b4f528 commit f66be60

2 files changed

Lines changed: 17 additions & 14 deletions

File tree

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# EWS-Predictive-Dashboard
22
Python-based ICU Early Warning System (EWS) predictive dashboard. Uses time-series vitals to detect patient deterioration, benchmarked against NEWS2, with real-time alerts, trend visualisations, and a dual CLI/FastAPI interface. Portfolio-ready, deployed, and clinically-informed.
33

4-
4+
```text
55
Raw ICU Vitals (long format, MIMIC-style)
66
└─> compute_news2.py
77
├─ Input: raw vitals CSV
@@ -36,4 +36,5 @@ ML Model (LightGBM)
3636
│ ├─ Train predictive model for deterioration / escalation
3737
│ ├─ Use timestamp trends + missingness flags
3838
│ └─ Evaluate performance (AUROC, precision-recall, etc.)
39-
└─ Output: predictions, feature importances, evaluation metrics
39+
└─ Output: predictions, feature importances, evaluation metrics
40+
```

notes.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,25 +6,21 @@
66

77
### Pipeline Overview
88

9+
```text
910
Raw CSVs (chartevents.csv, etc.)
10-
11-
11+
1212
extract_news2_vitals.py
13-
14-
13+
1514
news2_vitals.csv
16-
17-
15+
1816
check_co2_retainers.py
19-
20-
17+
2118
news2_vitals_with_co2.csv + co2_retainer_details.csv
22-
23-
19+
2420
compute_news2.py
25-
26-
21+
2722
Final NEWS2 scores per patient
23+
```
2824

2925
### Goals
3026
- Extract relevant vital signs from PhysioNet.org MIMIC-IV Clinical Database Demo synthetic dataset for NEWS2 scoring.
@@ -217,18 +213,24 @@ Final NEWS2 scores per patient
217213

218214
### Overview
219215
**For timestamp-level ML features (news2_features_timestamp.csv)**:
216+
217+
```text
220218
raw long vitals (from MIMIC/ICU)
221219
↓ compute_news2.py
222220
news2_scores.csv ← "clinical truth" (all vitals + NEWS2 + escalation labels)
223221
↓ make_timestamp_features.py
224222
news2_features_timestamp.csv ← "ML ready" (numeric features, missingness flags, encodings)
223+
```
225224

226225
**For patient-level summary features (news2_features_patient.csv)**:
226+
227+
```text
227228
raw long vitals
228229
↓ compute_news2.py
229230
news2_scores.csv ← news2_patient_summary.csv not needed
230231
↓ make_patient_features.py
231232
news2_features_patient.csv ← ML ready (patient-level aggregates, imputed medians, missingness %)
233+
```
232234

233235
**The difference**:
234236
- Timestamp pipeline → preserves row-by-row dynamics (LOCF, staleness, rolling windows).

0 commit comments

Comments
 (0)