Skip to content

Commit 4cddd61

Browse files
committed
Update Day 47-48 Start README.md
1 parent b6e7d02 commit 4cddd61

3 files changed

Lines changed: 5375 additions & 154 deletions

File tree

README.md

Lines changed: 98 additions & 153 deletions
Original file line numberDiff line numberDiff line change
@@ -1,153 +1,98 @@
1-
# Time-Series ICU Patient Deterioration Predictor 📉⏳
2-
3-
Early warning system predicting ICU patient deterioration on MIMIC-IV Clinical Demo v2.2 dataset (100 patients), framework comparing LightGBM vs Temporal Convolutional Network across 3 targets: peak deterioration (max_risk), typical risk (median_risk), and proportion of admission in high-risk states (pct_time_high).
4-
5-
Temporal and aggregated feature engineering with clinical-validity-aware missing data handling, using custom NEWS2-derived ground-truth values (GCS/LOC mapping, supplemental O₂ and CO₂ retainer custom logic).
6-
7-
TCN trained on 171 temporal-features x 96 timestamps, 24hr rolling windows across 8 vital parameters, with 3-layer TemporalBlock stack, kernel=3, dropout=0.2, head_hidden=64, batch=32, 50 epochs, early stopping at epoch 10;
8-
9-
LightGBM trained on 40 aggregated patient-features with 5-fold stratified CV, with hyperparameter tuning.
10-
11-
TCN greater sensivity on max_risk (AUC +9.3%, AP +1.25%), LightGBM greater reliability and calibration on median_risk (AUC +17%, Brier ↓68%, ECE ↓63%) and more precise pct_time_high (RMSE ↓32%, R² +44%, residual SD ↓42%).
12-
13-
evaluating interpretability (SHAP) versus saliency-based explanations for clinical adoption.
14-
15-
Deployed, reproducible auditable pipeline with deployment-lite, and full documentation for clinical validation.
16-
17-
TCN_refined captures short-term acute events with rapid early detection, whereas LightGBM provides robust, calibrated estimates of sustained deterioration exposure; supports ICU triage, continuous monitoring, and escalation decisions with quantified, actionable confidence.
18-
TCN excels at short-term acute events with rapid detection while LightGBM provides reliable long-term estimates of sustained deterioration exposure, suggesting ensemble approach for production deployment.
19-
20-
Portfolio-ready, deployed, and clinically-informed.
21-
22-
**Tech stack**: python, pandas, NumpPy, LightGBM, PyTorch
23-
24-
**Pipeline**
25-
```text
26-
Raw ICU Vitals (long format, MIMIC-style)
27-
└─> compute_news2.py
28-
├─ Input: raw vitals CSV
29-
├─ Action: compute NEWS2 scores per timestamp
30-
└─ Output: news2_scores.csv (wide format with vitals, NEWS2 score, escalation labels), news2_patient_summary.csv (patient-level summary)
31-
32-
news2_scores.csv
33-
└─> make_timestamp_features.py
34-
├─ Action:
35-
│ ├─ Aggregate per patient
36-
│ ├─ Add missingness flags
37-
│ ├─ Apply LOCF per vital
38-
│ ├─ Compute carried-forward flags
39-
│ ├─ Compute rolling window stats (1h/4h/24h)
40-
│ ├─ Compute time-since-last-observation
41-
│ └─ Encode risk/escalation as ordinal numeric
42-
└─ Output: news2_features_timestamp.csv
43-
(ML-ready timestamp-level features)
44-
45-
news2_scores.csv
46-
└─> make_patient_features.py
47-
├─ Action:
48-
│ ├─ Aggregate per patient
49-
│ ├─ Compute median, mean, min, max per vital
50-
│ └─ Include % missingness per vital
51-
└─ Output: news2_features_patient.csv
52-
(ML-ready patient-level summary features)
53-
```
54-
55-
# Timestamp features rationale
56-
- We compute rolling window features over 1h, 4h, and 24h intervals.
57-
- Mean, min, max capture the magnitude and variability of vitals.
58-
- Slope gives the trend — whether the vital is rising or falling and how fast.
59-
- AUC measures cumulative exposure, i.e., how much and for how long a patient has experienced abnormal values.
60-
- These features provide temporal context for the ML model, so it doesn’t just see isolated values but also their trajectory over time.
61-
62-
63-
# LightGBM vs Neural Network (TCN) Pipeline
64-
```text
65-
ML Model (LightGBM)
66-
├─ Input: news2_features_patient.csv
67-
│ ├─ Median, mean, min, max per vital
68-
│ ├─ Impute missing values
69-
│ ├─ % missing per vital
70-
│ └─ Risk summary stats (max, median, % time at high risk)
71-
├─ Action:
72-
│ ├─ Train predictive model for deterioration / escalation
73-
│ ├─ Use timestamp trends + missingness flags
74-
│ └─ Evaluate performance (AUROC, precision-recall, etc.)
75-
└─ Output: predictions, feature importances, evaluation metrics
76-
77-
ML Model (Neural Network, TCN)
78-
├─ Input: news2_features_timestamp.csv
79-
│ ├─ Timestamp-level vitals & rolling features (mean, min, max, std, slopes, AUC)
80-
│ ├─ Missingness flags
81-
│ ├─ Carried-forward flags
82-
│ └─ Time since last observation
83-
├─ Action:
84-
│ ├─ Train predictive model for deterioration / escalation
85-
│ ├─ Learn temporal patterns, trends, and interactions
86-
│ ├─ Can handle sequences of variable length per patient
87-
│ └─ Evaluate performance (AUROC, precision-recall, calibration)
88-
└─ Output:
89-
├─ Predictions per timestamp or per patient
90-
├─ Learned feature embeddings / attention weights (if applicable)
91-
└─ Evaluation metrics
92-
```
93-
94-
# LightGBM vs Neural Network (TCN) Pipeline Visualisation
95-
```text
96-
Raw EHR Data (vitals, observations, lab results)
97-
98-
99-
Timestamp Feature Engineering (news2_scores.csv)
100-
- Rolling statistics (mean, min, max, std)
101-
- Slopes, AUC, time since last observation
102-
- Imputation & missingness flags
103-
104-
├─────────────► TCN Neural Network Model (v2)
105-
│ - Input: full time-series per patient
106-
│ - Can learn temporal patterns, trends, dynamics
107-
108-
109-
Patient-Level Feature Aggregation (make_patient_features.py → news2_features_patient.csv)
110-
- Median, mean, min, max per vital
111-
- % missing per vital
112-
- Risk summary stats (max, median, % time at high risk)
113-
- Ordinal encoding for risk/escalation
114-
115-
116-
LightGBM Model (v1)
117-
- Input: one row per patient (fixed-length vector)
118-
- Uses aggregated statistics only
119-
- Cannot handle sequences or variable-length time series
120-
```
121-
122-
# Model Comparison: LightGBM vs Neural Network (V1 & V2)
123-
124-
| Aspect | LightGBM (V1) | Temporal Convolutional Network (TCN) (V2) |
125-
|--------|-------------------|-------------------|
126-
| **ML Model Name / Type** | LightGBM (Gradient Boosted Decision Trees) | Temporal Convolutional Network (TCN)(Neural network) |
127-
| **V1 / V2** | V1: uses patient-level features, baseline interpretable patient summary (classic tabular ML) | V2: uses timestamp-level features, advanced sequence modeling (modern deep learning) |
128-
| **Input Datasets** | `news2_features_patient.csv` (patient-level summaries) | `news2_features_timestamp.csv` (time series of vitals, missingness flags) |
129-
| **Optional Inputs** | Timestamp features could be added later for hybrid model | Patient-level summary features from `news2_features_patient.csv` can be appended but not mandatory |
130-
| **Reason for this input choice** | LightGBM is a tree-based model: handles static features and aggregates well; does not naturally model temporal sequences | Neural networks (LSTM/TCN) can model temporal trends, sequences, and interactions over time; need full timestamp features to exploit sequential information |
131-
| **Why two different models** | LightGBM: fast, interpretable (feature importance), strong baseline. | Neural network: captures temporal dynamics, can potentially improve predictive performance on time-series deterioration<br>Complements LightGBM; addresses potential limitations of static patient summaries by using sequential information in timestamp features |
132-
| **Strengths** | - Handles missing values gracefully.<br>- Fast training and inference.<br>- Provides feature importances.<br>- Works well with tabular summary features. | - Models temporal trends and interactions.<br>- Can capture subtle patterns in sequences of vitals.<br>- Potentially better performance on real-time deterioration prediction. |
133-
| **Weaknesses / Limitations** | - Ignores sequence and timing of events.<br>- May lose some granularity of patient trajectory.<br>- Cannot capture interactions over time. | - Requires more computation and tuning.<br>- Harder to interpret.<br>- Sensitive to missing data; requires careful imputation or masking. |
134-
| **Output** | Predictions per patient, feature importances, evaluation metrics (AUROC, PR-AUC, etc.) | Predictions per timestamp or per patient trajectory, evaluation metrics (AUROC, PR-AUC, potentially time-dependent metrics) |
135-
| **Use case / Deployment** | Baseline model; interpretable; fast deployment; can be used for early warning systems using summary features | Advanced model for final deployment or v2 experimentation; may be integrated in real-time monitoring dashboards for continuous deterioration prediction |
136-
137-
138-
Portfolio narrative framing (objective and honest)
139-
140-
Here’s how you can present this:
141-
1. State the limitation upfront:
142-
• “Synthetic dataset contains very few high-risk events; patient-level deterioration classification targets were largely zero. Standard classification tasks were infeasible.”
143-
2. Pivot your narrative to learnable outcomes:
144-
• LightGBM: Predict patient-level NEWS2 / continuous risk burden, analyze feature importances to show clinical insights.
145-
• TCN: Predict timestamp-level NEWS2 trends to capture dynamic risk evolution.
146-
3. Metrics and comparison:
147-
• Report regression metrics (RMSE, R², MAE).
148-
• Compare to simple baselines (mean NEWS2, last observation carried forward) to show your model improves predictive performance.
149-
• Highlight trend detection and feature influence, which is a clinically relevant skill.
150-
4. Why this is still strong for a portfolio:
151-
• Demonstrates data wrangling, preprocessing, CV, feature engineering, ML pipeline, model selection, hyperparameter tuning, and neural networks.
152-
• Shows clinical insight (feature importance, temporal trends).
153-
• Recruiters and technical reviewers care about how you solved real-world limitations, not just “predicted rare events.”
1+
# Time-Series ICU Patient Deterioration Predictor
2+
3+
## *Hybrid Machine Learning System for Early Warning in Critical Care*
4+
5+
---
6+
7+
## Executive Summary
8+
9+
**Tech stack:** *Python, pandas, NumPy, LightGBM, PyTorch, Scikit-learn*
10+
11+
This project implements a dual-architecture early warning system comparing gradient-boosted decision trees (LightGBM) against temporal convolutional networks (TCN) for predicting ICU patient deterioration, across three risk horizons (maximum risk atained, average sustained risk, % time spent in high risk). Built on MIMIC-IV Clinical Demo v2.2 dataset (100 patients), the system processes 171 temporal features across 24-hour windows and 40 aggregated patient-level features, to support continuous monitoring and escalation decisions.
12+
13+
The hybrid approach reveals complementary strengths: LightGBM achieves superior calibration and regression fidelity (68% Brier reduction, +17% AUC, +44% R²) for sustained risk assessment, while TCN demonstrates stronger acute event discrimination (+9.3% AUC, superior sensitivity) for detecting rapid deterioration.
14+
15+
The complete pipeline includes NHS-validated NEWS2 preprocessing with CO₂ retainer logic, GCS mapping, and supplemental O₂ protocols; extensive evaluation metrics and model-specific interpretability methods for clinical validation (SHAP for LightGBM, absolute gradient×input saliency for TCN); and a deployment-ready dual inference system (batch and per-patient) for end-to-end usability.
16+
17+
**Key Contributions:**
18+
- Clinical validity pipeline with robust NEWS2 computation
19+
- Dual feature engineering (patient-level vs timestamp) for both classical and deep learning models
20+
- Duel model training with hyperparameter tuning
21+
- Rigorous refinement and model evaluation
22+
- Transparent interpretability validated against domain knowledge
23+
- Deployment-lite inference pipeline demonstrating end-to-end usability
24+
25+
---
26+
27+
## Table of Contents
28+
1. [Introduction](#introduction)
29+
2. [Clinical Motivation](#clinical-motivation)
30+
3. [Data Pipeline Overview](#data-pipeline-overview)
31+
4. [Phase 1: CO₂ Retainer Identification & NEWS2 Tracker](#phase-1-co2-retainer-identification--news2-tracker)
32+
5. [Phase 2: ML-Ready Feature Engineering](#phase-2-ml-ready-feature-engineering)
33+
6. [Phase 3: LightGBM Training & Validation](#phase-3-lightgbm-training--validation)
34+
7. [Next Steps](#next-steps)
35+
36+
---
37+
38+
39+
## 1. Clinical Background & Motivation
40+
41+
### The Problem
42+
ICU patient deterioration manifests through subtle vital sign changes hours before critical events. The National Early Warning Score 2 (NEWS2) is widely used in UK hospitals to detect and escalate care for deteriorating patients. Accurate, real-time scoring and risk stratification can:
43+
- Enable earlier intervention and ICU escalation
44+
- Support clinical decision-making with actionable, interpretable metrics
45+
- Provide a foundation for advanced ML models to improve patient outcomes
46+
47+
Although NEWS2 is the national standard for deterioration detection, it has well-recognised constraints:
48+
- **No true temporal modelling:** Although observations are charted sequentially, the scoring algorithm treats each set of vitals independently and does not incorporate trend, slope, variability, or rate-of-change.
49+
- **Discrete scoring limitations:** NEWS2 discretises continuous physiological signals into coarse bands and does not model interactions between multiple variables, which limits sensitivity to subtle multivariate deterioration patterns.
50+
- **Escalation overload:** Threshold-based scoring generates many false positives in elderly and multimorbid cohorts, contributing to alert burden and escalation fatigue.
51+
- **Limited predictive horizon:** NEWS2 typically identifies deterioration only after thresholds are crossed, offering limited early-warning capability compared with models that can detect sub-threshold physiological drift.
52+
53+
### Clinical Escalation Context
54+
NEWS2 scoring bands map directly to clinical monitoring frequency and escalation actions; these operational consequences define the clinical targets we aim to predict:
55+
56+
| NEWS2 Score. | Clinical Risk | Monitoring Frequency | Clinical Response |
57+
|-----------------------------------|---------------|--------------------------------------------------------|------------------------------------------------------------------------------------|
58+
| **0** | Low | Minimum every **12 hours** | Routine monitoring by registered nurse. |
59+
| **1–4** | Low | Minimum every **4–6 hours** | Nurse to assess need for change in monitoring or escalation. |
60+
| **Score of 3 in any parameter** | Low–Medium | Minimum every **1 hour** | **Urgent** review by ward-based doctor to decide monitoring/escalation. |
61+
| **5–6** | Medium | Minimum every **1 hour** | **Urgent** review by ward-based doctor or acute team nurse; consider critical care team review. |
62+
| **≥7** | High | **Continuous** monitoring | **Emergent** assessment by clinical/critical-care team; usually transfer to HDU/ICU. |
63+
64+
#### Why this matters
65+
- Transitions between risk bands (especially into medium or high) drive clinical workload and resource allocation, including urgent reviews and ICU involvement.
66+
- Predicting imminent transitions into these categories (e.g., entering high risk within the next 4–6 hours) enables earlier intervention, reducing delayed escalations and improving critical-care resource planning.
67+
68+
#### Why NEWS2 is used as the reference standard
69+
- NEWS2 is the nationally accepted standard for ward-based clinical deterioration assessment. Using it as the ground-truth ensures that ML models are trained and evaluated against a clinically validated reference.
70+
- ML models predict summary outcomes derived from NEWS2 clinical-risk categories:
71+
- `max_risk`: Maximum risk attained during the observation window
72+
- `median_risk`: Average sustained risk across the stay
73+
- `pct_time_high`: Percentage of time spent in high-risk state
74+
- Evaluating ML predictions against these NEWS2-derived outcomes allows assessment of **predictive horizon**, **sensitivity**, and the ability to anticipate **clinically actionable deterioration trends** before standard escalation would occur.
75+
76+
### Why Machine Learning?
77+
ICU deterioration is complex and often subtle, involving multivariate temporal patterns that standard threshold-based systems cannot fully capture. ML models allow us to go beyond static scoring by predicting summary outcomes derived from NEWS2 clinical-risk categories.
78+
79+
#### LightGBM (classical, non-temporal ML)
80+
- LightGBM, a gradient-boosted decision tree (GBDT) algorithm, provides a strong baseline for tabular clinical data
81+
- Captures nonlinear interactions between vital signs
82+
- Fast to train and tune, handles missing data robustly
83+
- Highly interpretable via SHAP
84+
- Often competitive or superior when temporal structure is weak
85+
86+
#### Temporal Convolutional Network (TCN) (temporal deep learning)
87+
- TCN captures time-dependent patterns, slopes, and variability
88+
- Models long-range temporal context
89+
- Robust to irregular sampling
90+
- Potentially detects subtle deterioration earlier than threshold-based approaches
91+
92+
#### Why compare both
93+
- LightGBM provides a robust classical-ML baseline for tabular clinical data.
94+
- TCN evaluates whether temporal modelling yields measurable gains by capturing sequential patterns and slopes in vital signs.
95+
- This comparison reflects realistic deployment: classical ML may suffice for lower-frequency ward data, whereas temporal models exploit high-resolution ICU monitoring to detect early deterioration.
96+
- The evaluation clarifies where temporal modelling adds value, where classical ML is sufficient, and the trade-offs between interpretability and predictive performance.
97+
98+
This project therefore systematically evaluates temporal vs. non-temporal ML approaches for predicting ICU deterioration, using clinically meaningful NEWS2-derived summary outcomes as targets.

0 commit comments

Comments
 (0)