You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+53-43Lines changed: 53 additions & 43 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ This project implements a dual-architecture early warning system comparing gradi
12
12
13
13
Models were trained on the MIMIC-IV Clinical Demo v2.2 dataset (100 patients), using dual feature engineering pipelines: 171 timestamp-level temporal features (24-hour windows) for TCN, and 40 patient-level aggregated features for LightGBM.
14
14
15
-
The hybrid approach reveals complementary strengths: LightGBM achieves superior calibration and regression fidelity (68% Brier reduction, +17% AUC, +44% R²) for sustained risk assessment, while TCN demonstrates stronger acute event discrimination (+9.3% AUC, superior sensitivity) for detecting rapid deterioration. Together, they characterise short-term instability and longer-term exposure to physiological risk.
15
+
**The hybrid approach reveals complementary strengths:** LightGBM achieves superior calibration and regression fidelity (68% Brier reduction, +17% AUC, +44% R²) for sustained risk assessment, while TCN demonstrates stronger acute event discrimination (+9.3% AUC, superior sensitivity) for detecting rapid deterioration. Together, they characterise short-term instability and longer-term exposure to physiological risk.
16
16
17
17
The complete pipeline includes clinically validated NEWS2 preprocessing (CO₂ retainer logic, GCS mapping, supplemental O₂ protocols), comprehensive feature engineering, robust evaluation, and model-specific interpretability (SHAP for LightGBM; gradient×input saliency for TCN).
18
18
@@ -786,15 +786,15 @@ Maintaining both feature sets ensures flexibility and robustness in model select
786
786
3. Permute back for pooling → `(B, L, C_last)`
787
787
4. Apply masked mean pooling → `(B, C_last)`
788
788
- If mask provided, ignore padding, average over real timestamps.
789
-
6.**Optional dense head (if enabled)**
789
+
5.**Optional dense head (if enabled)**
790
790
- Linear → ReLU → Dropout
791
791
- Adds non-linearity and regularisation after pooling
792
792
- Output `(B, head_hidden)`
793
-
7.**Pass to task-specific heads:**
793
+
6.**Pass to task-specific heads:**
794
794
-`classifier_max`: binary logit `(B,)`
795
795
-`classifier_median`: binary logit `(B,)`
796
796
-`regressor`: continuous risk `(B,)`
797
-
8.**Output:**
797
+
7.**Output:**
798
798
- Return dictionary of patient-level predictions (ready for loss functions).
- Compute individual losses with loss functions → compare predictions to true labels (`y_max, y_median, y_reg`)
1027
1037
- Combine losses into 1 (`loss = loss_max + loss_median + loss_reg`) → one scalar loss value means each task contributes equally (multi-task learning).
1028
-
- Backward pass → calculate gradients of this total loss w.r.t. every model parameter.
1029
-
- Gradient clipping (`clip_grad_norm_`) → prevents exploding gradients (if gradients get too large, clipping rescales gradients, keeps training stable).
1030
-
- Optimiser step → updates weights in opposite direction of the gradients.
1031
-
-**This is the deep learning itself**: forward pass → loss → backward pass → update weights.
1038
+
- Backward pass → calculate gradients of this total loss w.r.t. every model parameter
1039
+
- Gradient clipping (`clip_grad_norm_`) → prevents exploding gradients (if gradients get too large, clipping rescales gradients, keeps training stable)
1040
+
- Optimiser step → updates weights in opposite direction of the gradients
1041
+
-**This is the deep learning itself**: forward pass → loss → backward pass → update weights
1032
1042
3.**Track Average Training Loss per Epoch**
1033
1043
- Weighted average over batch sizes → mean epoch training loss per patient
1034
-
- Logged and compared with validation loss for analysis → see if model is learning.
1044
+
- Logged and compared with validation loss for analysis → see if model is learning
1035
1045
2.**Validation Loop**
1036
1046
- Set model to evaluation mode → disable dropout, batch norm updates
1037
-
- Run the model on validation split (no gradients or optimiser step).
1038
-
- Compute and track average validation loss per epoch .
1039
-
- Scheduler step → Update LR scheduler based on validation loss.
1040
-
- **Logic**:
1047
+
- Run the model on validation split (no gradients or optimiser step)
1048
+
- Compute and track average validation loss per epoch
1049
+
- Scheduler step → Update LR scheduler based on validation loss
1050
+
-**Logic**:
1041
1051
- When validation loss improves (validation loss ↓) → save model, final model state will be best one
1042
-
- When validation loss stagnates/gets worse (validation loss ↑) → patience counter increases.
1043
-
- Early stopping: Training stops early when overfitting begins (after 7 epochs of no improvement).
1044
-
-**Rationale**: validation loss tells us if the model is generalising or just memorising training data .
1052
+
- When validation loss stagnates/gets worse (validation loss ↑) → patience counter increases
1053
+
- Early stopping: Training stops early when overfitting begins (after 7 epochs of no improvement)
1054
+
-**Rationale**: validation loss tells us if the model is generalising or just memorising training data
1045
1055
10.**Early Stopping**
1046
-
- If validation loss improves → save .pt model
1047
-
- If no improvement for 7 epochs → stop training early.
1048
-
-**Rationale**: protects against overfitting and wasted compute.
1056
+
- If validation loss improves → save .pt model
1057
+
- If no improvement for 7 epochs → stop training early
1058
+
-**Rationale**: protects against overfitting and wasted compute
1049
1059
1050
1060
#### 7.6.3 Loop Rationale
1051
1061
-**Multi-task learning:** Losses from 3 outputs combined for joint learning.
0 commit comments