Skip to content

Commit ce86a44

Browse files
committed
Update
1 parent 81d27cc commit ce86a44

2 files changed

Lines changed: 35 additions & 38 deletions

File tree

README.md

Lines changed: 35 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ Although the national standard for deterioration detection, NEWS2 has well-recog
6565

6666
##
6767
### 1.2 Clinical Escalation Context
68-
### 1.2.1 How NEWS2 Scoring Is Used
68+
#### 1.2.1 How NEWS2 Scoring Is Used
6969
NEWS2 scoring bands map directly to clinical monitoring frequency and escalation actions; these operational consequences define the clinical targets we aim to predict:
7070

7171
| NEWS2 Score | Clinical Risk | Monitoring Frequency | Clinical Response |
@@ -264,22 +264,24 @@ Maintaining both feature sets ensures flexibility and robustness in model select
264264
### 4.2 Timestamp-Level Features (for TCN)
265265
**Purpose:** Capture temporal dynamics for sequential modeling
266266

267-
#### 4.2.1 Imputation Strategy
267+
#### 4.2.1 Feature Computation
268+
269+
**Imputation Strategy**
268270
- **Missingness flags:** Binary indicators (1/0) for each vital parameter (value carried forward=1) so models can learn from missing patterns. Before carried-forward so that it is known which values were originally missing.
269271
- **LOCF (Last Observation Carried Forward) flags:** Propagate previous valid measurement, and create a binary carried-forward flag (binary 1/0).
270272

271-
#### 4.2.2 Rolling Window Features (1/4/24h)
273+
**Rolling Window Features (1/4/24h)**
272274
- **Mean:** Average value
273275
- **Min/Max:** Range boundaries
274276
- **Std**: Variability
275277
- **Slope:** Linear trend coefficient
276278
- **AUC (Area Under Curve):** Integral of value over time
277279

278-
#### 4.2.3 Other Features
280+
**Other Features**
279281
- **Staleness:** Time since last observation (staleness per vital)
280282
- **Numeric risk encoding:** Encoded as 0 (low), 1 (medium-low), 2 (medium), 3 (high)
281283

282-
#### 4.2.4 Output Format
284+
#### 4.2.2 Output Format
283285
- **Timestamp-level:** `news2_features_timestamp.csv`
284286
- ML-ready per patient per timestamp features (ordered by `subject_id` and by `charttime`).
285287
- Full time-series per patient allows modelling of temporal patterns, trends, dynamics
@@ -402,7 +404,7 @@ Maintaining both feature sets ensures flexibility and robustness in model select
402404
- **Targets:** Patient-level outcomes (`max_risk`, `median_risk`, `pct_time_high`).
403405
- **Training approach:** Each timestep inherits the patient label → TCN maps whole sequence → patient-level prediction.
404406

405-
**Why not per-timestep prediction:**
407+
**Why not per-timestep prediction**
406408
- True sequence-to-sequence labeling would predict risk per timestep for richer early-warning capability.
407409
- Challenges: require detailed labels for every timestamp (rare in ICU datasets), per-timestep prediction in a small dataset is prone to overfitting and instability; and evaluation is complex.
408410

@@ -495,7 +497,7 @@ Maintaining both feature sets ensures flexibility and robustness in model select
495497

496498
##
497499
### 6.3 Hyperparameter Tuning
498-
### 6.3.1 Tuning Process
500+
#### 6.3.1 Tuning Process
499501
- This is the only Phase 3 component used in later phases (Phase 5 LightGBM evaluation).
500502
- Tuned the four parameters with the highest impact on stability and generalisation for small tabular datasets:
501503
- `learning_rate` → controls step size; balances speed vs overfitting.
@@ -564,11 +566,13 @@ Maintaining both feature sets ensures flexibility and robustness in model select
564566
**Purpose:**
565567
- Build, configure, and train a causal deep-learning model that captures temporal deterioration patterns beyond what classical ML can learn.
566568
- Deliver a fully reproducible end-to-end pipeline: data preparation → model design → training → validation → diagnostics → refinement
567-
**Why This Phase Matters**
569+
570+
**Why This Phase Matters:**
568571
- Classical models (e.g., LightGBM) cannot model temporal dynamics; the TCN extends the system to sequence-level reasoning
569572
- Initial TCN runs revealed class imbalance and regression skew, requiring corrective steps (pos-weighting, log-transform)
570573
- This phase demonstrates mature ML workflow: identify issues → diagnose → correct → retrain
571-
**End Products of Phase 4**
574+
575+
**End Products of Phase 4:**
572576
- Clean model-ready preprocessed data and patient splits
573577
- Clean padded/masked tensor datasets for sequence modelling
574578
- Fully defined multi-task causal TCN architecture
@@ -613,24 +617,24 @@ Maintaining both feature sets ensures flexibility and robustness in model select
613617
- Add `max_risk`, `median_risk`, `pct_time_high` from patient-level data
614618
2. **Patient-level Stratified Split**
615619
- Split: Train/validation/test → 70/15/15
616-
- Stratified by the same binary risk labels used for LightGBM (see Section 6.2) → stratification prevents class imbalance, random state fixed for reproducibility
617-
- Splitting by patient prevents leakage across sequences
620+
- Stratified by the same binary risk labels used for LightGBM (see Section 6.2) → stratification prevents class imbalance, random state fixed for reproducibility
621+
- Splitting by patient prevents leakage across sequences
618622
3. **Feature Cleaning**
619-
- Identify all feature columns (exclude IDs and targets), remove unused categorical fields, convert certain labels to binary → 171 timestamp-level feaures
620-
- Separate continuous (for z-score scaling) vs binary (no scaling needed) features.
623+
- Identify all feature columns (exclude IDs and targets), remove unused categorical fields, convert certain labels to binary → 171 timestamp-level feaures
624+
- Separate continuous (for z-score scaling) vs binary (no scaling needed) features.
621625
4. **Normalisation**
622626
- Apply z-scoring to continuous variables (categorical features unchanged) on train/val/test splits → ensures features are on comparable scales, preserving trends.
623-
- Fit `StandardScaler()` on training patients only (avoids information leakage).
627+
- Fit `StandardScaler()` on training patients only (avoids information leakage).
624628
5. **Sequence Construction**
625-
- Group rows per patient.
626-
- Convert each patient to (timesteps × features) 2D NumPy arrays.
629+
- Group rows per patient.
630+
- Convert each patient to (timesteps × features) 2D NumPy arrays.
627631
6. **Sequence Padding/Truncation and Masking**
628-
- Use fixed length 96 hours → `MAX_SEQ_LEN = 96` for uniform input sizes
629-
- Short sequences → zero-pad; long sequences → truncate.
630-
- Masks mark real (1) vs padded (0) timesteps for loss computation.
632+
- Use fixed length 96 hours → `MAX_SEQ_LEN = 96` for uniform input sizes
633+
- Short sequences → zero-pad; long sequences → truncate.
634+
- Masks mark real (1) vs padded (0) timesteps for loss computation.
631635
7. **Stack Sequences + Mask tensors For Each Split (train/val/test):**
632636
- Sequences: `train.pt`, `val.pt`, `test.pt` → shape `(num_patients, 96, num_features)`
633-
- Masks: `train_mask.pt`, `val_mask.pt`, `test_mask.pt` → shape `(num_patients, 96)`
637+
- Masks: `train_mask.pt`, `val_mask.pt`, `test_mask.pt` → shape `(num_patients, 96)`
634638
8. **Save Preprocessed Artifacts**
635639
- `patient_splits.json` → dictionary of patient IDs train/val/test split
636640
- `standard_scaler.pkl` → z-scoring scalar (training-set mean/std)
@@ -704,16 +708,16 @@ Maintaining both feature sets ensures flexibility and robustness in model select
704708
- Residual connection (adds input back to output → maintain gradient flow in deep stacks)
705709
- Purpose is for feature extraction
706710
3. **Dilated, Stacked TCN Layers**
707-
- TemporalBlocks stacked with exponentially increasing dilations (1 → 2 → 4 → …).
708-
- Expands the receptive field efficiently, enabling modelling of: short-range changes (first layers) → medium-range trends → long-range deterioration patterns (deeper layers) without huge kernels
709-
- Final block outputs tensor `(B, C_last, L)`
711+
- TemporalBlocks stacked with exponentially increasing dilations (1 → 2 → 4 → …).
712+
- Expands the receptive field efficiently, enabling modelling of: short-range changes (first layers) → medium-range trends → long-range deterioration patterns (deeper layers) without huge kernels
713+
- Final block outputs tensor `(B, C_last, L)`
710714
4. **Masked Mean Pooling**
711715
- Aggregates variable-length (padded) patient sequences into a fixed-size vector `(B, C_last)` per patient for downstream heads
712716
- Masked pooling computes the mean over only real (non-padded) timesteps → ignores padded timesteps to prevent gradient/feature distortion
713717
5. **Dense Head (Optional)**
714-
- Patient vector: Linear → ReLU (non-linearity) → Dropout.
715-
- Used to mix pooled features → adds extra representational capacity before task heads
716-
- Can be disabled for a direct connection from pooled features → task heads.
718+
- Patient vector: Linear → ReLU (non-linearity) → Dropout.
719+
- Used to mix pooled features → adds extra representational capacity before task heads
720+
- Can be disabled for a direct connection from pooled features → task heads.
717721
6. **Multi-Task Output Heads**
718722
- Separate linear heads generate patient-level outputs:
719723
- Classification: `classifier_max` (max risk), `classifier_median` (median risk)
@@ -735,7 +739,7 @@ Maintaining both feature sets ensures flexibility and robustness in model select
735739
2. **Stack temporal blocks:** Builds TCN layers with exponentially increasing dilations (1, 2, 4, …).
736740
2. Sets feature dimension = last blocks channel size → ready for dense head.
737741
3. **Creates optional dense head if `head_hidden` is provided:** Linear → ReLU → Dropout
738-
4. Defines three final linear heads for multi-task prediction → one scalar prediction per patient.
742+
4. Defines three final linear heads for multi-task prediction → one scalar prediction per patient:
739743
- `classifier_max`
740744
- `classifier_median`
741745
- `regressor`
@@ -778,13 +782,13 @@ Maintaining both feature sets ensures flexibility and robustness in model select
778782
- Downsample (1×1 conv): reshapes channels for residual addition
779783
- Residual/skip connection: add original input, gradient shortcut path back
780784
- Output tensor `(B, C_last, L)` → ready for next block or pooling or final classifier
781-
- Dilations increase per block (1 → 2 → 4), expanding receptive field
785+
- Dilations increase per block (1 → 2 → 4), expanding receptive field
782786
3. Permute back for pooling → `(B, L, C_last)`
783787
4. Apply masked mean pooling → `(B, C_last)`
784788
- If mask provided, ignore padding, average over real timestamps.
785789
6. **Optional dense head (if enabled)**
786790
- Linear → ReLU → Dropout
787-
- Adds non-linearity and regularisation after pooling
791+
- Adds non-linearity and regularisation after pooling
788792
- Output `(B, head_hidden)`
789793
7. **Pass to task-specific heads:**
790794
- `classifier_max`: binary logit `(B,)`
@@ -816,7 +820,7 @@ The following hyperparameters were used when training the final TCN model and st
816820

817821
#### 7.4.1 Core Training Parameters
818822
| Component | Value | Notes |
819-
|----------|--------|-------|
823+
|----------|-------------|-------|
820824
| Device | `cuda` (if available) or `cpu` | GPU acceleration |
821825
| Batch size | `32` | Number of patient sequences processed in one pass |
822826
| Epochs | `50` | Number of complete passes through the training dataset before stopping; more epochs → more weight adjustments to reduce loss |
@@ -875,7 +879,7 @@ The following hyperparameters were used when training the final TCN model and st
875879
│ • train/val/test sequences │
876880
│ • train/val/test masks │
877881
│ • patient_splits.json │
878-
TCNModel (tcn_model.py)
882+
│ • TCNModel (tcn_model.py)
879883
│ • config_refined.json │
880884
└──────────────┬────────────────┘
881885

requirements.txt

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,19 +4,12 @@ pandas>=2.1.0
44

55
# Visualization
66
matplotlib>=3.7.0
7-
seaborn>=0.12.2
87

98
# Machine learning (baseline, more later)
109
scikit-learn>=1.3.0
1110

1211
# Deep learning
1312
torch>=2.2.0
14-
torchvision>=0.17.0
15-
torchaudio>=2.2.0
16-
17-
# Jupyter for exploration
18-
jupyterlab>=4.0.0
19-
notebook>=7.0.0
2013

2114
# Utilities and file handling
2215
tqdm>=4.66.0

0 commit comments

Comments
 (0)