This project implements a multimodal machine learning pipeline for predicting head and neck cancer patient survival using AutoGluon. The model combines multiple data modalities to achieve clinically relevant performance in survival prediction.
Develop an AutoGluon MultiModal AI model that predicts head and neck cancer patient survival status (deceased vs. living) by integrating multiple types of medical data.
This project uses the HANCOCK dataset from hancock.research.fau.edu.
From the full HANCOCK dataset, this project specifically uses:
-
Structured Data
clinical_data.json- Patient demographics and treatment historypathological_data.json- Tumor staging and molecular markersblood_data.json- Laboratory measurementsblood_data_reference_ranges.json- Reference ranges
-
Text Data
- German surgery reports
- English translations of surgery reports
-
MA Cell Density Measurements
- Tissue microarray cell density quantification
- 6,332 individual measurements across patients
-
Primary Tumor Annotations
- Geometric annotations of primary tumors from WSI
- Used for extracting tumor shape and spatial features