MBBS and Applied Machine Learning Engineer building end-to-end clinical ML systems across time-series modelling and clinical NLP using large-scale medical datasets.
Developed pipelines with core emphasis on clinical transparency and model interpretability, using PyTorch, Scikit-Learn, Hugging Face, and Google Cloud Run deployment with GitHub Actions versioning.
Currently contributing to R&D workflows using GKE-based cloud infrastructure for production-scale clinical NLP system at RadNomics.
Python | PyTorch | Hugging Face | scikit-learn | pandas | FastAPI | Docker | Google Cloud Run | GitHub Actions
- Cloud-deployed hybrid clinical NLP system combining rule-based extraction and transformer validation to generate structured entity outputs from ICU progress notes for downstream analysis and ML workflows
- Implemented regex-based extraction schemas for recall-focused extraction of 3 clinical entity types; fine-tuned BioClinicalBERT classifier on 1000+ manually annotated entities for precision-oriented validation layer
- Extracted 780,000+ structured entities from filtered adult ICU corpus of 160,000+ notes (30,000+ stays)
- Transformer validation achieved +45.9% in precision and −83.3% in false positives relative to rule-only baseline
- Deployed inference pipeline as stateless, containerised API on Google Cloud Run; versioned via GitHub Actions
Access Live API | View Repository | https://doi.org/10.5281/zenodo.20018309
Python | PyTorch | LightGBM | scikit-learn | pandas | NumPy
- Dual-architecture ICU early warning system combining Temporal CNN (TCN) and LightGBM to predict NEWS2-derived deterioration outcomes across 3 clinical risk dimensions
- Clincally validated data preprocessing included CO2 retainer logic, GCS mapping, and supplemental O2 protocols
- Engineered 171 timestamp-level features (8 vital parameters; 96-hour windows) and 40 aggregated patient-level features from 70,000+ time-series observations across 140 ICU stays
- TCN achieved +9.3% AUC improvement for acute-event detection; LightGBM achieved −68% Brier score and −48% RMSE for prolonged risk exposure
- Implemented SHAP and saliency mapping for clinican-interpretable feature insights
View Repository | https://doi.org/10.5281/zenodo.18487174
Applied Machine Learning Engineer @ RadNomics Ltd
- Contributed to R&D workflows for production clinical NLP system within GCP/GKE cloud infrastructure
- Processed 2.3M+ radiology reports involving medical data cleaning, preprocessing, and feature engineering
- Implemented report augmentation pipeline, generating 17M+ report pairs for downstream LLM modelling and evaluation workflows
- Used containerised remote development environments, Kubernetes pods, and Git-based collaboration under senior engineering supervision
- Machine Learning: PyTorch, Scikit-learn, LightGBM, Hugging Face Transformers, NLP
- Cloud / Infra: Google Cloud Platform (GKE, Cloud Run), Kubernetes, Docker
- Software Engineering: Python (OOP), FastAPI, Git/GitHub, CI/CD (GitHub Actions)
- Data: Pandas, NumPy, SQL
- MSc, Computer Science with Artificial Intelligence @ City St George’s, University of London
- Relevant Modules: Machine Learning, Artificial Intelligence, Cloud Computing, Software Engineering, Databases, Big Data Analytics and Visualisation
- Final Project: Evaluating Retrieval-Augmented Generation for Radiology Impression Drafting Using Clinically Significant Error Metrics
- MBBS, Medicine @ Norwich Medical School, University of East Anglia
- Yip, S. (2026). Time-Series ICU Patient Deterioration Predictor (1.0.0). Zenodo. https://doi.org/10.5281/zenodo.18487174
- Yip, S. (2026). Clinical Entity Extraction-Validation System (1.0.0). Zenodo. https://doi.org/10.5281/zenodo.20018309
MBBS Medical Student @ Norwich Medical School, University of East Anglia
Audit & Research Assistant @ Norfolk and Norwich University Hospital
- Lacertus syndrome and its surgical management using WALANT - our first 12 cases (Research Poster)
- Giant trichoblastic carcinoma initially misdiagnosed as basal cell carcinoma (Case Report)
- Head and Neck Surgery, Integrated Care Pathway Surgical Proforma Audit
- Plastic Surgery, Free Flap Surgical Outcomes Audit
- Clinical Informatics: EHR Systems (ICE, SystmOne, MediViewer, EPMA), HL7-FHIR, NEWS2, GDPR
- Clinical Research: Audit Methodology, Literature Review, Critical Appraisal, Manuscript Preparation



