Skip to content

purvanshjoshi/clinical-risk-predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

179 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

πŸ₯ π€π¬π­π«πšπŒπžπ: 𝐂π₯𝐒𝐧𝐒𝐜𝐚π₯ 𝐑𝐒𝐬𝐀 π€πˆ

Next-Generation Predictive Analytics & Decision Support System

Hugging Face Spaces Vercel Deployment License

AstraMed represents a paradigm shift inExplore Live App | Test API Engine | View Code

Python FastAPI React TypeScript Docker


πŸ“‘ Table of Contents

🌍 Live Deployment

Component Status Stack Link
Prediction Engine 🟒 Online FastAPI + XGBoost/CatBoost API Docs
ML Inference Node 🟒 Online Python 3.10 Model Spaces
Frontend App 🟒 Online React + TypeScript Live App

Interactive Demo: Visit the API Docs link to explore the Swagger documentation and test the model inference directly.


🎯 Problem Statement

πŸ₯ Track 1: Clinical Decision Support

graph LR
    A[😷 Silent Disease<br/>Progression] --> B[⏰ Late Detection]
    B --> C[πŸ’° Costly<br/>Interventions]
    C --> D[πŸ“‰ Poor<br/>Outcomes]
    
    style A fill:#ff6b6b,stroke:#c92a2a,color:#fff
    style B fill:#ffa94d,stroke:#e8590c,color:#fff
    style C fill:#ffd43b,stroke:#fab005,color:#000
    style D fill:#ff6b6b,stroke:#c92a2a,color:#fff
Loading

πŸ” Context

Chronic diseases such as diabetes often develop silently. By the time symptoms appear, interventions become costly and outcomes worsen. Clinicians operate under:

  • ⏱️ Time Pressure β€” Limited consultation windows
  • πŸ“Š Data Gaps β€” Incomplete historical records
  • ❓ Uncertainty β€” Complex probabilistic assessments

Meanwhile, patients struggle to understand probabilistic health risks and preventive actions.

⚠️ The Challenge

Design a clinical decision support workflow that:

  • βœ… Surfaces early risk signals from routine patient data
  • βœ… Supports informed, timely interventions
  • βœ… Doesn't overwhelm doctors or mislead patients

πŸ’‘ What Our Solution Enables

πŸ”¬ For Clinicians

  • πŸ“ˆ High-density risk scores with confidence intervals
  • 🎯 Key contributing factors ranked by importance
  • πŸ“Š SHAP-based explanations and visualizations
  • πŸ’Š Evidence-based action recommendations
  • πŸ“‰ Longitudinal trend analysis

πŸ‘€ For Patients

  • 🚦 Simple risk gauges (traffic light system)
  • πŸ“ Plain-language summaries
  • πŸ₯— Personalized lifestyle guidance
  • πŸ“± Progress tracking over time
  • ✨ AI-generated action plans

πŸ’‘ Solution Architecture

πŸ—οΈ System Design

image

πŸ”„ Data Flow Pipeline

image

πŸš€ Key Features

🎯 Core Capabilities

1️⃣ Risk Scoring & Stratification

  • πŸ“Š Multi-Level Classification: Low / Medium / High risk tiers
  • πŸ“ˆ Confidence Intervals: Uncertainty quantification
  • πŸ“‰ Longitudinal Tracking: Risk velocity over time
  • 🎯 Percentile Rankings: Population-based context

2️⃣ Explainability & Transparency

  • πŸ” SHAP Values: Feature importance rankings
  • πŸ“Š Force Plots: Visual explanation of predictions
  • 🎨 Interactive Charts: Drill-down analysis
  • πŸ“‹ Audit Trails: Complete decision logs

3️⃣ Counterfactual Reasoning

  • πŸŽ›οΈ What-If Scenarios: "Reduce BMI by 5% β†’ Risk ↓15%"
  • πŸ”„ Interactive Simulation: Real-time slider controls
  • 🎯 Modifiable Factors: Focus on actionable changes
  • πŸ“ˆ Impact Visualization: Before/after comparisons

4️⃣ AI-Powered Reports

  • πŸ“ Clinical Summaries: Technical detail for providers
  • πŸ‘€ Patient Explanations: Plain-language versions
  • πŸ€– BioMistral-7B: Medical-grade language model
  • πŸ“„ PDF Generation: Exportable reports

5️⃣ Population Analytics

  • πŸ‘₯ Digital Twin Matching: Find similar patient outcomes
  • πŸ“Š Cohort Analysis: Demographic comparisons
  • 🎯 Percentile Context: "Your risk is higher than 82% of peers"
  • πŸ“ˆ Trend Detection: Population-level patterns

6️⃣ Pro Max UI/UX Experience

  • πŸ’Ž Glassmorphism 2.0: Premium, accessible aesthetic
  • πŸ”² Bento Grid Layout: Information-dense, organized dashboard
  • πŸŽ›οΈ Interactive Sliders: Real-time "What-If" adjustments
  • πŸ“Š Radial Risk Gauges: Dynamic, animated risk visualization

🧠 The Machine Learning Engine

AstraMed is powered by an enterprise-grade Ensemble Learning Pipeline designed for high-stakes clinical environments where accuracy and explainability are paramount.

πŸ”¬ Architecture: The "Tri-Force" Ensemble

Instead of relying on a single model, we leverage a Soft-Voting Ensemble of three industry-leading gradient boosting algorithms:

  1. XGBoost (eXtreme Gradient Boosting): Optimized for speed and performance on structured clinical data.
  2. CatBoost (Categorical Boosting): Handles categorical features (e.g., "Gender", "Smoking History") natively without leakage.
  3. LightGBM: Provides high efficiency on large-scale datasets.
graph TD
    A[Patient Data] --> B[Preprocessing Pipeline]
    B --> C{Ensemble Core}
    C -->|Probability| D[XGBoost]
    C -->|Probability| E[CatBoost]
    C -->|Probability| F[LightGBM]
    D & E & F --> G[Soft Voting Aggregator]
    G --> H[Final Risk Score]
    H --> I[Calibration Layer]
    I --> J[Risk Stratification]
Loading

πŸ” Explainable AI (XAI) with SHAP

We solve the "Black Box" problem using SHAP (SHapley Additive exPlanations). Every prediction comes with a mathematical justification:

  • Local Interpretability: Why did this specific patient get a high risk score? (e.g., "+15% due to High HbA1c").
  • Global Interpretability: What factors drive disease risk across the entire population?

πŸ”„ "What-If" Counterfactual Simulation

AstraMed goes beyond static predictions. Our Counterfactual Engine allows clinicians to simulate outcomes:

"If the patient reduces BMI by 2 points and lowers HbA1c to 5.7%, how does their 5-year risk change?" This empowers shared decision-making and personalized goal setting.


πŸ› οΈ Tech Stack

Backend Stack

Python FastAPI scikit-learn XGBoost Pandas

Frontend Stack

React TypeScript Tailwind Recharts Lucide

ML & AI Stack

Frontend

React TypeScript Tailwind Vite Recharts

Backend & ML

Python FastAPI scikit-learn XGBoost SHAP

DevOps & Infrastructure

Docker Vercel Hugging Face Git

πŸ“Š Technology Matrix

Layer Technology Purpose
🎨 Frontend React + TypeScript Interactive UI components
🎨 Styling Tailwind CSS Responsive design system
⚑ Backend FastAPI High-performance REST API
🧠 ML Engine XGBoost + LightGBM + CatBoost SOTA ensemble prediction
πŸ” Explainability SHAP Feature importance analysis
πŸ€– AI Engine BioMistral-7B Medical language model
πŸ’Ύ Database JSON Store (MVP) β†’ PostgreSQL Patient history & records
🐳 Container Docker + Docker Compose Consistent deployment
πŸš€ Deployment Huggingface (Backend) + Vercel (Frontend) Cloud hosting

πŸ‘₯ Team Structure

🎯 4-Member Multidisciplinary Team

πŸ”¬ ML Engineer

Model Development & Explainability

🎯 Responsibilities

  • πŸ“Š Dataset cleaning and exploratory data analysis
  • πŸ€– Risk model development (XGBoost, LightGBM, CatBoost)
  • πŸ“ˆ Uncertainty quantification and calibration
  • πŸ” SHAP-based feature importance
  • 🎲 Counterfactual reasoning implementation
  • βš–οΈ Bias detection and fairness analysis

πŸ“¦ Deliverables

  • ml-research/train.py β€” Model training pipeline
  • backend/models/risk_model.py β€” Inference engine
  • backend/models/explainability.py β€” SHAP integration
  • Model performance reports and visualizations

βš™οΈ Backend Engineer

FastAPI Services & Infrastructure

🎯 Responsibilities

  • πŸ—οΈ API architecture and endpoint design
  • πŸ“₯ Patient data ingestion and validation
  • πŸ” Authentication and authorization
  • πŸ“Š Risk computation API endpoints
  • πŸ‘₯ Cohort analysis and digital twin matching
  • πŸš€ Deployment setup (Docker, Render)

πŸ“¦ Deliverables

  • backend/app.py β€” Main FastAPI application
  • backend/routes/ β€” All API endpoints
  • backend/schemas/ β€” Pydantic models
  • API documentation (OpenAPI/Swagger)

πŸ‘¨β€βš•οΈ Frontend Engineer (Clinician)

Professional Dashboard Interface

🎯 Responsibilities

  • 🎨 Clinician dashboard UI/UX design
  • πŸ” Patient search and filtering system
  • πŸ“Š Risk score visualization (gauges, charts)
  • 🎯 Key driver display components
  • πŸ“‹ Explanation panels and tooltips
  • πŸ’Š Action recommendation interface

πŸ“¦ Deliverables

  • frontend/src/components/Clinician/ β€” Dashboard components
  • Risk visualization library
  • Clinical workflow integration
  • Responsive design implementation

πŸ‘€ Frontend Engineer (Patient)

Patient Portal & Documentation

🎯 Responsibilities

  • 🎨 Patient portal UI/UX design
  • 🚦 Simple risk gauge (traffic light)
  • πŸ“ Plain-language explanation generation
  • πŸ₯— Lifestyle recommendation interface
  • πŸ“ˆ Progress tracking visualizations
  • πŸ“š Project documentation and pitch deck

πŸ“¦ Deliverables

  • frontend/src/components/Patient/ β€” Patient components
  • User-friendly health guidance interface
  • docs/ β€” Comprehensive documentation
  • Presentation slides and demo materials

πŸ“… Development Timeline

πŸ“‹ Detailed Sprint Plan

πŸ—“οΈ Week 1: Design & Core Model (by Jan 24)

🎯 Click to expand tasks
  • Project Setup

    • Initialize GitHub repository with proper structure
    • Set up development environments (Python, Node.js)
    • Configure CI/CD pipelines (GitHub Actions)
    • Create project board and issue templates
  • ML Foundation

    • Load and explore diabetes dataset
    • Perform statistical analysis and visualization
    • Handle missing values and outliers
    • Feature engineering (interactions, scaling)
    • Initial model prototyping
  • Backend Architecture

    • Design API schema (Pydantic models)
    • Set up FastAPI boilerplate
    • Implement health check endpoints
    • Configure CORS and middleware
  • Frontend Design

    • Create wireframes for clinician dashboard
    • Design patient portal mockups
    • Set up React + TypeScript + Vite
    • Implement component structure

πŸ—“οΈ Week 2: Full Stack Development (by Jan 31)

🎯 Click to expand tasks
  • ML Pipeline

    • Train ensemble model (XGBoost + LightGBM + CatBoost)
    • Implement SHAP explainability
    • Calculate feature importance
    • Serialize models (.joblib files)
    • Validate model performance (AUC-ROC, calibration)
  • Backend APIs

    • /predict β€” Risk assessment endpoint
    • /simulate β€” What-if analysis endpoint
    • /report β€” AI report generation endpoint
    • /history β€” Patient timeline endpoint
    • /cohort β€” Population analysis endpoint
  • Frontend Integration

    • Clinician dashboard with risk visualization
    • Patient portal with simple gauges
    • Connect to backend APIs
    • Implement state management
    • Add loading states and error handling
  • End-to-End Testing

    • Integration tests for API endpoints
    • UI component tests
    • Full workflow validation

πŸ—“οΈ Week 3: Polish & Submission (by Feb 9)

🎯 Click to expand tasks
  • Advanced Features

    • Implement counterfactual engine
    • Add cohort comparison functionality
    • Integrate BioMistral-7B for AI reports
    • Build what-if simulation interface
  • Documentation

    • Write comprehensive README
    • Create MODEL_CARD.md
    • Document ETHICS_AND_LIMITATIONS.md
    • Complete ARCHITECTURE.md
    • Generate API documentation
  • Presentation

    • Design pitch deck (15 slides)
    • Record demo video (5-7 minutes)
    • Prepare talking points
    • Rehearse presentation
  • Final Polish

    • UI/UX refinement and accessibility
    • Performance optimization
    • Bug fixes and edge case handling
    • Docker deployment testing
    • Submit repository and materials

⚑ Quick Start

πŸ“‹ Prerequisites

# Required software
βœ… Python 3.10+
βœ… Node.js 18+
βœ… Git 2.30+
βœ… Docker 24.0+ (optional)

🐍 Backend Setup

# Navigate to backend directory
cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the server
uvicorn backend.api:app --reload --port 8001

# πŸŽ‰ Server running at http://localhost:8001
# πŸ“š API docs at http://localhost:8001/docs

βš›οΈ Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

# πŸŽ‰ App running at http://localhost:5173

🧠 ML Model Training

# Navigate to ML research directory
cd ml-research

# Install dependencies
pip install -r requirements.txt

# Train the model
python train_pro.py

# πŸ“¦ Models saved to backend/models/
# πŸ“Š Performance metrics in outputs/

🐳 Docker Deployment (Recommended)

# Build and run all services
docker-compose up --build

# Services available at:
# 🎨 Frontend: http://localhost:3000
# ⚑ Backend: http://localhost:8001
# πŸ“š API Docs: http://localhost:8001/docs

πŸ“¦ Project Structure

clinical-risk-predictor/
β”‚
β”œβ”€β”€ πŸ“ backend/                     # FastAPI Server
β”‚   β”œβ”€β”€ πŸ“„ app.py                   # Main application entry point
β”‚   β”œβ”€β”€ πŸ“„ requirements.txt         # Python dependencies
β”‚   β”‚
β”‚   β”œβ”€β”€ πŸ“ models/                  # ML Risk Models
β”‚   β”‚   β”œβ”€β”€ πŸ“„ risk_model.py        # Ensemble prediction engine
β”‚   β”‚   β”œβ”€β”€ πŸ“„ counterfactuals.py   # What-if analysis logic
β”‚   β”‚   └── πŸ“„ explainability.py    # SHAP feature importance
β”‚   β”‚
β”‚   β”œβ”€β”€ πŸ“ routes/                  # API Endpoints
β”‚   β”‚   β”œβ”€β”€ πŸ“„ patient.py           # Patient data management
β”‚   β”‚   β”œβ”€β”€ πŸ“„ risk.py              # Risk computation APIs
β”‚   β”‚   └── πŸ“„ cohort.py            # Population analytics
β”‚   β”‚
β”‚   β”œβ”€β”€ πŸ“ schemas/                 # Data Validation
β”‚   β”‚   β”œβ”€β”€ πŸ“„ patient.py           # Patient data models
β”‚   β”‚   └── πŸ“„ prediction.py        # Prediction schemas
β”‚   β”‚
β”‚   └── πŸ“ utils/                   # Helper Functions
β”‚       β”œβ”€β”€ πŸ“„ preprocessing.py     # Feature engineering
β”‚       └── πŸ“„ validation.py        # Data validation
β”‚
β”œβ”€β”€ πŸ“ frontend/                    # React Application
β”‚   β”œβ”€β”€ πŸ“„ package.json             # Node dependencies
β”‚   β”œβ”€β”€ πŸ“„ vite.config.ts           # Vite configuration
β”‚   β”‚
β”‚   β”œβ”€β”€ πŸ“ public/                  # Static Assets
β”‚   β”‚   └── πŸ–ΌοΈ logo.svg
β”‚   β”‚
β”‚   └── πŸ“ src/
β”‚       β”œβ”€β”€ πŸ“„ App.tsx              # Root component
β”‚       β”œβ”€β”€ πŸ“„ main.tsx             # Entry point
β”‚       β”‚
β”‚       β”œβ”€β”€ πŸ“ components/          # React Components
β”‚       β”‚   β”‚
β”‚       β”‚   β”œβ”€β”€ πŸ“ Clinician/       # Doctor Dashboard
β”‚       β”‚   β”‚   β”œβ”€β”€ πŸ“„ RiskDashboard.tsx
β”‚       β”‚   β”‚   β”œβ”€β”€ πŸ“„ PatientList.tsx
β”‚       β”‚   β”‚   β”œβ”€β”€ πŸ“„ RiskDetail.tsx
β”‚       β”‚   β”‚   └── πŸ“„ CohortAnalysis.tsx
β”‚       β”‚   β”‚
β”‚       β”‚   β”œβ”€β”€ πŸ“ Patient/         # Patient Portal
β”‚       β”‚   β”‚   β”œβ”€β”€ πŸ“„ RiskGauge.tsx
β”‚       β”‚   β”‚   β”œβ”€β”€ πŸ“„ SimpleReport.tsx
β”‚       β”‚   β”‚   β”œβ”€β”€ πŸ“„ ActionPlan.tsx
β”‚       β”‚   β”‚   └── πŸ“„ Progress.tsx
β”‚       β”‚   β”‚
β”‚       β”‚   └── πŸ“ Common/          # Shared Components
β”‚       β”‚       β”œβ”€β”€ πŸ“„ Header.tsx
β”‚       β”‚       β”œβ”€β”€ πŸ“„ Footer.tsx
β”‚       β”‚       └── πŸ“„ LoadingSpinner.tsx
β”‚       β”‚
β”‚       β”œβ”€β”€ πŸ“ pages/               # Page Components
β”‚       β”‚   β”œβ”€β”€ πŸ“„ ClinicianView.tsx
β”‚       β”‚   └── πŸ“„ PatientView.tsx
β”‚       β”‚
β”‚       β”œβ”€β”€ πŸ“ hooks/               # Custom Hooks
β”‚       β”‚   └── πŸ“„ useRiskPrediction.ts
β”‚       β”‚
β”‚       └── πŸ“ utils/               # Utilities
β”‚           └── πŸ“„ api.ts           # API client
β”‚
β”œβ”€β”€ πŸ“ ml-research/                 # ML Development
β”‚   β”œβ”€β”€ πŸ“„ train.py                 # Model training script
β”‚   β”œβ”€β”€ πŸ“„ evaluate.py              # Model evaluation
β”‚   β”œβ”€β”€ πŸ“„ requirements.txt         # ML dependencies
β”‚   β”‚
β”‚   β”œβ”€β”€ πŸ“ notebooks/               # Jupyter Notebooks
β”‚   β”‚   β”œβ”€β”€ πŸ““ 01_EDA.ipynb         # Exploratory analysis
β”‚   β”‚   β”œβ”€β”€ πŸ““ 02_Modeling.ipynb    # Model development
β”‚   β”‚   └── πŸ““ 03_Evaluation.ipynb  # Performance analysis
β”‚   β”‚
β”‚   └── πŸ“ experiments/             # Experiment Logs
β”‚       └── πŸ“„ model_metrics.json
β”‚
β”œβ”€β”€ πŸ“ data/                        # Datasets
β”‚   β”œβ”€β”€ πŸ“Š diabetes_dataset.csv     # Training data (provided)
β”‚   β”œβ”€β”€ πŸ“Š synthetic_patients.csv   # Test data
β”‚   └── πŸ“Š population_stats.json    # Cohort statistics
β”‚
β”œβ”€β”€ πŸ“ docs/                        # Documentation
β”‚   β”œβ”€β”€ πŸ“„ ARCHITECTURE.md          # System design details
β”‚   β”œβ”€β”€ πŸ“„ API_SPEC.md              # API documentation
β”‚   β”œβ”€β”€ πŸ“„ MODEL_CARD.md            # Model specifications
β”‚   β”œβ”€β”€ πŸ“„ ETHICS_AND_LIMITATIONS.md # Safety considerations
β”‚   β”œβ”€β”€ πŸ“„ TEAM_ROLES.md            # Team structure
β”‚   β”œβ”€β”€ πŸ“„ TIMELINE.md              # Sprint planning
β”‚   └── πŸ“„ DEPLOYMENT.md            # Deployment guide
β”‚
β”œβ”€β”€ πŸ“ .github/                     # GitHub Configuration
β”‚   └── πŸ“ workflows/
β”‚       β”œβ”€β”€ πŸ“„ backend-tests.yml    # Backend CI/CD
β”‚       └── πŸ“„ frontend-tests.yml   # Frontend CI/CD
β”‚
β”œβ”€β”€ πŸ“„ docker-compose.yml           # Multi-container setup
β”œβ”€β”€ πŸ“„ .gitignore                   # Git ignore rules
β”œβ”€β”€ πŸ“„ README.md                    # This file
β”œβ”€β”€ πŸ“„ CONTRIBUTING.md              # Contribution guidelines
└── πŸ“„ LICENSE                      # MIT License


πŸ“Š Expected Deliverables

🎯 Final Showcase Outputs

πŸ“¦ 1. Public GitHub Repository

Complete Source Code with Documentation

  • βœ… Well-organized file structure
  • βœ… Comprehensive README.md
  • βœ… Code comments and docstrings
  • βœ… Architectural diagrams
  • βœ… API documentation (OpenAPI)
  • βœ… Version control history

Repository Link: GitHub.com/YourTeam/clinical-risk-predictor

πŸ’» 2. Working Prototype

Full-Stack Application Demo

  • βœ… FastAPI backend (deployed)
  • βœ… React frontend (deployed)
  • βœ… Clinician dashboard interface
  • βœ… Patient portal interface
  • βœ… Real-time risk predictions
  • βœ… Interactive visualizations

Live Demo: app.clinical-risk.demo

πŸŽ₯ 3. Demo Video

5-7 Minute Walkthrough

  • βœ… Problem statement explanation
  • βœ… Solution architecture overview
  • βœ… Live feature demonstration
  • βœ… Key technical insights
  • βœ… Impact and use cases
  • βœ… Future roadmap

Video Link: YouTube/Product-Demo

πŸ“š 4. Comprehensive Documentation

Technical & Clinical Documentation

  • βœ… MODEL_CARD.md β€” ML model details
  • βœ… ETHICS_AND_LIMITATIONS.md β€” Safety analysis
  • βœ… ARCHITECTURE.md β€” System design
  • βœ… API_SPEC.md β€” Endpoint reference
  • βœ… DEPLOYMENT.md β€” Setup guide
  • βœ… Presentation slides (PDF)

πŸ“š Documentation

πŸ“– Available Documentation

πŸ—οΈ Architecture

Read Docs

System design, data flow, component interactions

πŸ”Œ API Reference

Read Docs

Endpoint documentation, request/response schemas

πŸ€– Model Card

Read Docs

ML model details, performance metrics

βš–οΈ Ethics & Safety

Read Docs

Bias analysis, limitations, safety guidelines

πŸ‘₯ Team Structure

Read Docs

Detailed role breakdown, deliverables

πŸš€ Deployment

Read Docs

Production setup, Docker guide


πŸ“„ License

License: MIT

MIT License β€” See LICENSE file for details

Ready to Transform Healthcare Through AI?
⭐ Star this repository β€’ 🍴 Fork and contribute β€’ πŸ“§ Get in touch


Last Updated: January 2025 | Version: 1.0.0 | Status: 🚧 In Active Development

Releases

No releases published

Packages

 
 
 

Contributors