Jilo Health is an innovative AI-driven health screening platform designed to enable early detection of critical health conditions using just a smartphone camera. This system combines facial and eye image analysis with advanced deep learning to identify subtle physiological and neurological abnormalities—bringing affordable healthcare screening to India's non-metro elderly population.
Tagline: Remote, low-cost, non-invasive digital health screening for the missing middle.
India's non-metro elderly population (50+), especially in Tier-2/3 cities, faces critical barriers to healthcare:
- Limited access to medical professionals capable of early disease screening
- Low health awareness and poor doctor visit frequency
- Mobility constraints preventing regular medical checkups
- Financial burden of private healthcare
Many diseases show early visual signs (pallor, facial asymmetry, scleral discoloration) that go unnoticed due to lack of regular screening.
- Face Image - Detects pallor, cyanosis, facial asymmetry (stroke risk)
- Eye Image (Close-up) - Analyzes sclera for jaundice, redness, pallor
- Blendshape Features - MediaPipe facial muscle activation patterns for stroke detection
- Anemia - Conjunctival pallor detection with hemoglobin estimation (7-18 g/dL)
- Jaundice - Scleral icterus and hepatic dysfunction indicators
- Hypoxia/Cyanosis - Respiratory and cardiac complications
- Stroke Risk - Facial asymmetry and muscle activation patterns
- Edema - Heart failure and fluid retention indicators
- Dehydration - Skin texture and dryness analysis
┌─────────────────────────────────────────────────┐
│ Jilo Health - Full Stack │
└─────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ FRONTEND (React 18 + TypeScript + Vite) │
│ - Camera capture (face, eyes) │
│ - Real-time user guidance │
│ - Results visualization │
│ - Multilingual support (EN/HI) │
└────────────────┬────────────────────────────────┘
│ REST API
┌────────────────▼────────────────────────────────┐
│ BACKEND (FastAPI - Python) │
│ - Image preprocessing & validation │
│ - Multimodal ML inference │
│ - Clinical diagnosis logic │
│ - Device calibration │
└────────────────┬────────────────────────────────┘
│
┌────────────────▼────────────────────────────────┐
│ ML MODELS (PyTorch) │
│ - EfficientNet-B2 (eye analysis) │
│ - CBAM Attention (feature refinement) │
│ - StrokeBlendshapeClassifier (DNN) │
│ - Test-Time Augmentation (TTA) │
└────────────────┬────────────────────────────────┘
│
┌────────────────▼────────────────────────────────┐
│ ANALYSIS MODULES │
│ - MLFaceAnalyzer (multi-space color fusion) │
│ - StrokeBlendshapeClassifier (52 features) │
│ - MediaPipe Integration (facial landmarks) │
└─────────────────────────────────────────────────┘
Input: Eye Image + Face Image
↓
Eye Model: CNN + CBAM + TTA (70.2% acc)
+
Face Model: Multi-color space fusion + Blendshape analysis
↓
Multi-Modal Fusion (Weighted)
↓
Clinical Rule Layer
↓
Disease Predictions + Recommendations
- Architecture: EfficientNet-B2 + CBAM Attention
- Training: 5 epochs with advanced augmentation
- Accuracy: 70.2% (with TTA)
- Sensitivity: 66.7%
- Specificity: 72.2%
- AUC-ROC: 0.772
| Condition | Sensitivity | Specificity | Accuracy |
|---|---|---|---|
| Anemia | 66.7% | 72.2% | 70.2% |
| Jaundice | 75% | 80% | 78% |
| Stroke Risk | 72% | 76% | 74% |
- Total latency: <2 seconds per patient
- Model size: ~112 MB (downloadable for offline use)
- Scalability: Supports 100+ concurrent users
- ✅ CBAM Attention Module - Channel + spatial attention
- ✅ Advanced Data Augmentation - Gamma, color, blur, erasing
- ✅ Test-Time Augmentation (TTA) - 5 augmentations averaged
- ✅ Adaptive Focal Loss - Learnable alpha/gamma per task
- ✅ Uncertainty Quantification - TTA standard deviation
| Method | Baseline | Improved | Gain |
|---|---|---|---|
| CBAM Attention | 63.2% | 65.8% | +2.6% |
| Advanced Aug | 65.8% | 68.1% | +2.3% |
| TTA (5x) | 68.1% | 70.2% | +2.1% |
| Total | 63.2% | 70.2% | +7.0% |
- Python 3.12+
- Node.js 18+
- GPU (optional, CPU works fine)
1. Install Python Dependencies
cd /home/umanggod/jilo-health-interiit
pip install -r requirements.txt2. Verify Model Files
Ensure the following model files exist in models/:
best_eye_model_improved.pth(108 MB)stroke_classifier_dnn.pth(77 KB)scaler.pkl(1.9 KB)face_landmarker.task(3.6 MB)model_config_improved.json
3. Run Backend Server
uvicorn backend_improved:app --reload --host 0.0.0.0 --port 8000The API will be available at:
- API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/api/health
1. Install Node Dependencies
cd jilo-health-scan
npm install2. Run Development Server
npm run dev3. Build for Production
npm run build
npm run previewThe frontend will be available at:
- Development: http://localhost:5173
- Production: http://localhost:4173 (after build)
jilo-health-interiit/
├── backend_improved.py # FastAPI backend server
├── stroke_blendshape_classifier.py # PyTorch stroke detection DNN
├── ml_face_analyzer.py # Advanced facial analysis (EfficientNet-B2)
├── face_droop_analyzer.py # MediaPipe facial landmark analysis
├── medical_templates.py # Clinical diagnosis templates
├── requirements.txt # Python dependencies
├── requirements_cpu.txt # CPU-only dependencies (alternative)
├── render.yaml # Render cloud deployment config
├── Procfile # Heroku deployment config
├── runtime.txt # Python version specification
│
├── models/ # Pretrained ML models
│ ├── best_eye_model_improved.pth # EfficientNet-B2 for eye analysis (108 MB)
│ ├── stroke_classifier_dnn.pth # Blendshape DNN for stroke (77 KB)
│ ├── scaler.pkl # StandardScaler for preprocessing (1.9 KB)
│ ├── face_landmarker.task # MediaPipe FaceLandmarker (3.6 MB)
│ ├── model_config_improved.json # Model configuration
│ └── backend_config.json # Backend configuration
│
├── jilo-health-scan/ # React TypeScript Frontend
│ ├── src/
│ │ ├── pages/ # App screens
│ │ │ ├── Onboarding.tsx
│ │ │ ├── PatientForm.tsx
│ │ │ ├── FaceCapture.tsx
│ │ │ ├── EyeCapture.tsx
│ │ │ ├── Analyzing.tsx
│ │ │ ├── Results.tsx
│ │ │ ├── History.tsx
│ │ │ └── Settings.tsx
│ │ ├── components/ # Reusable components
│ │ ├── lib/ # API & utilities
│ │ ├── contexts/ # React context
│ │ └── App.tsx
│ ├── package.json
│ └── vite.config.ts
│
├── docs/ # Documentation
│ ├── API_DOCUMENTATION.md # Complete API reference
│ ├── DEPLOYMENT_GUIDE.md # Deployment instructions
│ ├── EC2_DEPLOYMENT.md # AWS EC2 setup
│ └── RENDER_QUICK_GUIDE.md # Render.com deployment
│
└── .gitattributes # GitHub LFS configuration
- Real-time guidance for proper face and eye positioning
- Lighting detection to ensure optimal image quality
- Stability detection to prevent blurry captures
- Elderly-friendly UI with large buttons and clear instructions
- Multilingual interface (English + Hindi)
- Multi-modal fusion: Combines face + eye + blendshape features
- Test-Time Augmentation: 5 augmentation strategies for robust predictions
- Device calibration: Adapts to different smartphone cameras
- Uncertainty quantification: Provides confidence scores
- Medically-aligned predictions with clinical reasoning
- Risk stratification: URGENT / High / Routine urgency levels
- Detailed explanations in English and Hindi
- Hemoglobin estimation for anemia (7-18 g/dL range)
- Confidence scores for each prediction
- Architecture: Pretrained EfficientNet-B2 + CBAM attention
- Input: 384x384 eye image
- Output: Anemia score + hemoglobin estimation
- Performance: 70.2% accuracy on test set
- Features:
- Channel and spatial attention (CBAM)
- Multi-task learning (anemia + hemoglobin)
- Test-time augmentation for robustness
- Architecture: [52 → 128 → 64 → 32 → 2] with BatchNorm + Dropout
- Input: 52 MediaPipe blendshape features
- Output: Stroke risk probability (0-1)
- Preprocessing: StandardScaler normalization
- Features:
- Deep neural network for facial muscle pattern analysis
- Trained on facial asymmetry indicators
- Real-time inference (<100ms)
- Multi-space color fusion: LAB + YCbCr + HSV
- Texture analysis: Local Binary Pattern (LBP)
- Detects: Pallor, cyanosis, jaundice, edema, dehydration
- Robust: Device calibration for different cameras
- Features:
- Perceptually uniform color spaces
- Lighting normalization (CLAHE)
- CNN feature extraction from EfficientNet-B2
Analyze patient images and return health predictions.
Request:
curl -X POST http://localhost:8000/api/screen \
-F "eye_image=@eye.jpg" \
-F "face_image=@face.jpg" \
-F "age=65" \
-F "gender=M" \
-F "medical_history=diabetes,hypertension"Response:
{
"diseases": [
{
"disease": "Anemia",
"detected": true,
"confidence": 0.78,
"severity": "Mild",
"urgency": "High",
"hemoglobin_gdl": 11.2,
"explanation_en": "Conjunctival pallor detected indicating potential anemia",
"explanation_hi": "संयोजन पीलापन पाया गया है जो संभावित एनीमिया का संकेत देता है"
},
{
"disease": "Stroke Risk",
"detected": true,
"confidence": 0.72,
"urgency": "URGENT",
"explanation_en": "Facial muscle analysis indicates potential stroke risk",
"explanation_hi": "चेहरे की मांसपेशियों का विश्लेषण संभावित स्ट्रोक जोखिम का संकेत देता है"
}
],
"overall_urgency": "URGENT",
"timestamp": "2025-12-12T16:30:00Z"
}Health check endpoint.
Response:
{
"status": "healthy",
"model": "improved_v2_with_tta",
"accuracy": "70.2%"
}For complete API documentation, see API_DOCUMENTATION.md.
-
CBAM: "Convolutional Block Attention Module" (ECCV 2018)
- Channel and spatial attention mechanisms
- Improves feature representation for medical imaging
-
TTA: "Test-Time Augmentation" (Medical Image Analysis 2022)
- 5 augmentation strategies averaged
- Reduces prediction variance and improves robustness
-
Focal Loss: "Focal Loss for Dense Object Detection" (ICCV 2017)
- Addresses class imbalance in medical datasets
- Learnable alpha/gamma parameters per task
-
Advanced Augmentation: Medical imaging best practices
- Gamma correction, color jitter, blur, random erasing
- Domain-specific augmentations for clinical images
-
Multi-Color Space Fusion: Research-backed approach
- LAB (perceptually uniform), YCbCr (chrominance), HSV (hue-saturation)
- Robust to lighting variations and device differences
# Backend
uvicorn backend_improved:app --reload
# Frontend
cd jilo-health-scan && npm run devgit push origin main # Automatic deploymentSee RENDER_QUICK_GUIDE.md for details.
# Setup EC2 instance
./setup_ec2.sh
# Deploy to EC2
./deploy_to_ec2.shSee EC2_DEPLOYMENT.md for full setup guide.
docker build -t jilo-health .
docker run -p 8000:8000 jilo-health# Install dependencies
pip install -r requirements.txt
# Run with auto-reload
uvicorn backend_improved:app --reload
# Run tests (if available)
pytest tests/
# Type checking
mypy backend_improved.pycd jilo-health-scan
npm install
npm run dev # Dev server
npm run build # Production build
npm run lint # Linting# Optional: Model download URL
export MODEL_URL="https://your-cdn.com/models/best_eye_model_improved.pth"- API_DOCUMENTATION.md - Complete API reference with examples
- DEPLOYMENT_GUIDE.md - Production deployment guide
- EC2_DEPLOYMENT.md - AWS EC2 setup instructions
- RENDER_QUICK_GUIDE.md - Render.com deployment
- FACIAL_ML_RESEARCH.md - Research methodology
Compared against global solutions:
- IBM Watson Health - Cloud-based, expensive (~$50K/year)
- Google Cloud Vision - Generic CV, not medical-specific
- Philips HealthSuite - Enterprise-focused, limited accessibility
- Our Solution - Specialized, offline-capable, affordable, India-focused
- Blendshape-based stroke detection - Using 52 facial muscle features from MediaPipe
- Multi-color space fusion - LAB + YCbCr + HSV for robust pallor detection
- Device calibration - Adapts to different smartphone cameras automatically
- Test-Time Augmentation - 5 strategies for improved robustness
- Lightweight models - <200MB total, works on older phones
- 📱 Video analysis for continuous monitoring
- 🤖 Voice-based health assessment
- 📊 Population-level analytics
- 🏥 Integration with health systems
- 🔐 HIPAA/medical compliance
- 🌐 Additional regional languages (Bengali, Tamil, etc.)
- Low-bandwidth optimization - Compress images, batch processing
- Offline models - Run on-device without internet
- Local partnerships - Distribute via ASHA workers, clinics
- Language support - Regional languages
- Accessibility - Audio guidance, haptic feedback
- Collect more data - Currently 858 labeled images
- Self-supervised pre-training - On 10K+ unlabeled images
- Model ensemble - Train 3 models, average predictions
- Active learning - Label most uncertain samples
- Medical validation - Clinical trials with doctors
- Accuracy: 70.2% for anemia detection (room for improvement)
- Dataset: Trained on 858 labeled images (limited diversity)
- Validation: Requires clinical validation with larger datasets
- Device dependency: Performance may vary across different cameras
- Website: https://jilohealth.com
- Email: tech@jilohealth.com
- Documentation: See
docs/folder - Issues: GitHub Issues (if applicable)
Proprietary - Inter IIT Tech Meet 14.0 Submission
Jilo Health Technology Team
- AI/ML Engineers
- Full-stack Developers
- Clinical Advisors
Problem: Early Health Screening for Non-Metro Elderly Population
Solution: AI-Powered Facial & Eye Image Analysis
Status: MVP Complete - Ready for Evaluation
Last Updated: December 12, 2025
Version: 2.0 (With Blendshape Stroke Classifier)