Skip to content

AkshatRaj00/visapredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VisaIQ — Visa Processing Intelligence

Python Streamlit scikit-learn License

A production-grade visa processing time predictor built with Random Forest ML, confidence-aware estimates, and optional Gemini AI guidance.


Features

  • Confidence-aware predictions — not just a number, but a range with High/Medium/Low confidence score derived from Random Forest tree agreement
  • Gemini AI guidance — optional 3-sentence practical insight per prediction via Google Gemini 1.5 Flash
  • CSV retraining — upload your own historical visa records to retrain the model in-app (max 5 MB / 10,000 rows)
  • Training diagnostics — MAE, R², countries and visa types covered
  • Prediction history — last 100 predictions with CSV export
  • Dark UI — clean, production-style Streamlit interface

Tech Stack

Layer Technology
Frontend / App Streamlit 1.45
ML Model Random Forest (scikit-learn 1.5)
Data pandas, numpy
AI Insight Google Gemini 1.5 Flash
Model persistence joblib
Containerization Docker

Project Structure

visapredictor/
├── app.py              # Streamlit UI — Predict, Train, Analytics tabs
├── predict.py          # Prediction logic, confidence scoring, input validation
├── train.py            # Training pipeline, feature engineering
├── sample_data.csv     # Sample dataset for testing
├── rf_model.pkl        # Trained Random Forest model (auto-generated)
├── country_encoder.pkl # LabelEncoder for countries (auto-generated)
├── visa_encoder.pkl    # LabelEncoder for visa types (auto-generated)
├── requirements.txt    # Pinned dependencies
├── Dockerfile          # Container definition
└── .gitignore

Quickstart

1. Clone the repo

git clone https://github.com/AkshatRaj00/visapredictor.git
cd visapredictor

2. Create virtual environment

python -m venv venv
source venv/bin/activate        # Linux/macOS
venv\Scripts\activate           # Windows

3. Install dependencies

pip install -r requirements.txt

4. Train the model

python train.py

This generates rf_model.pkl, country_encoder.pkl, and visa_encoder.pkl.

5. Run the app

streamlit run app.py

Open http://localhost:8501


Docker

docker build -t visaiq .
docker run -p 8501:8501 visaiq

Gemini AI Setup (Optional)

To enable AI-generated guidance per prediction:

  1. Get a free API key from Google AI Studio
  2. For Streamlit Cloud: add to .streamlit/secrets.toml
    GOOGLE_API_KEY = "your-key-here"
  3. For local: set environment variable
    export GOOGLE_API_KEY="your-key-here"

CSV Training Format

Upload a CSV with these exact columns:

Column Format Example
country String India
visa_type String Student
application_date YYYY-MM-DD 2024-01-15
decision_date YYYY-MM-DD 2024-03-20

The model computes processing time (days) from decision_date - application_date and extracts the month as a seasonal feature.

Limits: Max 5 MB, max 10,000 rows per upload.


Deployment — Streamlit Community Cloud

  1. Push code to GitHub
  2. Go to share.streamlit.io
  3. Connect repo → set main file as app.py
  4. In Advanced Settings → set Python 3.12
  5. Add GOOGLE_API_KEY in Secrets (optional)

⚠️ .pkl model files must be committed or generated at startup. Add a train.py call in your startup script if needed.


How Confidence is Calculated

Confidence is derived from Random Forest tree agreement:

  1. Each of the 100 trees in the forest makes an individual prediction.
  2. Standard deviation of tree predictions is computed.
  3. Confidence = (1 - std_dev / max_expected_std) × 100, clamped to 0–100.
Label Confidence Range
🟢 High ≥ 70%
🟡 Medium 40–69%
🔴 Low < 40%

Contributing

  1. Fork the repo
  2. Create a feature branch: git checkout -b feat/your-feature
  3. Commit your changes
  4. Open a Pull Request

License

MIT License — see LICENSE for details.


Author

Akshat RajGitHub · Portfolio · Twitter

About

VisaIQ — Visa Processing Time Estimator A machine learning powered web app that predicts visa processing times based on historical data, with AI-generated insights powered by Google Gemini. Built for Infosys Springboard Internship — Milestone 4

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors