End-to-end Machine Learning system for flight price prediction with REST API, web interface, CI/CD pipeline, and AWS deployment
- Overview
- Architecture
- Key Features
- Tech Stack
- Project Structure
- Local Installation
- Usage
- Testing
- AWS Deployment
- CI/CD Pipeline
- Model Details
- API Documentation
- Contributing
- License
Production-ready Machine Learning system that predicts flight prices in India using historical data. The project demonstrates MLOps best practices with:
- ML Model: XGBoost regressor achieving R² ~0.988
- REST API: FastAPI with automatic Swagger documentation
- Web Interface: Interactive Streamlit dashboard
- CI/CD: Automated testing and deployment with GitHub Actions
- Cloud Infrastructure: AWS Elastic Beanstalk with auto-scaling
- Testing: Comprehensive test suite with pytest
Dataset: 300,153 historical flight records with 11 predictor variables
Performance Metrics:
- RMSE: ~2,450 INR
- MAE: ~1,260 INR
- R²: ~0.988
┌──────────────────────────────────────────────────────────────┐
│ Users │
└────────────┬─────────────────────────┬───────────────────────┘
│ │
┌────────▼─────────┐ ┌────────▼──────────┐
│ Streamlit UI │ │ External Apps │
│ (Frontend) │ │ (API Clients) │
└────────┬─────────┘ └────────┬──────────┘
│ │
└─────────┬───────────────┘
│
┌────────▼────────────────────┐
│ FastAPI REST API │
│ ┌──────────────────────┐ │
│ │ Request Validation │ │
│ │ (Pydantic) │ │
│ └──────────┬───────────┘ │
│ │ │
│ ┌──────────▼───────────┐ │
│ │ Preprocessing │ │
│ │ Pipeline │ │
│ └──────────┬───────────┘ │
│ │ │
│ ┌──────────▼───────────┐ │
│ │ XGBoost Model │ │
│ │ (R² = 0.988) │ │
│ └──────────────────────┘ │
└─────────────────────────────┘
│
┌─────────────┴──────────────┐
│ │
┌────▼──────────┐ ┌────────▼─────────┐
│ AWS Elastic │ │ GitHub Actions │
│ Beanstalk │ │ (CI/CD) │
│ │ │ │
│ • Auto-scaling│ │ • Auto-test │
│ • Load Balancer│ │ • Auto-deploy │
│ • CloudWatch │ │ │
└───────────────┘ └──────────────────┘
Data Flow:
- User inputs flight parameters via Streamlit UI or API
- FastAPI validates input with Pydantic schemas
- Preprocessing pipeline transforms features
- XGBoost model generates prediction
- Response returned with predicted price and metadata
- ✅ Optimized XGBoost model with RandomizedSearchCV hyperparameter tuning
- ✅ Advanced feature engineering: cyclic encoding for time, target encoding for high-cardinality features
- ✅ Reproducible pipeline with scikit-learn ColumnTransformer
- ✅ High accuracy: RMSE ~2,450 INR, MAE ~1,260 INR, R² ~0.988
- ✅ Feature importance analysis for model interpretability
- ✅ Individual prediction endpoint (
POST /predict) - ✅ Batch prediction endpoint (
POST /predict/batch) - ✅ Health monitoring (
GET /health) - ✅ Feature importance analysis (
GET /feature-importance) - ✅ Interactive documentation with Swagger UI (
/docs) - ✅ Data validation with Pydantic
- ✅ CORS enabled for cross-origin requests
- ✅ Structured logging for debugging
- ✅ Intuitive UI with real-time predictions
- ✅ Single prediction form with dropdown selectors
- ✅ Price comparison tool (same flight, different classes/dates)
- ✅ CSV batch upload for multiple predictions
- ✅ Feature importance visualization
- ✅ API health status indicator
- ✅ Download results as CSV
- ✅ CI/CD pipeline with GitHub Actions
- ✅ Automated testing with pytest (7 test cases)
- ✅ Auto-deployment to AWS Elastic Beanstalk
- ✅ Docker support for containerization
- ✅ CloudWatch integration for logging and monitoring
- ✅ Auto-scaling configuration
- ✅ Application Load Balancer for high availability
| Category | Technologies |
|---|---|
| ML/Data Science | |
| Backend | |
| Frontend | |
| Testing | |
| Cloud/DevOps | |
| CI/CD | |
| Containerization |
flight-price-ml-production/
├── .ebextensions/ # AWS Elastic Beanstalk configuration
│ ├── 01_python.config # Python environment setup
│ └── 02_logging.config # CloudWatch logging setup
├── .github/workflows/ # GitHub Actions CI/CD
│ └── deploy.yml # Automated deployment workflow
├── src/ # Source code
│ ├── api/ # FastAPI application
│ │ └── main.py # API endpoints and configuration
│ ├── data/ # Data processing
│ │ └── make_dataset.py # Data loading and cleaning
│ ├── features/ # Feature engineering
│ │ └── build_features.py # Feature transformation pipeline
│ └── models/ # ML models
│ ├── train.py # Model training script
│ └── predict.py # Prediction service
├── tests/ # Test suite
│ └── test_api.py # API endpoint tests
├── models/ # Trained model artifacts
│ ├── xgboost_model.pkl # Trained XGBoost model (~9MB)
│ └── preprocessor.pkl # Fitted preprocessing pipeline (~4KB)
├── data/ # Data directory
│ ├── raw/ # Raw data files
│ └── processed/ # Processed data files
├── notebooks/ # Jupyter notebooks (EDA, experiments)
├── application.py # Entry point for Elastic Beanstalk
├── streamlit_app.py # Streamlit web interface
├── requirements.txt # Python dependencies
├── Dockerfile # Docker configuration (optional)
├── Procfile # Process file for EB
└── README.md # This file
- Python 3.11 or higher
- pip or conda package manager
- Git
- (Optional) Docker for containerization
- (Optional) AWS CLI for cloud deployment
git clone https://github.com/Acquarts/flight-price-ml-production.git
cd flight-price-ml-productionOption A: Using venv (Linux/Mac)
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtOption B: Using venv (Windows)
python -m venv venv
.\venv\Scripts\Activate.ps1
pip install -r requirements.txtOption C: Using conda (Recommended for Windows)
conda env create -f environment.yml
conda activate flight-pricePlace the airlines_flights_data.csv file in:
data/raw/airlines_flights_data.csv
python -m src.models.trainOutput:
models/xgboost_model.pkl- Trained XGBoost model (~9MB)models/preprocessor.pkl- Fitted preprocessing pipeline (~4KB)
Expected console output:
INFO - === INITIATING TRAINING ===
INFO - Loading data from data/raw/airlines_flights_data.csv
INFO - Data loaded: 300153 rows, 12 columns
INFO - Data cleaned: (300153, 11)
INFO - Time features encoded
INFO - Train: (240122, 12), Test: (60031, 12)
INFO - Preprocessor created with 30 features
INFO - Data transformed: (240122, 30)
INFO - Training XGBoost with params: {'n_estimators': 600, ...}
INFO - Model trained successfully
INFO - Model metrics:
RMSE: 2458.58
MAE: 1258.63
R²: 0.9883
INFO - Model saved at models/xgboost_model.pkl
INFO - Preprocessor saved at models/preprocessor.pkl
INFO - === TRAINING COMPLETED ===
uvicorn application:application --reload --host 0.0.0.0 --port 8000Server will start at: http://localhost:8000
Open in your browser:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Root endpoint with API info |
| GET | /health |
Health check and model status |
| POST | /predict |
Single flight price prediction |
| POST | /predict/batch |
Batch predictions for multiple flights |
| GET | /feature-importance |
Get model's feature importance |
| GET | /docs |
Interactive Swagger documentation |
cURL:
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{
"airline": "SpiceJet",
"flight": "SG-8157",
"source_city": "Delhi",
"departure_time": "Evening",
"stops": "zero",
"arrival_time": "Night",
"destination_city": "Mumbai",
"class": "Economy",
"duration": 2.17,
"days_left": 1
}'Python:
import requests
url = "http://localhost:8000/predict"
payload = {
"airline": "SpiceJet",
"flight": "SG-8157",
"source_city": "Delhi",
"departure_time": "Evening",
"stops": "zero",
"arrival_time": "Night",
"destination_city": "Mumbai",
"class": "Economy",
"duration": 2.17,
"days_left": 1
}
response = requests.post(url, json=payload)
result = response.json()
print(f"Predicted Price: ₹{result['predicted_price']:.2f}")
# Output: Predicted Price: ₹8659.98JavaScript (fetch):
const url = "http://localhost:8000/predict";
const payload = {
airline: "SpiceJet",
flight: "SG-8157",
source_city: "Delhi",
departure_time: "Evening",
stops: "zero",
arrival_time: "Night",
destination_city: "Mumbai",
class: "Economy",
duration: 2.17,
days_left: 1
};
fetch(url, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(payload)
})
.then(res => res.json())
.then(data => console.log(`Predicted Price: ₹${data.predicted_price}`));Response:
{
"predicted_price": 8659.98,
"currency": "INR",
"timestamp": "2026-01-20T10:30:00.123456"
}curl -X POST "http://localhost:8000/predict/batch" \
-H "Content-Type: application/json" \
-d '{
"flights": [
{
"airline": "SpiceJet",
"flight": "SG-8157",
"source_city": "Delhi",
"departure_time": "Evening",
"stops": "zero",
"arrival_time": "Night",
"destination_city": "Mumbai",
"class": "Economy",
"duration": 2.17,
"days_left": 1
},
{
"airline": "Vistara",
"flight": "UK-995",
"source_city": "Delhi",
"departure_time": "Morning",
"stops": "zero",
"arrival_time": "Afternoon",
"destination_city": "Mumbai",
"class": "Business",
"duration": 2.25,
"days_left": 1
}
]
}'Response:
{
"predictions": [
{
"predicted_price": 8659.98,
"currency": "INR",
"timestamp": "2026-01-20T10:30:00.123456"
},
{
"predicted_price": 49954.50,
"currency": "INR",
"timestamp": "2026-01-20T10:30:00.123456"
}
],
"total": 2
}Airlines: SpiceJet, AirAsia, Vistara, GO_FIRST, Indigo, Air_India
Cities: Delhi, Mumbai, Bangalore, Kolkata, Hyderabad, Chennai
Time Slots: Early_Morning, Morning, Afternoon, Evening, Night, Late_Night
Stops: zero, one, two_or_more
Class: Economy, Business
Duration: Float (hours), range: 0.5 - 50.0
Days Left: Integer, range: 1 - 49
The project includes a complete web interface built with Streamlit for non-technical users.
Step 1: Start the API (in one terminal)
uvicorn application:application --reload --host 0.0.0.0 --port 8000Step 2: Start Streamlit (in another terminal)
streamlit run streamlit_app.pyThe web interface will automatically open at: http://localhost:8501
Interactive form to predict single flight prices:
-
Flight Information
- Select airline from dropdown
- Choose origin and destination cities
- Select cabin class (Economy/Business)
-
Schedule & Details
- Departure and arrival time slots
- Number of stops
- Flight duration (hours)
- Days until departure (slider: 1-49)
-
Prediction Result
- Large display of predicted price in INR
- Route, class, and airline summary
- Factors influencing the price
Example workflow:
- Select "SpiceJet" as airline
- Choose "Delhi" → "Mumbai" route
- Set "Economy" class
- Select "Evening" departure, "Night" arrival
- Set "zero" stops, duration 2.17 hours
- Slide "Days left" to 1
- Click "🚀 Predict Price"
- View result: ₹8,659.98
Two options for comparing multiple flights:
Option 1: Upload CSV File
- Upload a CSV with required columns
- Automatic batch prediction for all rows
- View results in sortable table
- Statistics: min/max/average prices
- Download results as CSV
CSV Format:
airline,flight,source_city,departure_time,stops,arrival_time,destination_city,class,duration,days_left
SpiceJet,SG-8157,Delhi,Evening,zero,Night,Mumbai,Economy,2.17,1
Vistara,UK-995,Delhi,Morning,zero,Afternoon,Mumbai,Business,2.25,1Option 2: Quick Comparator
- Select airline, route, duration, stops
- Automatically compares:
- Economy vs Business class
- Different booking dates (1, 7, 14, 30 days advance)
- View in pivot table format
- Insights on price variation by advance booking
Example output:
Class | 1 day | 7 days | 14 days | 30 days
------------|--------|--------|---------|--------
Economy | ₹8,659 | ₹7,234 | ₹6,892 | ₹6,453
Business | ₹49,954| ₹45,123| ₹42,876 | ₹40,234
Feature Importance Visualization
- Click "🔍 Get Feature Importance"
- View bar chart of top 15 features
- See table with exact importance values
- Interpretation guide
Top Features (typical):
- class_Business - ~45% importance
- class_Economy - ~38% importance
- duration - ~9% importance
- days_left - ~1.4% importance
- stops - ~1.2% importance
Model Information
- Training data: 300,153 flights
- Model: XGBoost Regressor
- Metrics: RMSE ~2,450 INR, MAE ~1,260 INR, R² ~0.988
API Status Monitor (real-time):
- ✅ API Active / ❌ API Unavailable
- ✅ Model Loaded / ❌ Model Not Loaded
- Auto-refreshes on page load
Information Panel:
- Model details
- Performance metrics
- Quick start tips
While this project focuses on FastAPI deployment to AWS Elastic Beanstalk, the Streamlit interface can be deployed separately:
Option 1: Streamlit Cloud (Free)
- Push your code to GitHub
- Go to share.streamlit.io
- Connect your GitHub repository
- Select
streamlit_app.pyas the main file - Set environment variable:
API_URL=https://your-elastic-beanstalk-url.amazonaws.com - Deploy
Option 2: Heroku
# Create Procfile for Streamlit
echo "web: streamlit run streamlit_app.py --server.port=$PORT --server.address=0.0.0.0" > Procfile.streamlit
# Deploy
heroku create flight-price-streamlit
git push heroku main
heroku openOption 3: Docker + AWS ECS
# Dockerfile.streamlit
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt streamlit_app.py ./
COPY src/ ./src/
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8501
CMD ["streamlit", "run", "streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]# Build and push to ECR
docker build -f Dockerfile.streamlit -t flight-price-streamlit .
docker tag flight-price-streamlit:latest <AWS_ACCOUNT>.dkr.ecr.us-east-1.amazonaws.com/flight-price-streamlit:latest
docker push <AWS_ACCOUNT>.dkr.ecr.us-east-1.amazonaws.com/flight-price-streamlit:latest
# Deploy to ECS (configure task definition and service)Note: Update API_URL in streamlit_app.py (line 16) to point to your production API:
# For local development
API_URL = "http://localhost:8000"
# For production
API_URL = "http://your-app.elasticbeanstalk.com"# Run all tests
pytest tests/ -v
# Run with coverage report
pytest tests/ -v --cov=src --cov-report=term-missing
# Run specific test file
pytest tests/test_api.py -v
# Run specific test
pytest tests/test_api.py::test_predict_endpoint -v| Test | Description | Status |
|---|---|---|
test_root_endpoint |
Validates root endpoint returns API info | ✅ |
test_health_check |
Verifies health endpoint and model status | ✅ |
test_predict_endpoint |
Tests single prediction with valid data | ✅ |
test_predict_invalid_data |
Validates data validation (422 error) | ✅ |
test_batch_prediction |
Tests batch prediction endpoint | ✅ |
test_feature_importance |
Verifies feature importance endpoint | ✅ |
test_docs_endpoint |
Checks Swagger documentation availability | ✅ |
Expected Output:
======================== test session starts ========================
tests/test_api.py::test_root_endpoint PASSED [ 14%]
tests/test_api.py::test_health_check PASSED [ 28%]
tests/test_api.py::test_predict_endpoint PASSED [ 42%]
tests/test_api.py::test_predict_invalid_data PASSED [ 57%]
tests/test_api.py::test_batch_prediction PASSED [ 71%]
tests/test_api.py::test_feature_importance PASSED [ 85%]
tests/test_api.py::test_docs_endpoint PASSED [100%]
========================= 7 passed in 1.73s =========================
- AWS account with appropriate IAM permissions
- AWS CLI installed and configured
- EB CLI installed:
pip install awsebcli
eb init -p python-3.11 flight-price-api --region us-east-1eb create flight-price-prod \
--instance-type t3.medium \
--envvars MODEL_PATH=models/xgboost_model.pkl,PREPROCESSOR_PATH=models/preprocessor.pkl,LOG_LEVEL=INFOThis creates:
- EC2 instances (t3.medium)
- Application Load Balancer
- Auto Scaling Group (1-4 instances)
- CloudWatch logging
- Security groups
Wait 5-10 minutes for environment creation.
# Deploy current code
eb deploy
# Check status
eb status
# View logs
eb logs
# Open in browser
eb openeb status | grep CNAME
# Output: CNAME: flight-price-prod.us-east-1.elasticbeanstalk.comYour API will be available at:
http://flight-price-prod.us-east-1.elasticbeanstalk.com
Test it:
curl http://flight-price-prod.us-east-1.elasticbeanstalk.com/health# Scale instances
eb scale 3
# SSH into instance
eb ssh
# View environment info
eb status --verbose
# Restart application
eb restart
# Terminate environment (to stop costs)
eb terminate flight-price-prodSee CI/CD Pipeline section below.
Estimated Monthly Costs (us-east-1):
- t3.medium instance: ~$30/month
- Application Load Balancer: ~$20/month
- CloudWatch + S3: ~$5/month
- Total: ~$55/month
Cost Optimization:
- Use t3.small for low traffic: ~$15/month
- Terminate environment when not in use: $0
- Use AWS Free Tier for first year
- Enable auto-scaling with min=0 for dev environments
To stop costs:
eb terminate flight-price-prodThe project includes a complete CI/CD pipeline that automatically:
- ✅ Runs tests on every push
- ✅ Builds deployment package
- ✅ Uploads to S3
- ✅ Deploys to Elastic Beanstalk (on
mainbranch)
Go to your repository: Settings → Secrets and variables → Actions → New repository secret
Add these secrets:
| Secret Name | Value | Description |
|---|---|---|
AWS_ACCESS_KEY_ID |
Your AWS access key | IAM user credentials |
AWS_SECRET_ACCESS_KEY |
Your AWS secret key | IAM user credentials |
S3_BUCKET |
flight-price-deployments |
S3 bucket for artifacts |
aws s3 mb s3://flight-price-deployments --region us-east-1Edit .github/workflows/deploy.yml (line 10):
env:
AWS_REGION: us-east-1
EB_APPLICATION_NAME: flight-price-api
EB_ENVIRONMENT_NAME: flight-price-prod # Match your EB environment name# Make a change
echo "# Update" >> README.md
# Commit and push
git add README.md
git commit -m "Trigger deployment"
git push origin mainGo to: GitHub → Actions tab
You'll see the workflow running with these steps:
- 🧪 Test (runs pytest)
- 📦 Build (creates deployment package)
- ☁️ Deploy (uploads to EB)
Typical execution time: 2-3 minutes
The workflow is defined in .github/workflows/deploy.yml:
name: Deploy to AWS Elastic Beanstalk
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
- run: pip install -r requirements.txt
- run: pytest tests/ -v --cov=src
deploy:
needs: test
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- uses: actions/checkout@v3
- uses: aws-actions/configure-aws-credentials@v2
- run: zip -r deploy.zip .
- run: aws s3 cp deploy.zip s3://${{ secrets.S3_BUCKET }}/
- run: aws elasticbeanstalk create-application-version ...
- run: aws elasticbeanstalk update-environment ...┌─────────────┐
│ git push │
│ to main │
└──────┬──────┘
│
▼
┌─────────────┐
│ Run Tests │ ← pytest tests/
│ │
└──────┬──────┘
│ PASS ✅
▼
┌─────────────┐
│Build Package│ ← zip deployment files
│ │
└──────┬──────┘
│
▼
┌─────────────┐
│ Upload to S3│ ← aws s3 cp
│ │
└──────┬──────┘
│
▼
┌─────────────┐
│Create App │ ← create-application-version
│ Version │
└──────┬──────┘
│
▼
┌─────────────┐
│ Deploy to │ ← update-environment
│ EB │
└──────┬──────┘
│
▼
┌─────────────┐
│ Live! 🎉 │ ← API accessible
└─────────────┘
XGBoost (eXtreme Gradient Boosting) - Gradient boosted decision trees optimized for speed and performance.
Optimized using RandomizedSearchCV with 20 iterations:
{
'n_estimators': 600, # Number of boosting rounds
'learning_rate': 0.2, # Step size shrinkage
'max_depth': 8, # Maximum tree depth
'subsample': 1.0, # Fraction of samples for each tree
'colsample_bytree': 0.8, # Fraction of features for each tree
'gamma': 1, # Minimum loss reduction for split
'random_state': 42 # Reproducibility
}Total Features: ~30 after transformation
Transformations Applied:
-
Cyclic Encoding for time features:
time_sin = sin(2π × seconds / 86400) time_cos = cos(2π × seconds / 86400)
Applied to:
departure_time,arrival_time -
One-Hot Encoding for low-cardinality categorical:
airline(6 categories)stops(3 categories)class(2 categories)
-
Target Encoding for high-cardinality categorical:
source_city(6 categories)destination_city(6 categories)
-
Numerical Features (passthrough):
durationdays_left- Time encoding features
Top 10 features by importance:
| Rank | Feature | Importance | Impact |
|---|---|---|---|
| 1 | class_Business | ~45% | Strong positive (higher prices) |
| 2 | class_Economy | ~38% | Moderate positive |
| 3 | duration | ~9% | Positive correlation |
| 4 | days_left | ~1.4% | Negative correlation (book early = lower price) |
| 5 | stops | ~1.2% | More stops = lower price |
| 6 | departure_time_sin | ~0.8% | Time of day effect |
| 7 | arrival_time_cos | ~0.7% | Time of day effect |
| 8 | source_city (encoded) | ~0.6% | Origin city effect |
| 9 | destination_city (encoded) | ~0.5% | Destination city effect |
| 10 | airline (encoded) | ~0.4% | Carrier effect |
Cross-Validation (5-fold):
- RMSE: 2,449.52 ± 16.93 INR
- MAE: 1,259.90 ± 7.60 INR
- R²: 0.9884 ± 0.0002
Test Set:
- RMSE: 2,458.58 INR (~11.8% of mean price)
- MAE: 1,258.63 INR (~6.0% of mean price)
- R²: 0.9883
Interpretation:
- Model explains 98.8% of price variance
- Average prediction error: ₹1,258.63
- Root mean squared error: ₹2,458.58
python -m src.models.trainPipeline:
- Load data (300,153 flights)
- Clean data (remove duplicates)
- Encode time features (cyclic)
- Train/test split (80/20)
- Create preprocessing pipeline
- Fit preprocessor on training data
- Transform features
- Train XGBoost model
- Evaluate on test set
- Save model and preprocessor
Artifacts Generated:
models/xgboost_model.pkl- Trained model (~9MB)models/preprocessor.pkl- Fitted pipeline (~4KB)
Key Findings:
- Class is the dominant factor: Business class flights cost ~5.7x more than Economy
- Early booking saves money: Booking 30 days vs 1 day in advance saves ~15-20%
- Direct flights are premium: Non-stop flights cost ~10-15% more
- Flight duration matters: Longer flights generally cost more (non-linear relationship)
- Airline differences: Vistara most expensive, Air India most economical
- Time of day: Evening/Night departures slightly more expensive
Description: Root endpoint with API information
Response:
{
"message": "Flight Price Predictor API",
"version": "1.0.0",
"docs": "/docs"
}Description: Health check and model status
Response:
{
"status": "healthy",
"model_loaded": true,
"timestamp": "2026-01-20T10:30:00.123456"
}Status Codes:
200: Service healthy503: Service unhealthy
Description: Predict price for a single flight
Request Body:
{
"airline": "SpiceJet",
"flight": "SG-8157",
"source_city": "Delhi",
"departure_time": "Evening",
"stops": "zero",
"arrival_time": "Night",
"destination_city": "Mumbai",
"class": "Economy",
"duration": 2.17,
"days_left": 1
}Response:
{
"predicted_price": 8659.98,
"currency": "INR",
"timestamp": "2026-01-20T10:30:00.123456"
}Status Codes:
200: Success422: Validation error500: Internal server error
Description: Predict prices for multiple flights
Request Body:
{
"flights": [
{
"airline": "SpiceJet",
"flight": "SG-8157",
"source_city": "Delhi",
"departure_time": "Evening",
"stops": "zero",
"arrival_time": "Night",
"destination_city": "Mumbai",
"class": "Economy",
"duration": 2.17,
"days_left": 1
},
{
"airline": "Vistara",
"flight": "UK-995",
"source_city": "Delhi",
"departure_time": "Morning",
"stops": "zero",
"arrival_time": "Afternoon",
"destination_city": "Mumbai",
"class": "Business",
"duration": 2.25,
"days_left": 1
}
]
}Response:
{
"predictions": [
{
"predicted_price": 8659.98,
"currency": "INR",
"timestamp": "2026-01-20T10:30:00.123456"
},
{
"predicted_price": 49954.50,
"currency": "INR",
"timestamp": "2026-01-20T10:30:00.123456"
}
],
"total": 2
}Description: Get model's feature importance scores
Query Parameters:
top_n(optional): Number of top features to return (default: 15)
Response:
{
"feature_importance": {
"class_Business": 0.4523,
"class_Economy": 0.3841,
"duration": 0.0892,
"stops_zero": 0.0456,
"airline_Air_India": 0.0234,
"destination_city": 0.0198,
"source_city": 0.0187,
"stops_one": 0.0154,
"days_left": 0.0143,
"airline_Vistara": 0.0112
},
"top_n": 10
}422 Validation Error:
{
"detail": [
{
"loc": ["body", "duration"],
"msg": "ensure this value is greater than 0",
"type": "value_error.number.not_gt"
}
]
}500 Internal Server Error:
{
"detail": "Error in prediction: [error message]"
}Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes
- Run tests:
pytest tests/ -v - Commit:
git commit -m 'Add amazing feature' - Push:
git push origin feature/amazing-feature - Open a Pull Request
- Follow PEP 8 guidelines
- Use type hints where applicable
- Add docstrings to all functions
- Write tests for new features
- Update documentation
- Ensure all tests pass
- Update README.md if needed
- Add description of changes
- Request review from maintainers
This project is licensed under the MIT License - see the LICENSE file for details.
Adrián - Acquarts
- Dataset from Kaggle: Flight Price Prediction Dataset
- FastAPI documentation and community
- Streamlit for the amazing UI framework
- XGBoost developers for the powerful ML library
- AWS for cloud infrastructure
If you encounter any issues or have questions:
- Check the TROUBLESHOOTING.md guide
- Open an issue on GitHub
- Review existing issues and discussions
- Add authentication (API keys, JWT)
- Implement rate limiting
- Add caching layer (Redis)
- Multi-region deployment
- Real-time price updates
- Historical price tracking
- Price alerts system
- Mobile app (React Native)
- GraphQL API
- Kubernetes deployment option
⭐ If you find this project useful, please consider giving it a star! ⭐
Made with ❤️ by Acquarts