This project analyzes international passenger traffic data for AeroConnect to identify high- and low-demand routes, uncover seasonal and regional trends, and build a model to forecast traffic for resource allocation decisions.
The challenge required:
- Cleaning and preparing the dataset.
- Performing exploratory data analysis (EDA).
- Building a predictive model.
- Evaluating the model’s accuracy.
- Providing recommendations for AeroConnect’s future route planning
I cleaned the dataset by:
- Converting Year and Month_num into a proper Date column.
- Checking for missing or inconsistent values.
- Creating a Route column.
- Sorting the dataset chronologically for modeling.
Cleaned data link: https://drive.google.com/file/d/17vA1u8qtmHL24LFEBr2MuxxVBUZV_oqm/view?usp=sharing
- Setup: Downloaded and installed all necessary Python libraries (e.g., pandas, numpy, matplotlib, scikit-learn) to support data cleaning, visualization, and modeling.
- EDA: Identified the most and least trafficked routes, seasonal travel patterns, and geographical concentration of international traffic.
- Model: Built a regression model with a third-degree polynomial trend, monthly dummy variables, and ridge regression (α = 0.1) to capture both long-term growth and seasonality.
- Evaluation: Assessed accuracy using MAE, RMSE, and MAPE.
- Recommendations: Suggested which routes AeroConnect should expand or scale back, and how to apply forecasts for planning.