Skip to content

Predicts hourly taxi demand at airports using historical data and time patterns. Helps optimize driver allocation during peak hours, improving service efficiency and reducing wait times. Enables proactive staffing decisions through accurate short-term forecasting of passenger demand.

Notifications You must be signed in to change notification settings

gorop51-2/Taxi-Demand-Forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš• Taxi Demand Forecasting

πŸ“Œ Project Overview

"Chetenkoe Taxi" company collects historical data on taxi orders at airports. To attract more drivers during peak hours, it's essential to forecast the number of taxi orders for the next hour. This project develops a time series forecasting model to predict taxi demand with high accuracy.

🎯 Objective

Build a model that accurately predicts hourly taxi order demand with:

  • RMSE ≀ 48 on the test set
  • Ability to forecast demand for multiple future hours
  • Timely identification of peak demand periods to optimize driver allocation

πŸ“Š Dataset Description

  • Source: taxi.csv - historical taxi order data
  • Time period: March 1, 2018 to August 31, 2018 (spring-summer season)
  • Original frequency: Every 10 minutes
  • Resampled frequency: Hourly (after processing)
  • Key feature:
    • num_orders - number of taxi orders

The dataset contains 26,496 entries with no missing values or anomalies. The time series shows:

  • Mild upward trend
  • Daily seasonality patterns
  • No yearly seasonality (limited to spring-summer period)

πŸ” Methodology

Data Preparation

  • Resampled original 10-minute data to hourly frequency
  • Conducted stationarity analysis using Augmented Dickey-Fuller test (confirmed stationarity)
  • Created time-based features:
    • Hour of day
    • Day of week
    • Weekend indicator
    • Lag features (up to 248 hours)
    • Rolling mean (window sizes: 19, 24, 29 hours)

Modeling Approach

Implemented custom time series forecasting framework with:

  • TimeSeriesSplit cross-validation
  • Feature engineering pipeline
  • Two candidate models with hyperparameter optimization:
    • Ridge Regression (L2-regularized linear model)
    • LGBMRegressor (LightGBM gradient boosting)

Model Selection

  • Tested multiple combinations of lag features and window sizes
  • Evaluated models using RMSE metric
  • Selected best model based on cross-validation performance

πŸ“ˆ Results

Best Model: Ridge Regression with optimal parameters

Performance Metrics:

  • Cross-validation RMSE: 25
  • Test set RMSE: 33.79 (well below the 48 threshold)
  • Successfully captures daily demand patterns and peak hours

Practical Application:

  • Model can forecast demand for multiple future hours
  • Example prediction for September 1, 2018:
    • First hour: ~279 orders
    • Full day forecast shows clear daily pattern with morning and evening peaks

The solution enables "Chetenkoe Taxi" to proactively allocate drivers based on predicted demand, improving service quality during peak hours while optimizing operational costs.

About

Predicts hourly taxi demand at airports using historical data and time patterns. Helps optimize driver allocation during peak hours, improving service efficiency and reducing wait times. Enables proactive staffing decisions through accurate short-term forecasting of passenger demand.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published