This project implements a Unified Multivariate and Dual-Objective Forecasting Model for Sustainable Food Supply Chains using External Signals. The system combines real-time data integration, advanced machine learning models, and sustainability metrics to provide accurate predictions for both food demand and waste.
- ✅ Dual-Objective Prediction: Simultaneously forecasts food demand (orders) and food waste (kg)
- ✅ Real-time API Integration: Weather, fuel prices, economic indicators, and Google Trends
- ✅ Advanced ML Models: Random Forest, Gradient Boosting, and Ensemble methods
- ✅ Sustainability Metrics: CO2 impact calculations and waste reduction potential
- ✅ Production-Ready Architecture: Database integration, model persistence, and API endpoints
- ✅ Comprehensive Evaluation: Multiple metrics including R², MAE, RMSE, and MAPE
- Real-time Weather Data (WeatherAPI)
- Fuel Price Data (Multiple sources)
- Economic Indicators (World Bank/Financial APIs)
- Google Trends (Consumer behavior)
- Holiday Information (Country-specific)- Feature Engineering (Temporal, lag, interaction features)
- Data Normalization (MinMaxScaler, StandardScaler)
- Sequence Creation (10-week lookback windows)
- Data Validation and Cleaning- Random Forest Regressor
- Gradient Boosting Regressor
- LSTM-style Ensemble
- Multi-output Regression- Real-time Prediction API
- Batch Prediction System
- Alert Generation
- Confidence Scoringpip install pandas>=1.5.0
pip install numpy>=1.20.0
pip install scikit-learn>=1.1.0
pip install requests>=2.28.0
pip install holidays>=0.16
pip install sqlite3 # Usually included with Pythonpip install tensorflow>=2.10.0 # For LSTM/Transformer models
pip install pytrends>=4.9.0 # For Google Trends
pip install optuna>=3.0.0 # For hyperparameter optimization
pip install flask>=2.0.0 # For API deploymentThe system automatically creates a SQLite database with the following tables:
food_data: Historical demand and waste dataenhanced_food_data: Data with external featurespredictions: Model predictions and actuals
from advanced_forecaster import AdvancedFoodSupplyChainForecaster
# Initialize forecaster
forecaster = AdvancedFoodSupplyChainForecaster({
'weather_api_key': 'YOUR_API_KEY',
'location': 'Ahmedabad,India',
'lookback': 10,
'prediction_horizon': 1
})# Prepare comprehensive dataset with external features
enhanced_data = forecaster.prepare_comprehensive_dataset(
start_date='2022-01-01',
end_date='2024-01-01'
)
# Create sequences for training
X, y = forecaster.create_sequences(enhanced_data, lookback=10)
# Split data
(X_train, y_train), (X_val, y_val), (X_test, y_test) = forecaster.split_data(X, y)from model_trainer import ModelTrainer
trainer = ModelTrainer(forecaster)
# Train multiple models
rf_result = trainer.train_random_forest(X_train, y_train, X_val, y_val)
gb_result = trainer.train_gradient_boosting(X_train, y_train, X_val, y_val)
lstm_result = trainer.train_lstm_model(X_train, y_train, X_val, y_val)
# Evaluate models
test_results = trainer.evaluate_all_models(X_test, y_test)from prediction_system import RealTimePredictionSystem
# Initialize prediction system
prediction_system = RealTimePredictionSystem(forecaster, trainer, best_model)
# Make real-time prediction
prediction = prediction_system.create_prediction_api_response()
# Batch predictions
batch_predictions = prediction_system.batch_predict(
start_date=datetime.now(),
end_date=datetime.now() + timedelta(weeks=4)
)- Sign up at WeatherAPI
- Get your free API key
- Set the key in configuration:
config = {
'weather_api_key': 'YOUR_WEATHER_API_KEY',
'location': 'Your_City,Country'
}For production use, integrate with:
- World Bank API: Economic indicators
- Financial Modeling Prep: Real-time economic data
- Trading Economics: Comprehensive economic data
Available options:
- Fuel Price APIs India: Real-time fuel prices
- HERE Technologies: Global fuel price data
- Local government APIs: Region-specific data
Based on our evaluation with 105 weeks of data:
| Model | Demand MAE | Demand R² | Waste MAE | Waste R² | Overall Score |
|---|---|---|---|---|---|
| Random Forest | 20.9 | -1.71 | 5.10 | 0.415 | -0.650 |
| LSTM Ensemble | 20.9 | -1.79 | 5.01 | 0.412 | -0.689 |
| Gradient Boosting | 26.3 | -3.22 | 6.75 | 0.116 | -1.553 |
Note: Negative R² values indicate the model performs worse than a naive mean predictor, suggesting need for more data or feature engineering.
The system calculates:
- CO2 Impact: 2.5 kg CO2 per kg food waste
- Waste Reduction Potential: Difference between predicted and optimal waste
- Cost Savings: Based on operational cost models
- Environmental Benefits: Quantified sustainability improvements
python advanced_forecaster.pyfrom flask import Flask, jsonify, request
app = Flask(__name__)
@app.route('/predict', methods=['POST'])
def predict():
date = request.json.get('date')
prediction = prediction_system.create_prediction_api_response(date)
return jsonify(prediction)
if __name__ == '__main__':
app.run(debug=True)Deploy to AWS, Google Cloud, or Azure using containerization:
FROM python:3.9-slim
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]- Training: At least 52 weeks (1 year) of historical data
- Validation: 20-30% of training data
- Features: 10-15 external features recommended
- Completeness: <5% missing values
- Consistency: Regular weekly intervals
- Accuracy: Validated business data
- Freshness: Real-time external signals
- Add more temporal features (seasonality, trends)
- Include interaction terms
- Use domain-specific features (menu changes, promotions)
- Hyperparameter tuning with Optuna
- Ensemble methods combining multiple algorithms
- Deep learning models with TensorFlow
- Increase historical data size
- Add more external signals
- Improve data quality and preprocessing
# Track prediction accuracy over time
def monitor_model_performance():
recent_predictions = get_recent_predictions()
actual_values = get_actual_values()
current_mae = calculate_mae(recent_predictions, actual_values)
if current_mae > threshold:
trigger_model_retraining()- Monitor API response times and availability
- Validate data ranges and distributions
- Alert on anomalous values
- Schedule monthly model retraining
- A/B test new models before deployment
- Maintain model versioning
- Novel Architecture: Dual-objective forecasting with real-time signals
- Comprehensive Integration: Multiple external data sources
- Sustainability Focus: Environmental impact quantification
- Production-Ready: Complete end-to-end system
- Journals: Nature Food, Journal of Cleaner Production, Food Policy
- Conferences: ICML, NeurIPS, IEEE Big Data
- Industry: Supply Chain Management Review
- Collect larger dataset (2-3 years)
- Implement Transformer architecture
- Add graph neural networks for supply chain modeling
- Conduct real-world pilot study
- Publish research findings
1. API Rate Limits
# Implement exponential backoff
import time
def api_call_with_retry(api_func, max_retries=3):
for attempt in range(max_retries):
try:
return api_func()
except RateLimitError:
time.sleep(2 ** attempt)
raise Exception("Max retries exceeded")2. Missing Dependencies
# Install all requirements
pip install -r requirements.txt
# For specific issues
pip install --upgrade scikit-learn
pip install --upgrade pandas3. Memory Issues
# Reduce batch size or sequence length
config['batch_size'] = 16
config['lookback'] = 5For technical support or research collaboration:
- Email: iampayal018@gmail.com
- GitHub: [Repository URL]