Aurora Wave Prediction

A production-ready web application for oceanic wave prediction using Microsoft's Aurora model with real-time ECMWF data.

Features

Real-time Wave Prediction: Generate wave forecasts using the Aurora model
Interactive Web Interface: User-friendly Streamlit frontend
ECMWF Data Integration: Fetch real-time atmospheric and wave data
Flexible Geographic Selection: Choose custom regions for prediction
Multiple Time Steps: Generate multi-step forecasts
Memory-Optimized Processing: Efficient handling of large datasets
Robust Data Pipeline: 3-day wave data strategy with batched processing

Architecture

Frontend (Streamlit) ↔ Backend (Flask) ↔ Aurora Model
                                ↓
                        ECMWF Data Sources
                        (Surface, Atmospheric, Wave)

Quick Start

Prerequisites

Python 3.8+
CUDA-compatible GPU (recommended)
Git

Installation

Clone the repository

git clone <repository-url>
cd Aurora-Wave-Prediction

Set up Backend

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Set up Frontend

cd ../frontend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Running the Application

Start the Backend

cd backend
source venv/bin/activate  # On Windows: venv\Scripts\activate
python backend_app.py

Start the Frontend (in a new terminal)

cd frontend
source venv/bin/activate  # On Windows: venv\Scripts\activate
streamlit run streamlit_app.py

Access the Application
- Frontend: http://localhost:8501
- Backend API: http://localhost:5000

API Endpoints

Download Data

POST /api/download-data
Content-Type: application/json

{
  "target_date": "2024-01-15",
  "lat_bounds": [2.2, 4.2],
  "lon_bounds": [72.2, 74.2]
}

Generate Predictions

POST /api/predict_oceanic
Content-Type: application/json

{
  "target_date": "2024-01-15",
  "lat_bounds": [2.2, 4.2],
  "lon_bounds": [72.2, 74.2],
  "steps": 5
}

Data Sources

ECMWF OpenData (Primary)

Surface Variables: 2m temperature, 10m wind, mean sea level pressure
Atmospheric Variables: Temperature, wind (5 pressure levels: 1000, 925, 850, 700, 500 hPa)
Wave Variables: Significant wave height, wave direction, wave period, peak period
Resolution: 0.25° global grid
Update Frequency: 4 times daily (00z, 06z, 12z, 18z)
Data Strategy: 3-day coverage with optimal forecast time selection

Wave Data Processing

Batched Processing: Handles large GRIB files (12MB each) in memory-efficient batches
Forecast Step Averaging: Combines multiple forecast steps for robust predictions
Automatic Fallback: Synthetic data generation when ECMWF is unavailable
File Caching: Reuses downloaded data to prevent duplicate downloads

Model Details

Aurora Wave Model

Type: Transformer-based weather prediction model
Input: Multi-variable atmospheric and oceanic data
Output: Wave forecasts (height, direction, period)
Resolution: Configurable (default: 32x32, enhanced to 96x96)
Forecast Steps: 1-10 steps (6-hour intervals)
Variables: 22 surface variables, 3 atmospheric variables, 5 static variables

Technical Improvements

Memory Management: Batched processing prevents allocation errors
Tensor Shape Handling: Proper atmospheric level preservation (5 levels)
Data Type Detection: Automatic distinction between wave and atmospheric data
Spatial Resolution: Automatic resizing and interpolation

Configuration

Key configuration options in backend/config.py:

ECMWF Configuration: Variables, pressure levels, resolution
Wave Data Configuration: Variables, mapping, processing options
Aurora Variables: Required and optional variables for model
Processing Configuration: Batch size, time steps, spatial resolution

Development

Project Structure

Aurora-Wave-Prediction/
├── backend/
│   ├── aurora_data_sources/     # Data source implementations
│   │   ├── base.py             # Base data source class
│   │   ├── ecmwf_source.py     # ECMWF atmospheric/surface data
│   │   ├── ecmwf_wave_source.py # ECMWF wave data with batching
│   │   ├── huggingface_source.py # Static variables from HuggingFace
│   │   └── __init__.py         # Package initialization
│   ├── aurora_interface.py      # Aurora model interface
│   ├── backend_app.py          # Flask API server
│   ├── config.py               # Configuration settings
│   └── requirements.txt        # Python dependencies
├── frontend/
│   ├── streamlit_app.py        # Streamlit web interface
│   └── requirements.txt        # Python dependencies
├── downloads/                   # Data cache directory
└── README.md

Key Dependencies

ecmwf-opendata: ECMWF data access
cfgrib: GRIB file processing
xarray: Multi-dimensional data handling
torch: PyTorch for Aurora model
aurora: Microsoft Aurora model package

Performance Optimizations

Memory Management

Batched Processing: Process GRIB files in groups of 2 to prevent memory overflow
Immediate Reduction: Reduce dataset memory usage during loading
Data Type Optimization: Use float32 instead of float64
Automatic Cleanup: Close datasets after processing

Data Processing

Smart Caching: Reuse downloaded files within 6-hour window
Forecast Step Averaging: Combine multiple forecast steps for wave data
Pressure Level Preservation: Maintain 5 atmospheric levels for model compatibility
Spatial Interpolation: Efficient resizing using scipy interpolation

Troubleshooting

Common Issues

Memory Allocation Errors
- Fixed: Batched processing handles large datasets
- Use smaller geographic regions for very large areas
- Monitor system memory usage
Tensor Shape Mismatches
- Fixed: Proper data type detection and processing
- Atmospheric data preserves 5 pressure levels
- Wave data averages forecast steps correctly
Data Download Failures
- Check internet connectivity
- Verify ECMWF service status
- System falls back to synthetic data automatically
Model Loading Errors
- Ensure sufficient disk space for model checkpoints
- Check HuggingFace connectivity
- Verify CUDA availability for GPU acceleration

Performance Tips

Use smaller geographic regions for faster processing
Enable GPU acceleration when available
Monitor download cache to prevent disk space issues
Use recent data (within 3 days) for best ECMWF availability

Production Deployment

The system is production-ready with:

✅ Memory-efficient data processing
✅ Robust error handling and fallbacks
✅ Efficient data caching and reuse
✅ Real ECMWF data integration
✅ Scalable architecture

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Microsoft Research for the Aurora model
ECMWF for weather and wave data access
Streamlit and Flask communities
Contributors to ecmwf-opendata and cfgrib packages
GPU acceleration recommended for faster predictions

Troubleshooting

Backend won't start: Check Python version (3.8+) and install requirements Frontend connection error: Ensure backend is running on port 5000 Slow predictions: Consider using GPU or reducing forecast steps Memory issues: Close other applications, use smaller prediction areas

License

Uses Microsoft's Aurora model under their license terms.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets		assets
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

AbhishekSai1551/Aurora-Wave-Prediction

Folders and files

Latest commit

History

Repository files navigation

Aurora Wave Prediction

Features

Architecture

Quick Start

Prerequisites

Installation

Running the Application

API Endpoints

Download Data

Generate Predictions

Data Sources

ECMWF OpenData (Primary)

Wave Data Processing

Model Details

Aurora Wave Model

Technical Improvements

Configuration

Development

Project Structure

Key Dependencies

Performance Optimizations

Memory Management

Data Processing

Troubleshooting

Common Issues

Performance Tips

Production Deployment

License

Acknowledgments

Troubleshooting

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages