This example is an implementation of the DLWP Cubed-sphere model. The DLWP model can be used to predict the state of the atmosphere given a previous atmospheric state. You can infer a 320-member ensemble set of six-week forecasts at 1.4° resolution within a couple of minutes, demonstrating the potential of AI in developing near real-time digital twins for weather prediction
The goal is to train an AI model that can emulate the state of the atmosphere and predict global weather over a certain time span. The Deep Learning Weather Prediction (DLWP) model uses deep CNNs for globally gridded weather prediction. DLWP CNNs directly map u(t) to its future state u(t+Δt) by learning from historical observations of the weather, with Δt set to 6 hr
DLWP uses convolutional neural networks (CNNs) on a cubed sphere grid to produce global forecasts. The latest DLWP model leverages a U-Net architecture with skip connections to capture multi-scale processes. The model architecture is described in the following papers
Sub-Seasonal Forecasting With a Large Ensemble of Deep-Learning Weather Prediction Models
-
Install PhysicsNeMo with required extras:
pip install .[launch]
-
Install additional dependencies:
pip install -r requirements.txt
-
Install TempestRemap (required for coordinate transformation):
git clone https://github.com/ClimateGlobalChange/tempestremap cd tempestremap mkdir build && cd build cmake .. make make install
There are two methods to prepare the training data for DLWP:
This is the recommended approach for full model training. It provides more control over variable selection and time periods.
-
First, ensure you have set up your CDS API key as described in the
dataset_downloadREADME. -
Use the provided DLWP configuration:
python dataset_download/start_mirror.py --config-name="config_dlwp.yaml"The configuration includes:
- 7 ERA5 variables mapped to cubed-sphere grid
- Resolution: 64x64 grid cells per face
- Years: 1980-2015 (training), 2016-2017 (validation), 2018 (testing)
- Temporal resolution: 6-hourly
-
Transform the downloaded data to cubed-sphere format:
cd data_curation python post_processing.py --input-dir /path/to/downloaded/data --output-dir /path/to/output
For testing or development, you can use the simplified data preparation scripts
in the data_curation directory:
-
Download a minimal set of ERA5 variables:
cd data_curation python data_download_simple.py -
Process the downloaded data:
python post_processing.py
See the
data_curation/README.mdfor detailed instructions and parameters.
The final dataset should be organized as follows:
data_dir/
├── train/
│ ├── 1980.h5
│ ├── 1981.h5
│ └── ...
├── test/
│ ├── 2017.h5
│ └── ...
├── out_of_sample/
│ └── 2018.h5
└── stats/
├── global_means.npy
└── global_stds.npyEach HDF5 file contains:
- Shape: (time_steps, channels, faces, height, width)
- Faces: 6 (cubed-sphere)
- Height/Width: 64 (resolution parameter)
- Channels: 7 (atmospheric variables)
To train the model, run:
python train_dlwp.pyFor distributed training:
mpirun -np <NUM_GPUS> python train_dlwp.pyNote: Add --allow-run-as-root if running in a container as root.
Progress can be monitored using MLFlow:
mlflow ui -p 2458Sub-Seasonal Forecasting With a Large Ensemble of Deep-Learning Weather Prediction Models
Arbitrary-Order Conservative and Consistent Remapping and a Theory of Linear Maps: Part 1
Arbitrary-Order Conservative and Consistent Remapping and a Theory of Linear Maps, Part 2