This repository contains the ESFM training and evaluation code used for atmospheric and Earth-system forecasting experiments.
The codebase is based on the Aurora model stack and extends it with ESFM-specific model components, data pipelines, and experiment scripts (both training and evaluation).
- ESFM: Earth System Foundation Model
This repo provides:
- ESFM model implementations in
esfm/model/ - Dataset wrappers and mixed-source loading in
utils/dataset.py - Training entrypoints for standard ESFM and encoder KD pretraining
- Inference entrypoints for single-step and multi-step rollout forecasting
- Metric computation for station datasets (ECMWF 11k & Weather-5K) using
evaluate_station_metrics.py; gridded dataset evaluation (ERA5, CMIP6, MODIS) using the SwissClim_Evaluations (v0.2.0) toolbox.
Important environment assumption:
- Many scripts are configured for CSCS paths (for example
/capstor/store/cscs/...) and Slurm execution. - If you run outside CSCS, you must adapt data paths and launcher scripts.
esfm/: core package (model, batch representation, rollout helpers, normalization)configs/: experiment configuration files (model size, optimization, logging, etc.)dataset_config.yaml: dataset definitionsmasking_config.yaml: optional masking strategies for trainingloss_config.yaml: per-variable loss weightstrain.py: main ESFM training entrypointtrain_encoder_KD.py: encoder knowledge-distillation training entrypointinference.py: single-step inference/evaluation entrypointinference_rollout.py: multi-step rollout inference entrypointevaluate_station_metrics.py: metrics computation for station experiments from prediction zarr filesscripts/training/: Slurm training scripts to reproduce experiments from the manuscriptscripts/inference/: Slurm inference scripts to reproduce experiments from the manuscript
All experiments have been tested on the Container Engine of CSCS Alps.
Most of our experiments have ran on an image based off of modulus:24.04, but later verified with physicsnemo:25.03.
Model weights for the different experiments from the manuscript are available on Hugging Face (ESFM).
The ESFM preprint is available [here].
CLI arguments are defined in config.py.
You usually run with:
--config: experiment yaml inconfigs/--dataset_config_path: dataset registry yaml (default:dataset_config.yaml)--loss_config_path: variable-weight config (default:loss_config.yaml)--mask_config_pathand--mask_config_type: optional masking setup
Example config file:
configs/config_ESFM_s_enc_KD_nm.yaml
This sets options such as:
- architecture (
str_architecture_size) - devices and workers
- optimization and gradient settings
- logging (
wnb_*) - data sources (
data_sources)
Dataset path root override:
- Entry points (
train.py,train_encoder_KD.py,inference.py,inference_rollout.py,merge_esfm_encoder.py) readESFM_DATA_PATH_PREFIXif set. - If unset, they fall back to
/capstor/store/cscs/. - Keep in mind that you will nevertheless need to adjust the relative paths under
dataset_config.yaml.
Example:
export ESFM_DATA_PATH_PREFIX=/your/storage/rootPrediction copy destination override:
utils/validation_utils.pyreadsESFM_RESULTS_COPY_BASEwhen copying merged inference predictions.- If unset, it falls back to
/capstor/store/cscs/swissai/a122/ESFM_Results.
Example:
export ESFM_RESULTS_COPY_BASE=/your/results/storage/ESFM_ResultsFor reproducing ESFM variants proposed in our work, please look at the slurm scripts listed under scripts/training.
Further information is provided in detail under README.
For example, after adapting paths to your custom compute environment, you can run a training as:
sbatch scripts/training/train_ESFM_s_wm_ri_pre.shFor a quick interactive ERA5 visualisation, see notebooks/inference_ESFMs_on_ERA5.ipynb. For actual and efficient inference on local datasets, use the Python and Slurm examples below.
python inference.py --config ./configs/config_ESFM_s_nm.yamlSlurm example:
sbatch scripts/inference/inference_ESFM_s_nm.shRollout uses inference_rollout.py and supports custom args such as --NUM_ROLLOUT_STEPS.
Example:
python inference_rollout.py \
--config ./configs/config_ESFM_s_nm_e11k_lt6h.yaml \
--NUM_ROLLOUT_STEPS 56Slurm example:
sbatch scripts/inference/inference_ESFM_s_nm_e11k_eval1_lt6h.shCMIP6, MODIS, Weather-5K, and ECMWF 11k datasets have been preprocessed using scripts released under the SwissClim Data Processing Scripts repository.
After generating predictions (zarr output), compute detailed station metrics with:
python evaluate_station_metrics.py \
--p_preds /path/to/predictions.zarr \
--station_type 11k \
--p_station /path/to/station_dataset_root \
--output_dir ./eval_out \For manuscript-aligned experiment commands and checkpoint notes, see:
scripts/README.mdscripts/training/scripts/inference/
These scripts are tuned for a specific HPC environment and should be adapted before running elsewhere.
Two helper scripts are provided to safely bulk-update common path roots.
scripts/update_workdir_roots.sh: updates defaultworkdir="..."lines in Slurm scripts underscripts/training/andscripts/inference/.scripts/update_logdir_roots.sh: updateslog_dir:base paths inconfigs/*.yamlwhile preserving each run-specific suffix.
Both scripts support:
--dry-runto preview matches without modifying files.--applyto perform replacements.- automatic timestamped backups under
.path_update_backups/before any write.
Recommended workflow:
- Run with
--dry-run. - Run with
--apply. - Review with
git diff.
# Preview
scripts/update_workdir_roots.sh --dry-run \
--old-workdir '/users/$USER/projects/ESFM' \
--new-workdir '/path/to/repo/ESFM'
# Apply
scripts/update_workdir_roots.sh --apply \
--old-workdir '/users/$USER/projects/ESFM' \
--new-workdir '/path/to/repo/ESFM'# Preview
scripts/update_logdir_roots.sh --dry-run \
--old-log-base '/iopsstor/scratch/cscs/fozdemir/ESFM_outputs' \
--new-log-base '/new/path/to/checkpoints/ESFM_outputs'
# Apply
scripts/update_logdir_roots.sh --apply \
--old-log-base '/iopsstor/scratch/cscs/fozdemir/ESFM_outputs' \
--new-log-base '/new/path/to/checkpoints/ESFM_outputs'After applying updates, inspect the result:
git diff -- scripts/training scripts/inference configs- If W&B logging is enabled, set
WANDB_KEY(or disable with--wnb_mode disabled). - Verify dataset paths in
dataset_config.yamland hard-coded path prefixes in entrypoint scripts. - If running outside Slurm/CSCS, remove or adapt Slurm-specific environment and NCCL settings.
- Ensure CUDA/NCCL/PyTorch versions are compatible across all nodes for distributed runs.
You can cite ESFM as follows:
@misc{ozdemir2026esfm,
title={Earth System Foundation Model (ESFM): A unified framework for heterogeneous data integration and forecasting},
author={Firat Ozdemir and Yun Cheng and Salman Mohebi and Fanny Lehmann and Simon Adamov and Zhenyi Zhang and Leonardo Trentini and Dana Grund and Oliver Fuhrer and Torsten Hoefler and Siddhartha Mishra and Sebastian Schemm and Benedikt Soja and Mathieu Salzmann},
year={2026},
eprint={2605.00850},
archivePrefix={arXiv},
primaryClass={physics.ao-ph},
url={https://arxiv.org/abs/2605.00850},
}
MIT License. See LICENSE.txt.
This work was supported under project IDs a01 and a122 as part of the Swiss AI Initiative, through a grant from the ETH Domain and computational resources provided by the Swiss National Supercomputing Centre (CSCS) under the Alps infrastructure.
Contributions are welcome.


