IceNet-MP is a multimodal pipeline for predicting sea ice.
You will need to install the following tools if you want to develop this project:
On an HPC system, this will install to ~/.local/bin, so make sure that your home directory has enough free space.
aarch64 wheel for cf-units.
Before installing on Isambard-AI you will need to set the following environment variables:
export UDUNITS2_XML_PATH=/projects/u5gf/seaice/udunits/share/udunits/udunits2.xml
export UDUNITS2_INCDIR=/projects/u5gf/seaice/udunits/include/
export UDUNITS2_LIBDIR=/projects/u5gf/seaice/udunits/lib/You can then install the project as follows (for DAWN / Baskerville, you can ignore the previous step):
git clone git@github.com:alan-turing-institute/icenet-mp.git
cd icenet_mp
uv sync --managed-pythonCreate a file in the folder icenet_mp/config that is called <your chosen name here>.local.yaml.
You will typically want this to inherit from base.yaml, and then you can apply your own changes on top.
For example, the following config will override the base_path option in base.yaml:
defaults:
- base
- _self_
base_path: /local/path/to/my/dataYou can then run this with, e.g.:
uv run imp <command> --config-name <your local config>.yamlYou can also use this config to override other options in the base.yaml file, as shown below:
defaults:
- base
- override /model: cnn_unet_cnn # Use this format if you want to use a different config
- _self_
# Override specific model parameters
model:
processor:
start_out_channels: 37 # Use this format to override specific model parameters in the named configs
base_path: /local/path/to/my/dataAlternatively, you can apply overrides to specific options at the command line like this:
uv run imp <command> ++base_path=/local/path/to/my/dataSee config/demo_north.yaml for an example of this.
Note that base_persistence.yaml overrides the specific options in base.yaml needed to run the Persistence model.
For running on a shared HPC systems (Baskerville, DAWN or Isambard-AI), you will want to use the pre-downloaded data and the right GPU accelerator. This is handled for you by including the appropriate config file:
defaults:
- base_baskerville OR base_dawn OR base_isambardai
- override /data: full # if you want to run over the full dataset instead of the sample dataset
- _self_ℹ️ Note that if you are running the below commands locally, specify the base path in your local config, then add the argument --config-name <your local config>.yaml.
You will need a CDS account to download data with anemoi (e.g. the ERA5 data).
Run uv run imp datasets create to download datasets.
N.b. For very large datasets, use load_in_parts instead (see Downloading large datasets below).
Run uv run imp datasets inspect to inspect datasets (i.e. to get dataset properties and statistical summaries of the variables).
You will need a Weights & Biases account to run a training run. Generate an API key then run the following to allow automatic authentication.
export WANDB_API_KEY=<your_api_key>
wandb login
Run uv run imp train to train using the datasets specified in the config.
ℹ️ This will save checkpoints to ${BASE_DIR}/training/wandb/run-${DATE}$-${RANDOM_STRING}/checkpoints/${CHECKPOINT_NAME}$.ckpt. Where the BASE_DIR is the base path to the data defined in your config file.
uv run command with PYTORCH_ENABLE_MPS_FALLBACK=1. For example:
PYTORCH_ENABLE_MPS_FALLBACK=1 uv run imp trainRun uv run imp evaluate --checkpoint PATH_TO_A_CHECKPOINT to evaluate using a checkpoint from a training run.
You can plot static images or animations of the raw data by adding the following option to your local config:
evaluate:
callbacks:
plotting:
make_input_plots: true
Settings (output directories, styling, animation parameters) are read from config.evaluate.callbacks.raw_inputs in your YAML config files. Command-line options can override config values if needed.
An IceNet-MP model needs to be able to run over multiple different datasets with different dimensions.
These are structured in NTCHW format, where:
Nis the batch size,Tis the number of history (forecast) steps for inputs (outputs)Cis the number of channels or variablesHis a height dimensionWis a width dimension
N and T will be the same for all inputs, but C, H and W might vary.
Taking as an example, a batch size (N=2), 3 history steps and 4 forecast steps, we will have k inputs of shape (2, 3, C_k, H_k, W_k) and one output of shape (2, 4, C_out, H_out, W_out).
A standalone model will need to accept a dict[str, TensorNTCHW] which maps dataset names to an NTCHW Tensor of values.
The model might want to use one or more of these for training, and will need to produce an output with shape N, T, C_out, H_out, W_out.
As can be seen in the example below, a separate instance of the model is likely to be needed for each output to be predicted.
Pros:
- all input variables are available without transformation
Cons:
- hard to add new inputs
- hard to add new outputs
A processor model is part of a larger encode-process-decode step.
Start by defining a latent space as (C_latent, H_latent, W_latent) - in the example below, this has been set to (10, 64, 64).
The encode-process-decode model automatically creates one encoder for each input and one decoder for each output.
The dataset-specific encoder takes the input data and converts it to shape (N, T, C_latent, H_latent, W_latent).
The k encoded datasets can then be combined in latent space to give a single dataset of shape (N, T, k * C_latent, H_latent, W_latent).
This is then passed to the processor, which must accept input of shape (N, T, k * C_latent, H_latent, W_latent) and produce output of the same shape.
This output is then passed to one or more output-specific decoders which take input of shape (N, T, k * C_latent, H_latent, W_latent) and produce output of shape (N, T, C_out, H_out, W_out).
Pros:
- easy to add new inputs
- easy to add new outputs
Cons:
- input variables have been transformed into latent space
There are various demonstrator Jupyter notebooks in the notebooks folder.
You can run these with uv run --group notebooks jupyter notebook.
A good one to start with is notebooks/demo_pipeline.ipynb which gives a more detailed overview of the pipeline.
For particularly large datasets, e.g. the full ERA5 dataset, it may be necessary to download the data in parts.
The load_in_parts command automates the process of downloading datasets in parts, tracking progress, and allowing you to resume interrupted downloads:
uv run imp datasets load_in_parts --config-name <your config>.yamlThis command will:
- Automatically initialise the dataset if it doesn't exist
- Load all parts sequentially, tracking progress in a part tracker file
- Skip already completed parts if the process is interrupted and restarted
- Handle errors gracefully (by default, continues to the next part on error)
You will then need to finalise the dataset when done.
uv run imp datasets finalise --config-name <your config>.yaml--continue-on-error/--no-continue-on-error(default:--continue-on-error): Continue to next part on error--force-reset: Clear existing progress tracker and start from part 1. Anemoi will check whether you have the data already and continue.--dataset <name>: Run only a single dataset by name (useful when you have multiple datasets in your config). Make sure you use the dataset name and not the name of the config.--total-parts <n>: Override the computed total number of parts (useful if you want more / fewer parts than the default 10)--overwrite: Delete the dataset directory before loading (use with caution!)
Load all parts for all datasets, resuming from where you left off:
uv run imp datasets load_in_parts --config-name <your config>.yamlLoad a specific dataset with a custom number of parts:
uv run imp datasets load_in_parts --config-name <your config>.yaml --dataset my_dataset --total-parts 25Start fresh, clearing any previous progress (doesn't delete any data):
uv run imp datasets load_in_parts --config-name <your config>.yaml --force-resetStart and destroy any previously saved data (careful):
uv run imp datasets load_in_parts --config-name <your config>.yaml --overwriteIf you need more control, you can manually manage the download process:
- First initialise the dataset:
uv run imp datasets init --config-name <your config>.yaml- Then load each part
iof the totalnin turn:
uv run imp datasets load --config-name <your config>.yaml --parts i/n- When all the parts are loaded, finalise the dataset:
uv run imp datasets finalise --config-name <your config>.yaml
