Updates for 128x128, optimizer config by sgreenbury · Pull Request #309 · alan-turing-institute/autocast

sgreenbury · 2026-04-02T10:49:04Z

This pull request introduces several new experiment configuration files for different autoencoder and ensemble probabilistic diffusion (EPD) models, expands and updates datamodule configurations, and makes improvements to optimizer and dataloader settings. The changes enhance experiment reproducibility, flexibility in model training, and compatibility with various distributed training setups.

Experiment Configuration Additions and Enhancements:

Added new experiment YAML files for autoencoder and EPD models, including variants with periodic boundary conditions, different normalization settings, and various loss functions and noise configurations. These provide ready-to-use templates for running experiments under different scenarios. [1] [2] [3] [4] [5] [6] [7] [8]

Datamodule and Dataset Configuration Updates:

Introduced new datamodule configs for conditioned_navier_stokes_128, gpe_laser_only_wake_128, and shallow_water2d_128, and updated the data path for gpe_laser_only_wake to use a new dataset version. These changes support experiments on new or updated datasets with consistent normalization and loader settings. [1] [2] [3] [4]

Dataloader Performance Improvements:

Enabled pin_memory=True in all DataLoader instantiations within SpatioTemporalDataModule, which can improve data transfer performance when using GPUs. [1] [2] [3] [4] [5]

Optimizer and Training Infrastructure Improvements:

Added a new optimizer config adamw_half.yaml for compatibility with external projects, and provided detailed comments for scheduler control in adamw.yaml, allowing more flexibility in learning rate scheduling. [1] [2]
Updated the single GPU SLURM distributed config to explicitly set gpus_per_node, tasks_per_node, and ntasks for improved SLURM job control.

Loss Function Module Update:

Updated autocast.losses.__init__.py to include EnsembleMAELoss in the import and __all__ list, making it available for use in experiment configs.

Other Minor Improvements:

Updated pyproject.toml to specify test paths and directories to ignore for pytest, improving test discovery and isolation.

sgreenbury added 20 commits March 28, 2026 07:11

Add datamodules for 128x128

c71b843

Move config

26a7e77

Add AE config for 128x128

32b7b67

Add pin_memory=True

a26fdd3

Add 4 GPU ViT config

43f474e

Add disable mod embedding when not trained

b69d7f0

Merge remote-tracking branch 'origin/main' into test-scripts/run-128x128

05cf9ff

Merge branch 'cache-latents' into test-scripts/run-128x128

7b48e99

Update GPE config

dc547df

Add alpha fair CRPS loss config

bc53839

Merge remote-tracking branch 'origin/main' into test-scripts/run-128x128

1348f30

Update single GPU

dc3472f

Add separate config for single GPE AE

feb5e43

Add MAE config

5650513

Add concat and n_noise_channels config

60114bf

Add ensemble MAE

8c5f696

Fix eval

c3c9713

Fix metrics override

7555007

Ignore paths for tests, add __init__ in tets modules

13eba33

Update optimizer

2007857

sgreenbury commented Apr 2, 2026

View reviewed changes

Comment thread pyproject.toml Outdated

Fix pyproject

624503b

sgreenbury merged commit cb09424 into main Apr 2, 2026
3 checks passed

sgreenbury deleted the test-scripts/run-128x128 branch April 2, 2026 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates for 128x128, optimizer config#309

Updates for 128x128, optimizer config#309
sgreenbury merged 21 commits intomainfrom
test-scripts/run-128x128

sgreenbury commented Apr 2, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sgreenbury commented Apr 2, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant