Skip to content

Updates for 128x128, optimizer config#309

Merged
sgreenbury merged 21 commits intomainfrom
test-scripts/run-128x128
Apr 2, 2026
Merged

Updates for 128x128, optimizer config#309
sgreenbury merged 21 commits intomainfrom
test-scripts/run-128x128

Conversation

@sgreenbury
Copy link
Copy Markdown
Contributor

This pull request introduces several new experiment configuration files for different autoencoder and ensemble probabilistic diffusion (EPD) models, expands and updates datamodule configurations, and makes improvements to optimizer and dataloader settings. The changes enhance experiment reproducibility, flexibility in model training, and compatibility with various distributed training setups.

Experiment Configuration Additions and Enhancements:

  • Added new experiment YAML files for autoencoder and EPD models, including variants with periodic boundary conditions, different normalization settings, and various loss functions and noise configurations. These provide ready-to-use templates for running experiments under different scenarios. [1] [2] [3] [4] [5] [6] [7] [8]

Datamodule and Dataset Configuration Updates:

  • Introduced new datamodule configs for conditioned_navier_stokes_128, gpe_laser_only_wake_128, and shallow_water2d_128, and updated the data path for gpe_laser_only_wake to use a new dataset version. These changes support experiments on new or updated datasets with consistent normalization and loader settings. [1] [2] [3] [4]

Dataloader Performance Improvements:

  • Enabled pin_memory=True in all DataLoader instantiations within SpatioTemporalDataModule, which can improve data transfer performance when using GPUs. [1] [2] [3] [4] [5]

Optimizer and Training Infrastructure Improvements:

  • Added a new optimizer config adamw_half.yaml for compatibility with external projects, and provided detailed comments for scheduler control in adamw.yaml, allowing more flexibility in learning rate scheduling. [1] [2]
  • Updated the single GPU SLURM distributed config to explicitly set gpus_per_node, tasks_per_node, and ntasks for improved SLURM job control.

Loss Function Module Update:

  • Updated autocast.losses.__init__.py to include EnsembleMAELoss in the import and __all__ list, making it available for use in experiment configs.

Other Minor Improvements:

  • Updated pyproject.toml to specify test paths and directories to ignore for pytest, improving test discovery and isolation.

Comment thread pyproject.toml Outdated
@sgreenbury sgreenbury merged commit cb09424 into main Apr 2, 2026
3 checks passed
@sgreenbury sgreenbury deleted the test-scripts/run-128x128 branch April 2, 2026 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant