Skip to content

Update ablation scripts for ensemble and model size evaluations#340

Merged
sgreenbury merged 19 commits intomainfrom
2026-04-19/ablation-scripts
Apr 22, 2026
Merged

Update ablation scripts for ensemble and model size evaluations#340
sgreenbury merged 19 commits intomainfrom
2026-04-19/ablation-scripts

Conversation

@sgreenbury
Copy link
Copy Markdown
Contributor

This pull request adds a comprehensive set of configuration files and documentation to support ablation studies and sensitivity sweeps for model size, ensemble size, loss function variants, architectural comparisons, and conditioning strategies in the CNS (Conditioned Navier-Stokes) and related datasets. The changes enable flexible experimentation and evaluation workflows by introducing new YAML configs for model variants, detailed README guides for each ablation, and an eval script for ensemble size studies.

Key changes include:

Model Size Ablation Configs

  • Added four new YAML configuration files under local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/ to support model size sweeps for both CRPS-ViT and FM-ViT models, providing 0.4x and 2x parameter count variants. [1] [2] [3] [4]

Documentation for Ablation Studies

  • Introduced a top-level README.md in slurm_scripts/ablations/ outlining the scope, current status, design notes, and workflow for all ablation, comparison, and sweep experiments.
  • Added detailed README files for specific ablation studies, including:
    • Ensemble size (ensemble_size/README.md) with batch regime details and scheduling instructions.
    • CRPS loss variants (crps_variants/README.md) with implementation sketches for swapping loss functions.
    • Architecture comparison between U-Net, FNO, and ViT (arch_unet_fno_vit/README.md).
    • Cached-latent CRPS loss study (cached_latent_crps/README.md).
    • Conditioning strategies: global vs permute (cond_global_vs_permute/README.md).

Evaluation Script for Ensemble Size Ablation

  • Added submit_eval_crps_ambient.sh under slurm_scripts/ablations/ensemble_size/eval/ to automate evaluation of ensemble-size ablation runs, with consistent evaluation parameters and clear documentation.

These changes provide a structured foundation for running, extending, and documenting a wide range of ablation and comparison studies in the project.

Label matches the honest measured ~2.09x / ~2.10x scaling rather than
the imprecise 160M target. Updates preset filenames, variant IDs,
wandb names, and README accordingly.
Extends the scan to 3 points per architecture (0p4x, baseline, 2x) using
aspect-preserving, heads-fixed scaling. Keeps the smaller point at
more-standard transformer dimensions to avoid confounding from overly narrow /
shallow settings.
Pin comparison eval submitters to n_members=10 explicitly.
This keeps reruns and future submissions aligned even if the
global eval default changes later.
@sgreenbury sgreenbury merged commit 4fe6d1a into main Apr 22, 2026
3 checks passed
@sgreenbury sgreenbury deleted the 2026-04-19/ablation-scripts branch April 22, 2026 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant