Update ablation scripts for ensemble and model size evaluations#340
Merged
sgreenbury merged 19 commits intomainfrom Apr 22, 2026
Merged
Update ablation scripts for ensemble and model size evaluations#340sgreenbury merged 19 commits intomainfrom
sgreenbury merged 19 commits intomainfrom
Conversation
Label matches the honest measured ~2.09x / ~2.10x scaling rather than the imprecise 160M target. Updates preset filenames, variant IDs, wandb names, and README accordingly.
Extends the scan to 3 points per architecture (0p4x, baseline, 2x) using aspect-preserving, heads-fixed scaling. Keeps the smaller point at more-standard transformer dimensions to avoid confounding from overly narrow / shallow settings.
… into 2026-04-19/ablation-scripts
Pin comparison eval submitters to n_members=10 explicitly. This keeps reruns and future submissions aligned even if the global eval default changes later.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds a comprehensive set of configuration files and documentation to support ablation studies and sensitivity sweeps for model size, ensemble size, loss function variants, architectural comparisons, and conditioning strategies in the CNS (Conditioned Navier-Stokes) and related datasets. The changes enable flexible experimentation and evaluation workflows by introducing new YAML configs for model variants, detailed README guides for each ablation, and an eval script for ensemble size studies.
Key changes include:
Model Size Ablation Configs
local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/to support model size sweeps for both CRPS-ViT and FM-ViT models, providing 0.4x and 2x parameter count variants. [1] [2] [3] [4]Documentation for Ablation Studies
README.mdinslurm_scripts/ablations/outlining the scope, current status, design notes, and workflow for all ablation, comparison, and sweep experiments.ensemble_size/README.md) with batch regime details and scheduling instructions.crps_variants/README.md) with implementation sketches for swapping loss functions.arch_unet_fno_vit/README.md).cached_latent_crps/README.md).cond_global_vs_permute/README.md).Evaluation Script for Ensemble Size Ablation
submit_eval_crps_ambient.shunderslurm_scripts/ablations/ensemble_size/eval/to automate evaluation of ensemble-size ablation runs, with consistent evaluation parameters and clear documentation.These changes provide a structured foundation for running, extending, and documenting a wide range of ablation and comparison studies in the project.