alan-turing-institute
diff --git a/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/crps_vit_azula_0p4x.yaml‎
Lines changed: 21 additions & 0 deletions b/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/crps_vit_azula_0p4x.yaml‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/crps_vit_azula_2x.yaml‎
Lines changed: 20 additions & 0 deletions b/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/crps_vit_azula_2x.yaml‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/fm_vit_0p4x.yaml‎
Lines changed: 16 additions & 0 deletions b/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/fm_vit_0p4x.yaml‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/fm_vit_2x.yaml‎
Lines changed: 15 additions & 0 deletions b/‎local_hydra/local_experiment/ablations/model_size/conditioned_navier_stokes/fm_vit_2x.yaml‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎slurm_scripts/ablations/README.md‎
Lines changed: 66 additions & 0 deletions b/‎slurm_scripts/ablations/README.md‎
Lines changed: 66 additions & 0 deletions
diff --git a/‎slurm_scripts/ablations/arch_unet_fno_vit/README.md‎
Lines changed: 37 additions & 0 deletions b/‎slurm_scripts/ablations/arch_unet_fno_vit/README.md‎
Lines changed: 37 additions & 0 deletions
diff --git a/‎slurm_scripts/ablations/cached_latent_crps/README.md‎
Lines changed: 22 additions & 0 deletions b/‎slurm_scripts/ablations/cached_latent_crps/README.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎slurm_scripts/ablations/cond_global_vs_permute/README.md‎
Lines changed: 27 additions & 0 deletions b/‎slurm_scripts/ablations/cond_global_vs_permute/README.md‎
Lines changed: 27 additions & 0 deletions
diff --git a/‎slurm_scripts/ablations/crps_variants/README.md‎
Lines changed: 35 additions & 0 deletions b/‎slurm_scripts/ablations/crps_variants/README.md‎
Lines changed: 35 additions & 0 deletions
diff --git a/‎slurm_scripts/ablations/ensemble_size/README.md‎
Lines changed: 90 additions & 0 deletions b/‎slurm_scripts/ablations/ensemble_size/README.md‎
Lines changed: 90 additions & 0 deletions
@@ -0,0 +1,21 @@
+# @package _global_
+defaults:
+  - /local_experiment/epd/conditioned_navier_stokes/crps_vit_azula_large
+  - _self_
+
+experiment_name: ablation_model_size_crps_vit_azula_0p4x_conditioned_navier_stokes
+
+datamodule:
+  # Keep effective per-GPU batch at 256 after increasing n_members 8 -> 16.
+  batch_size: 16
+
+model:
+  n_members: 16
+  processor:
+    # Aspect-preserving ~0.39x variant (depth 8, width aspect-matched to
+    # baseline 568/12 = 47.3). Measured at ~31.6M processor params versus
+    # ~80.8M for the baseline 568/12/8 model.
+    hidden_dim: 376
+    n_layers: 8
+    num_heads: 8
+    n_noise_channels: 1024
@@ -0,0 +1,20 @@
+# @package _global_
+defaults:
+  - /local_experiment/epd/conditioned_navier_stokes/crps_vit_azula_large
+  - _self_
+
+experiment_name: ablation_model_size_crps_vit_azula_2x_conditioned_navier_stokes
+
+datamodule:
+  # Keep effective per-GPU batch at 256 after increasing n_members 8 -> 16.
+  batch_size: 16
+
+model:
+  n_members: 16
+  processor:
+    # Measured at ~169.3M processor params in the CNS ambient setup, versus
+    # ~80.8M for the baseline 568/12/8 model (~2.09x).
+    hidden_dim: 768
+    n_layers: 16
+    num_heads: 8
+    n_noise_channels: 1024
@@ -0,0 +1,16 @@
+# @package _global_
+defaults:
+  - /local_experiment/epd/conditioned_navier_stokes/fm_vit_large
+  - _self_
+
+experiment_name: ablation_model_size_fm_vit_0p4x_conditioned_navier_stokes
+
+model:
+  processor:
+    backbone:
+      # Aspect-preserving ~0.32x variant (depth 8, width aspect-matched to
+      # baseline 704/12 = 58.7). Measured at ~25.6M processor params versus
+      # ~80.0M for the baseline 704/12/8 backbone.
+      hid_channels: 472
+      hid_blocks: 8
+      attention_heads: 8
@@ -0,0 +1,15 @@
+# @package _global_
+defaults:
+  - /local_experiment/epd/conditioned_navier_stokes/fm_vit_large
+  - _self_
+
+experiment_name: ablation_model_size_fm_vit_2x_conditioned_navier_stokes
+
+model:
+  processor:
+    backbone:
+      # Measured at ~168.6M processor params in the CNS ambient setup, versus
+      # ~80.3M for the baseline 704/12/8 backbone (~2.10x).
+      hid_channels: 896
+      hid_blocks: 16
+      attention_heads: 8
@@ -0,0 +1,66 @@
+# Ablations
+
+Sensitivity sweeps, comparisons, and ablations that sit on top of the main
+4-dataset comparison in `slurm_scripts/comparison/`. "Ablation" is used
+loosely here for all three — true ablations (EMA on/off), comparisons
+(FM vs diffusion, ViT vs U-Net), and sweeps (ensemble size, noise
+channels) — to match how ML papers usually label this section.
+
+Most ablations are still **CNS-only for now**. The current exception is
+`ensemble_size` under the `eff_bs1024` regime, which now extends to the
+other three main comparison datasets (`gray_scott`,
+`gpe_laser_only_wake`, `advection_diffusion`) in addition to CNS. Each
+script keeps dataset coverage local so widening an ablation remains a
+small edit.
+
+## Status table
+
+| ablation | type | datasets | runs | status |
+|---|---|---|---|---|
+| ensemble_size (m=16, fixed bs=32) | sweep | CNS | 1 | ready |
+| ensemble_size (m=16, fixed global eff. bs=1024) | sweep | GS / GPE / CNS / AD | 4 | timing ready |
+| noise_channels | sweep | CNS | 1+ | stub |
+| crps_variants (AlphaFair / Fair / CRPS) | comparison | CNS | 3 | stub |
+| fm_vs_diffusion | comparison | CNS | 1 | stub |
+| arch_unet_fno_vit | comparison | CNS | 2 | stub |
+| model_size | sweep | CNS | 2 | ready |
+| cached_latent_crps | comparison | CNS | 1 (done, 2026-04-19) | stub |
+| cond_global_vs_permute | comparison | CNS | 1 (done for CRPS-ViT, 2026-04-18) | stub |
+| eval_only/ode_steps | eval-only | FM runs | 0 | stub |
+| eval_only/ema | eval-only | EMA ckpts | 0 | stub |
+
+"Done" entries refer to runs already produced by
+`slurm_scripts/comparison/` that double as the CNS data point for this
+ablation — no new training required, but they should be eval'd through
+the same pipeline.
+
+## Design notes
+
+- **Flexible by construction.** Each ablation is a self-contained
+  subdirectory. Changing the knob values, swapping to a different
+  baseline, or dropping an ablation is a localized edit. Dataset
+  coverage lives inside each ablation's submit scripts, so extending one
+  ablation does not spill into the others.
+- **Baselines stay in `local_hydra/local_experiment/{epd,processor}/`.**
+  Ablation configs extend those via Hydra `defaults`. When the sweep is
+  a one-liner (e.g. ensemble size → `model.n_members` +
+  `datamodule.batch_size`), the submit script uses CLI overrides and no
+  new config file is created. When the ablation materially changes the
+  architecture (model size, arch comparison), each variant gets its own
+  yaml under `local_hydra/local_experiment/ablations/<name>/<dataset>/`.
+- **Timing first, then 24h schedule.** Same two-step pattern as
+  `slurm_scripts/comparison/`: each ablation has a `*_timing.sh` (5-epoch
+  run → `timing.ckpt`) and a `*_large.sh` (24h run with cosine epochs
+  computed from timing).
+
+## Submission workflow
+
+1. `submit_*_timing.sh` — 5-epoch timing runs, producing `timing.ckpt`.
+2. Extract per-combo `cosine_epochs` via
+   `uv run autocast time-epochs --from-checkpoint <path>/timing.ckpt -b 24`
+   and paste into `submit_*_large.sh` (matches `comparison/` flow).
+3. `submit_*_large.sh` — 24h production runs, dry-run first.
+4. Eval from the script local to the study:
+   `slurm_scripts/comparison/eval/` for the canonical comparison suite, and
+   `slurm_scripts/ablations/<name>/eval/` for ablation-only run sets that have
+   not been promoted into the main comparison yet.
@@ -0,0 +1,37 @@
+# Architecture comparison: U-Net, FNO, ViT
+
+Compare U-Net and FNO backbones against the ViT (Azula) baseline on the
+CRPS ambient path.
+
+**Status:** stub — no scripts yet.
+
+## Baseline
+
+`local_hydra/local_experiment/epd/conditioned_navier_stokes/crps_vit_azula_large.yaml`
+(ViT-Azula, ~81M params).
+
+## Knob
+
+Swap `model.processor` backbone while trying to match parameter count
+(~80M) and per-epoch budget. Candidate configs to crib from:
+
+- `local_hydra/local_experiment/epd_crps_unet_azula.yaml` — U-Net +
+  CRPS.
+- `local_hydra/local_experiment/epd_crps_fno.yaml` — FNO + CRPS.
+
+Each will need per-CNS `local_experiment/ablations/arch/<arch>.yaml`
+that matches the ambient baseline's encoder/decoder/loss so only the
+backbone varies.
+
+## Datasets
+
+CNS only for now. Table says 2 datasets × 2 non-ViT archs = 4 runs
+(CNS gives 2: U-Net and FNO).
+
+## Outstanding decisions
+
+- How to match parameter count across architectures — the comparison
+  table for the main study (see `slurm_scripts/comparison/README.md`)
+  locked ~80M for ViT variants; we need equivalent targets for U-Net
+  and FNO.
+- Whether FNO needs a different patch-size / token structure.
@@ -0,0 +1,22 @@
+# Cached-latent CRPS
+
+CRPS loss trained in cached-latent space (processor-only training on
+pre-encoded latents, decoded only at eval time).
+
+**Status:** CNS data point exists —
+`outputs/2026-04-19/crps_cns64_vit_azula_large_58712c4_71ba7be`.
+No new training script needed for this pass; eval is handled by
+`slurm_scripts/comparison/eval/submit_eval_crps_latent.sh`.
+
+## Baseline
+
+`local_hydra/local_experiment/processor/conditioned_navier_stokes/crps_vit_azula_large.yaml`.
+
+## Next steps
+
+- When the second dataset is added, extend the `DATASETS` map in
+  `submit_eval_crps_latent.sh` and submit a matching training run via
+  `slurm_scripts/comparison/cached_latents/submit_crps_latent_large.sh`.
+- Decide whether to include `eval.mode=latent` ablation alongside
+  `eval.mode=ambient` for this ablation specifically — it answers "how
+  much of the latent-CRPS gap is decode/encode drift?".
@@ -0,0 +1,27 @@
+# Conditioning: global_cond (AdaLN) vs permute_concat
+
+Swap the CRPS ambient conditioning path from `permute_concat` (spatial
+channel concatenation) to `identity` encoder + `include_global_cond:
+true` (AdaLN modulation on the backbone). Makes conditioning flow match
+FM ambient, isolating the encoder effect.
+
+**Status:** CNS data point exists for CRPS-ViT —
+`outputs/2026-04-18/crps_cns64_vit_azula_large_0f89f06_cf53b48`. No new
+CRPS-ViT training needed for this pass; U-Net equivalent is pending.
+
+## Baselines
+
+- CRPS-ViT with identity+global_cond:
+  `local_hydra/local_experiment/epd/conditioned_navier_stokes/crps_vit_azula_large_identity_global_cond.yaml`.
+- CRPS-ViT with permute_concat (main baseline):
+  `.../crps_vit_azula_large.yaml`.
+
+## Outstanding
+
+- U-Net analogue: need `crps_unet_large_identity_global_cond.yaml`
+  mirroring the ViT ablation. U-Net backbone `include_global_cond` path
+  to be verified against
+  `src/autocast/processors/` U-Net module.
+- Eval for the existing CNS ViT ablation run is covered by
+  `slurm_scripts/comparison/eval/submit_eval_crps_ambient.sh` (included
+  in its RUN_DIRS).
@@ -0,0 +1,35 @@
+# CRPS loss variants
+
+Compare `AlphaFairCRPS` (baseline) vs `FairCRPS` vs `CRPS`.
+
+**Status:** stub — no scripts yet.
+
+## Baseline
+
+`local_hydra/local_experiment/epd/conditioned_navier_stokes/crps_vit_azula_large.yaml`
+(uses `AlphaFairCRPSLoss`).
+
+## Knob
+
+Swap `model.loss_func._target_` and the matching `train_metrics.crps`
+target:
+
+| variant | loss_func | metric |
+|---|---|---|
+| AlphaFairCRPS (baseline) | `autocast.losses.ensemble.AlphaFairCRPSLoss` | `autocast.metrics.ensemble.AlphaFairCRPS` |
+| FairCRPS | `autocast.losses.ensemble.FairCRPSLoss` | `autocast.metrics.ensemble.FairCRPS` |
+| CRPS | `autocast.losses.ensemble.CRPSLoss` | `autocast.metrics.ensemble.CRPS` |
+
+Exact class paths to be verified against
+`src/autocast/losses/ensemble.py` and `metrics/ensemble.py` before
+scripting.
+
+## Datasets
+
+CNS only for now. Table spec'd 2 datasets × 3 losses = 6 runs — CNS
+gives us 3 runs for this pass.
+
+## Implementation sketch
+
+Single-file sweep via CLI overrides in `submit_crps_variants_*.sh` with
+a `LOSSES` array of `(name, loss_target, metric_target)` triples.
@@ -0,0 +1,90 @@
+# Ensemble size ablation
+
+First-pass defaults focus on `n_members=16` under two batch-size
+regimes. For the current submission pass, the active scripts are pared
+down to just three `eff_bs1024` runs on `gray_scott`,
+`gpe_laser_only_wake`, and `advection_diffusion`; the CNS entries and
+`fixed_bs32` combo are left commented for later reuse. All runs inherit
+from the matching per-dataset
+`local_hydra/local_experiment/epd/<dataset>/crps_vit_azula_large.yaml`;
+the ablation is a pure CLI override on `model.n_members` +
+`datamodule.batch_size`, so no new experiment configs are needed.
+
+## Knob map
+
+Main baseline is `bs_crps=32 × n_members=8 × 4 GPUs = 1024 global
+effective` (i.e. `256 effective per-GPU`).
+
+### Fixed batch size = 32/GPU (same as baseline)
+
+Keep `datamodule.batch_size=32` and set `n_members=16`.
+This doubles effective batch vs baseline.
+
+| n_members | bs_per_gpu | effective per-GPU | effective global |
+|---:|---:|---:|---:|
+| 16 | 32 | 512 | 2048 |
+
+### Fixed global effective batch = 1024 (matches baseline compute budget)
+
+Keep `bs_crps × n_members × 4 GPUs = 1024`. With `n_members=16`,
+`bs_per_gpu=16`.
+
+| n_members | bs_per_gpu | effective per-GPU | effective global |
+|---:|---:|---:|---:|
+| 16 | 16 | 256 | 1024 |
+
+## Dataset coverage
+
+| dataset | `fixed_bs32` | `eff_bs1024` |
+|---|---:|---:|
+| `conditioned_navier_stokes` | yes | yes |
+| `gray_scott` | no | yes |
+| `gpe_laser_only_wake` | no | yes |
+| `advection_diffusion` | no | yes |
+
+This keeps the original CNS pilot in reserve while the active submit
+scripts target only the three compute-matched (`1024` effective global
+batch) CRPS ablations on the other comparison datasets.
+
+## Files
+
+| file | purpose |
+|---|---|
+| `submit_ensemble_timing.sh` | 5-epoch timing for the three active `eff_bs1024` runs (`gray_scott`, `gpe_laser_only_wake`, `advection_diffusion`) → `timing.ckpt` per run |
+| `submit_ensemble_large.sh`  | 24h production runs for the same three active runs, using cached or timing-derived cosine schedules |
+| `eval/submit_eval_crps_ambient.sh` | ambient eval for the current `m=16` CRPS run set (CNS `fixed_bs32` pilot plus all available `eff_bs1024` runs), with conservative `eval.batch_size=4` and explicit `eval.n_members=10` to match the comparison-study eval regime |
+
+## Extending the sweep
+
+Add more lines to `COMBOS` in both submit scripts. Invariants are checked
+per regime so bad tuples fail fast before any submission:
+
+- `fixed_bs32`: require `bs_per_gpu=32`; vary `n_members`.
+- `eff_bs1024`: require `bs_per_gpu × n_members × 4 GPUs = 1024`.
+
+Dataset coverage is controlled separately via `REGIMES_BY_DATASET` in
+each submit script, so extending `eff_bs1024` without broadening
+`fixed_bs32` is a one-line change per dataset.
+
+## Eval placement
+
+Ensemble-size eval now lives under `slurm_scripts/ablations/ensemble_size/eval/`
+rather than `slurm_scripts/comparison/eval/`. The reason is organizational:
+the run set is still partly ablation-only (`fixed_bs32`) even though the
+`eff_bs1024` subset may later graduate into the main comparison baseline.
+
+If that promotion happens, move the promoted run dirs into a comparison-level
+eval script and leave only the genuinely ablation-only runs here.
+
+## Scheduling
+
+`submit_ensemble_large.sh` first checks `COSINE_EPOCHS_BY_COMBO`. If a
+key is missing, it looks for the matching timing run
+`outputs/*/crps_<dataset>_<regime>_m<n_members>/timing.ckpt` and derives
+`trainer.max_epochs` on the fly with:
+
+`uv run autocast time-epochs --from-checkpoint <path>/timing.ckpt -b 24 -m 0.02`
+
+That means the added `gray_scott`, `gpe_laser_only_wake`, and
+`advection_diffusion` `eff_bs1024` runs become submit-ready as soon as
+their timing jobs finish, without another script edit.