Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,7 @@ physicsnemo/models/vfgn/ @mnabian

# Experimental deliberately has no codeowner
physicsnemo/experimental/
physicsnemo/experimental/datapipes/healda/ @pzharrington

# ==============================================================================
# EXAMPLES - Active Learning
Expand Down
218 changes: 218 additions & 0 deletions examples/weather/healda/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
# HealDA — AI-based Data Assimilation on the HEALPix Grid

> **🏗️ This recipe is under active construction. 🏗️**
> Structure and functionality are subject to changes.

HealDA is a stateless assimilation model that produces a single
global weather analysis from conventional and satellite
observations. It operates on a HEALPix level-6 padded XY grid
and outputs ERA5-compatible atmospheric variables.

## Setup

Start by installing PhysicsNeMo (if not already installed) with
the `healda` optional dependency group, along with the packages
in `requirements.txt`. Then, copy this folder
(`examples/weather/healda`) to a system with a GPU available.
Also, prepare a dataset that can serve training data according
to the protocols outlined in the
[Generalized Data Loading](#generalized-data-loading) section
below.

### Normalization statistics

Per-sensor observation stats (`configs/normalizations/*.csv`)
and the ERA5 stats table (`configs/era5_13_levels_stats.csv`)
ship with this recipe rather than the installed package, since
they are training-set specific. Point the datapipe at them by
setting `HEALDA_STATS_DIR` before importing
`physicsnemo.experimental.datapipes.healda`:

```bash
export HEALDA_STATS_DIR=$(pwd)/configs
```

If unset, sensor configs fall back to zero-mean / unit-std
(useful for tests and structural checks); the ERA5 stats
loader will raise instead, since there is no sensible default.

## Generalized Data Loading

The `physicsnemo.experimental.datapipes.healda` package provides
a composable data loading pipeline with clear extension points.
The architecture separates components into loaders, transforms,
datasets, and sampling infrastructure.

### Architecture

```text
ObsERA5Dataset(era5_data, obs_loader, transform)
| Temporal windowing via FrameIndexGenerator
| __getitems__ -> get() per index -> transform.transform()
v
RestartableDistributedSampler (stateful distributed sampling with checkpointing)
|
DataLoader (pin_memory, persistent_workers)
|
prefetch_map(loader, transform.device_transform)
|
Training loop (GPU-ready batch)
```

### Key Protocols

Custom data sources and transforms plug in via these protocols
(see `physicsnemo.experimental.datapipes.healda.protocols`):

**`ObsLoader`** — the observation loading interface:

```python
class MyObsLoader:
async def sel_time(self, times):
"""Return {"obs": [pa.Table, ...]}"""
...
```

**`Transform`** / **`DeviceTransform`** — two-stage batch
processing:

```python
class MyTransform:
def transform(self, times, frames):
"""CPU-side: normalize, encode obs, time features."""
...

def device_transform(self, batch, device):
"""GPU-side: move to device, compute obs features."""
...
```

### Provided Implementations

| Component | Module | Description |
|---|---|---|
| `ObsERA5Dataset` | `dataset` | ERA5 state + observations |
| `UFSUnifiedLoader` | `loaders.ufs_obs` | Parquet obs loader |
| `ERA5Loader` | `loaders.era5` | Async ERA5 zarr loader |
| `ERA5ObsTransform` | `transforms.era5_obs` | Two-stage transform |
| `RestartableDistributedSampler` | `samplers` | Stateful distributed sampler |
| `prefetch_map` | `prefetch` | CUDA stream prefetching |

All modules above are under
`physicsnemo.experimental.datapipes.healda`.

### Writing a Custom Observation Loader

Implement `async def sel_time(times)` returning a dict with
observation data per timestamp:

```python
class GOESRadianceLoader:
def __init__(self, data_path, channels):
self.data_path = data_path
self.channels = channels

async def sel_time(self, times):
tables = []
for t in times:
table = self._load_goes_radiances(t)
tables.append(table)
return {"obs": tables}
```

Then pass it to the dataset:

```python
from physicsnemo.experimental.datapipes.healda import (
ObsERA5Dataset,
)
from physicsnemo.experimental.datapipes.healda.transforms.era5_obs import (
ERA5ObsTransform,
)
from physicsnemo.experimental.datapipes.healda.configs.variable_configs import (
VARIABLE_CONFIGS,
)

dataset = ObsERA5Dataset(
era5_data=era5_xr["data"],
obs_loader=GOESRadianceLoader(...),
transform=ERA5ObsTransform(sensors=["goes"], ...),
variable_config=VARIABLE_CONFIGS["era5"],
)
```

### Putting It Together

A complete training pipeline wires together all the
components — dataset, sampler, DataLoader, and GPU prefetch:

```python
import torch
from torch.utils.data import DataLoader

from physicsnemo.experimental.datapipes.healda import (
ObsERA5Dataset,
RestartableDistributedSampler,
identity_collate,
prefetch_map,
)
from physicsnemo.experimental.datapipes.healda.loaders.ufs_obs import (
UFSUnifiedLoader,
)
from physicsnemo.experimental.datapipes.healda.transforms.era5_obs import (
ERA5ObsTransform,
)
from physicsnemo.experimental.datapipes.healda.configs.variable_configs import (
VARIABLE_CONFIGS,
)

sensors = ["atms", "mhs", "conv"]

# 1. Build loaders
obs_loader = UFSUnifiedLoader(
data_path="/path/to/processed_obs",
sensors=sensors,
obs_context_hours=(-21, 3),
)
transform = ERA5ObsTransform(
variable_config=VARIABLE_CONFIGS["era5"],
sensors=sensors,
)

# 2. Build dataset
dataset = ObsERA5Dataset(
era5_data=era5_xr["data"],
obs_loader=obs_loader,
transform=transform,
variable_config=VARIABLE_CONFIGS["era5"],
split="train",
)

# 3. Sampler + DataLoader
sampler = RestartableDistributedSampler(
dataset, rank=rank, num_replicas=world_size,
)
sampler.set_epoch(0)
dataloader = DataLoader(
dataset,
sampler=sampler,
batch_size=2,
num_workers=8,
collate_fn=identity_collate,
pin_memory=True,
persistent_workers=True,
)

# 4. GPU prefetch (hides CPU→GPU transfer behind training)
device = torch.device("cuda")
loader = prefetch_map(
dataloader,
lambda batch: transform.device_transform(batch, device),
queue_size=1,
)

# 5. Training loop — batches arrive GPU-ready
for batch in loader:
loss = model(batch)
...
```
77 changes: 77 additions & 0 deletions examples/weather/healda/configs/era5_13_levels_stats.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
variable,level,std,mean
U,1000,6.0761532856545495,-0.4012036280949351
U,925,7.8438841318137085,0.1875586020534487
U,850,8.117545798838897,1.0676392264524672
U,700,9.161262910521476,3.1993547283822656
U,600,10.351920350301654,4.753506500293617
U,500,12.020863918100371,6.643045555863143
U,400,14.37786455089647,9.166420770646372
U,300,17.368910160479544,12.604205075841238
U,250,18.63126418939197,14.458098693848147
U,200,18.826071570402007,15.603421939654016
U,150,17.338743800703853,14.694208759254527
U,100,14.207755780676433,10.084541123485184
U,50,14.828288062604061,3.447298300210448
V,1000,5.0562302721142185,0.1881812905956076
V,925,6.087634459436962,0.18351119177675437
V,850,5.788139666651214,0.08865528649740444
V,700,6.326620906882278,-0.01624326876106696
V,600,7.18152443362136,-0.046446792220198006
V,500,8.45848562350414,-0.032289933410975316
V,400,10.38343861775161,-0.02010503440455113
V,300,12.609499025107015,-0.025803734238538843
V,250,13.057155983469311,-0.04324580333432856
V,200,12.04089626815829,-0.07330144652625546
V,150,9.733874174308893,-0.06046714286886432
V,100,7.052043591267376,0.016139828470052277
V,50,5.626839932669169,0.000506756767366129
T,1000,13.3926808838397,288.34811963631944
T,925,12.75887207878649,284.14180529926887
T,850,12.337912690900826,281.1236718383152
T,700,11.485912313471955,273.6522781145025
T,600,10.914242703518575,266.8225997851199
T,500,10.947319271596404,258.48344241794354
T,400,10.889317302118906,247.5078738667765
T,300,9.38121872205424,233.2984452545507
T,250,7.28726013236191,225.6329557585312
T,200,5.29271246649905,218.53493343877926
T,150,7.447585393440247,211.46310701251957
T,100,11.427625612914374,204.83933151080333
T,50,7.494050550499991,211.44287440973292
Z,1000,893.4657098478391,935.5049140418249
Z,925,1008.7658942548571,7360.676041141024
Z,850,1197.2156800245555,14248.78134896371
Z,700,1736.7489061933863,29767.144398835746
Z,600,2188.213530821762,41747.04921153613
Z,500,2729.9217253050874,55509.29599258445
Z,400,3402.601245989482,71726.5567290568
Z,300,4213.390725518206,91575.34112031905
Z,250,4602.533249864132,103579.02469155286
Z,200,4831.571016444258,117791.77588622355
Z,150,4711.8643515216945,135539.4191093247
Z,100,4105.159434190763,159702.46332065153
Z,50,3851.9190680022352,200924.18906200194
Q,1000,0.005781177709317301,0.00936470198136523
Q,925,0.004981825818460944,0.008008882445730241
Q,850,0.004179579792520903,0.006031608541363992
Q,700,0.002737859580661181,0.0031682692015671905
Q,600,0.0019479582011688481,0.0019997242562778714
Q,500,0.0012087878889463929,0.0011044648648525284
Q,400,0.0005684813340060475,0.0005001939892631466
Q,300,0.00018788822542209214,0.00016860593486309685
Q,250,8.253217825122686e-05,7.728339864456503e-05
Q,200,2.476106074180645e-05,2.5977126545155267e-05
Q,150,4.08396736011409e-06,6.446599195006216e-06
Q,100,6.154373024833254e-07,2.6868721151493317e-06
Q,50,2.5960505606920645e-07,2.6752383145132987e-06
tcwv,-1,16.707112756025314,24.224098621015028
tas,-1,15.355112655773071,287.40642670567786
uas,-1,5.436672873915895,-0.3751276131251324
vas,-1,4.491587780564064,0.18441198768158903
100u,-1,6.684378709883516,-0.36058662354821425
100v,-1,5.613643415909631,0.1893235299805761
pres_msl,-1,1109.2809461275167,101138.98195665042
sst,-1,8.851771853018453,290.8944586515412
sic,-1,0.18702627733432053,0.04226433445734063
orog,-1.0,627.3885284872,232.56013904090733
lfrac,-1.0,0.4695501683565522,0.3410480857539571
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Raw_Channel_ID,Platform_ID,obs_std,obs_mean,obs_min,obs_max
1,-1,8.214708551830356,-0.007816872517859445,-54.583263,25.529593
1,0,8.214708551830421,-0.007816872517859456,-54.583263,25.529593
2,-1,6.204024141659925,0.060118015162609874,-54.41878,28.946451
2,0,6.204024141659928,0.060118015162609874,-54.41878,28.946451
3,-1,1.7957874344187403,-0.04797540239732354,-11.155533,14.771423
3,0,1.7957874344187437,-0.047975402397323494,-11.155533,14.771423
4,-1,1.4675865871956242,-0.0006924302620134747,-9.648644,9.285899
4,0,1.4675865871956315,-0.0006924302620134743,-9.648644,9.285899
5,-1,0.8913469566843333,-0.005252729000527411,-5.767573,7.9473076
5,0,0.8913469566843248,-0.005252729000527415,-5.767573,7.9473076
6,-1,0.623793467189813,-0.003225442053726113,-5.3376155,3.5984225
6,0,0.6237934671898061,-0.0032254420537261115,-5.3376155,3.5984225
7,-1,0.39469768069981787,-0.0023120367162287455,-5.812195,7.6763997
7,0,0.39469768069981237,-0.002312036716228743,-5.812195,7.6763997
8,-1,0.33593410156027675,0.005151922195575682,-9.785906,9.08154
8,0,0.33593410156027365,0.005151922195575682,-9.785906,9.08154
9,-1,0.28105098555908775,0.001158328666100196,-3.716396,3.4178336
9,0,0.281050985559087,0.0011583286661001964,-3.716396,3.4178336
10,-1,0.22843497168668764,0.0036994384355875766,-3.6491232,4.755522
10,0,0.22843497168668825,0.003699438435587577,-3.6491232,4.755522
11,-1,0.21192941892048298,-0.0003545486185746004,-13.372186,3.3225749
11,0,0.21192941892048256,-0.0003545486185746005,-13.372186,3.3225749
12,-1,0.1347720053776834,-0.0007126142925684956,-6.045535,2.4844725
12,0,0.13477200537768147,-0.0007126142925684956,-6.045535,2.4844725
13,-1,0.11211111955725256,-0.0005063795610494643,-1.8508754,4.7352424
13,0,0.11211111955725352,-0.000506379561049464,-1.8508754,4.7352424
14,-1,0.11594864230525384,0.002015852397427362,-2.623983,5.3442745
14,0,0.11594864230525545,0.002015852397427362,-2.623983,5.3442745
15,-1,0.11065300843350233,0.0006929825699963552,-2.9404325,3.254625
15,0,0.11065300843350065,0.0006929825699963552,-2.9404325,3.254625
16,-1,0.11059560698104977,0.00025496963146238625,-5.1120696,1.139813
16,0,0.11059560698104766,0.0002549696314623864,-5.1120696,1.139813
17,-1,0.10827448760995192,-0.0008072181292068404,-11.610527,1.4685344
17,0,0.1082744876099539,-0.0008072181292068405,-11.610527,1.4685344
18,-1,0.10103222698655831,-0.001441306732794491,-3.8353717,4.2164545
18,0,0.10103222698655692,-0.0014413067327944903,-3.8353717,4.2164545
19,-1,0.10147859528906694,-0.0006220671585844798,-2.781216,3.3901258
19,0,0.10147859528906512,-0.0006220671585844798,-2.781216,3.3901258
20,-1,0.10368942955363945,-0.00020344373144167612,-1.5792689,8.15311
20,0,0.10368942955363931,-0.00020344373144167606,-1.5792689,8.15311
21,-1,0.10095808472137889,-0.00023500744504603737,-1.6666156,1.577552
21,0,0.10095808472137917,-0.00023500744504603707,-1.6666156,1.577552
22,-1,0.10052316918429405,-3.0905465844828725e-05,-1.1307209,9.927661
22,0,0.1005231691842936,-3.090546584482874e-05,-1.1307209,9.927661
23,-1,0.09880220102875298,0.00047988024789517484,-0.9339404,6.719097
23,0,0.0988022010287523,0.00047988024789517495,-0.9339404,6.719097
24,-1,0.08934753826574814,0.0005772402600992258,-10.382512,1.9781011
24,0,0.0893475382657479,0.0005772402600992256,-10.382512,1.9781011
25,-1,0.08806596592232664,-0.001283645375872585,-5.6984606,2.2212803
25,0,0.08806596592232749,-0.001283645375872586,-5.6984606,2.2212803
26,-1,0.08731936349619145,-0.0006102947848890263,-1.3738391,7.589438
26,0,0.08731936349619131,-0.0006102947848890257,-1.3738391,7.589438
27,-1,0.08717698513915002,-0.0005823197553379738,-2.9844427,1.3560678
27,0,0.087176985139149,-0.0005823197553379737,-2.9844427,1.3560678
28,-1,0.08492734980919361,-0.00038831001660858896,-2.6943138,1.0411136
28,0,0.08492734980919496,-0.0003883100166085897,-2.6943138,1.0411136
29,-1,0.08305636257870289,-0.00016189496049658577,-9.235278,2.5213282
29,0,0.08305636257870143,-0.0001618949604965858,-9.235278,2.5213282
30,-1,0.08194517117431961,-0.00030206228476307577,-1.3414413,1.4573736
30,0,0.08194517117432089,-0.0003020622847630757,-1.3414413,1.4573736
31,-1,0.0812069714976724,-8.516453259253043e-05,-0.6720387,7.2829967
31,0,0.08120697149767325,-8.516453259253042e-05,-0.6720387,7.2829967
32,-1,0.08075419401644526,-0.0008985027944582864,-11.253723,0.53134376
32,0,0.08075419401644411,-0.0008985027944582865,-11.253723,0.53134376
Loading
Loading