-
Notifications
You must be signed in to change notification settings - Fork 604
Description
Dear StormCast Team,
I hope this message finds you well. I am writing to ask for your advice regarding an issue I encountered while running extended forecast with the StormCast UNet model.
I trained the StormCast UNet model using HRRR and ERA5 datasets from 2018 to 2024, except for 2022, which was reserved for validation. Figure 1 shows the evolution of the loss function during training.
Figure 1. Loss curves during weighted training. The training dataset contains 52800 samples, while the validation dataset contains 8800 samples. MA denotes the moving average.
When conducting forecasts longer than four days, however, unexpected behavior emerges. As an example, I examined an atmospheric river event from Mar 8th to 20th, 2017. The StormCast UNet model was initialized with HRRR data at 00:00 UTC on Mar 8th, 2017, and the ERA5 fields were used as synoptic-scale forcing, updated hourly. The simulation was run for 10 days (Mar 8th to 18th, 2017).
Figure 2 presents daily-mean fields of specific humidity at 850 hPa (A, D, G, J), wind speed at 850 hPa (B, E, H, K), and precipitation (C, F, I, L). The model produces reasonable simulations during the first 96 lead hours, but the forecast degrades substantially beyond that point.
Figure 2. Daily-mean specific humidity at 850 hPa (A, D, G, J), wind speed at 850 hPa (B, E, H, K), and precipitation (C, F, I, L) for the first (A to C), second (D to F), fourth (G to I) and fifth (J to L) forecast days.
To further diagnose the issue, Figure 3 shows the time series of daily total precipitation averaged over the Western US (30ºN50ºN, 115ºW130ºW). While precipitation during the first three days remains within a reasonable range, it increases unrealistically beginning around Mar 10th, 2017.
We also tested a nudging configuration in which the mesoscale state near the domain boundaries is replaced with HRRR data. However, similar issues persist in the nudged simulations.
Figure 3. Daily-mean precipitation averaged over the western US from StormCast UNet simulations. The no-nudging simulation is initialized from HRRR at 00:00 UTC on Mar 8th, 2017, and constrained by ERA5 data. The nudging simulation is identical except that boundary states are replaced by HRRR fields at each hour.
I am not certain whether this behavior reflects an issue with the simulation setup or a known limitation of the model. If possible, could you please advise whether there are example scripts or best practices for running longer simulations using using a local checkpoint?
Separately, I would like to mention that the training of the StormCast diffusion model is currently underway.
Thank you very much for your time and guidance!
Sincerely,
Ziming