Skip to content

Comments

WeatherQuest Calibration pipeline with weekly support#1747

Open
costachris wants to merge 12 commits intomainfrom
cc/wxquest_v4_final
Open

WeatherQuest Calibration pipeline with weekly support#1747
costachris wants to merge 12 commits intomainfrom
cc/wxquest_v4_final

Conversation

@costachris
Copy link
Member

Purpose

To-do

Content


  • I have read and checked the items on the review checklist.

diable O3 timevarying, land periodic cal LAI

Use MODIS LAI climatology (land) and albedo from era5 (bucket) for historical runs. Initialize sea ice with sea ice temperatures from era5.

Fix sea ice (use surface temp for init). Use config outputs consistent with era5 val comparison.

lower ice min temp and revert CL grid

update config

Remove oceananigans and climaocean. Pin GPU compilers.

Add subseasonal calibration pipeline

Update calibration pipeline for weatherquest weekly calibrations on derecho

Add and use 1 day calibration. Fix some bugs in observations map and daily data handling.

add data normalizations and modify noise.

Add land mask, modify noise, add land parameter, write to scratch and deal with HDF errs

Add support and functionality for TransformInversion, Inversion. Precompute large matrices on cpu as inital step

add gravity wave to calibration, move to copies3

add options, add more parameters, log top level run script

fix normalization bug for multi variable option. Add ice albedo to toml.

fix precip units, add logging, start trying 7 day

weekly runs

Add NH average option, remove precip, all specifically of noise by var for NH option, add ensemble spatial plotting

update plots

Add script for analyzing parameters

clean rebase

update manifests

add TOA flux, update prior
@costachris costachris force-pushed the cc/wxquest_v4_final branch 2 times, most recently from 52210eb to f00776f Compare February 21, 2026 23:41
@costachris costachris changed the title Cc/wxquest v4 final WeatherQuest Calibration pipeline with weekly support Feb 22, 2026
…week comparison period to 1 month CERES data for now.
@@ -0,0 +1,53 @@
#!/bin/bash
Copy link
Member Author

@costachris costachris Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will likely remove this file (in favor of run_full_calibration.sh‎)

sample_date_ranges,
# Monthly run: 7-day spinup + 21-day calibration = 28 days total
# Model starts at (Jan 8 - 7 days) = Jan 1, ends at (Jan 8 + 21 days) = Jan 29
extend = Dates.Day(21),
Copy link
Member Author

@costachris costachris Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a temp hack to compare 3 weeks of model data (after 1 week spinup) to 1 month of CERES data.
I'm adding IC files starting a week before the start date, so then we can probably do the 1 week of spinup + 1 month of comparison.

Comment on lines +564 to +571
else
# Compute from h_elem (spectral element grid)
h_elem = get(config_dict, "h_elem", 12)
# Default formula: h_elem * 4 panels * 3 (spectral degree)
nlon = h_elem * 4 * 3
nlat = nlon ÷ 2
@info "Using model grid from h_elem=$h_elem: $(nlon)×$(nlat)"
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use the actual model output grid

end

"""
time_average_with_date(var, date)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a function for adding a singleton dimension in ClimaAnalysis, so you don't need to do this.

end
end

g_ensemble = EnsembleBuilder.get_g_ensemble(g_ens_builder)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern should not be followed since you don't know if the G ensemble matrix is fully completed. See

g_ens = EnsembleBuilder.get_g_ensemble(g_ens_builder)
if count(isnan, g_ens) > 0.9 * length(g_ens)
error("Too many NaNs")
end
return EnsembleBuilder.is_complete(g_ens_builder) ? g_ens :
error("G ensemble matrix is not completed")

Comment on lines +419 to +422
covar_estimator = ClimaCalibrate.ObservationRecipe.ScalarCovariance(;
scalar = Float64(CALIBRATION_NOISE_SCALAR),
use_latitude_weights = true,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized this function doesn't work too well since sample_date_ranges is overloaded for both calibration and what samples to choose for the observation. For example, the samples might be over 50 years, but you are only running a calibration for a couple of years.

Comment on lines +9 to 10
# Surface temperature/pressure
"pr" => "kg m^-2 s^-1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be simplified once we put all of this in a data loader.


Shifts the time axis by the largest period in the date range, sets units, and
windows to the date range. Calls `largest_period` (defined per-pipeline) to
determine the shift.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be a need for shifting the times.



# Lazy-loaded CERES data loader (initialized on first use)
const _CERES_LOADER = Ref{Union{Nothing, CalibrationTools.CERESDataLoader}}(nothing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a CERESDataLoader shouldn't be that expensive since it only involves opening the NetCDF file and parsing the metadata it.

Comment on lines +277 to +283
var = ClimaAnalysis.window(var, "time"; left = month_start, right = month_end)

# Get the data for this month (should be single time point)
times = ClimaAnalysis.times(var)
if length(times) == 0
error("No CERES data found for $short_name in month of $start_date")
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, window always get the nearest dates. I don't think it is possible for the length of times to be zero.

Comment on lines +288 to +290
if length(times) > 1
var = ClimaAnalysis.average_time(var)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this since it is introducing a fallback that shouldn't exist.

Comment on lines +276 to +277
# CERES dates are at start of month, so window to get the correct month
var = ClimaAnalysis.window(var, "time"; left = month_start, right = month_end)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, this should use selectwith MatchValue().

Comment on lines +452 to +455
Compute global mean and std for each variable across all date ranges.
Returns a Dict mapping short_name -> (mean, std).
Uses latitude-weighted averaging for physically meaningful statistics.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be implemented in ClimaCalibrate or ClimaAnalysis instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants