Skip to content

Obtaining truly modular Observation Covariance #328

Description

@odunbar

issue

During some recent discussions it became clear that Though we have a couple of dispatching covariance definitions in ObservationRecipes, e.g., ScalarCovariance and SVDplusD, they do not act in a way that is truly modular. In particular the following needs to be true

"From a top level script, and for a given set of observation samples y_1,y_2,...,y_n, (and some user-inputs e.g. scalings), we should be able to define a variety of covariances (scalar*I, diagonal, SVDplusD) easily."

As the fundamental EKP objects (scalar*I,diagonal, SVDplusD) can all operate similarly on y_1,y_2,...,y_n type inputs, it would appear that there is some inconsistency in how the ObservationRecipe applies these constructions. After discussions with @ph-kev we possibly draw this back to perhaps having inconsistent mapping from ClimaAnalaysis OutputVars to define samples.

Possible direction:

the CovarianceEstimator types as they are, are doing too many different things, including handling different complexities in the stacking of data etc. Instead we could separate this task to a ObservationSamplesBuilder method that can generate an object that is considered to be i.i.d samples of data. Then the building of the covariance should be a simple (modular) task applying an estimation approach to this set of samples. Any complexity in the organization/stacking dimensions etc. related to this task based on OutputVars should be delagated to the ObservationSamplesBuilder .

In effect this is providing a struct that observation dispatches on, rather than the current kwarg-based approach?

I think one important consequence is that the different covariance estimator structures will no longer depend on the state vars. They will use the state when they are applied via a dispatched method (e.g. called estimate_covariance)

As a sketch... the user/outer loop defines

cov_estimator = ScalarCovariance(scalar, ...)
obs_sample_builder = ObsSampleBuilder( 
    sample_dims = ["time"], 
    sample_dim_operation = ("aggregate", "30mins"),
)

then later in ObservationRecipe.covariance(vars, cov_estimator::CE, obs_sample_builder.::OSB) where {CE <: ..., OSB <: ...}

samples = build_observation_samples(obs_sample_builder, vars, ...)

estimate_covariance(
    cov_estimator,
    samples,
    ...
)

Happy to discuss

Link to observation/covariance usage

https://github.com/CliMA/ClimaCalibrate.jl/blob/ca6550f7930b4a2f9f535534c851d61f00630786/ext/observation_recipe.jl

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions