Skip to content

Replication Issue: Lorenz '63/Lorenz '96 Benchmarks from Cleary et al. (2021) #396

@Monoxido45

Description

@Monoxido45

Description

I am currently working on replicating the results from the Calibrate-Emulate-Sample (CES) framework as presented in the Journal of Computational Physics (Cleary et al., 2021). While I have successfully implemented the Calibration phase (using both EKI and EKS), I am encountering significant hurdles in replicating the Emulation and Sampling phases using the current Julia implementation.

Questions & Issues

  1. Missing Reference Examples: I have searched the main branch of the CalibrateEmulateSample.jl repository but cannot find the exact scripts used for the Lorenz '63 or Lorenz '96 benchmarks shown in the paper. Is there a legacy branch or a separate research repository that contains the original experiment configurations and results?
  2. Emulator Stability & Convergence: When using the GaussianProcess emulator on the ensemble trajectory, I am encountering some numerical instability and poor hyperparameter convergence. Even when using SVD decorrelation, the posterior results deviate from the paper's figures.
  3. Parameter Uncorrelation: I am seeking clarification on the "parameter uncorrelated" version mentioned in the documentation. I have attempted to implement this via the encoder_schedule, but the approximation quality remains poor.

Environment

  • Julia Version: v1.12.0
  • CalibrateEmulateSample.jl (CES) Version: v0.7.0

Minimal Working Example (L63 Replication)

Below is my implementation for replicating the Lorenz '63 experiment. It includes the forward model, the variance estimation, a "Gold Standard" RWMH sampler for reference, and the full CES pipeline.

using Pkg
Pkg.activate(".")

using Random
using LinearAlgebra
using Distributions
using CalibrateEmulateSample
using CalibrateEmulateSample.Emulators
using CalibrateEmulateSample.MarkovChainMonteCarlo
using CalibrateEmulateSample.Utilities
using CalibrateEmulateSample.DataContainers
using CalibrateEmulateSample.EnsembleKalmanProcesses
using CalibrateEmulateSample.ParameterDistributions
using OrdinaryDiffEq
using Statistics
using JLD2, Statistics, Printf

const EKP = CalibrateEmulateSample.EnsembleKalmanProcesses

# 1. Lorenz '63 System Definition
function lorenz_63(du, u, p, t)
    r, b = p
    du[1] = 10 * (u[2] - u[1])
    du[2] = u[1] * (r - u[3]) - u[2]
    du[3] = u[1] * u[2] - b * u[3]
end

# Forward map to compute 9 moments over window T
function get_l63_stats(theta_phys, u_start; T=10.0, T0=30.0, dt=0.001)
    p = [theta_phys[1], theta_phys[2]]
    # Spin-up phase
    prob_spin = ODEProblem(lorenz_63, u_start, (0.0, T0), p)
    sol_spin = solve(prob_spin, Euler(), dt=dt, saveat=[T0])
    # Observation phase
    prob_obs = ODEProblem(lorenz_63, sol_spin.u[end], (0.0, T), p)
    sol_obs = solve(prob_obs, Euler(), dt=dt, saveat=dt)
    states = hcat(sol_obs.u...)
    x, y, z = states[1, :], states[2, :], states[3, :]
    stats = [mean(x), mean(y), mean(z), mean(x.^2), mean(y.^2), mean(z.^2), 
             mean(x.*y), mean(y.*z), mean(z.*x)]
    return stats, copy(sol_obs.u[end])
end

# 2. Main CES Experiment
function run_final_l63_experiment()
    seed = 125
    rng = MersenneTwister(seed)

    m0, Σ0 = [3.3, 1.2], diagm([0.15^2, 0.5^2])
    prior = ParameterDistribution(Dict(
        "distribution" => Parameterized(MvNormal(m0, Σ0)),
        "constraint" => [no_constraint(), no_constraint()],
        "name" => "log_rho_beta"))

    theta_true_phys = [28.0, 8/3]
    y_obs, _ = get_l63_stats(theta_true_phys, [1.0, 1.0, 1.0])
    
    println("Estimating Gamma_obs...")
    Γ_obs = estimate_gamma_obs_faithful(theta_true_phys)
    obs_names = ["x_y_z_x2_y2_z2_xy_yz_zx"]

    # Constructing observation object
    truth = EKP.Observation(
        Dict("samples" => y_obs,
            "covariances" => Γ_obs,
            "names" => obs_names
        )
    )

    # CALIBRATE (EKS with Persistence)
    println("Running EKS Calibration")
    N_ens, N_iter = 100, 30
    # construct_initial_ensemble returns an ensemble in the parameterization used by the prior (log-space)
    initial_params = construct_initial_ensemble(rng, prior, N_ens)
    # create EnsembleKalmanProcess: (initial_ensemble, y_obs, Γ_obs, sampler)
    eks = EnsembleKalmanProcess(
        initial_params,
        truth,
        # y_obs,
        # Γ_obs,
        EKP.Sampler(prior),
        verbose=true,
    )
    ens_states = [[1.0, 1.0, 1.0] for _ in 1:N_ens]
    err_eks = zeros(N_iter)

    for i in 1:N_iter
        # physical parameters for ensemble members (apply inverse transform)
        params_phys = exp.(get_ϕ_final(prior, eks))
        g_ens = zeros(9, N_ens)
        for j in 1:N_ens
            g_ens[:, j], ens_states[j] = get_l63_stats(params_phys[:, j], ens_states[j]; T=10.0)
        end
        # update ensemble with the forward evaluations
        update_ensemble!(eks, g_ens)

        err_eks[i] = EKP.get_error(eks)[end] #mean((params_true - mean(params_i,dims=2)).^2)
        println("Iteration: " * string(i) * ", Error: " * string(err_eks[i]))

        # Simple diagnostics: print parameter error relative to truth
        if i == 1 || i % 5 == 0
            current_mean_phys = vec(mean(exp.(get_ϕ_final(prior, eks)), dims=2))
            param_err = norm(current_mean_phys - theta_true_phys)
            println("Iter $i | param_err=$(round(param_err, digits=6))")
        end
    end

    
    # Get ensemble points from the last EKS iteration (log-space and physical)
    last_ensemble_log = get_ϕ_final(prior, eks)
    last_ensemble_phys = exp.(last_ensemble_log)

    println("Captured last-iteration ensemble: $(size(last_ensemble_phys, 2)) members.")
    Γy = get_obs_noise_cov(eks)
    # EMULATE (GP with SVD Decorrelation)
    println("Training Emulator using last-iteration ensemble as training data...")
    # Using same GPJL arguments as in CES paper
    gppackage = Emulators.GPJL()
    pred_type = Emulators.YType()

    mlt = GaussianProcess(
        gppackage;
        kernel=nothing, # use default squared exponential kernel
        prediction_type=pred_type,
        noise_learn=true,
    )

    # Standard normalization schedule (Standardized inputs, SVD outputs)
    retain_var = 1.0
    # decorrelation of parameters and outputs
    encoder_schedule = [
        (decorrelate_sample_cov(), "in"),
        (decorrelate_structure_mat(retain_var=retain_var), "out")
    ]

    min_iter = 5
    max_iter = N_iter

    # Get training points from the EKP iteration number in the second input term  
    min_iter = min(max_iter, max(1, min_iter))
    input_output_pairs = Utilities.get_training_points(eks, min_iter:max_iter)

    emulator = Emulator(
        mlt,
        input_output_pairs;
        encoder_schedule=encoder_schedule,
        encoder_kwargs=(; obs_noise_cov=Γy),
    )
    optimize_hyperparameters!(emulator)
# SAMPLE (CES Emulator Posterior)
    println("Sampling CES Posterior...")
    u0 = vec(mean(get_inputs(input_output_pairs), dims=2))
    println("initial parameters: ", u0)

    ksampler = RWMHSampling()
    # First let's run a short chain to determine a good step size
    mcmc = MCMCWrapper(
        ksampler,
        y_obs,
        prior,
        emulator;
        init_params=u0
    )
    new_step = optimize_stepsize(mcmc; init_stepsize=0.1, N=2000, discard_initial=0)

    # Now begin the actual MCMC
    println("Begin MCMC - with step size ", new_step)
    chain = MarkovChainMonteCarlo.sample(
        mcmc,
        40_000;
        stepsize=new_step,
        discard_initial=1_000
    )
    posterior = MarkovChainMonteCarlo.get_posterior(mcmc, chain)

    post_mean = mean(posterior)
    post_cov = cov(posterior)
    println("post_mean: ", post_mean)
    println("post_cov: ", post_cov)

    ces_samples_eks = vcat([get_distribution(posterior)[name] for name in get_name(posterior)]...)
    
    # GOLD STANDARD SAMPLER
    println("Sampling Gold Standard (Reference)...")
    gold_samples_log = run_l63_gold_standard_independent(
        y_obs,
        Γ_obs,
        prior;
        rng=rng,
        N_samples=50000
    )
    # Thin and remove burn-in from gold standard samples
    burnin = 500
    n_keep = 10000
    N_total = size(gold_samples_log, 2)
    available = N_total - burnin

    # thinning step for gold samples
    jump = max(1, fld(available, n_keep))
    idx = collect(burnin+1:jump:N_total)

    # if not enough indices, decrease jump until we have >= n_keep (or jump==1)
    while length(idx) < n_keep && jump > 1
        jump -= 1
        idx = collect(burnin+1:jump:N_total)
    end
    # trim to exactly n_keep samples
    if length(idx) > n_keep
        idx = idx[1:n_keep]
    end

    # thinning step for ces samples based on EKS
    N_total = size(ces_samples_eks, 2)
    jump = max(1, fld(available, n_keep))
    idx_ces = collect(1:jump:N_total)

    # if not enough indices, decrease jump until we have >= n_keep (or jump==1)
    while length(idx) < n_keep && jump > 1
        jump -= 1
        idx_ces = collect(1:jump:N_total)
    end
    # trim to exactly n_keep samples
    if length(idx) > n_keep
        idx_ces = idx[1:n_keep]
    end

    ces_samples_eks = ces_samples_eks[:, idx_ces]
    gold_samples_log = gold_samples_log[:, idx]

    println("Experiment finished. Samples saved to l63_final_comparison.jld2")

    # include the version string passed to the function in the saved filename (sanitise it)
    ver_str = replace(string(version), r"[^\w\-.]" => "_")
    jld_path = joinpath(@__DIR__, "l63_final_comparison_$(ver_str).jld2")
    ces_physical_eks = exp.(ces_samples_eks)
    gold_physical = exp.(gold_samples_log)
    
    return ces_physical_eks, gold_physical, theta_true_phys, last_ensemble_phys
end


# Execute
ces_s, gold_s, t_true, ensemble_phys = run_final_l63_experiment()

# checking means and variances
ces_means_eks = vec(mean(ces_s, dims=2))
ces_vars_eks = vec(var(ces_s, dims=2))
# ces_means_eki = vec(mean(ces_eki, dims=2))
# ces_vars_eki = vec(var(ces_eki, dims=2))

gold_means = vec(mean(gold_s, dims=2))
gold_vars = vec(var(gold_s, dims=2))

eks_means = vec(mean(ensemble_phys, dims=2))
eks_vars = vec(var(ensemble_phys, dims=2))
Image

Results

The estimated posteriors show significant overconfident behavior even when implementing parameter decorrelation. I would appreciate any guidance on whether there are specific branches containing the original paper's code or if there are any issues with my current code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions