Skip to content

Conversation

Copy link

Copilot AI commented Nov 25, 2025

Plan: Add Local Timestepping Option for Steady-State Convergence

  • Add use_local_timestepping boolean flag to SolverChoice in ERF_DataStruct.H
  • Add input parameter parsing for erf.use_local_timestepping in SolverChoice::init_params
  • Create per-cell timestep MultiFab storage in ERF.H (dt_cell for each level)
  • Modify ERF_ComputeTimestep.cpp to compute local timesteps when enabled
    • Compute local CFL-based dt for each cell
    • Store in dt_cell MultiFab
    • Global dt still computed as minimum for reporting
    • Use actual physical spacing (dxinv[2]/detJ) for vertical direction
    • Add optional smoothing of local timestep field with 27-point stencil
  • Update ERF_TI_no_substep_fun.H to use local timesteps in RK stages
    • Apply cell-specific dt when updating conserved variables
    • Apply face-averaged dt for momentum equations
    • Support both EB and non-EB cases
    • Add bounds checking for face-centered timestep averaging
    • Use mfi.validbox() for proper bounds checking with 0 ghost cells
  • Update ERF_TI_slow_rhs_pre.H to handle variable dt
    • Apply local dt when computing scratch = S_old + dt*RHS for implicit solver
    • Apply local dt when backing out RHS as (S_new - S_old) / dt
    • Handle cell-centered and face-centered variables appropriately
  • Update ERF_SlowRhsPre.cpp to handle variable dt for anelastic mode
    • Apply local dt when computing (S_data - S_old) / dt for cell-centered variables
    • Apply local dt when computing (rho_u - rho_u_old) / dt for face-centered momentum
    • Use averaged dt from adjacent cells for momentum variables
  • Update ERF_PoissonSolve.cpp to handle variable dt in project_momenta
    • Apply local dt when computing gradp from fluxes (gradp = -flux / dt)
    • Apply local dt when updating pressure increment (pp_inc += phi / dt)
    • Use averaged dt from adjacent cells for face-centered gradients
  • Add documentation for the new feature in TimeAdvance.rst
  • Add input parameter documentation in Inputs.rst
  • Add realistic ABL single column example with local timestepping
    • Single column geometry (1x1 horizontal)
    • Vertical grid stretching
    • Geostrophic wind forcing
    • MOST surface layer boundary condition
    • Input sounding with neutral layer, capping inversion, and free atmosphere
    • MRF PBL scheme for turbulence closure
  • Address code review feedback
    • Add bounds checking for array accesses in momentum updates
    • Define LARGE_DT constant instead of magic number 1.e20
    • Use actual physical vertical spacing (dxinv[2]/detJ) accounting for terrain/grid stretching
    • Use mfi.validbox() for accurate bounds checking
    • Handle variable dt in erf_slow_rhs_pre for implicit solver
    • Handle variable dt in ERF_SlowRhsPre.cpp for anelastic mode
    • Handle variable dt in project_momenta for anelastic projection
    • Add timestep field smoothing option
  • Add CFL ramping capability
    • Add erf.cfl_init parameter for initial CFL value
    • Add erf.cfl_ramping_factor parameter for ramping rate
    • Implement ramping logic in ComputeDt
    • Document new parameters in Inputs.rst
  • Add convergence monitoring
    • Add erf.convergence_interval parameter
    • Compute L2 norm of RHS for RhoTheta and 3 momentum components
    • Print convergence metrics every convergence_interval timesteps
    • Document feature in Inputs.rst
    • Enable in test case

Implementation Complete

This PR implements three features to accelerate convergence to steady state:

1. Local Timestepping

Enables per-cell adaptive timestepping where each cell uses its own CFL-limited timestep rather than the global minimum. Activated with erf.use_local_timestepping = true. This is particularly useful for problems with regions of vastly different characteristic speeds.

Timestep Smoothing: Setting erf.smooth_local_dt = true applies a 27-point stencil average to smooth the timestep field after computation. The smoothed timestep at each cell is computed as the average of all valid neighbors in a 3×3×3 stencil, reducing sharp gradients and improving stability while maintaining the convergence benefits of local timestepping.

Current Limitation: Local timestepping is currently only supported with erf.substepping_type = None (no acoustic substepping). Support for acoustic substepping would require redesigning the implicit solver to handle spatially-varying fast timesteps, which is planned for future work.

The implementation now properly handles variable dt in:

  • Time integration (ERF_TI_no_substep_fun.H)
  • Implicit solver RHS computations (ERF_TI_slow_rhs_pre.H)
  • Anelastic mode RHS computations (ERF_SlowRhsPre.cpp)
  • Anelastic projection (ERF_PoissonSolve.cpp - project_momenta)

2. CFL Ramping

Allows gradual ramping of the CFL number from an initial value (erf.cfl_init) to a target value (erf.cfl) using a multiplicative scaling factor (erf.cfl_ramping_factor) applied each timestep. This improves stability during startup while allowing aggressive CFL values during steady running.

3. Convergence Monitoring

Monitors convergence to steady state by computing and printing the L2 norm of RHS source terms for RhoTheta and the three momentum components. Activated with erf.convergence_interval > 0. Decreasing L2 norms indicate the solution is approaching steady state. This is particularly useful with local timestepping enabled to assess convergence acceleration.

Example usage:

# Local timestepping (requires substepping_type = None)
erf.use_local_timestepping = true
erf.smooth_local_dt = true  # Optional: smooth timestep field
erf.substepping_type = None

# CFL ramping
erf.cfl = 0.8              # Target CFL
erf.cfl_init = 0.1         # Start at CFL = 0.1
erf.cfl_ramping_factor = 1.05  # Increase by 5% each step

# Convergence monitoring
erf.convergence_interval = 10  # Monitor every 10 timesteps

All features include comprehensive documentation and a realistic ABL single column test case demonstrating their use.

Original prompt

Add a local timestepping option


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Add local timestepping option Add local timestepping for steady-state convergence acceleration Nov 25, 2025
Copilot AI requested a review from ewquon November 25, 2025 17:28
Copilot finished work on behalf of ewquon November 25, 2025 17:28
@ewquon
Copy link
Contributor

ewquon commented Nov 25, 2025

@copilot For the example test case, use a simulation of an idealized neutral ABL in a single column with vertical grid stretching. The ABL is forced by a fixed geostrophic wind, which is in balance with the Coriolis forces at steady state. Surface shear stress is modeled by a MOST surface layer. The top boundary is a slip wall with fixed theta_grad to match the initial temperature lapse rate. Initialize with an input sounding with constant theta=300K up to 750 m AGL; a capping inversion that is 100 m thick and a temperature increase of 8K; and a lapse rate of 3 K/km in the free atmosphere. Turbulence closure is provided by the MRF PBL scheme.

Copy link
Author

Copilot AI commented Nov 25, 2025

@copilot For the example test case, use a simulation of an idealized neutral ABL in a single column with vertical grid stretching. The ABL is forced by a fixed geostrophic wind, which is in balance wi...

Replaced the example test case with an ABL single column simulation as requested. The new test case ABL_SingleColumn_LocalTimestep includes:

  • Single column (1x1 horizontal, 64 vertical cells with stretching ratio 1.03)
  • Geostrophic wind forcing (5 m/s) balanced with Coriolis at latitude 45°
  • MOST surface layer (z₀=0.1m) at bottom
  • Slip wall with θ_grad=0.003 K/m at top
  • Input sounding: neutral layer (θ=300K, 0-750m), capping inversion (8K increase, 750-850m), free atmosphere (3 K/km lapse rate, >850m)
  • MRF PBL scheme for turbulence closure

Commit: 23d01e0

Copilot finished work on behalf of ewquon November 25, 2025 18:03
Comment on lines 233 to 329
// Compute local timesteps for steady-state convergence acceleration
if (solverChoice.use_local_timestepping && dt_cell[level]) {
MultiFab& dt_local = *dt_cell[level];

// Compute local timestep for each cell based on local CFL constraint
if (solverChoice.terrain_type == TerrainType::EB) {
const eb_& eb_lev = get_eb(level);
const MultiFab& detJ = (eb_lev.get_const_factory())->getVolFrac();

for (MFIter mfi(dt_local, TilingIfNotGPU()); mfi.isValid(); ++mfi) {
const Box& bx = mfi.tilebox();
const Array4<Real>& dt_arr = dt_local.array(mfi);
const Array4<const Real>& s = S_new.const_array(mfi);
const Array4<const Real>& u = ccvel.const_array(mfi);
const Array4<const Real>& vf = detJ.const_array(mfi);

ParallelFor(bx, [=] AMREX_GPU_DEVICE(int i, int j, int k) noexcept {
if (vf(i,j,k) > 0.) {
const Real rho = s(i, j, k, Rho_comp);
const Real rhotheta = s(i, j, k, RhoTheta_comp);
Real pressure = getPgivenRTh(rhotheta);
Real c = std::sqrt(Gamma * pressure / rho);

Real dt_inv = 0.0;
if (l_substepping) {
if ((nxc > 1) && (nyc==1)) {
dt_inv = (amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0];
} else if ((nyc > 1) && (nxc==1)) {
dt_inv = (amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1];
} else {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0],
(amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1]);
}
} else {
if (nxc > 1 && nyc > 1) {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0],
(amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1],
(amrex::Math::abs(u(i,j,k,2))+c)*dzinv);
} else if (nxc > 1) {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0],
(amrex::Math::abs(u(i,j,k,2))+c)*dzinv);
} else if (nyc > 1) {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1],
(amrex::Math::abs(u(i,j,k,2))+c)*dzinv);
} else {
dt_inv = (amrex::Math::abs(u(i,j,k,2))+c)*dzinv;
}
}
dt_arr(i,j,k) = (dt_inv > 0.0) ? cfl / dt_inv : LARGE_DT;
} else {
dt_arr(i,j,k) = LARGE_DT;
}
});
}
} else {
for (MFIter mfi(dt_local, TilingIfNotGPU()); mfi.isValid(); ++mfi) {
const Box& bx = mfi.tilebox();
const Array4<Real>& dt_arr = dt_local.array(mfi);
const Array4<const Real>& s = S_new.const_array(mfi);
const Array4<const Real>& u = ccvel.const_array(mfi);

ParallelFor(bx, [=] AMREX_GPU_DEVICE(int i, int j, int k) noexcept {
const Real rho = s(i, j, k, Rho_comp);
const Real rhotheta = s(i, j, k, RhoTheta_comp);
Real pressure = getPgivenRTh(rhotheta);
Real c = std::sqrt(Gamma * pressure / rho);

Real dt_inv = 0.0;
if (l_substepping) {
if ((nxc > 1) && (nyc==1)) {
dt_inv = (amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0];
} else if ((nyc > 1) && (nxc==1)) {
dt_inv = (amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1];
} else {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0],
(amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1]);
}
} else {
if (nxc > 1 && nyc > 1) {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0],
(amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1],
(amrex::Math::abs(u(i,j,k,2))+c)*dzinv);
} else if (nxc > 1) {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,0))+c)*dxinv[0],
(amrex::Math::abs(u(i,j,k,2))+c)*dzinv);
} else if (nyc > 1) {
dt_inv = amrex::max((amrex::Math::abs(u(i,j,k,1))+c)*dxinv[1],
(amrex::Math::abs(u(i,j,k,2))+c)*dzinv);
} else {
dt_inv = (amrex::Math::abs(u(i,j,k,2))+c)*dzinv;
}
}
dt_arr(i,j,k) = (dt_inv > 0.0) ? cfl / dt_inv : LARGE_DT;
});
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Instead of dzinv, use the actual inverse of the physical spacing here, dxinv[2] divided by the local detJ

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use actual physical spacing dxinv[2] / detJ instead of dzinv for the vertical direction. This properly accounts for terrain and grid stretching, which is especially important for the ABL single column test case with vertical grid stretching.

Changes applied to both EB and non-EB cases in the local timestep computation.

Commit: 516f68d

Copilot AI requested a review from ewquon November 25, 2025 18:23
Copilot finished work on behalf of ewquon November 25, 2025 18:24

const bool use_local_dt = (solverChoice.use_local_timestepping && dt_cell[level]);
const Array4<const Real> dt_local_arr = use_local_dt ? dt_cell[level]->const_array(mfi) : Array4<const Real>{};
const Box& dt_box = use_local_dt ? dt_cell[level]->box(mfi) : Box();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot shouldn't this be a mfi.validbox() instead of dt_cell[level]->box(mfi)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Changed to use mfi.validbox() for proper bounds checking since dt_cell has 0 ghost cells. This gives the valid region for this MFIter rather than the MultiFab box.

Commit: a2e3a75

Copilot AI requested a review from ewquon November 25, 2025 18:58
Copilot finished work on behalf of ewquon November 25, 2025 18:58
@ewquon
Copy link
Contributor

ewquon commented Nov 25, 2025

@copilot Add the ability to ramp from erf.cfl_init with a erf.cfl_ramping_factor (a scaling factor for the CFL at that timestep) up to erf.cfl. If erf.cfl_init is not specified, then the target CFL is constant.

Copy link
Author

Copilot AI commented Nov 25, 2025

@copilot Add the ability to ramp from erf.cfl_init with a erf.cfl_ramping_factor (a scaling factor for the CFL at that timestep) up to erf.cfl. If erf.cfl_init is not specified, then the targe...

Added CFL ramping capability:

  • erf.cfl_init: Initial CFL value (if not specified, no ramping occurs)
  • erf.cfl_ramping_factor: Scaling factor applied each timestep (default: 1.0)
  • CFL ramps from cfl_init to target cfl by multiplying by cfl_ramping_factor each step

Example usage:

erf.cfl = 0.8              # Target CFL
erf.cfl_init = 0.1         # Start at CFL = 0.1
erf.cfl_ramping_factor = 1.05  # Increase by 5% each step

The ramping stops when the target CFL is reached. Documented in Inputs.rst.

Commit: 0948f1e

Copilot finished work on behalf of ewquon November 26, 2025 05:02
@ewquon
Copy link
Contributor

ewquon commented Nov 26, 2025

@copilot You haven't addressed the use of dt in erf_slow_rhs_pre in TimeIntegration/ERF_SlowRhsPre.cpp.

Copy link
Author

Copilot AI commented Nov 26, 2025

@copilot You haven't addressed the use of dt in erf_slow_rhs_pre in TimeIntegration/ERF_SlowRhsPre.cpp.

Modified ERF_SlowRhsPre.cpp to handle variable dt for anelastic mode:

  1. Added use_local_dt and dt_cell parameters to erf_slow_rhs_pre function signature
  2. When computing (cell_data - cell_old) / dt for cell-centered variables (rho, rhotheta), use local dt per cell
  3. When computing (rho_u - rho_u_old) / dt for face-centered momentum, use averaged dt from adjacent cells with bounds checking
  4. Updated call site in ERF_TI_slow_rhs_pre.H to pass the new parameters

This ensures anelastic mode correctly handles spatially-varying timesteps when l_anelastic && (nrk == 1).

Commit: fd9bb62

Copilot finished work on behalf of ewquon November 26, 2025 05:09
@ewquon
Copy link
Contributor

ewquon commented Nov 26, 2025

@copilot The anelastic path is still not fully functional with local timestepping. project_momenta (ERF_PoissonSolve.cpp) called at the end of no_substep_fun still uses a constant "slow_dt"

Copy link
Author

Copilot AI commented Nov 26, 2025

@copilot The anelastic path is still not fully functional with local timestepping. project_momenta (ERF_PoissonSolve.cpp) called at the end of no_substep_fun still uses a constant "slow_dt"

Modified project_momenta in ERF_PoissonSolve.cpp to handle variable dt for anelastic projection:

  1. Added use_local_dt and dt_cell parameters to function signature
  2. When computing gradp from fluxes (gradp = -flux / dt), use local dt for each face with averaging from adjacent cells
  3. When updating pressure increment (pp_inc += phi / dt), use local dt per cell
  4. Updated function declaration in ERF.H with default parameters for backward compatibility
  5. Updated call site in ERF_TI_no_substep_fun.H to pass local timestepping parameters

This completes the anelastic path support for local timestepping.

Commit: fa7b46e

Copilot finished work on behalf of ewquon November 26, 2025 06:51
@ewquon
Copy link
Contributor

ewquon commented Nov 26, 2025

@copilot After computing the local dt, add an option to smooth the dt field such that dt_cell[i,j,k] = mean(dt_cell[i-1:i+2, j-1:j+2, k-1:k+2]) -- nominally a 9-pt average except at boundaries.

Copy link
Author

Copilot AI commented Nov 26, 2025

@copilot After computing the local dt, add an option to smooth the dt field such that dt_cell[i,j,k] = mean(dt_cell[i-1:i+2, j-1:j+2, k-1:k+2]) -- nominally a 9-pt average except at boundaries.

Added timestep smoothing option with erf.smooth_local_dt parameter:

  1. Added smooth_local_dt boolean flag to SolverChoice
  2. Implemented 27-point stencil averaging: dt_smooth(i,j,k) = mean(dt[i-1:i+1, j-1:j+1, k-1:k+1])
  3. Handles boundaries properly by using only valid neighbors
  4. Applied after local timestep computation in ERF_ComputeTimestep.cpp
  5. Documented in Inputs.rst with mathematical formula
  6. Updated test case and README to demonstrate usage

The smoothing reduces sharp gradients in the timestep field, improving stability while maintaining the benefits of local timestepping for steady-state convergence.

Commit: dcff2e4

Copilot finished work on behalf of ewquon November 26, 2025 19:25
@ewquon
Copy link
Contributor

ewquon commented Nov 27, 2025

@copilot Verify consistency in changes since commit dcff2e4. Then, add convergence monitoring at the end of each timestep based on the L2 norm of the RHS source term for RhoTheta and the 3 components of momentum.

Copy link
Author

Copilot AI commented Nov 27, 2025

@copilot Verify consistency in changes since commit dcff2e4. Then, add convergence monitoring at the end of each timestep based on the L2 norm of the RHS source term for RhoTheta and the 3 components ...

Verified consistency of changes since dcff2e4 - all commits after that were made by @ewquon and add improvements to the local timestepping implementation (harmonic mean smoothing, blending, diffusive timescales).

Added convergence monitoring feature:

  1. New parameter erf.convergence_interval - controls how often convergence metrics are printed (similar to sum_interval)
  2. Computes L2 norms of RHS source terms for:
    • RhoTheta (thermal energy equation)
    • X-momentum
    • Y-momentum
    • Z-momentum
  3. Prints convergence metrics at specified intervals with format:
    CONVERGENCE: Step 100 Time 5.000000 L2(RHS): RhoTheta=1.234e-05 Xmom=2.345e-06 Ymom=3.456e-06 Zmom=4.567e-06
    
  4. Documentation added to Inputs.rst with parameter table and usage examples
  5. Test case updated to enable convergence monitoring every 10 timesteps

Decreasing L2 norms indicate the solution is approaching steady state. This feature is particularly useful for monitoring local timestepping convergence acceleration.

Commit: 960e756

Copilot AI changed the title Add local timestepping and CFL ramping for steady-state convergence acceleration Add local timestepping, CFL ramping, and convergence monitoring for steady-state convergence acceleration Nov 27, 2025
Copilot finished work on behalf of ewquon November 27, 2025 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants