Skip to content

Conversation

@polunma
Copy link

@polunma polunma commented Oct 15, 2025

please summarize your changes

Copy link
Collaborator

@mjschmidt271 mjschmidt271 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am strongly against merging this in its current form and would argue that major questions need to be addressed before including anything resembling these changes.

Also, note that gpu tests are failing as well as clang-format check.

(bad_boltzmann * temperature)) *
Kcoll_dust_a3 * icnlx;

// ---- dt guards and small-dt finite-difference limits ----
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this? This raises some concerns for me:

  1. Assuming dt < 1e-12 is the chosen setting for a run, does it seem appropriate to effectively reduce dt to zero for these calculations?
    • This isn't yet to the point of losing precision, and how small would we expect dt to ever be?
  2. This change introduces a spurious infinity in order to give the same result as a comparison we've already done.
    • This is an inappropriate way of accomplishing something much simpler, so can we outline what the goal is?
  3. Also, this leads to guaranteed unnecessary calculations:
    • introducing 3 ifs to assign the desired values of rate_over_dt_cnt<i> when the result is already determined by the initial if (dt < 1e-12).
    • introducing 3 calls to min() when this is also predetermined
    • Note that this all assumes that the J values are strictly positive, but that seems to be the case because things would look straaaange otherwise
  • Now, assuming setting dt = 0 is an appropriate choice, nearly all of the new code is unnecessary.
    • It effectively defines division by zero to be equal to infinity (it is not) and then appears to take great pains to not explicitly divide by zero.

Here, for the quantities q<i> = {frzbccnt, frzducnt} and in the case that their corresponding do_<xyz> flag is true, this appears to lead to the same result as

if (dt <= 1e-12) {
  // this will never be smaller than x2, so why do the comparison?
  x1 = z * infinity;
  x2 = z * J<i>;
  q<i> += min(x1, x2);
} else {
  // both of these should be strictly positive, and if I'm not mistaken,
  // for fixed J > 0, we always have x2 >= x1
  // so, again, introducing all of the comparisons is unnecessary
  x1 = z * ((1.0 - haero::exp(-Jcnt_bc * deltat)) / deltat);
  x2 = z / deltat;
 q<i> += min(x1, x2);
}

// dust_a1
Real coat_ratio1 = vol_shell[0] * (r_bc * 2.0) * fac_volsfc_pcarbon;

const Real n_so4_monolayers_dust = 1.0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly disagree with this approach. Where do the choices for these in-line constants come from?

Our goal has been to identify and subsequently get rid of all in-line constants in MAM4xx (work in progress--maybe one day). So, adding more is not something I can agree with.

Real rhw_ss_clamped = haero::max(rhw_ss, 1.0 + 1.0e-12);
Real denom = haero::max(bad_boltzmann * temperature * haero::log(rhw_ss_clamped),1.0e-30);
const Real rgdep = 2.0 * vwice * sigma_iv / denom;
//(bad_boltzmann * temperature * haero::log(rhw_ss_clamped));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove leftover comments

}

KOKKOS_INLINE_FUNCTION
Real get_Aimm(const Real vwice, const Real rgimm, const Real temperature,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is causing test failure for gpu. No idea why cpu still passes

Also, please delete leftover comments

// ---- dt guards and small-dt finite-difference limits ----
constexpr Real eps_dt = 1e-12;
const bool small_dt = haero::abs(deltat) <= eps_dt;
const Real inv_dt = small_dt ? std::numeric_limits<Real>::infinity()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeff-cohere @bartgol, can we use std::numeric_limits<Real>::infinity() in GPUs (CUDA)? What I recall is that we cannot use standard features inside GPU kernels, but I do not know if this is okay nowadays.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you can use the kokkos numeric traits.

haero::min(Hetfrz::limfacbc * uncoated_aer_num[Hetfrz::id_bc] / deltat,
uncoated_aer_num[Hetfrz::id_bc] / deltat *
(1.0 - haero::exp(-Jcnt_bc * deltat)));
std::fmin(Hetfrz::limfacbc * uncoated_aer_num[Hetfrz::id_bc] * inv_dt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we replace std::fmin with the Kokkos version of min?

frzducnt + haero::min(uncoated_aer_num[Hetfrz::id_dst1] / deltat,
uncoated_aer_num[Hetfrz::id_dst1] / deltat *
(1.0 - haero::exp(-Jcnt_dust_a1 * deltat)));
frzducnt + std::fmin(uncoated_aer_num[Hetfrz::id_dst1] * inv_dt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use kokkos min.

frzducnt + haero::min(uncoated_aer_num[Hetfrz::id_dst3] / deltat,
uncoated_aer_num[Hetfrz::id_dst3] / deltat *
(1.0 - haero::exp(-Jcnt_dust_a3 * deltat)));
frzducnt + std::fmin(uncoated_aer_num[Hetfrz::id_dst3] * inv_dt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use kokkos min.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants