Replace a single loop, and a few changes are needed for the small kernel PR #473

odiazib · 2025-08-05T16:21:01Z

Replace a single loop with nested parallel loops and make other changes needed for PR: E3SM-Project/E3SM#7567

Also testing: SMS_Ln5_P8x1.ne4pg2_oQU480.F2010-EAMxx-MAM4xx.frontier_craygnu-mphipcc.eamxx-L72

…instead of raw pointers. Replace the single loop with two nested parallel loops.

codecov · 2025-10-29T23:43:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.33%. Comparing base (34c4d5c) to head (7013325).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #473      +/-   ##
==========================================
- Coverage   93.43%   93.33%   -0.10%     
==========================================
  Files         303      303              
  Lines       25177    24232     -945     
  Branches     2766     2768       +2     
==========================================
- Hits        23523    22618     -905     
+ Misses       1654     1614      -40

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/mam4xx/mo_gas_phase_chemdr.hpp

overfelt · 2025-10-30T14:35:59Z

src/mam4xx/mo_photo.hpp

+        [&](int i, Real &lsum) {
+          lsum += 0.5 * (o3_col_deltas[i] + o3_col_deltas[i + 1]);
+        },
+        o3_col_dens(kk));


Is it not possible to replace these two nested Kokkos loops with a single Kokkos::parallel_scan?

Yes, it is possible. Does Kokkos::parallel_scan perform two nested loops internally? I prefer to keep it in the current form because it is easier to understand how the operation is performed.

I do not know the internal implementation of Kokkos::parallel_scan but I assume it is a slight modification of Kokkos::parallel_sum that assigns the intermediate values to the output array.

Is this code based on chemistry/mozart/mo_photo.F90? If so it seems the setcol function is incorrect anyway. The initial case for the 0th entry has the factor of 1/2 incorrect. When I write out what the result of o3_col_dens is after simplifying the arithmetic, I think this might be a trivial scan followed by a trivial element-by-element sum. Seems to me the Fortran implementation obscures what the result is and could have been simplified.

I am not 100% sure if this routine is from chemistry/mozart/mo_photo.F90. @singhbalwinder or @jeff-cohere, do you recall if we directly ported this routine from the Fortran code? If not, I believe we should update it.

Here's the commit for the original code, annotated by git blame:

9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 286) KOKKOS_INLINE_FUNCTION 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 287) void setcol(const Real o3_col_deltas[mam4::nlev + 1], ColumnView &o3_col_dens) { 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 288) // we can probably accelerate this with a parallel_scan, but let's just do 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 289) // a simple loop for now 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 290) constexpr int nlev = mam4::nlev; 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 291) o3_col_dens(0) = 0.5 * (o3_col_deltas[0] + o3_col_deltas[1]); 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 292) for (int k = 1; k < nlev; ++k) { 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 293) o3_col_dens(k) = 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 294) o3_col_dens(k - 1) + 0.5 * (o3_col_deltas[k] + o3_col_deltas[k + 1]); 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 295) } 9797b647a (Jeffrey N. Johnson 2023-11-28 16:32:20 -0800 296) }

And here's the log message for that commit:

commit 9797b647a848da2ce0eb3d3f73409f0ab1253f03 (HEAD) Author: Jeffrey N. Johnson <[email protected]> Date: Tue Nov 28 16:32:20 2023 -0800 Ported set_ub_col and setcol, and added validation tests (still in progress).

It would be good to check this against the original Fortran code, of course!

And here's the original code. As @overfelt has pointed out, there's an incorrect factor of 1/2 in there, so it should definitely be updated:

subroutine setcol( col_delta, & ! in col_dens ) ! out !--------------------------------------------------------------- ! ... set the column densities !--------------------------------------------------------------- use chem_mods, only : nabscol implicit none !--------------------------------------------------------------- ! ... dummy arguments !--------------------------------------------------------------- real(r8), intent(in) :: col_delta(:,0:,:) ! layer column densities [molecules/cm^2] real(r8), intent(out) :: col_dens(:,:,:) ! column densities [ 1/cm**2 ] !--------------------------------------------------------------- ! the local variables !--------------------------------------------------------------- integer :: kk, km1, mm ! indicies !--------------------------------------------------------------- ! ... compute column densities down to the ! current eta index in the calling routine. ! the first column is o3 and the second is o2. !--------------------------------------------------------------- do mm = 1,nabscol col_dens(:,1,mm) = col_delta(:,0,mm) + .5_r8 * col_delta(:,1,mm) ! kk=1 do kk = 2,pver km1 = kk - 1 col_dens(:,kk,mm) = col_dens(:,km1,mm) + .5_r8 * (col_delta(:,km1,mm) + col_delta(:,kk,mm)) enddo enddo end subroutine setcol

Sorry for the bug!

Thanks, @jeff-cohere, and no worries. @singhbalwinder, I will fix this bug in a follow-up PR because this fix will be NBFB.

issue: #482

Thanks, all! Yes, we should fix this in a follow-up PR. I also think we should use parallel_scan here instead of the nested loops, if possible.

src/mam4xx/gas_chem_mechanism.hpp

overfelt · 2025-10-30T15:02:02Z

We have discussed getting rid of the *.data() calls and pass around the Kokkos::Views instead. Is it too much trouble to do some of that in this code while you are refactoring some of it anyway?

odiazib · 2025-10-30T16:40:45Z

We have discussed getting rid of the *.data() calls and pass around the Kokkos::Views instead. Is it too much trouble to do some of that in this code while you are refactoring some of it anyway?

I have a PR to replace *.data() here. For some reason, we did not merge this PR. I will take a look to see if we can merge it.

singhbalwinder

Nice work! I have just a clarifying question and a suggestion.

singhbalwinder · 2025-10-30T22:14:32Z

src/mam4xx/gas_chem_mechanism.hpp

+  for (int i = 4; i < 31; ++i) {
+    loss[i] = het_rates[i + 1] * y[i + 1];
+    prod[i] = zero;
+  }


Thanks for removing the redundant + signs. Could you clarify why the loop runs up to 31? The extent of prod and loss arrays is clscnt4, where constexpr int clscnt4 = 30; // number of species in implicit class? It will be better to replace 31 (or any other extent) using variables like clscnt4.

It should be clscnt4. Thanks for finding this bug.

singhbalwinder · 2025-10-30T22:27:09Z

src/mam4xx/mo_photo.hpp

+        [&](int i, Real &lsum) {
+          lsum += 0.5 * (o3_col_deltas[i] + o3_col_deltas[i + 1]);
+        },
+        o3_col_dens(kk));


Thanks, all! Yes, we should fix this in a follow-up PR. I also think we should use parallel_scan here instead of the nested loops, if possible.

singhbalwinder · 2025-10-30T22:31:14Z

src/validation/mo_sethet/sethet.cpp

    for (int mm = 0; mm < gas_pcnst; ++mm) {
      for (int kk = 0; kk < pver; ++kk) {
-        qin_host[mm](kk) = qin_in[count];
+        qin_host(kk, mm) = qin_in[count];


Whenever we are switching between arrays and views, we should test it on Frontier so that the model doesn't break on it.

odiazib added 2 commits October 29, 2025 17:22

Replace arrays of ColumnView with a 2D view. Use template parameters …

ab0e899

…instead of raw pointers. Replace the single loop with two nested parallel loops.

Fixing compilation errors in validation tests and applying clang-format.

3bd41e1

odiazib force-pushed the oscar/small_kernels2 branch from 6f549b8 to 3bd41e1 Compare October 29, 2025 23:22

odiazib changed the title ~~Oscar/small kernels2~~ Replace a single loop, and a few changes are needed for the small kernel PR Oct 29, 2025

odiazib marked this pull request as ready for review October 29, 2025 23:26

odiazib requested review from overfelt and singhbalwinder and removed request for singhbalwinder October 29, 2025 23:26

overfelt reviewed Oct 30, 2025

View reviewed changes

src/mam4xx/mo_gas_phase_chemdr.hpp Show resolved Hide resolved

overfelt reviewed Oct 30, 2025

View reviewed changes

src/mam4xx/gas_chem_mechanism.hpp Show resolved Hide resolved

MAM4xx: Adding the size of the view.

a14c823

MAM4xx: Adding for loop.

998eb20

singhbalwinder requested changes Oct 30, 2025

View reviewed changes

MAM4xx: Fixing bug.

7013325

Replace a single loop, and a few changes are needed for the small kernel PR #473

Are you sure you want to change the base?

Replace a single loop, and a few changes are needed for the small kernel PR #473

Uh oh!

Conversation

odiazib commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

odiazib Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeff-cohere Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

overfelt commented Oct 30, 2025

Uh oh!

odiazib commented Oct 30, 2025

Uh oh!

singhbalwinder left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

odiazib commented Aug 5, 2025 •

edited

Loading

codecov bot commented Oct 29, 2025 •

edited

Loading

odiazib Oct 30, 2025 •

edited

Loading

jeff-cohere Oct 30, 2025 •

edited

Loading