Skip to content

Conversation

@hydrotian
Copy link
Contributor

A quick-fix to eliminate the negative runoff sent from ROF to OCN. Activated by setting redirect_negative_qgwl = .true. in user_nl_mosart. Two scenarios considered:
Scenario A (net_global_qgwl ≥ 0):

  • Proportionally scales down positive qgwl cells
  • Zeros out negative qgwl cells
  • No outlet redistribution

Scenario B (net_global_qgwl < 0):

  • Zeros out all qgwl
  • Redistributes deficit to all outlets proportionally

@hydrotian hydrotian added BFB PR leaves answers BFB MOSART Concerning the MOSART river model labels Oct 20, 2025
@rljacob rljacob requested a review from jonbob October 20, 2025 15:45
@proteanplanet
Copy link
Contributor

@hydrotian Please can you provide a location of the coupled simulation with these changes for us to explore? Also can you provide diagnostics for this simulation? Finally, can you confirm that these changes pass SMS, PET, PEM and ERS tests in a B-case?

@hydrotian
Copy link
Contributor Author

hydrotian commented Oct 20, 2025

@proteanplanet I don't have a coupled simulations done with this PR yet but I plan to submit one following my previous Bluetip simulation. This PR passed the e3sm_land_developer test suite which includes 50+ tests on Compy with some Namelist changes and Throughput changes. See the attached test results.
test_results.txt

@rljacob
Copy link
Member

rljacob commented Oct 20, 2025

Those test results don't have any PET or PEM tests. Try PET.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP and PEM.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP

@hydrotian
Copy link
Contributor Author

@rljacob The PET.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP simulation failed on Compy with following error message:

 Opened existing file 
 /compyfs/inputdata/share/domains/domain.lnd.ne4pg2_oQU240.190321.nc          23
 lat/lon grid flag (isgrid2d) is  F
 ncd_inqvid: variable LANDMASK is not on dataset
 decompInit_lnd(): Number of clumps exceeds number of land grid cells
         320         211
 ENDRUN:
 ERROR in decompInitMod.F90 at line 183

It is strange as I did not modify the land model in this PR. Any ideas? Should I try it on Chrysalis instead?

@rljacob
Copy link
Member

rljacob commented Oct 20, 2025

Yes try chrysalis. There may not be a good pelayout for that case on compy.

integer, allocatable :: outlet_gindices_local(:) ! Local array of global indices of outlets on this task
real(r8), allocatable :: outlet_discharges_local(:) ! Local array of discharges for these outlets
integer :: local_outlet_count
integer, allocatable :: all_outlet_gindices(:) ! Gathered on master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of these variables look unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are removed. Thanks.


! Reproducible sum for negative qgwl
neg_local(1,1) = local_negative_qgwl_sum
call shr_reprosum_calc(neg_local, neg_global, 1, 1, 1, &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could combine the two calls to shr_reprosum_calc into one call because it looks like the two fields are independent of each other. That would be more efficient than two calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The two calls are now combined.

@jonbob
Copy link
Contributor

jonbob commented Oct 22, 2025

I ran a PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test and it failed the comparison between the two runs. The PEM_Ln9.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test that's in e3sm_integration passes, but since it's only running 9 steps mosart only runs once in that test

@hydrotian
Copy link
Contributor Author

I ran a PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test and it failed the comparison between the two runs. The PEM_Ln9.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel test that's in e3sm_integration passes, but since it's only running 9 steps mosart only runs once in that test

My PET.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP passed, but the PEM.ne4pg2_ne4pg2.I1850CNPRDCTCBCTOP failed on comparison as well, because the 2nd run couldn't complete. I increased the walltime to 2 hours (maximum for a debug queue on Chrysalis?) but the simulation appeared to stall at some point. Then I tested the baseline (64046ec) and failed at the same point.

@jonbob
Copy link
Contributor

jonbob commented Oct 22, 2025

Thanks @hydrotian -- I checked and both runs for my PEM test completed fine, just had different results. I'm running a similar PET test right now

@hydrotian
Copy link
Contributor Author

@jonbob Thanks. Could you share the cprnc.out report? I want to see which fields are different between the two runs.

@jonbob
Copy link
Contributor

jonbob commented Oct 22, 2025

Sure, but after five days it ends up with 351 out of 507 fields different. It's at:

/lcrc/group/acme/ac.jwolfe/scratch/chrys/PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.20251022_120245_ruutak/PEM.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.20251022_120245_ruutak.cpl.hi.0001-01-06-00000.nc.base.cprnc.out

@jonbob
Copy link
Contributor

jonbob commented Oct 22, 2025

OK, the similar PET test (PET.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel) passed

@hydrotian
Copy link
Contributor Author

Thanks, @jonbob. Any insights about the PEM test fail? Would you mind doing a same PEM test for the baseline master where I branched from (64046ec)?

@jonbob
Copy link
Contributor

jonbob commented Oct 22, 2025

No insights from the PEM test -- we would have to do one where we tried to catch the first field that gets different answers. @proteanplanet noticed that you have a routine for sort_outlets_by_discharge_desc but we couldn't see it getting called?

@hydrotian
Copy link
Contributor Author

Yes. That was from an earlier commit on this branch. I can clean it up.

@rljacob
Copy link
Member

rljacob commented Oct 22, 2025

To get a better idea of when it diffs, change the river coupling frequency to match the other models. That might allow you to go back to a 9 nstep test. Also change the coupler history output to be every timestep.

@jonbob
Copy link
Contributor

jonbob commented Oct 22, 2025

@hydrotian -- I set redirect_negative_qgwl = .false. in your branch and the PEM test passes

@hydrotian
Copy link
Contributor Author

The PEM test has passed now. Both @jonbob and I confirmed that on our separate tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BFB PR leaves answers BFB MOSART Concerning the MOSART river model v3.1beta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants