Skip to content

Conversation

@billsacks
Copy link
Member

@billsacks billsacks commented Jul 9, 2025

Resolves #3313

Description of changes

The previous logic caused problems if "GRAIN" appeared in two (or more) strings, where one was a substring of the other. For example, in LREPRSTRUCT_Ly1_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput, before this replacement, one line contained 'GRAINN_TO_FOOD' and a later line contained (among other things) "'GRAINN_TO_FOOD_PERHARV', 'GRAINN_TO_FOOD_ANN'". This was problematic because the first replacement of GRAINN_TO_FOOD incorrectly led to replacements in the later strings as well.

This new logic should solve this issue.

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):
Resolves #3313

Are answers expected to change (and if so in what way)? Possible field diffs for the LREPRSTRUCT test

Any User Interface Changes (namelist or namelist defaults changes)? No

Does this create a need to change or add documentation? Did you do so? No

Testing performed, if any:

  • Created & built LREPRSTRUCT_Ly1_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput, compared user_nl_clm with before; but did not try running the test

IMPORTANT NOTE: This DOES show as an answer change because of field list differences for these tests:

LREPRSTRUCT_Ly1_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput
LREPRSTRUCT_Ly2_P128x1.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput

Obviously this doesn't fundamentally change the results for the tests though.

@billsacks
Copy link
Member Author

I compared user_nl_clm before and after the changes in this PR, in LREPRSTRUCT_Ly1_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput. It is identical except:

Old has: "... 'REPRODUCTIVE1N_TO_FOOD', 'REPRODUCTIVE2N_TO_FOOD_PERHARV', 'REPRODUCTIVE1N_TO_FOOD', 'REPRODUCTIVE2N_TO_FOOD_ANN', ..."

New has, in that location: "... 'REPRODUCTIVE1N_TO_FOOD_PERHARV', 'REPRODUCTIVE2N_TO_FOOD_PERHARV', 'REPRODUCTIVE1N_TO_FOOD_ANN', 'REPRODUCTIVE2N_TO_FOOD_ANN', ..."

I think that indicates that this resolves #3313 .

@billsacks billsacks force-pushed the fix_lreprstruct_grain_replacement branch from b76e47b to 2dd5c25 Compare July 9, 2025 00:47
@billsacks
Copy link
Member Author

(I just force-pushed to clean up python formatting.)

@ekluzek
Copy link
Collaborator

ekluzek commented Jul 9, 2025

@billsacks and @slevis-lmwg I'm thinking we should put this on b4b-dev and bring it in on the b4b-dev cycle. The alternative would be to have @slevis-lmwg merge it into ctsm5.3.062 -- but I think he's already ran testing there so it would slow that tag up. How does that sound to the both of you?

Copy link
Contributor

@slevis-lmwg slevis-lmwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, thank you for the quick turnaround @billsacks!

@slevis-lmwg
Copy link
Contributor

@ekluzek I agree with you, I would rather let this come in with b4b-dev.

@slevis-lmwg slevis-lmwg added bug something is working incorrectly bfb bit-for-bit labels Jul 9, 2025
@ekluzek ekluzek changed the base branch from master to b4b-dev July 9, 2025 17:23
@ekluzek ekluzek self-assigned this Jul 9, 2025
@ekluzek
Copy link
Collaborator

ekluzek commented Jul 9, 2025

@billsacks the one thing I wonder about this now, is that we could add a unit tester to validate that the conversion of the user_nl_clm file is correct. You tested by hand for the current test, is it worth adding a tester to make sure that continues to be the case?

The previous logic caused problems if "GRAIN" appeared in two (or more)
strings, where one was a substring of the other. For example, in
LREPRSTRUCT_Ly1_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput,
before this replacement, one line contained 'GRAINN_TO_FOOD' and a later
line contained (among other things) "'GRAINN_TO_FOOD_PERHARV',
'GRAINN_TO_FOOD_ANN'". This was problematic because the first
replacement of GRAINN_TO_FOOD incorrectly led to replacements in the
later strings as well.

This new logic should solve this issue.

Resolves ESCOMP#3313
@billsacks billsacks force-pushed the fix_lreprstruct_grain_replacement branch from 2dd5c25 to 96d44ea Compare July 9, 2025 17:48
@billsacks
Copy link
Member Author

I just force-pushed to be based on the b4b-dev branch.

@billsacks
Copy link
Member Author

the one thing I wonder about this now, is that we could add a unit tester to validate that the conversion of the user_nl_clm file is correct. You tested by hand for the current test, is it worth adding a tester to make sure that continues to be the case?

My feeling is that this isn't worth the effort... I don't think it would be too hard, but I also don't think it would provide a ton of value at this point. But if you feel it's worthwhile, I can do it: I don't have strong feelings one way or the other.

@ekluzek
Copy link
Collaborator

ekluzek commented Jul 9, 2025

@billsacks and I met and talked about this a bit. Adding a unit-test here could be done in maybe a few hours from Bill, so not a lot of time.

But, reasons it might not be that useful:

  • We aren't going to refactor this code (test nor Fortran code) anytime soon
  • The amount of code in the test is small (we have unit tests for more complicated system tests as we should)
  • Unit testing is most valuable during development
  • Unit testing is also valuable to improve the code and make what it does more understandable to other developers -- this code is clean enough as is

@ekluzek
Copy link
Collaborator

ekluzek commented Jul 9, 2025

OK, I ran the two tests and they PASS. They differ from baseline just because fieldlists differ, but that's expected.

LREPRSTRUCT_Ly1_P128x1.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput
LREPRSTRUCT_Ly2_P128x1.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-ciso--clm-cropMonthOutput

@ekluzek ekluzek merged commit 92d2a5c into ESCOMP:b4b-dev Jul 9, 2025
6 checks passed
@ekluzek ekluzek deleted the fix_lreprstruct_grain_replacement branch July 9, 2025 22:20
@samsrabin
Copy link
Member

samsrabin commented Jul 10, 2025

@ekluzek:

@billsacks and I met and talked about this a bit. Adding a unit-test here could be done in maybe a few hours from Bill, so not a lot of time.

But, reasons it might not be that useful:

  • We aren't going to refactor this code (test nor Fortran code) anytime soon
  • The amount of code in the test is small (we have unit tests for more complicated system tests as we should)
  • Unit testing is most valuable during development
  • Unit testing is also valuable to improve the code and make what it does more understandable to other developers -- this code is clean enough as is

I'd like to gently push back against a few of these points:

We aren't going to refactor this code (test nor Fortran code) anytime soon

Changes to this Python code and the related FORTRAN code aren't the only things that can cause a test like this to break. This is illustrated by the fact that this bug was only illuminated during #2445, which was about separating instantaneous and non-instantaneous output fields.

Unit testing is most valuable during development

But not just development of the code you're testing; it's also helpful in debugging other development. If that string replacement had been unit-tested, it might have helped @slevis-lmwg diagnose the real issue, rather than resolving it in a way that had side effects. (Maybe not in this case, since it wasn't obvious that it was a Python issue, but in general this is something to consider.)

The amount of code in the test is small

That doesn't mean it doesn't deserve unit-testing, because small ≠ easy to write with no latent bugs. Unit testing could have caught this bug during initial development, when the code was even smaller than it is now.

I'm of the philosophy that if something breaks, we need to add a test to make sure it doesn't break again—or at least, to make it easy to identify why it's breaking. The more we do that, the more efficient we'll get at writing tests, and the lower the time cost of adding tests.

@billsacks
Copy link
Member Author

@samsrabin - I'll own some of the points here that you're pushing back against, or at least variations on those points. I expressed feelings to @ekluzek that, at this point, the time spent developing unit tests of this replacement code didn't feel justifiable, but I didn't / don't have strong feelings on that, and if others feel it would be good to have unit tests of this, I'd support that.

Part of my own calculus was that I need to focus on ESMF stuff for the upcoming release. How would you feel about adding a unit test of this? I got as far as thinking about how I'd do this:

  1. Pull replace_grain into a top-level function
  2. Extract the line, user_nl_clm_text = re.sub(r"GRAIN\w*", replace_grain, user_nl_clm_text) into a new top-level function in the module. This will be the function that would be unit tested.
  3. Do a manual test of building the LREPRSTRUCT test to ensure that the user_nl_clm file is still the same as before after steps (1) and (2).
  4. Create a unit test module for LREPRSTRUCT
  5. Create at least one unit test for the extracted function, giving it a string as input and testing the resulting string. At a minimum this should have the situation mentioned in my comment above, where one line contains 'GRAINN_TO_FOOD' and a later line contains (among other things) "'GRAINN_TO_FOOD_PERHARV', 'GRAINN_TO_FOOD_ANN'". (The later line could include a more extensive line that matches what's in that line in the current user_nl_clm.)

@samsrabin
Copy link
Member

Thanks, @billsacks! I definitely agree with you not being burdened with the unit-testing on this one; I just wanted to push back on those points in a more general sense. That unit test plan sounds perfect; I'll file an issue quoting it.

@ekluzek
Copy link
Collaborator

ekluzek commented Jul 14, 2025

@samsrabin thanks so much for the discussion here. I do in general feel like we do way too little unit-testing. But, I also think we need to think about the testing that we add and make sure it's beneficial. And as you point out I wasn't sure that we should have @billsacks be the one to do the unit testing. But, you raise some really good thoughtful points here...

We aren't going to refactor this code (test nor Fortran code) anytime soon

Changes to this Python code and the related FORTRAN code aren't the only things that can cause a test like this to break. This is illustrated by the fact that this bug was only illuminated during #2445, which was about separating instantaneous and non-instantaneous output fields.

This is a really good point. Bugs can show up for unrelated things in untested code, and this is just an example of this.

The amount of code in the test is small

That doesn't mean it doesn't deserve unit-testing, because small ≠ easy to write with no latent bugs. Unit testing could have caught this bug during initial development, when the code was even smaller than it is now.

Yes, actually when you think about sometimes it's concise code that's tricky that is the most problematic. And it's hard to find because it's just a small bit.

I'm of the philosophy that if something breaks, we need to add a test to make sure it doesn't break again—or at least, to make it easy to identify why it's breaking. The more we do that, the more efficient we'll get at writing tests, and the lower the time cost of adding tests.

I do highly endorse this philosophy as well. It's a practice that I heard about that has always made sense to me. I do like to try to put it in practice as much as we can. And I'm concerned when we aren't able to put it in practice.

But, thanks again for the discussion.

@ekluzek
Copy link
Collaborator

ekluzek commented Aug 15, 2025

We talked this over in the CTSM SE meeting yesterday, and we concur with @samsrabin points above. Very small bits of code can be problematic especially when untested. And since bugs can often be created in place distant from code that's changed, having better testing can not only be about one particular place in the code -- it can be about having better code that is less likely to have changes elsewhere trigger problems. That is what happened here, the changes that caused us to notice this problem were completely unrelated to these bits of code.

The solid reason for not adding the testing though that was unstated above is that @billsacks should be the one that adds the unit testing here.

We also expressed the desire to shift our culture from "having to justify testing" to "having to justify NOT adding testing". I think that would be a good shift in our thinking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bfb bit-for-bit bug something is working incorrectly

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Hist fields REPRODUCTIVE1N_TO_FOOD_PERHARV and _ANN lose their suffixes in LREPR* tests

4 participants