Skip to content

Conversation

@billsacks
Copy link
Member

Previously, any namelist variable needed to appear in the namelist_definition xml file. With the changes here, the namelist_definition file can contain a generic variable name that gives metadata for one or more final namelist variables with different names. This is done by adding multiple copies of a namelist_definition entry, one for each final name.

The use case for this is having configuration variables whose names are not known ahead of time, but instead are determined at build-namelist time based on some xml variables. Specifically, I'm planning to use this to support the addition of arbitrary water tracers, each of which will have one or more configuration variables whose name contains the name of that water tracer, where the list of water tracer names is set in an xml variable. The type and default value will be the same for each water tracer, so we can take this metadata from a single, generic entry in the namelist_definition file, but then we need to support an add_default call for each individual, dynamically-named variable.

Specifically, in CMEPS, I plan to have a namelist_definition entry like this:

  <entry id="wtracer_initial_ratio" multi_variable_entry="true">
    <type>real</type>
    <category>water_tracers</category>
    <group>ALLCOMP_attributes</group>
    <desc>
      Initial tracer ratio for each water tracer in this simulation.

      Note that there will be a separate entry for each water tracer, with a name like
      wtracer_initial_ratio_MyWaterTracer.
    </desc>
    <values>
      <value>0.0</value>
    </values>
  </entry>

But there won't actually be a wtracer_initial_ratio variable in the nuopc.runconfig file. Instead, for each water tracer in this simulation (defined via a list in an XML variable in env_run.xml), the nuopc.runconfig file will have a variable like wtracer_initial_ratio_MyWaterTracer. This will be accomplished in the CMEPS buildnml file by setting up a dict like:

multi_variable_mappings = {"wtracer_initial_ratio": ["wtracer_initial_ratio_MyWaterTracer", "wtracer_initial_ratio_AnotherWaterTracer"]}

(though this will be done dynamically based on the entries in the new XML variable).

Without the changes in this PR, I believe that the main way to accomplish something like this would be to have an array of values. However, that feels more error-prone from a user perspective, and perhaps more importantly, would lead to less consistency with how I plan to set things up for water isotopes - for which we will have predefined names, following a more standard approach.

Checklist

  • My code follows the style guidlines of this proejct (black formatting)
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • I have added tests that excerise my feature/fix and existing tests continue to pass
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding additions and changes to the documentation

Testing

I ran scripts_regression_tests on my Mac. Failures are the same as on master.

Previously, any namelist variable needed to appear in the
namelist_definition xml file. With the changes here, the
namelist_definition file can contain a generic variable name that gives
metadata for one or more final namelist variables with different names.
This is done by adding multiple copies of a namelist_definition entry,
one for each final name.

The use case for this is having configuration variables whose names are
not known ahead of time, but instead are determined at build-namelist
time based on some xml variables. Specifically, I'm planning to use this
to support the addition of arbitrary water tracers, each of which will
have one or more configuration variables whose name contains the name of
that water tracer, where the list of water tracer names is set in an xml
variable. The type and default value will be the same for each water
tracer, so we can take this metadata from a single, generic entry in the
namelist_definition file, but then we need to support an add_default
call for each individual, dynamically-named variable.
@codecov
Copy link

codecov bot commented Oct 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 31.00%. Comparing base (d856a52) to head (aa0b7ea).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4877      +/-   ##
==========================================
+ Coverage   30.98%   31.00%   +0.01%     
==========================================
  Files         264      264              
  Lines       38663    38677      +14     
  Branches     8384     8390       +6     
==========================================
+ Hits        11979    11991      +12     
- Misses      25459    25466       +7     
+ Partials     1225     1220       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines -92 to -93
self._entry_nodes = []
self._entry_ids = []
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I have removed _entry_nodes from NamelistDefinition: this wasn't being used, and would be problematic now that there could be the same node with different names, since the name might not match the id.

I have also renamed _entry_ids to _var_names, since this no longer necessarily gives the "id" attribute but instead may give the rename.

self._group_names[name] = self.get_group_name(node)

def set_nodes(self, skip_groups=None):
def _set_node_values(self, names, node):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I have made this function private. I have changed its interface (changing the name argument to now be a list of names - which will often be a list of a single item, but could be more) and want to make sure it isn't being used externally (which, as far as I can tell, it isn't)… and it seems like it's not meant to be used externally.

Comment on lines +135 to +145
# The reason we need add_default False for a multi_variable_entry is that the
# returned default_nodes are just the nodes themselves, without the mappings
# in multi_variable_mappings - so users of these default_nodes examine their
# ids, which is problematic for a multi_variable_entry. The implication is
# that add_default will need to be called explicitly for the final names in a
# multi_variable_entry.
add_default = (
not skip_default_entry
and not per_stream_entry
and not multi_variable_entry
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I don't return variables with multi_variable_entry in the list of defaults because use of these default nodes relies on the id matching the name. My original hope was that I could change the id in the node, but my sense from reading some code is that any modification of nodes assumes you want to write back to their originating file. If I'm wrong about that and it's actually possible to modify the nodes in memory without modifying the original xml file, I'd be happy to hear about that. Without that ability, I think returning these in the default list would require a change to how this default list is encoded and used, such as making it a dictionary mapping a name to a node, but it felt like that would be a pretty invasive change.

At least for my purposes, it's not a big deal that these variables don't appear in the default list: We need to know all of the names in buildnml anyway in order to add them to the multi_variable_mappings argument, so it's easy enough to loop through them and explicitly call add_default on each of them.

@gold2718
Copy link

I'm trying to get my head around what problem is being solved here. It would be nice to have an issue where the problem and potential solultions could be discussed. For now, I have three questions:

  1. One solution I see is to simply have a namelist variable for each potential water tracer with unused ones set to zero. Why is this not possible or practical?
  2. Why not use ParamGen which provides very flexible namelist building (cf. the CAM-SIMA buildnml)?
  3. We currently build namelist documentation from the namelist_definition.xml files. How would this work with the solution proposed in this PR?

@billsacks
Copy link
Member Author

Thanks for your questions, @gold2718 .

Following some additional discussion with @nusbaume and @jiang-zhu , I have more clarity on the requirements for water isotopes/tracers, and it turns out that this feature seems relatively unimportant, so I'm closing this PR.

@gold2718 let me know if you're still interested in answers to your questions... I'm happy to try to answer them if you are, but I figure they might be irrelevant since I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants