Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .cspell/ok-unknown-words.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ autoattribute
autoclass
autodoc
autofunction
autouse
automethod
automodule
bnds
Expand Down
2 changes: 1 addition & 1 deletion docs/commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ workflows are supported. Available subcommands:
* Minimal Syntax: ``fremor run -d [indir] -l [varlist] -r [table_config] -p [exp_config] -o [outdir] [options]``
* Required Options:
- ``-d, --indir TEXT`` — Input directory with netCDF files
- ``-l, --varlist TEXT`` — Variable list dictionary mapping local to MIP variable names
- ``-l, --varlist TEXT`` — Variable list dictionary mapping modeler variable names to MIP table variable names
- ``-r, --table_config TEXT`` — MIP table JSON configuration
- ``-p, --exp_config TEXT`` — Experiment/model metadata JSON
- ``-o, --outdir TEXT`` — Output directory prefix
Expand Down
18 changes: 16 additions & 2 deletions docs/cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ You will need to split the platform-target string appropriately to extract the i
Creating Variable Lists
~~~~~~~~~~~~~~~~~~~~~~~

Variable lists map your local variable names to MIP table variable names. Generate a variable list from a directory of netCDF files:
Variable lists map your modeler variable names to MIP table variable names. Generate a variable list from a directory of netCDF files:

.. code-block:: bash

Expand All @@ -86,6 +86,20 @@ This tool examines filenames to extract variable names. It assumes FRE-style nam
(e.g., ``component.YYYYMMDD.variable.nc``). Review the generated file and edit as needed to map
local variable names to target MIP variable names.

When a modeler's variable name differs from the MIP table variable name, the variable list
maps between them. For example, if your model produces ``sea_sfc_salinity`` but the MIP table
expects ``sos``:

.. code-block:: json

{
"sea_sfc_salinity": "sos"
}

The key (``sea_sfc_salinity``) is the modeler's variable name — it must match both the filename
and the variable name inside the netCDF file. The value (``sos``) is the MIP table variable name
used for metadata lookups.

To verify variables exist in MIP tables, search for variable definitions:

.. code-block:: bash
Expand Down Expand Up @@ -145,7 +159,7 @@ For processing individual directories or debugging specific issues, use ``fremor
Required arguments:

* ``--indir``: Directory containing netCDF files to CMORize
* ``--varlist``: JSON file mapping local variable names to target variable names
* ``--varlist``: JSON file mapping modeler variable names to MIP table variable names
* ``--table_config``: MIP table JSON file (e.g., ``CMIP6_Omon.json``)
* ``--exp_config``: Experiment configuration JSON with metadata
* ``--outdir``: Output directory root for CMORized files
Expand Down
2 changes: 1 addition & 1 deletion docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Glossary
``source_id``, ``grid_label``, and ``nominal_resolution``. Published by the CMIP community.

variable list
A JSON file mapping local model variable names to MIP table variable names. Generated by
A JSON file mapping modeler variable names to MIP table variable names. Generated by
``fremor varlist`` and consumed by ``fremor run``.

experiment configuration
Expand Down
26 changes: 14 additions & 12 deletions fremorizer/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,17 @@

fre_logger = logging.getLogger(__name__)

OPT_VAR_NAME_HELP='optional, specify a variable name to specifically process only filenames ' + \
'matching that variable name. I.e., this string help target local_vars, not ' + \
'target_vars.'
VARLIST_HELP='path pointing to a json file containing directory of key/value pairs. ' + \
'the keys are the \'local\' names used in the filename, and the values ' + \
'pointed to by those keys are strings representing the name of the variable ' + \
'contained in targeted files. the key and value are often the same, ' + \
'but it is not required.'
RUN_ONE_HELP='process only one file, then exit. mostly for debugging and isolating issues.'
DRY_RUN_HELP='don\'t call the cmor_mixer subtool, just printout what would be called and move on until natural exit'
OPT_VAR_NAME_HELP="optional, specify a variable name to specifically process only filenames " + \
"matching that variable name. I.e., this string help target local_vars, not " + \
"target_vars."
VARLIST_HELP="path pointing to a json file containing directory of key/value pairs. " + \
"the keys are the modeler\'s variable names used in the filename and " + \
"expected as the variable name within the targeted files. the values " + \
"pointed to by those keys are strings representing the corresponding " + \
"MIP table variable name. the key and value are often the same, " + \
"but it is not required."
RUN_ONE_HELP="process only one file, then exit. mostly for debugging and isolating issues."
DRY_RUN_HELP="don't call the cmor_mixer subtool, just printout what would be called and move on until natural exit"
START_YEAR_HELP = 'string representing the minimum calendar year CMOR should start processing for. ' + \
'currently, only YYYY format is supported.'
STOP_YEAR_HELP = 'string representing the maximum calendar year CMOR should stop processing for. ' + \
Expand Down Expand Up @@ -217,11 +218,10 @@ def find(varlist, table_config_dir, opt_var_name): #uncovered
required = False)
def run(indir, varlist, table_config, exp_config, outdir, run_one, opt_var_name,
grid_label, grid_desc, nom_res, start, stop, calendar):
# pylint: disable=unused-argument
"""
Rewrite climate model output files with CMIP-compliant metadata for down-stream publishing
"""
cmor_run_subtool(
result = cmor_run_subtool(
indir = indir,
json_var_list = varlist,
json_table_config = table_config,
Expand All @@ -236,6 +236,8 @@ def run(indir, varlist, table_config, exp_config, outdir, run_one, opt_var_name,
stop = stop,
calendar_type = calendar
)
if result < 0:
raise click.ClickException(f'cmor_run_subtool returned non-zero status: {result}')


@fremor.command('varlist')
Expand Down
15 changes: 12 additions & 3 deletions fremorizer/cmor_finder.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ def print_var_content(table_config_file: IO[str],
table_name = proj_table_vars['Header'].get('table_id').split(' ')[1]
except KeyError:
fre_logger.warning('couldn\'t get header and table_name field')
except IndexError:
fre_logger.warning("couldn't get header and table_name, probably not a variable table")

if table_name is not None:
fre_logger.info('looking for %s data in table %s!', var_name, table_name)
Expand Down Expand Up @@ -182,12 +184,19 @@ def make_simple_varlist( dir_targ: str,
# Build a deduplicated dict of variable names extracted from all filenames across
# all datetimes. Assigning to a dict naturally deduplicates while preserving
# first-seen insertion order (Python 3.7+).
# If a MIP table is provided, variables that match a MIP variable name get
# self-mapped (key==value). Variables NOT in the MIP table get an empty string
# as value, signaling they need manual mapping by the user.
var_list: Dict[str, str] = {}
for targetfile in all_nc_files:
var_name=os.path.basename(targetfile).split('.')[-2]
if mip_vars is not None and var_name not in mip_vars:
continue
var_list[var_name] = var_name
if mip_vars is not None:
if var_name in mip_vars:
var_list[var_name] = var_name
else:
var_list[var_name] = ''
else:
var_list[var_name] = var_name

if not var_list:
fre_logger.warning('WARNING: no variables in target mip table found, or no matching pattern,'
Expand Down
52 changes: 27 additions & 25 deletions fremorizer/cmor_mixer.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,11 @@ def rewrite_netcdf_file_var( mip_var_cfgs: dict = None,

:param mip_var_cfgs: Variable table, as loaded from the MIP table JSON config.
:type mip_var_cfgs: dict
:param local_var: Variable name used for finding files locally.
:param local_var: Modeler's variable name, used for finding files and reading data from them.
:type local_var: str
:param netcdf_file: Path to the input NetCDF file to be CMORized.
:type netcdf_file: str
:param target_var: Name of the variable to be processed.
:param target_var: MIP table variable name for metadata lookups.
:type target_var: str
:param json_exp_config: Path to experiment configuration JSON file (for dataset metadata).
:type json_exp_config: str
Expand All @@ -86,17 +86,17 @@ def rewrite_netcdf_file_var( mip_var_cfgs: dict = None,
.. note:: This function performs extensive setup of axes and metadata, and conditionally handles tripolar
ocean grids.
"""
fre_logger.info('input data:')
fre_logger.info(' local_var = %s', local_var)
fre_logger.info(' target_var = %s', target_var)
fre_logger.info("input data:")
fre_logger.info(" local_var = %s (modeler variable name, in filename and file)", local_var)
fre_logger.info(" target_var = %s (MIP table variable name)", target_var)

# open the input file
fre_logger.info('opening %s', netcdf_file)
ds = nc.Dataset(netcdf_file, 'r+')

# read the input variable data
fre_logger.info('attempting to read variable data, %s', target_var)
var = from_dis_gimme_dis(from_dis=ds, gimme_dis=target_var)
# read the input variable data using the modeler's variable name (local_var)
fre_logger.info('attempting to read variable data, %s', local_var)
var = from_dis_gimme_dis(from_dis=ds, gimme_dis=local_var)

## var type
#var_dtype = var.dtype
Expand Down Expand Up @@ -142,12 +142,14 @@ def rewrite_netcdf_file_var( mip_var_cfgs: dict = None,
var_brand = filter_brands(
brands, target_var, mip_var_cfgs,
has_time_bnds = 'time_bnds' in ds.variables,
input_vert_dim = get_vertical_dimension(ds, target_var)
input_vert_dim = get_vertical_dimension(ds, local_var)
)

else:
fre_logger.error('cmip7 case detected, but dimensions of input data do not match '
'any of those found for the associated brands.')
raise ValueError
raise ValueError('no variable brand was able to be identified for this CMIP7 case')
fre_logger.debug('cmip7 case, filtered possible brands to %s', var_brand)
else:
fre_logger.debug('non-cmip7 case detected, skipping variable brands')

Expand Down Expand Up @@ -212,8 +214,8 @@ def rewrite_netcdf_file_var( mip_var_cfgs: dict = None,
time_bnds = from_dis_gimme_dis(from_dis=ds, gimme_dis='time_bnds')

# determine the vertical dimension by looping over netcdf variables
vert_dim = get_vertical_dimension(ds, target_var) # returns int(0) if not present
fre_logger.info('Vertical dimension of %s: %s', target_var, vert_dim)
vert_dim = get_vertical_dimension(ds, local_var) # returns int(0) if not present
fre_logger.info("Vertical dimension of %s: %s", local_var, vert_dim)

# Check var_dim and vert_dim and assign lev if relevant.
lev, lev_units = None, '1'
Expand Down Expand Up @@ -530,7 +532,7 @@ def rewrite_netcdf_file_var( mip_var_cfgs: dict = None,

elif vert_dim in ALT_HYBRID_SIGMA_COORDS:
# find the ps file nearby
ps_file = netcdf_file.replace(f'.{target_var}.nc', '.ps.nc')
ps_file = netcdf_file.replace(f'.{local_var}.nc', '.ps.nc')
ds_ps = nc.Dataset(ps_file)
ps = from_dis_gimme_dis(ds_ps, 'ps')

Expand Down Expand Up @@ -690,9 +692,9 @@ def cmorize_target_var_files(indir: str = None,

:param indir: Path to the directory containing NetCDF files to process.
:type indir: str
:param target_var: Name of the variable to process in each file.
:param target_var: MIP table variable name for metadata lookups.
:type target_var: str
:param local_var: Local/filename variable name (often identical to target_var).
:param local_var: Modeler's variable name, used for file-targeting and reading data from files.
:type local_var: str
:param iso_datetime_range_arr: List of ISO datetime strings, each identifying a specific file.
:type iso_datetime_range_arr: list of str
Expand All @@ -717,10 +719,9 @@ def cmorize_target_var_files(indir: str = None,
.. note:: Copies files to a temporary directory, runs CMORization, moves results to output, cleans up temp files.
"""

fre_logger.info('local_var = %s to be used for file-targeting.\n'
'target_var = %s to be used for reading the data \n'
'from the file\n'
'outdir = %s', local_var, target_var, outdir)
fre_logger.info("local_var = %s to be used for file-targeting and reading data.\n"
"target_var = %s to be used for MIP table lookups.\n"
"outdir = %s", local_var, target_var, outdir)

# determine a tmp dir for working on files.
tmp_dir = create_tmp_dir(outdir, json_exp_config) + '/'
Expand Down Expand Up @@ -847,7 +848,7 @@ def cmorize_all_variables_in_dir(vars_to_run: Dict[str, Any],
"""
CMORize all variables in a directory according to a variable mapping.

:param vars_to_run: Mapping of local variable names (in filenames) to target variable names (in NetCDF).
:param vars_to_run: Mapping of modeler variable names to MIP table variable names.
:type vars_to_run: dict
:param indir: Directory containing NetCDF files to process.
:type indir: str
Expand All @@ -871,16 +872,17 @@ def cmorize_all_variables_in_dir(vars_to_run: Dict[str, Any],
.. note:: Errors for individual variables are logged and processing continues (except for run_one_mode).
"""

# loop over local-variable:target-variable pairs in vars_to_run
# loop over modeler-variable:mip-variable pairs in vars_to_run
return_status = -1
omissions = []
for local_var in vars_to_run:
# if the target-variable is 'good', get the name of the data inside the netcdf file.
target_var = vars_to_run[local_var] # often equiv to local_var but not necessarily.
if local_var != target_var:
fre_logger.warning('local_var == %s != %s == target_var\n'
'i am expecting %s to be in the filename, and i expect the variable\n'
'in that file to be named %s', local_var, target_var, local_var, target_var)
fre_logger.info('local_var == %s != %s == target_var\n'
'modeler variable name differs from MIP table variable name.\n'
'i am expecting %s in both the filename and the file, and will map it\n'
'to MIP table variable %s', local_var, target_var, local_var, target_var)

fre_logger.info('........beginning CMORization for %s/%s..........', local_var, target_var)
try:
Expand Down Expand Up @@ -938,7 +940,7 @@ def cmor_run_subtool(indir: str = None,

:param indir: Directory containing NetCDF files to process.
:type indir: str
:param json_var_list: Path to JSON file with variable mapping (local to target names).
:param json_var_list: Path to JSON file with variable mapping (modeler names to MIP table names).
:type json_var_list: str
:param json_table_config: Path to MIP table JSON file (per-variable metadata).
:type json_table_config: str
Expand Down
13 changes: 12 additions & 1 deletion fremorizer/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@
INDIR = ROOTDIR / 'ocean_sos_var_file'
VARLIST = ROOTDIR / 'varlist'
VARLIST_DIFF = ROOTDIR / 'varlist_local_target_vars_differ'
VARLIST_MAPPED = ROOTDIR / 'varlist_mapped'
EXP_CONFIG = ROOTDIR / 'CMOR_input_example.json'
EXP_CONFIG_CMIP7 = ROOTDIR / 'CMOR_CMIP7_input_example.json'

SOS_NC_FILENAME = 'reduced_ocean_monthly_1x1deg.199301-199302.sos.nc'
SOSV2_NC_FILENAME = 'reduced_ocean_monthly_1x1deg.199301-199302.sosV2.nc'
MAPPED_NC_FILENAME = 'reduced_ocean_monthly_1x1deg.199301-199302.sea_sfc_salinity.nc'

YYYYMMDD = date.today().strftime('%Y%m%d')

Expand Down Expand Up @@ -155,7 +157,7 @@ def _write_exp_configs():

The JSON data lives in this module (_CMIP6_EXP_CONFIG_DATA /
_CMIP7_EXP_CONFIG_DATA) so the on-disk files are no longer tracked by git.
This session-scoped autouse fixture materialises fresh copies before any
This session-scoped autouse fixture materializes fresh copies before any
test that needs them runs, and cleans them up afterwards.
"""
EXP_CONFIG.write_text(json.dumps(_CMIP6_EXP_CONFIG_DATA, indent=4))
Expand Down Expand Up @@ -231,3 +233,12 @@ def cli_sosv2_nc_file(cli_sos_nc_file): # pylint: disable=redefined-outer-name
shutil.copy(cli_sos_nc_file, str(nc_path))
assert nc_path.exists()
return str(nc_path)


@pytest.fixture(scope='session')
def cli_mapped_nc_file():
"""Generate the sea_sfc_salinity NetCDF file from CDL (session-scoped)."""
INDIR.mkdir(parents=True, exist_ok=True)
nc_path = INDIR / MAPPED_NC_FILENAME
_ncgen('reduced_ocean_monthly_1x1deg.199301-199302.sea_sfc_salinity.cdl', nc_path)
return str(nc_path)
Loading
Loading