-
Notifications
You must be signed in to change notification settings - Fork 446
EAMxx: Transposed Output #7757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
EAMxx: Transposed Output #7757
Changes from all commits
bc4f1b0
f7bb809
e935ff9
f9aff9b
fe445c1
8f7c9a3
f3a17a5
87dfa47
753f368
75219a5
56ba62f
6f6b878
6a52b46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,7 @@ | ||
| from utils import expect, ensure_netcdf4 | ||
| from utils import expect, _ensure_pylib_impl | ||
| _ensure_pylib_impl("xarray") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is how we've been doing things, but I am having second thoughts on these "under the hood" pip installs. @mahf708 @jgfouca I am not fond of the script adding pacakges without the user probably even knowing it (yes, we print a msg, but we don't ask for permission). I find it conceptually not right. How do you feel about removing all the "ensure_xyz" lines from our scripts, and simply erroring out if the pkg is not installed?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's going to be hard to get rid of this without risking all sorts of bad stuff, I think... :( My vote is to keep it, despite the weirdness/ugliness, because it allows us so much portability. I think it's a smart solution for this tricky problem. We don't want to prepare the env and for the most part these packages are light so installing them a la carte is good. I defer to Jim though
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would lean towards keeping it as well |
||
|
|
||
| ensure_netcdf4() | ||
|
|
||
| from netCDF4 import Dataset | ||
| import xarray as xr | ||
mahf708 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import numpy as np | ||
|
|
||
| import pathlib | ||
|
|
@@ -12,14 +11,15 @@ class CompareNcFiles(object): | |
| ############################################################################### | ||
|
|
||
| ########################################################################### | ||
| def __init__(self,src_file,tgt_file=None,compare=None): | ||
| def __init__(self,src_file,tgt_file=None,compare=None,allow_transpose=False): | ||
| ########################################################################### | ||
|
|
||
| self._src_file = pathlib.Path(src_file).resolve().absolute() | ||
| expect (self._src_file.exists(), | ||
| "Error! File '{}' does not exist.".format(self._src_file)) | ||
|
|
||
| self._compare = compare | ||
| self._allow_transpose = allow_transpose | ||
|
|
||
| if tgt_file is None: | ||
| self._tgt_file = self._src_file | ||
|
|
@@ -62,10 +62,27 @@ def get_name_and_dims(self,name_dims): | |
| def compare_variables(self): | ||
| ########################################################################### | ||
|
|
||
| ds_src = Dataset(self._src_file,'r') | ||
| ds_tgt = Dataset(self._tgt_file,'r') | ||
| ds_src = xr.open_dataset(self._src_file) | ||
| ds_tgt = xr.open_dataset(self._tgt_file) | ||
|
|
||
| success = True | ||
| if self._compare is None or self._compare == []: | ||
| self._compare = [] | ||
| # If compare is an empty list, compare all variables | ||
| print(f"Specific comparison variables not provided,\n" | ||
| f"will compare ALL variables in \n" | ||
| f"{self._src_file}\n" | ||
| f"with\n" | ||
| f"{self._tgt_file}\n") | ||
| for var in ds_src.variables: | ||
| if var not in ds_tgt.variables: | ||
| print (f" Comparison failed! Variable not found.\n" | ||
| f" - var name: {var}\n" | ||
| f" - file name: {self._tgt_file}") | ||
| success = False | ||
| continue | ||
| self._compare.append(var+"="+var) | ||
|
|
||
| for expr in self._compare: | ||
| # Split the expression, to get the output var name | ||
| tokens = expr.split('=') | ||
|
|
@@ -89,11 +106,11 @@ def compare_variables(self): | |
| f" - file name: {self._tgt_file}") | ||
| success = False | ||
| continue | ||
| lvar = ds_src.variables[lname]; | ||
| rvar = ds_tgt.variables[rname]; | ||
| lvar = ds_src[lname]; | ||
| rvar = ds_tgt[rname]; | ||
|
|
||
| lvar_rank = len(lvar.dimensions) | ||
| rvar_rank = len(rvar.dimensions) | ||
| lvar_rank = len(lvar.dims) | ||
| rvar_rank = len(rvar.dims) | ||
|
|
||
| expect (len(ldims)==0 or len(ldims)==lvar_rank, | ||
| f"Invalid slice specification for {lname}.\n" | ||
|
|
@@ -104,53 +121,33 @@ def compare_variables(self): | |
| f" input request: ({','.join(rdims)})\n" | ||
| f" variable rank: {rvar_rank}") | ||
|
|
||
|
|
||
| lslices = [[idim,slice] for idim,slice in enumerate(ldims) if slice!=":"] | ||
| rslices = [[idim,slice] for idim,slice in enumerate(rdims) if slice!=":"] | ||
|
|
||
| lrank = lvar_rank - len(lslices) | ||
| rrank = rvar_rank - len(rslices) | ||
|
|
||
| if lrank!=rrank: | ||
| print (f" Comparison failed. Rank mismatch.\n" | ||
| f" - input comparison: {expr}\n" | ||
| f" - upon slicing, rank({lname}) = {lrank}\n" | ||
| f" - upon slicing, rank({rname}) = {rrank}") | ||
| success = False | ||
| continue | ||
|
|
||
| lvals = self.slice_variable(lvar,lvar[:],lslices) | ||
| rvals = self.slice_variable(rvar,rvar[:],rslices) | ||
|
|
||
| if not np.array_equal(lvals,rvals): | ||
| # print (f"lvals: {lvals}") | ||
| # print (f"rvals: {rvals}") | ||
| item = np.argwhere(lvals!=rvals)[0] | ||
| rval = self.slice_variable(rvar,rvals, | ||
| [[idim,slice] for idim,slice in enumerate(item)]) | ||
| lval = self.slice_variable(lvar,lvals, | ||
| [[idim,slice] for idim,slice in enumerate(item)]) | ||
| loc = ",".join([str(i+1) for i in item]) | ||
| print (f" Comparison failed. Values differ.\n" | ||
| f" - input comparison: {expr}\n" | ||
| f' - upon slicing, {lname}({loc}) = {lval}\n' | ||
| f' - upon slicing, {rname}({loc}) = {rval}') | ||
| lslices = {lvar.dims[idim]:int(slice)-1 for idim,slice in enumerate(ldims) if slice!=":"} | ||
| rslices = {rvar.dims[idim]:int(slice)-1 for idim,slice in enumerate(rdims) if slice!=":"} | ||
| lvar_sliced = lvar.sel(lslices) | ||
| rvar_sliced = rvar.sel(rslices) | ||
| expect (set(lvar_sliced.dims) == set(rvar_sliced.dims), | ||
| f"Error, even when sliced these two elements do not share the same dimensionsn\n" | ||
| f" - left var name : {lname}\n" | ||
| f" - right var name : {rname}\n" | ||
| f" - left dimensions : {lvar_sliced.dims}\n" | ||
| f" - right dimensions: {rvar_sliced.dims}\n") | ||
|
|
||
| if self._allow_transpose: | ||
| rvar_sliced = rvar_sliced.transpose(*lvar_sliced.dims) | ||
|
|
||
| equal = (lvar_sliced.data==rvar_sliced.data).all() | ||
| if not equal: | ||
| rse = np.sqrt((lvar_sliced.data-rvar_sliced.data)**2) | ||
| nonmatch_count = np.count_nonzero(rse) | ||
| print (f" Comparison failed. Values differ at {nonmatch_count} out of {rse.size} locations.\n" | ||
| f" - input comparison: {expr}\n" | ||
| f' - max L2 error, {rse.max()}\n' | ||
| f' - max L2 location, [{",".join(map(str,(np.array(np.unravel_index(rse.argmax(),rse.shape))+1).tolist()))}]\n' | ||
| f' - dimensions, {lvar_sliced.dims}') | ||
| success = False | ||
|
|
||
| return success | ||
|
|
||
| ########################################################################### | ||
| def slice_variable(self,var,vals,slices): | ||
| ########################################################################### | ||
|
|
||
| if len(slices)==0: | ||
| return vals | ||
|
|
||
| idim, slice_idx = slices.pop(-1) | ||
| vals = vals.take(int(slice_idx)-1,axis=int(idim)) | ||
|
|
||
| return self.slice_variable(var,vals,slices) | ||
|
|
||
| ########################################################################### | ||
| def run(self): | ||
| ########################################################################### | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -100,6 +100,11 @@ AtmosphereOutput::AtmosphereOutput(const ekat::Comm &comm, const ekat::Parameter | |
| // the name of the yaml file where the options are read from. | ||
| m_stream_name = params.name(); | ||
|
|
||
| // Is this output set to be transposed? | ||
| if (params.isParameter("transpose")) { | ||
| m_transpose = params.get<bool>("transpose"); | ||
| } | ||
|
|
||
| auto gm = field_mgr->get_grids_manager(); | ||
|
|
||
| // Figure out what kind of averaging is requested | ||
|
|
@@ -337,7 +342,23 @@ void AtmosphereOutput::init() | |
|
|
||
| // Store the field layout, so that calls to setup_output_file are easier | ||
| const auto& layout = fid.get_layout(); | ||
| m_vars_dims[fname] = get_var_dimnames(layout); | ||
| m_vars_dims[fname] = get_var_dimnames(m_transpose ? layout.transpose() : layout); | ||
|
|
||
| // Initialize a helper_field for each unique layout. This can be used for operations | ||
| // such as writing transposed output. | ||
| if (m_transpose) { | ||
| const auto helper_layout = layout.transpose(); | ||
| const std::string helper_name = "transposed_"+helper_layout.to_string(); | ||
| if (m_helper_fields.find(helper_name) == m_helper_fields.end()) { | ||
| // We can add a new helper field for this layout | ||
| using namespace ekat::units; | ||
| FieldIdentifier fid_helper(helper_name,helper_layout,Units::invalid(),fid.get_grid_name()); | ||
| Field helper(fid_helper); | ||
| helper.get_header().get_alloc_properties().request_allocation(); | ||
| helper.allocate_view(); | ||
| m_helper_fields[helper_name] = helper; | ||
| } | ||
| } | ||
|
|
||
| // Now check that all the dims of this field are already set to be registered. | ||
| const auto& tags = layout.tags(); | ||
|
|
@@ -495,7 +516,15 @@ run (const std::string& filename, | |
| count.sync_to_host(); | ||
|
|
||
| auto func_start = std::chrono::steady_clock::now(); | ||
| scorpio::write_var(filename,count.name(),count.get_internal_view_data<int,Host>()); | ||
| if (m_transpose) { | ||
| const auto& fl = count.get_header().get_identifier().get_layout().to_string(); | ||
| const std::string helper_name = "transposed_"+fl; | ||
| auto& temp = m_helper_fields.at(helper_name); | ||
| transpose(count,temp); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you ok having the call to transpose end up in the "duration_write" timing?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, maybe it doesn't matter. I think we should prob get rid of that timing anyways, as it doesn't add much to what's already reported in the timings file. |
||
| scorpio::write_var(filename,count.name(),temp.get_internal_view_data<int,Host>()); | ||
| } else { | ||
| scorpio::write_var(filename,count.name(),count.get_internal_view_data<int,Host>()); | ||
| } | ||
| auto func_finish = std::chrono::steady_clock::now(); | ||
| auto duration_loc = std::chrono::duration_cast<std::chrono::milliseconds>(func_finish - func_start); | ||
| duration_write += duration_loc.count(); | ||
|
|
@@ -573,7 +602,15 @@ run (const std::string& filename, | |
|
|
||
| // Write to file | ||
| auto func_start = std::chrono::steady_clock::now(); | ||
| scorpio::write_var(filename,field_name,f_out.get_internal_view_data<Real,Host>()); | ||
| if (m_transpose) { | ||
| const auto& fl = f_out.get_header().get_identifier().get_layout().to_string(); | ||
| const std::string helper_name = "transposed_"+fl; | ||
| auto& temp = m_helper_fields.at(helper_name); | ||
| transpose(f_out,temp); | ||
| scorpio::write_var(filename,field_name,temp.get_internal_view_data<Real,Host>()); | ||
| } else { | ||
| scorpio::write_var(filename,field_name,f_out.get_internal_view_data<Real,Host>()); | ||
| } | ||
| auto func_finish = std::chrono::steady_clock::now(); | ||
| auto duration_loc = std::chrono::duration_cast<std::chrono::milliseconds>(func_finish - func_start); | ||
| duration_write += duration_loc.count(); | ||
|
|
@@ -651,7 +688,7 @@ void AtmosphereOutput::set_avg_cnt_tracking(const std::string& name, const Field | |
| } | ||
|
|
||
| // We have not created this avg count field yet. | ||
| m_vars_dims[avg_cnt_name] = get_var_dimnames(layout); | ||
| m_vars_dims[avg_cnt_name] = get_var_dimnames(m_transpose ? layout.transpose() : layout); | ||
|
|
||
| auto nondim = ekat::units::Units::nondimensional(); | ||
| FieldIdentifier count_id (avg_cnt_name,layout,nondim,m_io_grid->name(),DataType::IntType); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -749,28 +749,34 @@ void reset_time_dim_len (const std::string& filename, const int new_length) | |
| void set_var_decomp (PIOVar& var, | ||
| const std::string& filename) | ||
| { | ||
| for (size_t i=1; i<var.dims.size(); ++i) { | ||
| EKAT_REQUIRE_MSG (var.dims[i]->offsets==nullptr, | ||
| "Error! We currently only allow decomposition on slowest-striding dimension.\n" | ||
| " Generalizing is not complicated, but it was not a priority.\n" | ||
| std::shared_ptr<const PIODim> decomp_dim; | ||
| for (size_t i=0; i<var.dims.size(); ++i) { | ||
| if (var.dims[i]->offsets) { | ||
| EKAT_REQUIRE_MSG (decomp_dim==nullptr, | ||
| "Error! Call set_var_decomp, more than one dimension set as decomposition dimension.\n" | ||
| " - filename: " + filename + "\n" | ||
| " - varname : " + var.name + "\n" | ||
| " - var dims: " + ekat::join(var.dims,get_entity_name,",") + "\n" | ||
| " - bad dim : " + var.dims[i]->name + "\n"); | ||
| " - varname : " + var.name + "\n"); | ||
| decomp_dim = var.dims[i]; | ||
| } | ||
| } | ||
| EKAT_REQUIRE_MSG (var.dims[0]->offsets!=nullptr, | ||
| "Error! Calling set_var_decomp, but the var first dimension does not appear to be decomposed.\n" | ||
| " - filename: " + filename + "\n" | ||
| " - varname : " + var.name + "\n" | ||
| " - var dims: " + ekat::join(var.dims,get_entity_name,",") + "\n"); | ||
| EKAT_REQUIRE_MSG(decomp_dim!=nullptr, | ||
| "Error! Calling set_var_decomp, none of the dimensions have been set as the decomposition dimension.\n" | ||
| " - filename: " + filename + "\n" | ||
| " - varname : " + var.name + "\n"); | ||
| EKAT_REQUIRE_MSG (var.decomp==nullptr, | ||
| "Error! You should have invalidated var.decomp before attempting to reset it.\n" | ||
| " - filename : " + filename + "\n" | ||
| " - varname : " + var.name + "\n" | ||
| " - var decomp: " + var.decomp->name + "\n"); | ||
| EKAT_REQUIRE_MSG (decomp_dim==var.dims[0] or decomp_dim==var.dims.back(), | ||
| "Error! Variable decomposition only supports decompostion over the first or last dimension.\n" | ||
| " - filename : " + filename + "\n" | ||
| " - varname : " + var.name + "\n" | ||
| " - var decomp: " + var.decomp->name + "\n"); | ||
| bool decomp_first_dim = var.dims[0]->offsets ? true : false; | ||
|
|
||
|
|
||
| // Create decomp name: dtype-dim1<len1>_dim2<len2>_..._dimk<lenN> | ||
| std::shared_ptr<const PIODim> decomp_dim; | ||
| std::string decomp_tag = var.dtype + "-"; | ||
| for (auto d : var.dims) { | ||
| decomp_tag += d->name + "<" + std::to_string(d->length) + ">_"; | ||
|
|
@@ -800,28 +806,42 @@ void set_var_decomp (PIOVar& var, | |
| // We haven't create this decomp yet. Go ahead and create one | ||
| decomp = std::make_shared<PIODecomp>(); | ||
| decomp->name = decomp_tag; | ||
| decomp->dim = var.dims[0]; | ||
| decomp->dim = decomp_dim; | ||
|
|
||
| int ndims = var.dims.size(); | ||
|
|
||
| // Get ALL dims global lengths, and compute prod of *non-decomposed* dims | ||
| std::vector<int> gdimlen = {decomp->dim->length}; | ||
| // Note, if the offsets for the dimension are null then this is by definition | ||
| // a non-decomposed dimension. | ||
| std::vector<int> gdimlen = {}; | ||
| int non_decomp_dim_prod = 1; | ||
| for (int idim=1; idim<ndims; ++idim) { | ||
| for (int idim=0; idim<ndims; ++idim) { | ||
| auto d = var.dims[idim]; | ||
| gdimlen.push_back(d->length); | ||
| non_decomp_dim_prod *= d->length; | ||
| if (d->offsets==nullptr) { | ||
| non_decomp_dim_prod *= d->length; | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You also need to change the offsets creation below. |
||
| } | ||
|
|
||
| // Create offsets list | ||
| const auto& dim_offsets = *decomp->dim->offsets; | ||
| int dim_loc_len = dim_offsets.size(); | ||
| decomp->offsets.resize (non_decomp_dim_prod*dim_loc_len); | ||
| for (int idof=0; idof<dim_loc_len; ++idof) { | ||
| auto dof_offset = dim_offsets[idof]; | ||
| auto beg = decomp->offsets.begin()+ idof*non_decomp_dim_prod; | ||
| auto end = beg + non_decomp_dim_prod; | ||
| std::iota (beg,end,non_decomp_dim_prod*dof_offset); | ||
| if (decomp_first_dim) { | ||
| for (int idof=0; idof<dim_loc_len; ++idof) { | ||
| auto dof_offset = dim_offsets[idof]; | ||
| auto beg = decomp->offsets.begin()+ idof*non_decomp_dim_prod; | ||
| auto end = beg + non_decomp_dim_prod; | ||
| std::iota (beg,end,non_decomp_dim_prod*dof_offset); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Up to you, but to make the symmetry between first/last dim cases clearer, we could replace the beg/end/iota logic with a manual for loop: if (decomp_first_dim) {
for (int idof=0; idof<dim_loc_len; ++idof) {
auto dof_offset = dim_offsets[idof];
for (int i=0; i<non_decomp_dim_prod; ++i) {
decomp->offsets[idof*non_decomp_dim_prod + i] =
non_decomp_dim_prod*dof_offset +i;
}
}
} else {
for (int ii = 0; ii<non_decomp_dim_prod; ++ii) {
for (int idof=0; idof<dim_loc_len; ++idof) {
auto dof_offset = dim_offsets[idof];
decomp->offsets[ii * dim_loc_len + idof] =
ii * decomp_dim->length + dof_offset;
}
}
} |
||
| } | ||
| } else { | ||
| for (int ii = 0; ii<non_decomp_dim_prod; ++ii) { | ||
| for (int idof=0; idof<dim_loc_len; ++idof) { | ||
| auto dof_offset = dim_offsets[idof]; | ||
| decomp->offsets[ii * dim_loc_len + idof] = | ||
| ii * decomp_dim->length + dof_offset; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Create PIO decomp | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"allow" seems ambiguous: are we checking both
var1=var2andvar1=var2'? Maybe we should call the option just--transpose-rhs(-tfor short)?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to the previous question. The allow_transpose conditional just rearranges the dimensions so both lhs and rhs have matching dimension ordering. If we "allow_transpose" on two files that already have the dimensions ordered correctly the transpose call will do nothing.
I wanted to make it so that we have the option to FAIL the test if variables are transposed and we don't expect that. Otherwise I could hard code the aligning dimensions and drop this arg altogether.