-
Notifications
You must be signed in to change notification settings - Fork 446
EAMxx: Transposed Output #7757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
EAMxx: Transposed Output #7757
Conversation
|
|
||
| auto func_start = std::chrono::steady_clock::now(); | ||
| scorpio::write_var(filename,count.name(),count.get_internal_view_data<int,Host>()); | ||
| if (m_transpose) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watch out for stray indentation. You prob use tabs instead of spaces, which cause indentation to be different depending on user settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I try to be diligent. The vi editor on perlmutter defaults to "automatic" identation that I have to remove and put in myself. For the record, I never use tab manually.
| auto func_start = std::chrono::steady_clock::now(); | ||
| scorpio::write_var(filename,count.name(),count.get_internal_view_data<int,Host>()); | ||
| if (m_transpose) { | ||
| const auto count_tmp = transpose(count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two overloads of the transpose function. Beside what you are using, we also have void transpose (const Field&, Field&), which uses a pre-constructed field for the result. This is preferable for stuff that happens at runtime, so we don't have to allocate (and destroy) a new field every time.
I would consider rethinking the implementation, so that we construct and allocate all fields during init, and call the other version of transpose at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that you don't need a transpose field for every field. Since these fields are used immediately and don't need to have persistent data, you could create one helper field per layout. During init, you could create a map Layout->Field, and then at run time do something like
const auto& fl = count.get_header().get_identifier().get_layout();
auto& temp = m_helper_fields.at(fl);
transpose(count,temp);
scorpio::write_var(filename,count.name(),temp.get_internal_view_data<int,Host>());| strvec_t dims; | ||
| for (int i=0; i<layout.rank(); ++i) { | ||
| const auto t = layout.tag(i); | ||
| const int ind = m_transpose ? layout.rank()-i-1 : i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FieldLayout has the method transpose, that returns the transposed layout. Perhaps you could transpose the layout at the call site, and avoid the index logic here (index logic is often confusing)
m_var_dims[fname] = get_var_dimnames(m_transpose ? layout.transpose() : layout);| non_decomp_dim_prod *= d->length; | ||
| if (d->offsets==nullptr) { | ||
| non_decomp_dim_prod *= d->length; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to change the offsets creation below.
bartgol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I have a couple of comments.
| auto dof_offset = dim_offsets[idof]; | ||
| auto beg = decomp->offsets.begin()+ idof*non_decomp_dim_prod; | ||
| auto end = beg + non_decomp_dim_prod; | ||
| std::iota (beg,end,non_decomp_dim_prod*dof_offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you, but to make the symmetry between first/last dim cases clearer, we could replace the beg/end/iota logic with a manual for loop:
if (decomp_first_dim) {
for (int idof=0; idof<dim_loc_len; ++idof) {
auto dof_offset = dim_offsets[idof];
for (int i=0; i<non_decomp_dim_prod; ++i) {
decomp->offsets[idof*non_decomp_dim_prod + i] =
non_decomp_dim_prod*dof_offset +i;
}
}
} else {
for (int ii = 0; ii<non_decomp_dim_prod; ++ii) {
for (int idof=0; idof<dim_loc_len; ++idof) {
auto dof_offset = dim_offsets[idof];
decomp->offsets[ii * dim_loc_len + idof] =
ii * decomp_dim->length + dof_offset;
}
}
}|
|
||
| // Initialize a helper_field for each unique layout. This can be used for operations | ||
| // such as writing transposed output. | ||
| if (m_helper_fields.find(layout.to_string()) == m_helper_fields.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we skip this part altogether if m_transpose is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this. Do we anticipate using helper fields more generically? If we really only see them as useful for transposed output should we name them something more specific then "helper"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely don't need them now (unless transpose is true), so no point in allocating additional fields for no reason. If we ever find we need other helper fields in other contexts, we can recycle them. And yes, you could call the member m_transpose_fields maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I think keeping the name as "helper" fields makes more sense. But I will only create these specific fields when transposed output is asked for, in this case.
| auto& temp = m_helper_fields.at(fl); | ||
| transpose(count,temp); | ||
| scorpio::write_var(filename,count.name(),temp.get_internal_view_data<int,Host>()); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uggg, getting better though :-)
| f' - upon slicing, {lname}({loc}) = {lval}\n' | ||
| f' - upon slicing, {rname}({loc}) = {rval}') | ||
| success = False | ||
| if self._compare == None or self._compare == []: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, instead of having this big if/else, with repeated comparison code, you could do
if self._compare==None or self._compare==[]:
self._compare = []
for vname in ds_src.variables:
var = ds_src.variables[vname]
s = "(" + ",".join([":"]*len(lvar.dimensions))
self._compare.append(f"{vname}{s}={vname}{s}")Then, you are sure that there is a list of vars to compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we could certainly do this.
| help="Compare variables from src file against variables from tgt file") | ||
|
|
||
| # Allow variables to be transposed | ||
| parser.add_argument("-a","--allow_transpose",action="store_true", default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"allow" seems ambiguous: are we checking both var1=var2 and var1=var2' ? Maybe we should call the option just --transpose-rhs (-t for short)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to the previous question. The allow_transpose conditional just rearranges the dimensions so both lhs and rhs have matching dimension ordering. If we "allow_transpose" on two files that already have the dimensions ordered correctly the transpose call will do nothing.
I wanted to make it so that we have the option to FAIL the test if variables are transposed and we don't expect that. Otherwise I could hard code the aligning dimensions and drop this arg altogether.
mahf708
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from a quick inspection. We need to make sure the tests pass and get some transposed output to play with
| @@ -1,8 +1,7 @@ | |||
| from utils import expect, ensure_netcdf4 | |||
| from utils import expect, _ensure_pylib_impl | |||
| _ensure_pylib_impl("xarray") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how we've been doing things, but I am having second thoughts on these "under the hood" pip installs.
@mahf708 @jgfouca I am not fond of the script adding pacakges without the user probably even knowing it (yes, we print a msg, but we don't ask for permission). I find it conceptually not right.
How do you feel about removing all the "ensure_xyz" lines from our scripts, and simply erroring out if the pkg is not installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's going to be hard to get rid of this without risking all sorts of bad stuff, I think... :(
My vote is to keep it, despite the weirdness/ugliness, because it allows us so much portability. I think it's a smart solution for this tricky problem. We don't want to prepare the env and for the most part these packages are light so installing them a la carte is good.
I defer to Jim though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would lean towards keeping it as well
| const std::string helper_name = "transposed_"+helper_layout.to_string(); | ||
| if (m_helper_fields.find(helper_name == m_helper_fields.end()) { | ||
| // We can add a new helper field for this layout | ||
| const auto helper_layout = m_transpose ? layout.transpose() : layout; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check on m_transpose here is superfluous (possibly confusing), since we are already inside if (m_transpose).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh yes, good point. I missed this when I add the if (m_transpose) conditional. I'll fix it.
| const auto& fl = count.get_header().get_identifier().get_layout().to_string(); | ||
| const std::string helper_name = "transposed_"+fl; | ||
| auto& temp = m_helper_fields.at(helper_name); | ||
| transpose(count,temp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you ok having the call to transpose end up in the "duration_write" timing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe it doesn't matter. I think we should prob get rid of that timing anyways, as it doesn't add much to what's already reported in the timings file.
This commit updates the definition of variable dimensions in the scorpio output interface to take into account whether or not the output should be transposed. This commmit transposes the output just before the write step, when desired. Tested locally using python. Next commit will add a unit test.
This commit also addresses a few reviewer comments. It uses the FieldLayout transpose command rather than using confusing index skipping when defining a transposed variable for output. It fixes an indentation error.
This commit addresses a reviewer suggestion that we can add "helper" fields to the output interface for cases like transposed output. We can then initialize those helper fields, which are defined by field layout, once and resue the allocated memory when we transpose and write output. This commit also updates the checks for the field transpose function to ignore cases where either of the fields in the transpose operation have the units set to "invalid"
This commit does two things for the compare-nc-files testing script. 1) If no comparison string is provided the script will compare all variables in the source file to the target file. This happens when -c is left blank in the command line arguments. 2) Provides an option to allow testing even when the dimension of the variables in one file might be transposed in the other. This happens when -a or --allow-transform is passed as a command line argument.
This commit changes the current method for slicing variables in favor of xarray build in functions. It also addresses a reviewer comment about spacing and in building helper fields only for transposed output.
244078c to
6a52b46
Compare
|
@AaronDonahue are you still working on this? |
Yes, I had set it to retest expecting that after rebase it would start to pass. I'll pick this up again and figure out why it is failing testing. |
Make it possible to request output transposed in EAMxx. Note,
timewill still be the first dimension in all cases. This option will transpose the remaining dimensions.To trigger the option, the user needs to add
to the output YAML file.
[BFB]