Update _get_start_data to always grab the beginning of timestep time#3414
Conversation
|
I've not pushed the fix for now to see the new test exposing the issue fail. I will push the fix after CI gets started. Also realizing return (self.operator.prev_res[-1].time[0],
len(self.operator.prev_res) - 1)might be more clear than return (self.operator.prev_res[-1].time[-2],
len(self.operator.prev_res) - 1)It seems this member variable will always have length 2, so perhaps a 0 is more clear. |
|
Looks like I made a silly mistake regarding Perhaps I'm misunderstanding what files Github Actions has access to, but I thought that pushing XML/h5 files to Either way @paulromano, I'm curious about your thoughts on this test in general, since it feels a little non-standard. I think it's important to test for this case though. Still going to hold off on pushing the fix until this test fails in the way we expect (the same as in #3387) |
It looks like the standars/best practice is to use |
…run can restart with the proper timesteps
af35184 to
879350c
Compare
|
Tested locally and now this commit should show only a failure for After that is the only failure, I will push the correction and show that the commit now passes tests and can be merged |
|
Was hoping there would be more printout from Github Actions (and this time it stopped after failing this test), but it looks like this is the only failing test in Locally, I get this as printout I will now push the fix |
a244474 to
624559d
Compare
…estep time so it will work whether a previous simulation completes or is interrupted
624559d to
5643fae
Compare
|
I realized that there already exists a (much) simpler chain in the repo, so I switched to using that. I also realized that the tests use the NNDC cross sections and that the XML/h5 files I generated used ENDF-B-VIII.0. There are some nuclide differences (e.g. handling of carbon) which might be causing the error. To eliminate possible sources of error, I regenerate the |
gonuke
left a comment
There was a problem hiding this comment.
A few edits of the comments you added
Co-authored-by: Paul Wilson <paul.wilson@wisc.edu>
gonuke
left a comment
There was a problem hiding this comment.
This all looks good to me - thanks @lewisgross1296
|
In case @paulromano may be wondering, this is basically a 1 line PR (2 character, in fact) other than the tests... |
paulromano
left a comment
There was a problem hiding this comment.
@lewisgross1296 Thanks for the fix here (and thanks for the nudge @gonuke). The one-line change looks good to me. The test, however, is a little more problematic because if and when we have other changes that require updating the continue_depletion_results.h5 file, there is no easy way to do that. Other tests recognize when you call pytest --update and will update their reference result files accordingly, but I don't see any way of doing that for this test since it requires killing a process. Are you OK with just removing the test for the time being? Obviously not ideal but honestly I would prefer to have no test rather than have a test that is difficult to update in the future.
|
Probably fine to get rid of the test from our perspective. We briefly discussed whether we could generate a test that would result in an interrupted job, thus creating this file on the fly, but I'm not sure if that's possible. |
|
Ok, sounds good. I'll go ahead and remove the test and merge this. If you guys are able to figure out a good way to test it, feel free to follow up with another PR but no sweat otherwise. |
|
Yeah, the test seems difficult to maintain since I had to locally kill the simulation to create that h5 file. It probably makes the most sense to remove it, but knowing the test passed here is hopefully sufficient. If something in the future causes the continue runs to break, we can cross that bridge when we get there. Hopefully this change is resistant to being broken. The updated doc string should help anyone who might need to use or change the Thanks for finishing this up @paulromano! |

Description
As of right now, any interrupted depletion simulation (think max wall time or unintended power off) will have an incorrect time for the time at which the simulation restarts. This happens when
continue_timestepsis True or False (independent of #3272) and needs to be fixed. This function is the problemopenmc/openmc/deplete/abc.py
Lines 821 to 825 in a7d1ceb
The fix is actually a one liner
.time[-1]->.time[-2]. I will explain.Every
StepResultin a depletion simulation, has a membertimewhich is a list of floatopenmc/openmc/deplete/stepresult.py
Lines 25 to 36 in a7d1ceb
For every step, the list of float
timemember has values[t,t+dt]for the given step. When reading in previous results, the last depletion step appears to havedt=0and thus is stored as[t_final,t_final]. I discovered this by printing out thetime_dset[step, :]in the from_hdf5 method each time its called in theResultsConstructor.openmc/openmc/deplete/results.py
Lines 66 to 77 in a7d1ceb
This is a fine thing to do if the simulation never gets killed, but it introduces a problem if you are not guaranteed to complete a depletion simulation to the last step with
[t,t]. Basically, if a simulation gets killed, the finaltimemember will look something like thisSince it never finishes this step, in the next run (continue or not),
self.operator.prev_res[-1].time[-1]is actually1782800. However, a simulation that restarts from this previous run should restart att=86400and finish the step. In the restart, we're currently starting att=172800with the firstdtin the new simulation. When this gets written, it appears like the wrongdtexists in thedepletion_results.h5file as the sum of the finaldtof the last step and the first newdtfor the interrupted step.Since the first value in the time list of floats member will always be the time you want in a restart (run to completion or killed), we should change to grab this value instead. Very subtle.
Fixes #3387.
Checklist
I have run clang-format (version 15) on any C++ source files (if applicable)