-
Notifications
You must be signed in to change notification settings - Fork 230
Update constants to use values from CODATA 2022 #5661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update constants to use values from CODATA 2022 #5661
Conversation
|
Thanks, Dave. If there is a large number of benchmarks that need to be reset, this could be a good opportunity to test again our tool Tools/DevUtils/update_benchmarks_from_azure_output.py and its instructions in our documentation. In theory, I updated and tested the tool manually in #5372. However, it is not tested automatically yet. |
Thanks @EZoni ! It worked and was easy to do. BTW, to download the raw log file, I copied the URL from he location bar and pasted it into the Note that almost all of the changes in the benchmarks are small as expected, ~1.e-9 or smaller. One exception is the |
I agree. I think that test has relatively large tolerances anyways, if I remember correctly. @aeriforme, what do you think? |
Thanks for pointing this out! I added this hint to our documentation in #5663. |
…tion_psb analysis test
lucafedeli88
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dpgrote ! I think that updating constants to CODATA 2022 is a good idea.
I've reviewed the logs of the tests that have failed.
I agree that issues with test_3d_beam_beam_collision.checksum (and also with test_2d_collision_xz_picmi.checksum ) are likely due to the fact that these tests are relatively long.
We need to investigate a bit better the cases test_1d_ohm_solver_ion_beam_picmi, test_1d_ohm_solver_ion_beam_picmi, and test_3d_qed_schwinger_2 show non-negligible discrepancies (I will have a look at the QED-related one).
We need to increase the tolerance of several analysis scrips, since they seem to be a bit too strict :
test_2d_theta_implicit_jfnk_vandb, test_2d_theta_implicit_jfnk_vandb_filtered, test_2d_theta_implicit_jfnk_vandb_picmi, test_2d_theta_implicit_strang_psatd, test_2d_pec_field_insulator_implicit_restart,test_3d_particle_boundaries, test_3d_load_external_field_grid_picmi, test_3d_load_external_field_grid_picmi,test_3d_particle_fields_diags , test_3d_particle_fields_diags, test_3d_reduced_diags.
I can provide a possible explanation for the QED-related test. Looking at the analysis script I found this comment: Note that the analysis script uses these constants (CODATA 2014, I think): while PICSAR-QED uses (CODATA 2018): |
|
@lucafedeli88 Thanks for looking over this PR. I think a number of the issues are related to the use of scipy.constants in the analysis scripts, since the constants are inconsistent, i.e. still the CODATA2018 values. Until this issue is resolved, I don't think this PR should be merged. Unfortunately, there is no easy solution. The simplest is probably to use the most recent version of ubuntu for the CI tests, which uses the most recent scipy with updated constants. A more robust longer term solution would be to create a new light-weight Python module that has the constants with the values consistent with the ones in C++, and then use this everywhere instead of relying on scipy.constants. This would guarantee that the values are always consistent. |
We could maybe discuss this in the upcoming developers' meeting. Using Ubuntu 24.04 in CI tests should be rather straightforward. |
Yes, we can do that soon. There had been discussions about the "need" to have less strict tolerances for the checksums to be compatible with version upgrades like this one. The work done in #5456 has set up things so that we can do that easily (see point "Add logic to reset tolerances based on environment variables" in the follow-up list of that PR description). This said, Ubuntu LTS version upgrades come once every two years, so I personally think that the tolerance fine tuning is not a real roadblock for this particular update, we could simply upgrade the Ubuntu version and reset the checksums that need to be reset. I will get to this, one way or another, as soon as possible. |
This comment was marked as outdated.
This comment was marked as outdated.
|
Ideally I thought it could be good to have the Ubuntu upgrade run through first without prior checksums changes and record how tests fail, like I did in #5731 (comment), to make it easier to discuss if/when a tolerance upgrade could be appropriate. But let's see if we can still assess this despite the pre-existing checksums changes. |
|
As mentioned in of the last comments in #5731, we can try to merge # (remove system copy of Matplotlib to avoid conflict
# with version set in the requirements file - see, e.g.,
# https://github.com/matplotlib/matplotlib/issues/28768)
sudo apt remove python3-matplotlib |
This test is still failing. I looked into it some and don't understand what is happening. The momentum is never modified but the values somehow change slightly at some point, from 299792458 to 299792458.00000006. |
Should we relax that tolerance to the machine limit for double precision floating point to finalize this PR and then open a new, follow-up PR where someone familiar with the test can reset the tolerance to strict zero (if it is indeed true that the momentum variable is never touched by any part of the code at any point in the test) and debug further? |
To get this finished, I agree that the tolerance should be increased. I'll set it to Future work as examine why the change is happening and reduce to tolerance back to zero when fixed. |
Even though the momentum is not modified, somehow the last bit is jiggled changing the value. So, instead of a tolerance of zero, it is set to the smallest difference resolvable for double precision.
|
@EZoni There is still some weirdness. The values for the CI test |
Kind of difficult to know if this was just some kind of Azure runner glitch or not. I think it could be worth trying to reset the checksums again and see if they still fail. @roelof-groenewald might be familiar with that test as well, not sure if he observed anything like this already in the past. |
I don't know why the ion impact ionization test would be impacted but not the electron impact ionization one. They run the same DSMC routine, just with different colliding species. So it is odd for only one to have it's checksum values changed. The ion version of that test is pretty new though, so maybe it is a bit more unstable than the others. We can keep an eye on it with future PRs to see if it fails "randomly". |
@EZoni @roelof-groenewald I did update the benchmarks again and it failed again. |
This is very strange, I don't understand it. Especially, as @roelof-groenewald said, based on the input file, this test seems identical to the electron one, just with an ion species instead. I don't understand why repeating the test on Azure runners changes the checksums of the ion test (and only of that one). |
|
If I repeat the test multiple times locally on my computer, I get the same checksums. Once I reset them the first time, they seem to be okay each other subsequent time. |
…_velocity_for_diagnostics This seems to fix the issue. There was apparently an out of bounds access during the momentum synchronization that was being done before the particle boundaries were applied at the last step. The option turned on fixes this.
|
@EZoni @roelof-groenewald The CI test is now fixed. The issue was a long expected bug related to how the synchronization of the velocity is done for the final diagnostic. The fix was to use the new option |
|
Thanks for digging into that test failure, @dpgrote! |
roelof-groenewald
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Roelof Groenewald <[email protected]>
lucafedeli88
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR!
One question: if not setting warpx.synchronize_velocity_for_diagnostics = 1 can lead to out-of-bounds accesses, shouldn't we set this to 1 by default and maybe warn the user if they set it to 0?
Ah, yes, we probably should, otherwise @dpgrote would not have opened this #5816 |
|
Thanks a lot for the fix and for all the work, @dpgrote. And thanks all, @lucafedeli88 and @roelof-groenewald, for your work and feedback as well. Is it clear why this issue started to appear when we updated the physical constants? Why did it not occur before? Anyways, I will approve and merge the PR in a minute. |
Thanks @EZoni ! I don't know why that CI test started failing here - I can only guess that it was chance. I didn't dig into finding exactly what was going wrong. |
The Ubuntu 20.04 runner image will be deprecated soon: actions/runner-images#11101. Only support CUDA 11.7+ going forward. To-do: - [x] GitHub Actions workflows - [ ] ~~Azure DevOps workflows~~ (transfered to BLAST-WarpX#5661) ### Notes I renamed the following workflows, with the aim of not having to rename them again in the future when we upgrade version numbers: - `NVCC 11.3 SP`, now `NVCC SP` - `NVCC 11.8.0 GNUmake`, now `NVC GNU Make` - `[email protected] NVCC/NVC++ Release [tests]`, now `NVHPC` If you are okay with the renaming, the list of "required" workflows should be updated to reflect the new names, and I don't have the necessary access to do so.
This PR adds a simple hint suggested by @dpgrote in BLAST-WarpX#5661 (comment) related to the how-to guide on how to use our [update_benchmarks_from_azure_output.py](https://github.com/ECP-WarpX/WarpX/blob/development/Tools/DevUtils/update_benchmarks_from_azure_output.py) to update checksum benchmarks from the Azure raw log output.
…AST-WarpX#5661) The values of the physical constants were from CODATA 2018. These should be updated to the current accepted values as specified in CODATA 2022. This breaks many CI benchmarks since the checks are at a higher precision than the changes in the constants. This PR change is also needed since the older version of Ubuntu (which is used for the Azure CI tests) is being deprecated. This PR switches to the newest Ubuntu which includes a newer version of scipy where the constants have also been updated (since version 1.15.0 on January 3). Note that this PR has fine tuned values of `mu0` and `alpha` so that the relation between the constants is exact. The is means that these values will differ slightly from the values in scipy.constants. Many of the CI tests were using locally defined constants in the input files and analysis scripts. These were all changed to rely on the built in constants in the parser and `scipy.constants`. --------- Co-authored-by: Edoardo Zoni <[email protected]> Co-authored-by: Luca Fedeli <[email protected]> Co-authored-by: Roelof Groenewald <[email protected]>
…AST-WarpX#5661) The values of the physical constants were from CODATA 2018. These should be updated to the current accepted values as specified in CODATA 2022. This breaks many CI benchmarks since the checks are at a higher precision than the changes in the constants. This PR change is also needed since the older version of Ubuntu (which is used for the Azure CI tests) is being deprecated. This PR switches to the newest Ubuntu which includes a newer version of scipy where the constants have also been updated (since version 1.15.0 on January 3). Note that this PR has fine tuned values of `mu0` and `alpha` so that the relation between the constants is exact. The is means that these values will differ slightly from the values in scipy.constants. Many of the CI tests were using locally defined constants in the input files and analysis scripts. These were all changed to rely on the built in constants in the parser and `scipy.constants`. --------- Co-authored-by: Edoardo Zoni <[email protected]> Co-authored-by: Luca Fedeli <[email protected]> Co-authored-by: Roelof Groenewald <[email protected]>
The values of the physical constants were from CODATA 2018. These should be updated to the current accepted values as specified in CODATA 2022.
This breaks many CI benchmarks since the checks are at a higher precision than the changes in the constants.
This PR change is also needed since the older version of Ubuntu (which is used for the Azure CI tests) is being deprecated. This PR switches to the newest Ubuntu which includes a newer version of scipy where the constants have also been updated (since version 1.15.0 on January 3).
Note that this PR has fine tuned values of
mu0andalphaso that the relation between the constants is exact. The is means that these values will differ slightly from the values in scipy.constants.Many of the CI tests were using locally defined constants in the input files and analysis scripts. These were all changed to rely on the built in constants in the parser and
scipy.constants.