Clang UB sanitizer CI test: increase coverage #5597

lucafedeli88 · 2025-01-23T11:08:11Z

In CI test based on clang UB sanitizers, most of the time (~ 1h30) is spent in compiling the code, while just a few minutes are spent actually running some simulations. This means that we can increase the coverage of the test by adding some more simulations to the tests with a negligible increase of the total runtime.
This PR does just that: now most of the cases in Examples/Physics_applications are tested with the UB sanitizer.

Note that some cases cannot run in double precision (see below). For this reason, the PR also splits the UB sanitizer test into single precision and double precision (in double precision only the cases that cannot run in single precision are tested).

Updates:

1) Issue found while running inputs_test_3d_beam_beam_collision --> We need to run this case in double precision

The tool has found an issue while running mpirun -n 2 ./build/bin/warpx.3d Examples/Physics_applications/beam_beam_collision/inputs_test_3d_beam_beam_collision :

STEP 1 starts ...
/home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17: runtime error: -nan is outside the range of representable values of type 'int'
/home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17: runtime error: -nan is outside the range of representable values of type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17 in 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /home/runner/work/WarpX/WarpX/build/_deps/fetchedpicsar-src/multi_physics/QED/include/picsar_qed/containers/picsar_tables.hpp:310:17 in

I've temporarily commented out this case while I investigate the cause. ~~For the moment, I am not able to reproduce the issue on my local machine~~. The issue is using single precision for this specific test case! Specifically, momenta end up being NaN and the sanitizer detects the attempt to convert a floating point NaN into an integer.
I have added an ABLASTR_ALWAYS_ASSERT_WITH_MESSAGE in PoissonSolver.H so that a more readable error message is provided to the user.

2) Issue found while running inputs_test_2d_background_mcc --> We need to run this case in double precision

MLMG does not converge in single precision for this simulation case. We need to run it in double precision.

3) Issue found while running free_electron_laser --> We need to run this case in double precision

I have observed this issue:

 STEP 444 starts ...
0::1::Assertion `m_current_z_lab[i_buffer] >= m_buffer_domain_lab[i_buffer].lo(m_moving_window_dir) and m_current_z_lab[i_buffer] <= m_buffer_domain_lab[i_buffer].hi(m_moving_window_dir)' failed, file "/home/runner/work/WarpX/WarpX/Source/Diagnostics/BTDiagnostics.cpp", line 870, Msg: 
 ### ERROR   : z-slice in lab-frame (0.299976) is outside the buffer domain
#            physical extent (0.299976 to 0.299988).
 !!!

which seems to be related to using single precision instead of double precision. Therefore, we need to run this case in double precision.

4) Issue found while running inputs_test_2d_laser_ion_acc --> bugfix in WarpX

inputs_test_2d_laser_ion_acc case has the following issue in single precision:

--- INFO    : Writing openPMD file diags/openPMDbw000000
terminate called after throwing an instance of 'std::runtime_error'
  what():  Datatypes of chunk data (FLOAT) and record component (DOUBLE) do not match.
SIGABRT
See Backtrace.0.0 file for details

This comes from the fact that the datatype of this dataset in ParticleHistogram2D.cpp is hard-coded as double:

    auto dataset = io::Dataset(
            io::determineDatatype<double>(),
            {static_cast<unsigned long>(m_bin_num_ord), static_cast<unsigned long>(m_bin_num_abs)});

this PR modifies these lines as follows:

    auto dataset = io::Dataset(
            io::determineDatatype<amrex::Real>(),
            {static_cast<unsigned long>(m_bin_num_ord), static_cast<unsigned long>(m_bin_num_abs)});

5) Perform all the simulations in double precision

See #5936

6) Use `ctest` instead of running the tests directly

7) In `IntervalsParser` prevent the use of an uninitialized object

This was discovered by adding a new test case.
temp_slice.getStart() > m_slices[i_slice].getStart() && i_slice < static_cast<int>(m_slices.size()) must be reversed: i_slice < static_cast<int>(m_slices.size()) && temp_slice.getStart() > m_slices[i_slice].getStart() otherwise we call a method of a non-existent object.

…an double

Source/Diagnostics/ReducedDiags/ParticleHistogram2D.cpp

Co-authored-by: Axel Huebl <[email protected]>

lucafedeli88 · 2025-03-03T11:51:16Z

ping, @EZoni !

dpgrote

This looks good, thanks!

I will comment that NaNs always make me nervous that there is something bad lurking in the code. Do you know why they appear in that case with single precision?

lucafedeli88 · 2025-04-09T13:20:32Z

This looks good, thanks!

I will comment that NaNs always make me nervous that there is something bad lurking in the code. Do you know why they appear in that case with single precision?

Thanks, @dpgrote . The NaN comes from this line in PoissonSolver.H. beta_solver[2] is 1.00059783 here due to numerical errors, so the square root is negative. We could indeed try to fix that.

geom[lev].CellSize(2)/std::sqrt(1._rt-beta_solver[2]*beta_solver[2]))};

lucafedeli88 · 2025-04-09T14:00:08Z

This looks good, thanks!
I will comment that NaNs always make me nervous that there is something bad lurking in the code. Do you know why they appear in that case with single precision?

Thanks, @dpgrote . The NaN comes from this line in PoissonSolver.H. beta_solver[2] is 1.00059783 here due to numerical errors, so the square root is negative. We could indeed try to fix that.
geom[lev].CellSize(2)/std::sqrt(1._rt-beta_solver[2]*beta_solver[2]))};

Actually, in this case, we have a 250 GeV electron beam. So we can't represent beta reliably in single precision for such high energies. I think that the best we can do is to add an assert to ensure that beta < 1 .

…erage_clang_ub_sanitizer

lucafedeli88 · 2025-04-09T14:17:48Z

@dpgrote , I've added this check inside PoissonSolver.H :

    ABLASTR_ALWAYS_ASSERT_WITH_MESSAGE (
        std::all_of(beta_solver.begin(), beta_solver.end(),
            [](const auto b){return (b<1.0_rt);}),
        "Components of beta_solver must be < 1.");

so that at least the user has a more readable error message.

…erage_clang_ub_sanitizer

.github/workflows/clang_sanitizers.yml

…erage_clang_ub_sanitizer

EZoni

Looks good to me.

EZoni

Just one last comment, in case it helps make this more compact and robust.

.github/workflows/clang_sanitizers.yml

I left one more comment to address.

Co-authored-by: Edoardo Zoni <[email protected]>

In CI test based on clang UB sanitizers, most of the time (~ 1h30) is spent in compiling the code, while just a few minutes are spent actually running some simulations. This means that we can increase the coverage of the test by adding some more simulations to the tests with a negligible increase of the total runtime. This PR does just that: now all tests in `Examples/Physics_applications` are tested with the UB sanitizer. Co-authored-by: Axel Huebl <[email protected]> Co-authored-by: Edoardo Zoni <[email protected]>

add new tests

19333ab

lucafedeli88 added the component: tests Tests and CI label Jan 23, 2025

fix bug

47041c6

lucafedeli88 mentioned this pull request Jan 23, 2025

[WIP] Increase coverage of clang sanitizer tests #5280

Closed

lucafedeli88 changed the title ~~Clang UB sanitizer CI test: increase coverage~~ [WIP] Clang UB sanitizer CI test: increase coverage Jan 23, 2025

lucafedeli88 added 7 commits January 23, 2025 13:50

temporary workaround to test more cases

1f564e7

split clang UB sanitizer into single- and double- precision tests

ee64375

add echo to ease debugging

97cedfd

Move test case to double precision

82bb0cf

add even more test cases

62be62c

mv free_electron_laser to DP tests

d234bc2

dataset in ParticleHistogram2D must be of type amrex::Real, rather th…

bedd665

…an double

lucafedeli88 changed the title ~~[WIP] Clang UB sanitizer CI test: increase coverage~~ Clang UB sanitizer CI test: increase coverage Jan 24, 2025

lucafedeli88 requested review from EZoni and ax3l January 24, 2025 09:32

ax3l reviewed Jan 24, 2025

View reviewed changes

Source/Diagnostics/ReducedDiags/ParticleHistogram2D.cpp Outdated Show resolved Hide resolved

lucafedeli88 and others added 2 commits January 27, 2025 10:17

Update Source/Diagnostics/ReducedDiags/ParticleHistogram2D.cpp

bdb665a

Co-authored-by: Axel Huebl <[email protected]>

change data type for m_h_data_2D

8ed4236

lucafedeli88 requested a review from ax3l January 27, 2025 12:58

RemiLehe assigned EZoni Jan 28, 2025

lucafedeli88 added 2 commits April 8, 2025 13:24

fix merge conflict

12e395a

test 2d_theta_implicit_jfnk_vandb in single precision

d1af8e3

lucafedeli88 requested a review from dpgrote April 8, 2025 13:52

dpgrote approved these changes Apr 8, 2025

View reviewed changes

lucafedeli88 added 2 commits April 9, 2025 16:01

Merge remote-tracking branch 'upstream/development' into increase_cov…

82a5bb4

…erage_clang_ub_sanitizer

add error message if beta >= 1 in Poisson Solver

ac2459e

lucafedeli88 enabled auto-merge (squash) April 10, 2025 12:16

Merge remote-tracking branch 'upstream/development' into increase_cov…

33e286e

…erage_clang_ub_sanitizer

lucafedeli88 closed this Apr 29, 2025

auto-merge was automatically disabled April 29, 2025 14:47
Pull request was closed

lucafedeli88 reopened this Apr 29, 2025

Merge remote-tracking branch 'upstream/development' into increase_cov…

6b1d120

…erage_clang_ub_sanitizer

EZoni reviewed Apr 30, 2025

View reviewed changes

.github/workflows/clang_sanitizers.yml Show resolved Hide resolved

lucafedeli88 requested a review from EZoni May 13, 2025 15:23

lucafedeli88 changed the title ~~Clang UB sanitizer CI test: increase coverage~~ [WIPClang UB sanitizer CI test: increase coverage Jun 11, 2025

lucafedeli88 changed the title ~~[WIPClang UB sanitizer CI test: increase coverage~~ [WIP] Clang UB sanitizer CI test: increase coverage Jun 11, 2025

lucafedeli88 added 5 commits June 11, 2025 18:11

perform all the tests in double precision

5c0811b

Merge remote-tracking branch 'upstream/development' into increase_cov…

45d1d66

…erage_clang_ub_sanitizer

using ctest to run test cases for clang UB sanitizer

782e502

fix bug

25ac38e

fix minor issue in IntervalsParser.cpp

f5fd7db

lucafedeli88 changed the title ~~[WIP] Clang UB sanitizer CI test: increase coverage~~ Clang UB sanitizer CI test: increase coverage Jun 27, 2025

EZoni previously approved these changes Jun 27, 2025

View reviewed changes

EZoni reviewed Jun 27, 2025

View reviewed changes

.github/workflows/clang_sanitizers.yml Outdated Show resolved Hide resolved

EZoni self-requested a review June 27, 2025 19:08

Update .github/workflows/clang_sanitizers.yml

d014e7e

Co-authored-by: Edoardo Zoni <[email protected]>

EZoni approved these changes Jun 28, 2025

View reviewed changes

EZoni merged commit 468e3f3 into BLAST-WarpX:development Jun 28, 2025
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clang UB sanitizer CI test: increase coverage #5597

Clang UB sanitizer CI test: increase coverage #5597

Uh oh!

lucafedeli88 commented Jan 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

lucafedeli88 commented Mar 3, 2025

Uh oh!

dpgrote left a comment

Uh oh!

lucafedeli88 commented Apr 9, 2025

Uh oh!

lucafedeli88 commented Apr 9, 2025

Uh oh!

lucafedeli88 commented Apr 9, 2025

Uh oh!

Uh oh!

EZoni left a comment

Uh oh!

EZoni left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clang UB sanitizer CI test: increase coverage #5597

Clang UB sanitizer CI test: increase coverage #5597

Uh oh!

Conversation

lucafedeli88 commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updates:

1) Issue found while running inputs_test_3d_beam_beam_collision --> We need to run this case in double precision

2) Issue found while running inputs_test_2d_background_mcc --> We need to run this case in double precision

3) Issue found while running free_electron_laser --> We need to run this case in double precision

4) Issue found while running inputs_test_2d_laser_ion_acc --> bugfix in WarpX

5) Perform all the simulations in double precision

6) Use ctest instead of running the tests directly

7) In IntervalsParser prevent the use of an uninitialized object

Uh oh!

Uh oh!

lucafedeli88 commented Mar 3, 2025

Uh oh!

dpgrote left a comment

Choose a reason for hiding this comment

Uh oh!

lucafedeli88 commented Apr 9, 2025

Uh oh!

lucafedeli88 commented Apr 9, 2025

Uh oh!

lucafedeli88 commented Apr 9, 2025

Uh oh!

Uh oh!

EZoni left a comment

Choose a reason for hiding this comment

Uh oh!

EZoni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lucafedeli88 commented Jan 23, 2025 •

edited

Loading

6) Use `ctest` instead of running the tests directly

7) In `IntervalsParser` prevent the use of an uninitialized object