-
Notifications
You must be signed in to change notification settings - Fork 605
Description
In some cases, Ifpack2::BlockTriDiContainer tries to allocate a huge amount of memory for the view with label amd.A_x_offsets. This allocation happens inside BlockHelperDetails::precompute_A_x_offsets, which is one of the steps in symbolic setup.
Example output from when this is triggered:
Kokkos ERROR: HIP memory space failed to allocate 205.1 TiB (label="amd.A_x_offsets").
It looks like there are 2 possible causes:
-
the allocation is a 3D view. One dimension is the result of a
Kokkos::Maxreduction, which will return the minimum value for the type (a very large negative number for signed ints) if the policy has 0 elements. On the other hand, another dimension is numRows, so if numRows==0, the total size of A_x_offsets should also be 0. -
some value that the size depends on is uninitialized or garbage when precompute_A_x_offsets is called. Valgrind doesn't find any invalid or uninitialized reads in the unit tests though. Will try Kokkos debug+bounds checking.
-
Find reproducer
-
Add unit test
-
Fix