In the reduction function, there is a point where we may set intermediate values that may play a role in the reduction to a value of 0. This is done here.
This is technically a bug
While this isn't a problem in the timestep calculation, it could be a problem if we ever reuse the machinery anywhere else. In reality, we should set the new minimum to the equivalent of std::numeric_limits::lowest(). When Real is a float this provides -FLT_MAX and when Real is a double, it provides -DBL_MAX.
In reality there are 2-ish options solutions
- we could directly use
std::numeric_limits<Real>::lowest(). There is a bit of a compatibility issue with CUDA (but I'm pretty sure there isn't an issue with HIP). We would need to either:
- compile cuda with the
--expt-relaxed-constexpr flag. This is technically an experimental feature, but I think just about everyone uses it these days. It lets you use constexpr functions on the device
- we could look into using libc++
- We can add custom logic to pick between
-FLT_MAX and -DBL_MAX based on the way that Real has been defined
I would personally prefer option 1, since there are a few other things from the C++ standard library that I have reached for in the past, that would have simplified some code.
In the reduction function, there is a point where we may set intermediate values that may play a role in the reduction to a value of 0. This is done here.
This is technically a bug
While this isn't a problem in the timestep calculation, it could be a problem if we ever reuse the machinery anywhere else. In reality, we should set the new minimum to the equivalent of std::numeric_limits::lowest(). When
Realis afloatthis provides-FLT_MAXand whenRealis adouble, it provides-DBL_MAX.In reality there are 2-ish options solutions
std::numeric_limits<Real>::lowest(). There is a bit of a compatibility issue with CUDA (but I'm pretty sure there isn't an issue with HIP). We would need to either:--expt-relaxed-constexprflag. This is technically an experimental feature, but I think just about everyone uses it these days. It lets you useconstexprfunctions on the device-FLT_MAXand-DBL_MAXbased on the way thatRealhas been definedI would personally prefer option 1, since there are a few other things from the C++ standard library that I have reached for in the past, that would have simplified some code.