Testing on GPU

1a. to get gcc to do the same as `-fp-model=strict`, one uses
`-ffp-contract=off`.

1b. to get cuda to do the same, one adds `--fmad=false` to `cuda_args` in
`nvcc_wrapper`.

2. in addition, team reductions and scans will still diff. hommexx serializes
these with a wrapper. however, these account for too much of the code in p3, so
we can't use this route w/o losing tons of testing of important ||ism.

thus, for gpu tests, we should do non-bfb testing and set an appropriate tol. we
can augment the tolerance specification in run_and_cmp with something like
```
  if (OnGpu<Kokkos::DefaultExecutionSpace>::value)
    tol = 10*std::numeric_limits<Real>::epsilon();
```
we can still use 1a and 1b to get the diffs down to just the reduction/scan
ones.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testing on GPU #75

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Testing on GPU #75

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions