-
Notifications
You must be signed in to change notification settings - Fork 48
Description
@orvedahl documented some relevant tests for running Rayleigh on GPUs using OpenACC in this repository.
Before starting the actual port we should enable support for the relevant compilers in the Rayleigh build infrastructure. For Nvidia GPUs that would be the PGI compiler nvfortran. As it lacks support for quad-precision floating-point numbers, which we use in Math_Layer/Legendre_Polynomials.F90, we need to find a way around this.
There are several options to fix this:
- Rewrite
Legendre_Polynomials.F90to avoid the use of quad precision numbers. It needs to be checked if the precision is needed after all. - Implement quad-precision through an external library, such as GMP.
- Build the code using
gfortranwith its OpenACC support for Nvidia-PTX offloading. I have tested this and it works, but the downside here is that we will essentially always have to build our own compiler on each Nvidia GPU enabled cluster. - Build only
Legendre_Polynomials.F90with a different compiler (e.g.,gfortran). This is hard to do, because the.modfile format ofnvfortranandgfortranis not compatible.
I am leaning towards option 1 and if that doesn't work using option 2.
After the build system works, we should implement @orvedahl's changes to the loops and also explore if we can make use of direct GPU-to-GPU MPI communication. That would hopefully allow us to keep the data on the GPU for the whole computation, except for I/O.