-
Notifications
You must be signed in to change notification settings - Fork 936
Open
Labels
Milestone
Description
Thank you for taking the time to submit an issue!
Background information
A key NOAA/NCEP data assimilation Fortran code (GSI) uses quad precision in some sections of the code. IMPI reductions return valid answers when using Intel OneAPI MPI and cray-mpich. OpenMPI does not work and returns incorrect answers when built with either gnu gcc/gfortran or Intel OneAPI ifx/ifx.
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
v5.0.9
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
OpenMPI installed from recent github tarball and built with gcc and gfortran v12.4.0. Fortran test case built with gfortran.
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
Please describe the system on which you are running
- Operating system/version: Rocky Linux 9.4 (Blue Onyx)
- Computer hardware: AMD Genoa 9654
- Network type: IB NDR200
Details of the problem
Fortran test case attached. Run using 2 MPI ranks. When run with Intel OneAPI we see the following output (double and quad precision variables included, double precision is correct):
rank= 1 input = 6.28318530717958647692528676655901
rank= 0 input = 3.14159265358979323846264338327950
double precision result rank= 0 output= 9.42477796076938
quad precision result rank= 0 output= 9.42477796076937971538793014983851
OneAPI result= 9.42477796076937971538793014983851
When run with gfortran and OpenMPI:
rank= 1 input = 6.28318530717958647692528676655900559
rank= 0 input = 3.14159265358979323846264338327950280
double precision result rank= 0 output= 9.4247779607693793
quad precision result rank= 0 output= 6.28318530717960068778000196856272302
OneAPI result= 9.42477796076937971538793014983851
Source code:
! test for quad precision (real16) mpi_allreduce
! [email protected]
program test_reduce_sum_real16
use, intrinsic :: iso_fortran_env, only : real64, real128
use, intrinsic :: iso_c_binding
use mpi_f08
implicit none
integer, parameter :: n = 1
integer, parameter :: r_quad = selected_real_kind(33)
real(kind=real64) :: output64, input64
real(kind=real128) :: output, input
! real(kind=real128), parameter :: output_ref_impi = 9.42477796076937971538793014983851Q0
real(kind=real128), parameter :: pi = ACOS(-1.0Q0)
real(kind=real64), parameter :: pi64 = ACOS(-1.0D0)
integer :: me, np, ierror
call MPI_Init()
call MPI_Comm_rank(MPI_COMM_WORLD,me)
call MPI_Comm_size(MPI_COMM_WORLD,np)
! Initialize data
output = 0.0Q0
output64 = 0.0D0
input = pi * real(me+1, kind=real128)
input64 = pi64 * real(me+1, kind=real64)
print *, 'rank=',me,' input =',input
call MPI_Allreduce(input, output, n, MPI_REAL16, MPI_SUM, MPI_COMM_WORLD, ierror)
call MPI_Allreduce(input64, output64, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierror)
if (me == 0) then
print *, 'double precision result rank=',me,' output=',output64
print *, 'quad precision result rank=',me,' output=',output
print *, 'OneAPI result= 9.42477796076937971538793014983851'
endif
call MPI_Finalize()
end program test_reduce_sum_real16