Skip to content

[Flang][RT][OpenMP] system_clock behavior with multiple threads #74746

Open
@vzakhari

Description

@vzakhari

Reproducer:

program main
  use omp_lib
  integer, parameter :: n = 1000000
  integer(8) :: t1, t2, rate
  real :: a(n), b(n)
  real(8) :: start, end
  b = 1.0
  call system_clock(t1)
  start = omp_get_wtime()
  call work(a, b, n)
  call system_clock(t2, rate)
  end = omp_get_wtime()
  print *, 'system_clock: ', dble(t2-t1)/dble(rate)
  print *, 'omp_get_wtime: ', end-start
contains
  subroutine work(a, b, n)
    real :: a(:), b(:)
    !$omp parallel do
    do i=1,n
       a(i) = sin(b(i))
    end do
    !$omp end parallel do
  end subroutine work
end program main

Compile with: flang-new -fopenmp clock.f90 -O0
Run with OMP_NUM_THREADS=1 ./a.out and OMP_NUM_THREADS=128 ./a.out

In the multithreaded case, the program prints the wall time multiplied by the number of threads. I guess this might be an acceptable behavior, but many other compilers return the wall time.

I think this might be related to our preference of CLOCK_PROCESS_CPUTIME_ID over other timers:

#elif defined CLOCK_PROCESS_CPUTIME_ID

Note that gcc, for example, tries to use CLOCK_MONOTONIC and then CLOCK_REALTIME: https://github.com/gcc-mirror/gcc/blob/9693459e030977d6e906ea7eb587ed09ee4fddbd/libgfortran/intrinsics/system_clock.c#L39

@rovka , @Leporacanthicus, you've made changes in this code - would you agree that we should change the order of the ifdefs to match other compilers?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions