[Flang][RT][OpenMP] system_clock behavior with multiple threads

Reproducer:
```
program main
  use omp_lib
  integer, parameter :: n = 1000000
  integer(8) :: t1, t2, rate
  real :: a(n), b(n)
  real(8) :: start, end
  b = 1.0
  call system_clock(t1)
  start = omp_get_wtime()
  call work(a, b, n)
  call system_clock(t2, rate)
  end = omp_get_wtime()
  print *, 'system_clock: ', dble(t2-t1)/dble(rate)
  print *, 'omp_get_wtime: ', end-start
contains
  subroutine work(a, b, n)
    real :: a(:), b(:)
    !$omp parallel do
    do i=1,n
       a(i) = sin(b(i))
    end do
    !$omp end parallel do
  end subroutine work
end program main
```

Compile with: flang-new -fopenmp clock.f90 -O0
Run with `OMP_NUM_THREADS=1 ./a.out` and `OMP_NUM_THREADS=128 ./a.out`

In the multithreaded case, the program prints the wall time multiplied by the number of threads.  I guess this might be an acceptable behavior, but many other compilers return the wall time.

I think this might be related to our preference of `CLOCK_PROCESS_CPUTIME_ID` over other timers: https://github.com/llvm/llvm-project/blob/e3720bbc088d904ed7fad9ad1a4db294d2bcfc05/flang/runtime/time-intrinsic.cpp#L64

Note that gcc, for example, tries to use `CLOCK_MONOTONIC` and then `CLOCK_REALTIME`: https://github.com/gcc-mirror/gcc/blob/9693459e030977d6e906ea7eb587ed09ee4fddbd/libgfortran/intrinsics/system_clock.c#L39

@rovka , @Leporacanthicus, you've made changes in this code - would you agree that we should change the order of the ifdefs to match other compilers?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flang][RT][OpenMP] system_clock behavior with multiple threads #74746

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Flang][RT][OpenMP] system_clock behavior with multiple threads #74746

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions