Description
In a recent investigation of the performance of Chapel vs Fortran (especially Intel Fortran), I found that by enabling clang/LLVM's auto vectorization with a vector library Chapel code could be further vectorized, especially Math library routines like log2
and pow
(which normally are not vectorizable).
This is achievable today in Chapel through some heroics, but only using the C backend.
chpl --fast --no-ieee-float --ccflags -fveclib=...
Note that --no-ieee-float
is required, otherwise due to floating point precision constraints vectorization will be limited. I am also focusing here on -fveclib
, the clang flag. GCC has a similar flag named -mveclibabi
.
Using -fveclib
allowed me to compile Chapel code against SVML (--ccflags -fveclib=SVML
) and gain about a 1.3x performance boost. With SVML, I additionally had to add -L/path/to/SVML/lib -lsvml
to avoid linking issues. I could also link against other available vector libraries like libmvec
for similar performance (--ccflags -fveclib=libmvec
). This did not require extra linking args (because clang/gcc already know how to find libmvec).
However, all of this requires the C backend. When using the LLVM backend, fveclib
is not respected.
I think we should add a Chapeltastic flag, like --vector-library=...
to achieve this. With the C backend using clang, this would be equivalent to --ccflags -fveclib
. With the C backend using gcc, -mveclibabi
. With LLVM, we just need to thread in the right options. There is actually already a TODO in clangUtil.cpp
about this.
There a few design questions here
- What should the new flag name be?
- What should the behavior be if the backend compiler doesn't support
-fveclib
or the requested library - Should we do some validation of the requested library, or just pass whatever we get along to the backend.
- Should we try and come up with portable names across compilers?
- e.g., in GCC the name is
svml
and in clang the name isSVML
, with the wrong casing being rejected by the backend.
- e.g., in GCC the name is