Summary
Do we expect all the ctests to work with the oneAPI backend on Intel PVCs? I was testing the unit tests on Sunspot at Argonne National Lab and we had 4 fails in the unit tests:
The following tests FAILED:
983 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
984 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
1967 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
1968 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
Version
oneMath commit cf9c3cb
Environment
This is running on Sunspot at Argonne National Lab.
- HW you use: PVC
- Backend library version: oneMKL from Intel oneAPI 2025.2
- OS name and version: SLES 15.4
- Compiler version: Intel oneAPI 2025.2 and UMd/KMD 1146.12
- CMake output log: Build is below, hopefully it's enough
Steps to reproduce
This is the build I use on Aurora:
#!/bin/bash -ex
blas_root=$PWD
# first get lapack
rm -rf lapack
git clone https://github.com/Reference-LAPACK/lapack.git
cd lapack/
mkdir build
cd build/
cmake -DLAPACKE=ON -DBUILD_INDEX64=ON -DBUILD_SHARED_LIBS=ON -DCBLAS=ON -DCMAKE_INSTALL_LIBDIR=${blas_root} -DCMAKE_INSTALL_PREFIX=${blas_root} ..
cmake --build . -j48 --target install
cd ../
rm -rf build
mkdir build
cd build
cmake -DLAPACKE=ON -DBUILD_INDEX64=OFF -DBUILD_SHARED_LIBS=ON -DCBLAS=ON -DCMAKE_INSTALL_LIBDIR=${blas_root} -DCMAKE_INSTALL_PREFIX=${blas_root} ..
cmake --build . -j48 --target install
cd ../..
# build
rm -rf oneMath
git clone https://github.com/uxlfoundation/oneMath.git
cd oneMath
rm -rf build
mkdir build
cd build
cmake ../ -DMKL_ROOT=$MKLROOT -DREF_BLAS_ROOT=${blas_root} -DENABLE_MKLCPU_BACKEND=FALSE -DREF_LAPACK_ROOT=${blas_root}
cmake --build . -j48
ctest --no-tests=error
Observed behavior
The output of ctest --no-tests=error is:
The following tests FAILED:
983 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
984 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
1967 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
1968 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)
Errors while running CTest
Expected behavior
I expect all the unit tests to pass, but maybe these are not supported on PVC or some other reason?
Thanks!
Summary
Do we expect all the ctests to work with the oneAPI backend on Intel PVCs? I was testing the unit tests on Sunspot at Argonne National Lab and we had 4 fails in the unit tests:
The following tests FAILED: 983 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) 984 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) 1967 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) 1968 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed)Version
oneMath commit cf9c3cb
Environment
This is running on Sunspot at Argonne National Lab.
Steps to reproduce
This is the build I use on Aurora:
Observed behavior
The output of
ctest --no-tests=erroris:The following tests FAILED: 983 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) 984 - BLAS/RT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) 1967 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Column_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) 1968 - BLAS/CT/GraphGemmBatchUsmTestSuite/GraphGemmBatchUsmTests.RealSinglePrecision/Row_Major_Intel_R__Data_Center_GPU_Max_1550 (Failed) Errors while running CTestExpected behavior
I expect all the unit tests to pass, but maybe these are not supported on PVC or some other reason?
Thanks!