I have the following error message in case TUNING is enabled using the hip_compile branch:
ERROR: Failed to find key entry (cudaMemset:bytes=221184:absorb,PLEGMA_Field.cu,1092) (rank 1, host jwb0001.juwels, tune.cpp:1045 in quda::TuneParam quda::tuneLaunch(quda::Tunable&, QudaTune, QudaVerbosity)())
I have tried to manually add this line to tune_cache.tsv, however that lead to other missing entries in the tune_cache.
I have tried to locate where actually the actual call of quda::tuneLaunch is happening, but so far I was not able to locate it. I
was checking in ``../lib/kernels/PLEGMA_kernel_tuner.cuh''.
I also checked that for some other kernel calls in PLEGMA the tuning seems to be performed correctly:
bytes=191102976 cudaMemcpyDeviceToHost unload,PLEGMA_Field.cu,217 32 1 1 11 1 0 -1 -1 -1 -1 0.0207882 # 0.00 Gflop/s, 9.19 GB/s, tuning took 0.143715 seconds at Wed Dec 7 15:48:16 2022
I will investigate the issue further
I have the following error message in case TUNING is enabled using the hip_compile branch:
ERROR: Failed to find key entry (cudaMemset:bytes=221184:absorb,PLEGMA_Field.cu,1092) (rank 1, host jwb0001.juwels, tune.cpp:1045 in quda::TuneParam quda::tuneLaunch(quda::Tunable&, QudaTune, QudaVerbosity)())
I have tried to manually add this line to tune_cache.tsv, however that lead to other missing entries in the tune_cache.
I have tried to locate where actually the actual call of quda::tuneLaunch is happening, but so far I was not able to locate it. I
was checking in ``../lib/kernels/PLEGMA_kernel_tuner.cuh''.
I also checked that for some other kernel calls in PLEGMA the tuning seems to be performed correctly:
bytes=191102976 cudaMemcpyDeviceToHost unload,PLEGMA_Field.cu,217 32 1 1 11 1 0 -1 -1 -1 -1 0.0207882 # 0.00 Gflop/s, 9.19 GB/s, tuning took 0.143715 seconds at Wed Dec 7 15:48:16 2022
I will investigate the issue further