Activity
Updated oneTBB version in driver install instructions
Updated oneTBB version in driver install instructions
Fixed dual CU and IPC reporting on AMD RDNA1-4 GPUs, updated AMD driv…
Fixed dual CU and IPC reporting on AMD RDNA1-4 GPUs, updated AMD driv…
Fixed compiler warning with min_int
Fixed compiler warning with min_int
Disabled native dp4a in Intel CPU Runtime for OpenCL because it is sl…
Disabled native dp4a in Intel CPU Runtime for OpenCL because it is sl…
Disabled native dp4a in Intel CPU Runtime for OpenCL because it is sl…
Disabled native dp4a in Intel CPU Runtime for OpenCL because it is sl…
Updated OpenCL driver installation guide and links
Updated OpenCL driver installation guide and links
Fixed missing <chrono> header on some compilers
Fixed missing <chrono> header on some compilers
Also allow to use #undef in OpenCL C code
Also allow to use #undef in OpenCL C code
Fixed broken CL_DEVICE_OPENCL_C_ALL_VERSIONS on AMD GPUs
Fixed broken CL_DEVICE_OPENCL_C_ALL_VERSIONS on AMD GPUs
OpenCL kernels are now compiled for latest supported OpenCL C standar…
OpenCL kernels are now compiled for latest supported OpenCL C standar…
Fixed compiling on macOS with new OpenCL headers
Fixed compiling on macOS with new OpenCL headers
Renamed def_workgroup_size to cl_workgroup_size
Renamed def_workgroup_size to cl_workgroup_size
Better detection for nvidia__64_cores_per_cu
Better detection for nvidia__64_cores_per_cu
Updated OpenCL headers, better device detection using vendor ID and N…
Updated OpenCL headers, better device detection using vendor ID and N…
Faster enqueueReadBuffer on modern CPUs with 64-Byte-aligned host_buf…
Faster enqueueReadBuffer on modern CPUs with 64-Byte-aligned host_buf…
Better VRAM capacity reporting correction for Intel dGPUs
Better VRAM capacity reporting correction for Intel dGPUs
Fixed TFlops estimate for Intel Battlemage GPUs
Fixed TFlops estimate for Intel Battlemage GPUs
Fixed broken make.sh compile script
Fixed broken make.sh compile script
Automatically use zero-copy buffers on CPUs/iGPUs to reduce memory fo…
Automatically use zero-copy buffers on CPUs/iGPUs to reduce memory fo…
Enabled basic FP16 vector arithmetic support on Nvidia Pascal and new…
Enabled basic FP16 vector arithmetic support on Nvidia Pascal and new…
Fixed maximum buffer allocation size limit for AMD GPUs
Fixed maximum buffer allocation size limit for AMD GPUs
Fixed maximum buffer allocation size limit in Intel CPU Runtime for O…
Fixed maximum buffer allocation size limit in Intel CPU Runtime for O…