Performance regression using `^`

As reported in https://github.com/luraess/JuliaGPUPerf/issues/2 and https://github.com/luraess/JuliaGPUPerf/issues/3, there is an issue significantly affecting performance when doing `^` operation within GPU kernels.

The Int32 on Int32 case (https://github.com/luraess/JuliaGPUPerf/issues/2) may have been fixed as upon suggestion from @vchuravy by using 
```
my_pow(x, p) = ccall("llvm.powi.f32.i32", llvmcall, Float32, (Float32, Int32), x, p)
#[...]
A[ix,iy] = B[ix,iy] + s*my_pow(C[ix,iy], pow_int)
```

But the Float32 and Float64 cases are still lacking behind.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance regression using `^` #193

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance regression using ^ #193

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Performance regression using `^` #193