[ROCm] fixes ambiguous calls to shfl*
where there is no explicit type conversion from c10::Half
to __half
#190
Job | Run time |
---|---|
3m 29s | |
3m 29s | |
4m 54s | |
4m 53s | |
16m 45s |