[ROCm] fixes ambiguous calls to shfl*
where there is no explicit type conversion from c10::Half
to __half
#190
Job | Run time |
---|---|
8s | |
8s |