Dear DaCe team,
We encountered a bug in the codegen (SDFG linked and repro' below).
The quirk is that a tasklet takes a np.float32 scalar but is given a np.float64 - this triggers a malformed cast inside the kernel on the callsite of a nested function, e.g.
void nested_sdfg_0_0_3(const double& scalar, float* __restrict__ out_field, int __i, int __j);
__global__ void __launch_bounds__(32) horizontal_loop_139965476328208_0_0_5(float * __restrict__ stencil_out, const double stencil_scalar) {
#code removed
nested_sdfg_0_0_3(*(float *)(&stencil_scalar), &stencil_out[0], __i, __j);
#--------------------------------^ bad idea
}
Rep:
import dace
sdfg = dace.SDFG.from_file("./bad_cast_on_scalar.sdfgz.zip")
sdfg.compile()
Dear DaCe team,
We encountered a bug in the codegen (SDFG linked and repro' below).
The quirk is that a tasklet takes a
np.float32scalar but is given anp.float64- this triggers a malformed cast inside the kernel on the callsite of a nested function, e.g.Rep: