-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Which component has the problem?
CuTe DSL
Bug Report
Describe the bug
A clear and concise description of what the bug is.
-
host side
in the cutlass/test/unit/gemm/device/gemm_testbed_3x_evt.hpp
the HostRowReduce struct has the param tensor_row_reduce_ and for the cuda malloc tensor_row_reduce_.resize(cutlass::Coord<1>(N_));
it only has N dimensions, the M dimensions compress to 1 -
device side
the tCrRow dimensions is (CPY,CPY_M,CPY_N,EPI_M,EPI_N) , the N and M all have dimensionsTensor mRow = make_tensor(make_gmem_ptr<ElementOutput>(params.ptr_row), make_shape(M,N,L), params.dRow); // (M,N,L) Tensor gRow_l = local_tile(mRow, take<0,2>(args.tile_shape_mnk), make_coord(m,n,_)); // (CTA_M,CTA_N,L) Tensor tCgRow = sm90_partition_for_epilogue<ReferenceSrc>( // (CPY,CPY_M,CPY_N,EPI_M,EPI_N) gRow_l(_,_,l), args.epi_tile, args.tiled_copy, args.thread_idx); Tensor tCrRow = make_tensor_like<ElementCompute>(tCgRow);
Steps/Code to reproduce bug
Follow this guide http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports to craft a minimal bug report. This helps us reproduce the issue you're having and resolve the issue more quickly.
Expected behavior
A clear and concise description of what you expected to happen.
Environment details (please complete the following information):
- Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]
Additional context
Add any other context about the problem here.