Skip to content

[BUG] Sm90RowReduction struct tCrRow is not suit for the host side param tensor_row_reduce dimensions #2864

@muyudy

Description

@muyudy

Which component has the problem?

CuTe DSL

Bug Report

Describe the bug
A clear and concise description of what the bug is.

  1. host side
    in the cutlass/test/unit/gemm/device/gemm_testbed_3x_evt.hpp
    the HostRowReduce struct has the param tensor_row_reduce_ and for the cuda malloc tensor_row_reduce_.resize(cutlass::Coord<1>(N_));
    it only has N dimensions, the M dimensions compress to 1

  2. device side
    the tCrRow dimensions is (CPY,CPY_M,CPY_N,EPI_M,EPI_N) , the N and M all have dimensions

      Tensor mRow = make_tensor(make_gmem_ptr<ElementOutput>(params.ptr_row), make_shape(M,N,L), params.dRow); // (M,N,L)
     Tensor gRow_l = local_tile(mRow, take<0,2>(args.tile_shape_mnk), make_coord(m,n,_));             // (CTA_M,CTA_N,L)
     Tensor tCgRow = sm90_partition_for_epilogue<ReferenceSrc>(                         // (CPY,CPY_M,CPY_N,EPI_M,EPI_N)
       gRow_l(_,_,l), args.epi_tile, args.tiled_copy, args.thread_idx);
     Tensor tCrRow = make_tensor_like<ElementCompute>(tCgRow); 
    

Steps/Code to reproduce bug
Follow this guide http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports to craft a minimal bug report. This helps us reproduce the issue you're having and resolve the issue more quickly.

Expected behavior
A clear and concise description of what you expected to happen.

Environment details (please complete the following information):

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions