Skip to content

Conversation

@tsu-bin
Copy link

@tsu-bin tsu-bin commented Dec 12, 2025

Suppose I have a small helper function to create a tiled_copy automatically with optimal thread tile layout, and I want to this process happen during compile time, here is an example.

template <int thr_num, int buf_rows, int buf_cols,
    template<class> class CopyOpType, class CopyAsType, class OrigType>
auto constexpr make_tiled_cp()
{
  constexpr auto thr_rows = std::min(thr_num, buf_rows);
  constexpr auto thr_cols = thr_num / thr_rows;

  auto thr_tile = make_layout(
      make_shape(Int<thr_rows>{}, Int<thr_cols>{}), LayoutLeft{});

  return make_tiled_copy(
          Copy_Atom<CopyOpType<CopyAsType>, OrigType>{},
          thr_tile,
          Layout<Shape<Int<buf_rows/thr_rows>, Int<buf_cols/thr_cols>>>{});
}

I call this helper function inside the kernel.

constexpr auto g2s_async_cp_q = make_tiled_cp<k_thr_num, CTA_Q, HEAD_DIM, SM80_CP_ASYNC_CACHEGLOBAL, cute::uint128_t, int8_t>();

But currently make_tiled_copy lacks constexpr specifier, I think it would be helpful to add this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants