Skip to content

Conversation

@vpietila-amd
Copy link
Contributor

Proposed changes

Refactored the CK Builder convolution signature such that

  • data type
  • data type for computation
  • layout
  • elementwise operation

are defined per tensor. This refactoring allows us build instances for complex fused operations such as scale-add-scale-add-relu. I added a fwd conv builder test that demonstrates how the complex ops benefit from the new signature design pattern. At the high level, the signature is composed as follows

┌─────────────────────────────────────────────────────────┐
│                    ConvSignature                        │
├─────────────────────────────────────────────────────────┤
│ Properties:                                             │
│   • spatial_dim: int           (1D, 2D, or 3D)          │
│   • direction: ConvDirection   (Fwd/BwdData/BwdWeight)  │
│   • data_type: DataType        (default data type)      │
│   • accumulation_data_type: DataType                    │
│   • input: ConvTensor          ──┐                      │
│   • weight: ConvTensor         ──│                      │
│   • output: ConvTensor         ──│                      │
└──────────────────────────────────┼──────────────────────┘
                                   │
                                   ▼
              ┌─────────────────────────────────────────┐
              │           ConvTensor                    │
              ├─────────────────────────────────────────┤
              │ ╔═════════════════════════════════════╗ │
              │ ║ TensorConfig (required)             ║ │
              │ ╠═════════════════════════════════════╣ │
              │ ║  • layout: ConvLayout               ║ │
              │ ║  • data_type: DataType (optional)   ║ │
              │ ║  • compute_type: DataType (optional)║ │
              │ ╚═════════════════════════════════════╝ │
              │                                         │
              │ ┌─────────────────────────────────────┐ │
              │ │ TensorOperation (optional)          │ │
              │ ├─────────────────────────────────────┤ │
              │ │  • elementwise_operation            │ │
              │ │  • auxiliary_operand_configs[]      │ │
              │ │    (each is also ConvTensor)  ◄───────┼─┐
              │ └─────────────────────────────────────┘ │ │
              └─────────────────────────────────────────┘ │
                                                          │
                                 Recursive ───────────────┘

Design Points:

  • ConvSignature contains three ConvTensor instances (input, weight, output)
  • All tensors share the same ConvTensor structure
  • Each ConvTensor has:
    • TensorConfig (required): Defines layout as well as optional data and compute type overrides
    • TensorOperation (optional): Defines fused elementwise operations
  • Auxiliary operands (e.g., bias) in TensorOperation also use the ConvTensor type

The convolution traits and descriptors do not fully utilize the new structure, but all tests are passing. They could be refactored separately.

@vpietila-amd vpietila-amd changed the title [CK_TILE] Refactor convolution signature to provide data type/layout/elementwis op per tensor [CK_TILE] Refactor convolution signature to provide data type/layout/elementwise op per tensor Dec 1, 2025
@@ -0,0 +1,244 @@
# Composable Kernel Builder Design Documentation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice documentation! Thanks for taking time to create this!


enum class DataType
{
UNDEFINDED = 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: UNDEFINED

Only fix if you need to make other changes.

// G: Group, N: Batch, K: Output Channel, C: Input Channel, W: Width
// Enum defines Input, Weight, and Output tensor layouts respectively.
enum class GroupConvLayout1D
enum class TensorLayout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty clean. Another strategy I sometimes see with flat lists of enums like this is to add extra enum values for start and end of ranges of grouped elements to enable classification checks with math. We can add that later if it helps logic, but concepts and lists with constexpr std::array<TensorLayout> can also be used if we need more structure.

inline std::ostream&
operator<<(std::ostream& os,
const std::variant<GroupConvLayout1D, GroupConvLayout2D, GroupConvLayout3D>& layout)
inline std::ostream& operator<<(std::ostream& os, TensorLayout layout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to have a "toString" function with the switch case, and then use that to define ostream printing.

│ ConvTensor │
├─────────────────────────────────────────┤
│ ╔═════════════════════════════════════╗ │
│ ║ TensorConfig (required) ║ │
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why we need the TensorConfig wrapper instead of just directly having a layout, datatype, and compute_type for the tensor.

EXPECT_EQ(Traits::direction, ck_tile::builder::ConvDirection::FORWARD);
EXPECT_EQ(Traits::layout, ck_tile::builder::GroupConvLayout2D::GNHWC_GKYXC_GNHWK);
EXPECT_THAT(Traits::layout,
::testing::ElementsAre(ck_tile::builder::TensorLayout::GNHWC,
Copy link
Collaborator

@shumway shumway Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: generally add a using declaration at the top of the test file:

using ::testing::ElementsAre

There are a lot of advantages to that, including shorter code and documenting what helpers you are importing.

(The same goes for test utils, prefer using declarations instead of importing entire namespaces. Once you import an entire namespace, it can quickly become difficult to deduce which functions you are using or even which function is being called.)

using ::ck_tile::builder::factory::internal::GetTensorLayout;
using ::ck_tile::builder::factory::internal::LayoutToCK;

using namespace ::ck_tile::builder::test;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much better to have specific using delarations for symbols in a namespace instead of importing everything from a namespace.

using ::ck_tile::builder::factory::internal::LayoutToCK;

using namespace ::ck_tile::builder::test;
using enum ::ck_tile::builder::ConvDirection;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much cleaner to only have using enum in a function scope (even if it's repeated many times) rather than have it at file scope.

TEST(AuxiliaryTensorLayout, AssignsLayoutForG_C_strided)
{
using CKLayout = LayoutToCK<TensorLayout::G_C_strided>::type;
EXPECT_TRUE((std::is_same_v<CKLayout, ck::tensor_layout::convolution::G_C>));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@shumway shumway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I added a lot of comments, but you don't have to implement them unless you end up needing to do more edits (no reason to run this through testing again). The feedback is more for incremental improvement and to help us converge on common best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants