Skip to content

[DOC] Need clarity on SGV Layout in new Xe re-architecture #607

@sspintel

Description

@sspintel

Hi, I have several questions relating to the Copy Atoms. Opened this issue to seek clarification about them

1/
Image

From this representation, I could understand that this 8x4 block load needs 2 subgroups of 16 threads to operate. However, it's confusing what the indices in the "Subgroup View" represent. If they stand for the linear indices for that block in memory, I don't get why it is called a Subgroup View.

2/
Are there any utilities to visualize SG-V partitioning similar to the tools to visualize TV partitioning in CuTe?

3/
Under the "Subgroup Scope and Thread-Local Data" section of the Xe-rearch documentation:

DPAS and block 2D copy atoms are subgroup operations, meaning that all 16 threads of the subgroup collectively execute these operations, and collectively own all input/output data.

If we take a block 2d copy operation like XE_LOAD_2D_TRANSPOSE of bits 32, height 32, and width 8, it is not immediately clear to me if it gets executed by a single work-item or a single sub-group or multiple sub-groups. If we go by the logic in the example above, then we have 32x8 = 256 elements to copy and we should need 256/16 = 16 sub-groups, where each work-item handles loading of 16 elements, is that so?

cc @petercad @mkumargarg

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions