Skip to content

Work Group Size Optimization for Tuning #603

@EJainDev

Description

@EJainDev

Larger work group sizes always result in better performance unless a larger work group size leads to less compute units being used. Why not calculate the work group size as the following:

WGS_Total = max(min(Num_Threads / Compute_Units, Max_WGS), Sub_Group_Size)

This reduces the number of parameters to explore while maintaining maximum performance, which improves tuning time.

For GEMM specifically, the ratio of WGS_M to WGS_N should be as close to the ratio of M to N as possible.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions