Skip to content

[Issue]: rocWMMA performance regression on SageAttention between 1.7 and 2.0 #3579

@bsyrowik

Description

@bsyrowik

Problem Description

Three issues are mentioned in a PR for SageAttention (thu-ml/SageAttention#332):

  1. Coop API changes in rocWMMA 2+
  • This is well documented, and rwfsmith posted a link to the migration docs. No action required.
  1. Windows support for rocWMMA
  • Support for this is now available through TheRock builds, and jammm posted a link to the relevant PR. No action required.
  1. ~25% performance regression for this workload between rocWMMA 1.7 and 2.0. This may be worth investigating.

According to the comment here: thu-ml/SageAttention#332 (comment) a ~25% performance regression is observed when moving from rocWMMA 1.7 to 2.0 for SageAttention. We should try to understand where this regression is coming from and what can be done to mitigate it. I am not aware of any significant known performance regressions from 1.7 to 2.0 based on our benchmarking, but I am not aware of any benchmarking efforts against the 9070 device, which is where this performance regression is observed.

Since there are some code and API changes between the two pieces of code being run, it is unclear whether this regression is truly originating from rocWMMA, or whether there is something else going on. It is possible a visual comparison of the two implementations may provide insights, otherwise we will need a self-contained reproducer that unambiguously shows this is a true performance regression in rocWMMA to investigate further.

Operating System

Linux

CPU

n/a

GPU

9070

ROCm Version

2.0

ROCm Component

rocWMMA

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions