Skip to content

Added benchmark for single attention layer across different sequence lengths#3929

Merged
howardzhang-cv merged 34 commits intomainfrom
gh/howardzhang-cv/19/head
Mar 9, 2026
Merged

Added benchmark for single attention layer across different sequence lengths#3929
howardzhang-cv merged 34 commits intomainfrom
gh/howardzhang-cv/19/head

Conversation

@howardzhang-cv
Copy link
Contributor

@howardzhang-cv howardzhang-cv commented Feb 21, 2026

Stack from ghstack (oldest at bottom):

Summary

  • Added new benchmark for new low precision attention API: tests a single attention layer (fp8 attention layers include the quantization kernel as part of the test)
  • Can set baseline and test models between different backends: (fa2, fa3, fa3_fp8, fa4, fa4_fp8)

Example Run

python benchmarks/prototype/attention/benchmark_sdpa.py --baseline fa3 --test fa3_fp8

[ghstack-poisoned]
[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 21, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3929

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit abf45cf with merge base 42bcdc4 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2026
@howardzhang-cv howardzhang-cv marked this pull request as draft February 21, 2026 02:49
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 23, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: f75bc1d
Pull-Request: pytorch#3929
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 24, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: f75bc1d
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 25, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: dd85756
Pull-Request: pytorch#3929
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 25, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: dd85756
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
@howardzhang-cv howardzhang-cv added module: not user facing Use this tag if you don't want this PR to show up in release notes benchmark labels Feb 25, 2026
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 25, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 9f9d973
Pull-Request: pytorch#3929
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 25, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 9f9d973
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 25, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 3d0b5ef
Pull-Request: pytorch#3929
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 26, 2026
…lengths

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 3d0b5ef
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 28, 2026
Benchmark script for measuring FP8 SDPA performance on a single
attention layer across different sequence lengths, head dimensions,
and backends. Useful for isolating kernel-level performance.

ghstack-source-id: 43da6b1
Pull-Request: pytorch#3929
@howardzhang-cv howardzhang-cv requested review from drisspg and vkuzo March 2, 2026 19:28
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 2, 2026
Benchmark script for measuring FP8 SDPA performance on a single
attention layer across different sequence lengths, head dimensions,
and backends. Useful for isolating kernel-level performance.

ghstack-source-id: 8591090
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 3, 2026
Benchmark script for measuring FP8 SDPA performance on a single
attention layer across different sequence lengths, head dimensions,
and backends. Useful for isolating kernel-level performance.

ghstack-source-id: ae37727
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 3, 2026
Benchmark script for measuring FP8 SDPA performance on a single
attention layer across different sequence lengths, head dimensions,
and backends. Useful for isolating kernel-level performance.

ghstack-source-id: ae37727
Pull-Request: pytorch#3929
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 5, 2026
Benchmark script for measuring FP8 SDPA performance on a single
attention layer across different sequence lengths, head dimensions,
and backends. Useful for isolating kernel-level performance.

ghstack-source-id: ae37727
Pull-Request: pytorch#3929
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 5, 2026
Benchmark script for measuring FP8 SDPA performance on a single
attention layer across different sequence lengths, head dimensions,
and backends. Useful for isolating kernel-level performance.

ghstack-source-id: ae37727
Pull-Request: pytorch#3929
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@howardzhang-cv howardzhang-cv changed the base branch from gh/howardzhang-cv/19/base to main March 9, 2026 22:18
@howardzhang-cv howardzhang-cv merged commit 1b920d0 into main Mar 9, 2026
36 checks passed
@howardzhang-cv howardzhang-cv deleted the gh/howardzhang-cv/19/head branch March 9, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants