Skip to content

Added benchmarking for new torchao low precision attention api#3865

Merged
howardzhang-cv merged 46 commits intomainfrom
gh/howardzhang-cv/17/head
Mar 9, 2026
Merged

Added benchmarking for new torchao low precision attention api#3865
howardzhang-cv merged 46 commits intomainfrom
gh/howardzhang-cv/17/head

Conversation

@howardzhang-cv
Copy link
Contributor

@howardzhang-cv howardzhang-cv commented Feb 12, 2026

Stack from ghstack (oldest at bottom):

Summary

  • Added new benchmark for new low precision attention API
  • Can set baseline and test models between different backends: (fa2, fa3, fa3_fp8, fa4, fa4_fp8)
  • uses flux.1-schnell model, 4 inference steps, DrawBench prompts
  • has options to control number of prompts, torch.compile usage, warmup_iters, using debug prompts, number of inference steps, rope fusion
  • Following the guidelines of add performance and accuracy eval of flux-1.schnell #3502

Example Run

python benchmarks/prototype/attention/eval_flux_model.py --baseline fa3 --test fa3_fp8 --compile

[ghstack-poisoned]
[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 12, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3865

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6065db8 with merge base 42bcdc4 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 12, 2026
howardzhang-cv added a commit that referenced this pull request Feb 12, 2026
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 66939a9
Pull-Request: #3865
@howardzhang-cv howardzhang-cv added benchmark module: not user facing Use this tag if you don't want this PR to show up in release notes labels Feb 12, 2026
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Feb 12, 2026
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: dd15b60
Pull-Request: #3865
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Feb 13, 2026
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 4f64a25
Pull-Request: #3865
[ghstack-poisoned]
[ghstack-poisoned]
@howardzhang-cv howardzhang-cv marked this pull request as draft February 21, 2026 02:49
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 23, 2026
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 2c5edc7
Pull-Request: pytorch#3865
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Feb 24, 2026
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
ghstack-source-id: 2c5edc7
Pull-Request: pytorch#3865
[ghstack-poisoned]
[ghstack-poisoned]
@howardzhang-cv howardzhang-cv requested review from drisspg and vkuzo March 2, 2026 19:28
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 2, 2026
Benchmark script for evaluating FP8 attention on Flux (text-to-image)
models. Measures image generation quality and runtime performance with
and without low-precision attention applied.

ghstack-source-id: 0078782
Pull-Request: pytorch#3865
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 3, 2026
Benchmark script for evaluating FP8 attention on Flux (text-to-image)
models. Measures image generation quality and runtime performance with
and without low-precision attention applied.

ghstack-source-id: f287aed
Pull-Request: pytorch#3865
[ghstack-poisoned]
[ghstack-poisoned]
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 3, 2026
Benchmark script for evaluating FP8 attention on Flux (text-to-image)
models. Measures image generation quality and runtime performance with
and without low-precision attention applied.

ghstack-source-id: f287aed
Pull-Request: pytorch#3865
@vkuzo
Copy link
Contributor

vkuzo commented Mar 3, 2026

do you want to eventually compare this with results from quantizing linear, moe, etc? if yes, could just build into existing scripts from the start, if no then separate script seems fine

howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 5, 2026
Benchmark script for evaluating FP8 attention on Flux (text-to-image)
models. Measures image generation quality and runtime performance with
and without low-precision attention applied.

ghstack-source-id: f287aed
Pull-Request: pytorch#3865
howardzhang-cv added a commit to howardzhang-cv/ao that referenced this pull request Mar 5, 2026
Benchmark script for evaluating FP8 attention on Flux (text-to-image)
models. Measures image generation quality and runtime performance with
and without low-precision attention applied.

ghstack-source-id: f287aed
Pull-Request: pytorch#3865
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@howardzhang-cv howardzhang-cv changed the base branch from gh/howardzhang-cv/17/base to main March 9, 2026 22:03
@howardzhang-cv howardzhang-cv merged commit 1b390fb into main Mar 9, 2026
36 checks passed
@howardzhang-cv howardzhang-cv deleted the gh/howardzhang-cv/17/head branch March 9, 2026 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants