Generate GPU L2 size inputs in flash_attention Triton Bench #681

sryap · 2025-12-03T21:00:50Z

Summary:
Add an option to generate inputs as large as the GPU L2 cache size to
avoid an explicit cache clearing in every iteration

Differential Revision: D85318459

meta-codesync · 2025-12-03T21:00:57Z

@sryap has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85318459.

Summary: Add an option to generate inputs as large as the GPU L2 cache size to avoid an explicit cache clearing in every iteration Reviewed By: henrylhtsang Differential Revision: D85318459

sryap had a problem deploying to docker-s3-upload December 3, 2025 21:00 — with GitHub Actions Failure

meta-cla bot added the cla signed label Dec 3, 2025

meta-codesync bot added fb-exported meta-exported labels Dec 3, 2025

henrylhtsang approved these changes Dec 3, 2025

View reviewed changes

facebook-github-bot force-pushed the export-D85318459 branch from ee255d6 to 8f08261 Compare December 3, 2025 22:20

facebook-github-bot had a problem deploying to docker-s3-upload December 3, 2025 22:20 — with GitHub Actions Error

facebook-github-bot force-pushed the export-D85318459 branch from 8f08261 to 2d443e1 Compare December 3, 2025 22:26

facebook-github-bot temporarily deployed to docker-s3-upload December 3, 2025 22:26 — with GitHub Actions Inactive

facebook-github-bot had a problem deploying to docker-s3-upload December 3, 2025 22:26 — with GitHub Actions Failure

Generate GPU L2 size inputs in flash_attention Triton Bench (#681)

560e2b9

Summary: Add an option to generate inputs as large as the GPU L2 cache size to avoid an explicit cache clearing in every iteration Reviewed By: henrylhtsang Differential Revision: D85318459

facebook-github-bot force-pushed the export-D85318459 branch from 2d443e1 to 560e2b9 Compare December 4, 2025 00:33

facebook-github-bot temporarily deployed to docker-s3-upload December 4, 2025 00:33 — with GitHub Actions Inactive

facebook-github-bot had a problem deploying to docker-s3-upload December 4, 2025 00:33 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generate GPU L2 size inputs in flash_attention Triton Bench #681

Generate GPU L2 size inputs in flash_attention Triton Bench #681

sryap commented Dec 3, 2025

Uh oh!

meta-codesync bot commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Generate GPU L2 size inputs in flash_attention Triton Bench #681

Are you sure you want to change the base?

Generate GPU L2 size inputs in flash_attention Triton Bench #681

Conversation

sryap commented Dec 3, 2025

Uh oh!

meta-codesync bot commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants