Add inductor_flex_attention_bwd operator by OmarPavel · Pull Request #940 · meta-pytorch/tritonbench

OmarPavel · 2026-03-10T23:13:15Z

Summary:
Add TritonBench operator to benchmark the flex attention backward pass
inductor kernel (triton_tem_fused_flex_attention_backward_zeros_1).

Uses FWD_ONLY=True but manually times backward via
output.backward(dy, retain_graph=True). Compares aten (eager) vs
inductor (torch.compile). Backward FLOP count uses 2.5x multiplier
(2.0 bwd + 0.5 recompute).

Default config: B=8, H=16, D=128, bf16, requires_grad=True on q/k/v.

Reviewed By: stashuk-olek

Differential Revision: D95461827

Summary: Add TritonBench operator to benchmark the flex attention forward pass inductor kernel (triton_tem_fused_flex_attention_0). Compares aten (eager flex_attention) vs inductor (torch.compile) with causal mask, sweeping seq_len from 128 to 16384. Reports latency, speedup, and tflops (adjusted for block sparsity). Default config: B=8, H=16, D=128, bf16. Reviewed By: stashuk-olek Differential Revision: D95461825

Summary: Add TritonBench operator to benchmark the flex attention backward pass inductor kernel (triton_tem_fused_flex_attention_backward_zeros_1). Uses FWD_ONLY=True but manually times backward via output.backward(dy, retain_graph=True). Compares aten (eager) vs inductor (torch.compile). Backward FLOP count uses 2.5x multiplier (2.0 bwd + 0.5 recompute). Default config: B=8, H=16, D=128, bf16, requires_grad=True on q/k/v. Reviewed By: stashuk-olek Differential Revision: D95461827

meta-codesync · 2026-03-10T23:13:41Z

@OmarPavel has exported this pull request. If you are a Meta employee, you can view the originating Diff in D95461827.

omarpavelmeta added 2 commits March 10, 2026 16:13

OmarPavel had a problem deploying to docker-s3-upload March 10, 2026 23:13 — with GitHub Actions Failure

meta-cla Bot added the cla signed label Mar 10, 2026

meta-codesync Bot added fb-exported meta-exported labels Mar 10, 2026

xuzhao9 mentioned this pull request Mar 16, 2026

Add inductor_flex_attention_fwd operator #939

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add inductor_flex_attention_bwd operator#940

Add inductor_flex_attention_bwd operator#940
OmarPavel wants to merge 2 commits intomainfrom
export-D95461827

OmarPavel commented Mar 10, 2026

Uh oh!

meta-codesync Bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

OmarPavel commented Mar 10, 2026

Uh oh!

meta-codesync Bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants