Skip to content

🚀[FEA]: Add FLARE: Fast Low-rank Attention Routing Engine to model zoo #1150

@vpuri3

Description

@vpuri3

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Low (would be nice)

Please provide a clear description of problem you would like to solve.

I wanted to share our latest paper, titled "FLARE: Fast Low-rank
Attention Routing Engine."

https://arxiv.org/abs/2508.12594

FLARE is a novel self-attention mechanism that learns a low-rank attention formulation that can be applied in linear time. FLARE achieves superior accuracy across diverse neural PDE surrogate benchmarks, and scales to unprecedented problem sizes (1 million tokens on a single GPU). Our code is available below.

https://github.com/vpuri3/FLARE.py

FLARE is built entirely from standard fused attention primitives and does not need any custom kernels. You can find the code for FLARE at the link below. I am happy to contribute a PR to implement FLARE.

https://github.com/vpuri3/FLARE.py/blob/master/pdebench/models/flare.py

Describe any alternatives you have considered

No response

Metadata

Metadata

Assignees

Labels

? - Needs TriageNeed team to review and classifyenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions