Skip to content

Test comparing attention weights fails for FlashInfer if q_len=1 #96

Description

@mseeger

Describe the bug

Look at test/attention/test_attn_weights.py, test_larger_comparison, the q_len=1 cases.

Currently, FlashInfer is skipped for this one. If the skip is removed (in get_variants), the tests fail for FlashInfer. All other variants give the same results.

FlashInfer works for all other q_len values. Likely this is because the case q_len=1 is implemented differently.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions