[Feature Request] expose unidirectional
(causal) attribute of GQA #23409
Open
Description
Describe the feature request
We have a unidirectional
attribute in MHA, but it is missing in GQA because most LLMs are causal. Transformer-based Text-to-Image models, on the other hand, are not causal. We should expose this attribute in GQA to help facilitate the deployment of Text-to-Image models.
Describe scenario use case
Text-to-Image generation models (like Stable Diffusion 3) require attention implemented with causal=False
.