-
Notifications
You must be signed in to change notification settings - Fork 415
Open
Description
🚀 The feature, motivation and pitch
Hi! Thank you for adding support for MTA (#689) ! Do I understand it correctly, that this implementation only covers post-sm key-query convolution? There is also pre-sm Q-K convolution, head convolution, and gated group norm (last one should probably not be part of the kernel). We have released reference code here: https://github.com/facebookresearch/RAM/blob/main/projects/mta/mta_transformer.py#L337
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels