could not do any cuda operators when try to schedule with dp attention #10098
Unanswered
Hanatan1th
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I can run successfully with tp2 dp1 and enable_dp_attention=False, when I try to run with enable_dp_attention=True, tp4, dp2, will report like

when I change this torch.zeros to torch.empty, it will not report error here, but report similar error later

or

it seems I can only do cuda operators inside model forward when run with enable_dp_attention
Beta Was this translation helpful? Give feedback.
All reactions