fix the self_attn module of DecoderLayer by TrinitialChan · Pull Request #22 · SHI-Labs/Prompt-Free-Diffusion

TrinitialChan · 2023-11-16T03:42:54Z

Your paper specifies that the Decoder section performs a stacked multi-head self-attention operation, however I have found in the code that the behavior of the DecoderLayer class is inconsistent with the above description. By printing the attn_output_weights of the self_attn module, I found attention map shaped '([L, 1, 1])', and there is clearly a problem with such an attention computation. I provided a quickfix in this PR.

kilimchoi · 2023-12-07T04:57:03Z

@xingqian2018 can you check this?

fix the self_attn module of DecoderLayer

3e14cc0

TrinitialChan mentioned this pull request Nov 21, 2023

Incorrect implementation of self-attention #23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix the self_attn module of DecoderLayer#22

fix the self_attn module of DecoderLayer#22
TrinitialChan wants to merge 1 commit intoSHI-Labs:masterfrom
TrinitialChan:fix_self_attn

TrinitialChan commented Nov 16, 2023

Uh oh!

kilimchoi commented Dec 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TrinitialChan commented Nov 16, 2023

Uh oh!

kilimchoi commented Dec 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants