Why gradient_clip=0.01? #103
guoqincode
started this conversation in
General
Replies: 1 comment
-
|
Since the training of DIT is relatively unstable |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am very confused about why gradient_clip=0.01, in most diffusion model training, gradient_clip is usually =1, which is 100 times different from 0.01, I hope to get your answer!
Beta Was this translation helpful? Give feedback.
All reactions