Replies: 1 comment
-
Hello @xhsoldier. Does training the model without quantization with the same schedule lead to the same problem? If you could provide a reproducer, we could look at this problem in more detail. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Int8 quantization aware training, loading a pretrained fp32 model with unbanlanced weight distribution, some weights are very big, some weight are very small.
After training 3 steps, nan occurs and the training failed.
How to resulve this issue?
Beta Was this translation helpful? Give feedback.
All reactions