I ran the code as written on github.
However, after a certain point, the loss is all Nan.
I think it's a loss of the dataset, so I recreated the pkl file with create_data.py, but Nan comes out as it is. Is it correct to run the training to the end even if Nan comes out?