Train MNIST using numpy with manual backprop (somehow more accurate than pytorch) pytorch loss.backward() : 0.964 backprop by hand : 0.977