Why compute action from target net rather than online net?

https://github.com/g6ling/Reinforcement-Learning-Pytorch-Cartpole/blob/ecb7b622cfefe825ac95388cceb6752413d90a2a/POMDP/4-R2D2-Single/train.py#L76

Another question : 
Why do you only store hidden state from target net and not from online net?