Varstore has so many problems. It does not respect no_grad_guard. It only create float dtype. It include shallow copy of the tesnors. There is not much point of such struct. To keep track of all the weights, a hashmap is perfect. But the nn module are hardcoded to depend on it. Any reason it is this way?
Varstore has so many problems. It does not respect no_grad_guard. It only create float dtype. It include shallow copy of the tesnors. There is not much point of such struct. To keep track of all the weights, a hashmap is perfect. But the nn module are hardcoded to depend on it. Any reason it is this way?