grad_compare files are better suited for/as testing modules. Checks should assume float64 as the standard. Tile sizes should be as small as possible - [ ] Forward equivalence under a certain tile size - [ ] Backward kernel gradient equivalence between streaming and a normal network - [ ] Freeze streaming-incompatible layers, i.e. normalization layers and global pooling
grad_compare files are better suited for/as testing modules.
Checks should assume float64 as the standard. Tile sizes should be as small as possible