Added Accumulated Gradient Scoring
Added the ability to accumulate gradients to simulate larger batch sizes for RigL steps. Larger batch sizes tend to reduce noise & the aim is to emulate a batch size of 4096 using a true batch size of 1024.
Added the ability to accumulate gradients to simulate larger batch sizes for RigL steps. Larger batch sizes tend to reduce noise & the aim is to emulate a batch size of 4096 using a true batch size of 1024.