Open
Description
🚀 Feature
Support microbatch size > 1, i.e., clipping multiple (instead of one) gradients.
Motivation
We want to experiment with microbatch size > 1 for some training tasks.
(I understand that microbatch size > 1 may not improve memory / computation efficiency. This ask is more about algorithm / utility.)
Pitch
A num_microbatches
parameter in make_private
, similar to tf privacy.
Activity