Gradient Accumulation with Dual (optimizer, scheduler) Training #14999
Unanswered
celsofranssa
asked this question in
code help: NLP / ASR / TTS
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, Lightning community,
I am using a dual (optimizer, scheduler) training as shown in the code snippet below:
With
"frequency": 1
on both optimizers, the trainer callsoptimizer_1
in stepi
while calling optimizer_2 in step(i+1)
.Therefore, is there an approach to combine
gradient acccumulation
with this optimization setup whereoptimizer_1
uses the accumulated gradient from steps(i-1)
andi
whileoptimizer_2
uses the accumulated gradient from stepsi
and(i+ 1)
?Beta Was this translation helpful? Give feedback.
All reactions