Skip to content

Query on Gradient accumulation #2134

Open
@Vattikondadheeraj

Description

@Vattikondadheeraj

Hey, I have one doubt on gradient accumulation parameter. When I increase the parameter from 4 to 8, I am getting OOM error which doesn't make much sense to me. I just wanna ask why am I getting this error? Are you storing the gradients individually or summing them as we get new ones? Or Am i missing something else?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions