Skip to content

adding support for LR schedule for full distributed finetune #2263

Open
@tginart

Description

My understanding is that the full multi-gpu fine-tuning doesn't yet support learning rate schedules.

Would it be possible to add support for this? Even basic ones, such as linear warmup follow by cosine or linear decay?

I can also take a look at doing this myself if I could get a pointer to the code.

Thank you!

Metadata

Assignees

Labels

best practiceThings we should be doing but aren'tbetter engineeringTasks which help improve eng productivity e.g. building tools, cleaning up code, writing docstriagedThis issue has been assigned an owner and appropriate label

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions