Skip to content

feat: add WSD (Warmup-Stable-Decay) learning rate scheduler#1375

Open
Rakshitha-Ireddi wants to merge 1 commit intoEleutherAI:mainfrom
Rakshitha-Ireddi:feature/wsd-lr-scheduler
Open

feat: add WSD (Warmup-Stable-Decay) learning rate scheduler#1375
Rakshitha-Ireddi wants to merge 1 commit intoEleutherAI:mainfrom
Rakshitha-Ireddi:feature/wsd-lr-scheduler

Conversation

@Rakshitha-Ireddi
Copy link

Implements the Warmup-Stable-Decay (WSD) learning rate schedule from the MiniCPM paper (https://arxiv.org/abs/2404.06395). This schedule uses three phases: linear warmup, constant LR (stable), then cosine decay to min_lr over the final fraction of training.

Changes:

  • Add 'wsd' to lr_decay_style options in NeoXArgs
  • Add wsd_decay_ratio parameter (default 0.1) controlling decay phase length
  • Implement WSD branch in AnnealingLR.get_lr()
  • Add state_dict/load_state_dict support for wsd_decay_ratio
  • Pass wsd_decay_ratio through get_learning_rate_scheduler()
  • Add comprehensive unit tests (15 tests, all passing)

Authors

  • Ireddi Rakshitha
  • Yashwanth Devavarapu

Implements the Warmup-Stable-Decay (WSD) learning rate schedule from
the MiniCPM paper (https://arxiv.org/abs/2404.06395). This schedule
uses three phases: linear warmup, constant LR (stable), then cosine
decay to min_lr over the final fraction of training.

Addresses EleutherAI#1326.

Changes:
- Add 'wsd' to lr_decay_style options in NeoXArgs
- Add wsd_decay_ratio parameter (default 0.1) controlling decay phase length
- Implement WSD branch in AnnealingLR.get_lr()
- Add state_dict/load_state_dict support for wsd_decay_ratio
- Pass wsd_decay_ratio through get_learning_rate_scheduler()
- Add comprehensive unit tests (15 tests, all passing)

Contact: @Rakshitha-Ireddi (GitHub)
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Rakshitha Ireddi seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants