Skip to content

v0.2.3

Choose a tag to compare

@lkevinzc lkevinzc released this 31 Oct 01:08
· 3 commits to main since this release
c1a074c

What's Changed

  • chore: update lora and add metrics by @lkevinzc in #66
  • Fix incorrect state indexing in PPOMultiTurnLearner critic training by @MozerWang in #67
  • fix micro batch training issue in DPO training by @hmhuy0 in #68
  • feat: add fp16 training by @lkevinzc in #70

New Contributors

Full Changelog: v0.2.2...v0.2.3