v0.2.3
What's Changed
- chore: update lora and add metrics by @lkevinzc in #66
- Fix incorrect state indexing in PPOMultiTurnLearner critic training by @MozerWang in #67
- fix micro batch training issue in DPO training by @hmhuy0 in #68
- feat: add fp16 training by @lkevinzc in #70
New Contributors
- @MozerWang made their first contribution in #67
- @hmhuy0 made their first contribution in #68
Full Changelog: v0.2.2...v0.2.3