Fix GRPO to conform with TRL: Fix loss, make tests accurate, correct metrics computation #828
Annotations
3 errors
|
tests (6.2)
Process completed with exit code 2.
|
|
tests (6.3)
The job was canceled because "_6_2" failed.
|
|
tests (6.3)
The operation was canceled.
|