Fix GRPO to conform with TRL: Fix loss, make tests accurate, correct metrics computation #842
Annotations
4 errors
|
tests (6.2)
Canceling since a higher priority waiting request for 'AMD GPU-628' exists
|
|
tests (6.2)
The operation was canceled.
|
|
tests (6.3)
Canceling since a higher priority waiting request for 'AMD GPU-628' exists
|
|
tests (6.3)
The operation was canceled.
|