Integration testing for DPO 

We have a gap in our CI for the DPO recipes which we should address. To do this, we should:

1) Verify correctness of the algorithm against some reference implementation (maybe this is overkill, but it's been a while since it was originally contributed). This may also be an opportunity to ensure it works OK on other datasets. cc @RdoubleA RE Anthropic HH 
2) Run the model with some mock inputs/models to obtain reference loss values.
3) Write a test which ensures that when the recipe is launched with the above inputs/models it obtains expected loss values. This should sufficiently guard the recipe against future changes which may trigger re-verifying the correctness.


See similar recipe tests for PPO, LoRA fine-tune, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integration testing for DPO #1411

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integration testing for DPO #1411

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions