Skip to content

Integration testing for DPO  #1411

Open
@SalmanMohammadi

Description

@SalmanMohammadi

We have a gap in our CI for the DPO recipes which we should address. To do this, we should:

  1. Verify correctness of the algorithm against some reference implementation (maybe this is overkill, but it's been a while since it was originally contributed). This may also be an opportunity to ensure it works OK on other datasets. cc @RdoubleA RE Anthropic HH
  2. Run the model with some mock inputs/models to obtain reference loss values.
  3. Write a test which ensures that when the recipe is launched with the above inputs/models it obtains expected loss values. This should sufficiently guard the recipe against future changes which may trigger re-verifying the correctness.

See similar recipe tests for PPO, LoRA fine-tune, etc.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    best practiceThings we should be doing but aren'tbetter engineeringTasks which help improve eng productivity e.g. building tools, cleaning up code, writing docscommunity help wantedWe would love the community's help completing this issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions