Skip to content

Feature Request: Full Support for Direct Preference Optimization (DPO) #1532

Open
@pipSu

Description

@pipSu

Hello, I'm interested in knowing if there are any plans to implement Full support for Direct Preference Optimization (DPO) in the upcoming releases.

Are there any current efforts or roadmap items related to this, or is it something that might be considered in future updates?

Thank you for your time and consideration.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Labels

enhancementNew feature or requestrlhfAnything related to reinforcement learning w/ human feedback

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions