Open
Description
Hello,
Thank you for sharing your excellent paper on "Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards". I have some questions regarding the Direction Preference Alignment method, specifically about how the user preference vector ( v ) is incorporated during the model training process.
To be more specific, I would like to understand how the user preference vector ( v ) is actually integrated into the model during the training phase. From my understanding, should the attribute weights be directly concatenated onto the prompt (as you mentioned in the system prompt)?
I look forward to your response.
Thanks!
Metadata
Assignees
Labels
No labels