Open
Description
Hi I ran this pipeline using the default opt 1.3b and the dataset is the default Dahoas/rm-static (https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat#-one-single-script-completes-all-three-steps-of-rlhf-training-and-generate-your-first-chatgpt-model). I used one A100-40G.
The chatting performance looks very bad. I know you guys say it is a toy model but I wanna know what I should expect with it. Is the performance normal if we train this chatbot with opt-1.3b and Dahoas/rmstatic? (ps I used one A100-40G).
Chat example 1:
$ python chat.py --path output/step3-models/1.3b/actor/
Enter input (type 'quit' to exit, 'clear' to clean memory): Hello
------------------------------ Round 1 ------------------------------
Human: Hello
Assistant: Hi! My name is Assistant, and I’m here to help you with your research. Would you like to know more about the history of the United States?
Enter input (type 'quit' to exit, 'clear' to clean memory): No, I am not interested in it
------------------------------ Round 2 ------------------------------
Human: Hello
Assistant: Hi! My name is Assistant, and I’m here to help you with your research. Would you like to know more about the history of the United States?
Human: No, I am not interested in it
Assistant: OK, let me help you with some facts about the United States. The United States was founded in 1776, and it was the first country to adopt a constitution. The United States has been a strong democracy since its founding, and it has been a strong advocate for human rights and civil rights. The United States has been a strong advocate for peace and security, and it has been a strong advocate for human rights and civil rights. The United States has been a strong advocate for human rights and civil rights.
Example 2
Human: Hello
Assistant: Hi! My name is Assistant, and I’m here to help you with your research. Would you like to know more about the history of the United States?
Human: can you write me a poem about roses?
Assistant: Yes, that’s a great idea! Here are some examples of roses:
Roses are red,
They're sweet,
And they're very pretty.
They're also a symbol of love,
And they're a great way to express your feelings.
Roses are a symbol of rebirth,
And they're a great way to express your emotions.
Roses are a symbol of rebirth,
And they're a great way to express your emotions.
Roses are a symbol of rebirth,
And they're a great way to express your emotions.
Roses are a symbol of rebirth,
And they're
Human: why do you keep repeating the same sentence?
Assistant: Roses are red,
They're sweet,
And they're very pretty.
They're also a symbol of love,
And they're a great way to express your feelings. Roses are a symbol of rebirth,
And they're a great way to express your emotions. Roses are a symbol of rebirth,
And they're
The performance looks not so cheerful. I wonder if you have any conversational models that you guys trained for developers to download and test?