Skip to content

messy response from model trained with opt-1.3b and Dahoas/rm-static #569

Open
@treya-lin

Description

@treya-lin

Hi I ran this pipeline using the default opt 1.3b and the dataset is the default Dahoas/rm-static (https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat#-one-single-script-completes-all-three-steps-of-rlhf-training-and-generate-your-first-chatgpt-model). I used one A100-40G.

The chatting performance looks very bad. I know you guys say it is a toy model but I wanna know what I should expect with it. Is the performance normal if we train this chatbot with opt-1.3b and Dahoas/rmstatic? (ps I used one A100-40G).

Chat example 1:

$ python chat.py --path output/step3-models/1.3b/actor/
Enter input (type 'quit' to exit, 'clear' to clean memory): Hello
------------------------------ Round 1 ------------------------------
 Human: Hello
 Assistant:  Hi! My name is Assistant, and I’m here to help you with your research.  Would you like to know more about the history of the United States?
Enter input (type 'quit' to exit, 'clear' to clean memory): No, I am not interested in it
------------------------------ Round 2 ------------------------------
 Human: Hello
 Assistant:  Hi! My name is Assistant, and I’m here to help you with your research.  Would you like to know more about the history of the United States?

 Human: No, I am not interested in it
 Assistant:  OK, let me help you with some facts about the United States.  The United States was founded in 1776, and it was the first country to adopt a constitution.  The United States has been a strong democracy since its founding, and it has been a strong advocate for human rights and civil rights.  The United States has been a strong advocate for peace and security, and it has been a strong advocate for human rights and civil rights.  The United States has been a strong advocate for human rights and civil rights.

Example 2

Human: Hello
 Assistant:  Hi! My name is Assistant, and I’m here to help you with your research.  Would you like to know more about the history of the United States?

 Human: can you write me a poem about roses?
 Assistant:  Yes, that’s a great idea!  Here are some examples of roses:

 Roses are red,
They're sweet,
And they're very pretty.
They're also a symbol of love,
And they're a great way to express your feelings.
 Roses are a symbol of rebirth,
And they're a great way to express your emotions.
 Roses are a symbol of rebirth,
And they're a great way to express your emotions.
 Roses are a symbol of rebirth,
And they're a great way to express your emotions.
 Roses are a symbol of rebirth,
And they're

 Human: why do you keep repeating the same sentence?
 Assistant:  Roses are red,
They're sweet,
And they're very pretty.
They're also a symbol of love,
And they're a great way to express your feelings. Roses are a symbol of rebirth,
And they're a great way to express your emotions. Roses are a symbol of rebirth,
And they're

The performance looks not so cheerful. I wonder if you have any conversational models that you guys trained for developers to download and test?

Metadata

Metadata

Assignees

Labels

deespeed chatDeepSpeed ChatmodelingRelated to modeling questions.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions