Skip to content

Can not use bloom-560m model in the step2_reward_model_finetuning #479

Open
@korlin0110

Description

@korlin0110

In the step1, I can use the bloom model, like bloom-3b.
But in the step2, I use bloom-560m for reward finetuning, the error message is:

***** Running training *****
***** Evaluating reward, Epoch 0/1 *****
Traceback (most recent call last):
File "main.py", line 352, in
main()
File "main.py", line 306, in main
reward_score, acc = evaluation_reward(rm_model, eval_dataloader)
File "main.py", line 252, in evaluation_reward
outputs = model(**batch)
File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1695, in forward
loss = self.module(*inputs, **kwargs)
File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/Project/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/model/reward_model.py", line 95, in forward
assert divergence_ind > 0
AssertionError

How can I fix the error, or step 2 won't work with bloom-560m?

Metadata

Metadata

Assignees

Labels

deespeed chatDeepSpeed Chatnew-configA modified config from the given examplesystemAn issue with a environment/system setup.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions