Can not use bloom-560m model in the step2_reward_model_finetuning

In the step1, I can use the bloom model, like bloom-3b.
But in the step2, I use bloom-560m for reward finetuning, the error message is:

***** Running training *****
***** Evaluating reward, Epoch 0/1 *****
Traceback (most recent call last):
  File "main.py", line 352, in <module>
    main()
  File "main.py", line 306, in main
    reward_score, acc = evaluation_reward(rm_model, eval_dataloader)
  File "main.py", line 252, in evaluation_reward
    outputs = model(**batch)
  File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
    ret_val = func(*args, **kwargs)
  File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1695, in forward
    loss = self.module(*inputs, **kwargs)
  File "/workspace/anaconda3/envs/deepspeed/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/workspace/Project/DeepSpeedExamples/applications/DeepSpeed-Chat/training/utils/model/reward_model.py", line 95, in forward
    assert divergence_ind > 0
AssertionError

How can I fix the error, or step 2 won't work with bloom-560m?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not use bloom-560m model in the step2_reward_model_finetuning #479

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can not use bloom-560m model in the step2_reward_model_finetuning #479

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions