Open
Description
Hello, I‘m tring to use BLOOMZ for reward model training, and get error:
Traceback (most recent call last):
File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/../../main.py", line 349, in <module>
main()
File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/../../main.py", line 303, in main
reward_score, acc = evaluation_reward(rm_model, eval_dataloader)
File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/../../main.py", line 249, in evaluation_reward
outputs = model(**batch)
File "/users5/xydu/anaconda3/envs/dpchat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/utils/model/reward_model.py", line 97, in forward
return forward_call(*args, **kwargs)
File "/users5/xydu/anaconda3/envs/dpchat/lib/python3.9/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/users5/xydu/anaconda3/envs/dpchat/lib/python3.9/site-packages/deepspeed/runtime/engine.py", line 1695, in forward
loss = self.module(*inputs, **kwargs)
File "/users5/xydu/anaconda3/envs/dpchat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
assert divergence_ind > 0, divergence_ind
AssertionError return forward_call(*args, **kwargs)
File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/utils/model/reward_model.py", line 97, in forward
assert divergence_ind > 0
After output divergence_ind
I find it is 0 and change assert divergence_ind > 0
to assert divergence_ind >= 0
, will this affect the program?