Skip to content

In step 3, I met a error when executing self.actor_model.eval() #593

Open
@ZJXNEFU

Description

@ZJXNEFU

Here is the error I met, seems like the self._total_batch_size is None, but I don't know the reason

  File "/path/model_training/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", line 434, in main
    out = trainer.generate_experience(batch_prompt['prompt'],
  File "/path/model_training/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 97, in generate_experience
    self.eval()
  File "/path/model_training/DeepSpeed-Chat/training/step3_rlhf_finetuning/ppo_trainer.py", line 237, in eval
    self.actor_model.eval()
  File "/root/miniconda3/envs/dschat/lib/python3.9/site-packages/deepspeed/runtime/hybrid_engine.py", line 379, in eval
    f'|CurSamplesPerSec={(1 / latency * self._total_batch_size):.2f} ' + \
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions