Hi, I am now tring to train the qwen2.5 models using deepspeed and transformers, I tried to using tp or pp by setting the ds_config.json like:
"tensor_parallel":{"autotp_size":4},
"pipeline_parallel": {
"pp_size": 64,
"microbatches": 4,
"enable_backward_allreduce": true
},
But I think it doesn't woking. Should I wrapped the model with PipelineModule ?