Skip to content

v1.0.0-rc0 is not support actor_rollout_ref.rollout.tensor_model_parallel_size > 8 #5585

@l1351868270

Description

@l1351868270

System Info

verl: v1.0.0-rc0
镜像: vllm017.latest

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

examples/grpo_trainer/run_qwen3moe-30b_megatron_96gb.sh
actor_rollout_ref.rollout.tensor_model_parallel_size=16

Expected behavior

support tp > 8

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions