Skip to content

[QUESTION] Error in pipeline parallelism training for multimodal model #1220

@CFC87

Description

@CFC87

I am trying to run the example at https://github.com/NVIDIA/Megatron-LM/tree/main/examples/multimodal.
But I found that once I set the pipeline --pipeline-model-parallel-size to be larger than 1, I will get the error message:
File "/workspace/megatron-lm/megatron/core/models/gpt/gpt_model.py", line 219, in forward rotary_seq_len = self.rotary_pos_emb.get_rotary_seq_len( File "/workspace/megatron-lm/megatron/core/models/common/embeddings/rotary_pos_embedding.py", line 173, in get_rotary_seq_len rotary_seq_len = transformer_input.size(0) AttributeError: 'NoneType' object has no attribute 'size'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions