Skip to content

Is Llama model initialized on GPU? #776

Open
@mahmoodn

Description

@mahmoodn

When I run the Llama training command, I see the following message:

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.

I haven't modified the scripts, so I wonder what is the meaning and effect of that message. Should I set something extra in the configuration options?

Messages before and after that are shown below:

[2024-11-02 19:47:06,515] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-11-02 19:50:02,996] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-11-02 19:50:02,996] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-11-02 19:50:08,549] [INFO] [comm.py:652:init_distributed] cdb=None
[2024-11-02 19:50:08,549] [INFO] [comm.py:652:init_distributed] cdb=None
[2024-11-02 19:50:08,550] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
[2024-11-02 19:50:16,865] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 2
[2024-11-02 19:50:16,865] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 2
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
[2024-11-02 19:50:29,074] [INFO] [partition_parameters.py:348:__exit__] finished initializing model - num_params = 563, num_elems = 68.98B

Loading checkpoint shards:   0%|          | 0/29 [00:00<?, ?it/s]

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions