Is Llama model initialized on GPU?

When I run the Llama training command, I see the following message:
```
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
```
I haven't modified the scripts, so I wonder what is the meaning and effect of that message. Should I set something extra in the configuration options?

Messages before and after that are shown below:

```
[2024-11-02 19:47:06,515] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-11-02 19:50:02,996] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-11-02 19:50:02,996] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-11-02 19:50:08,549] [INFO] [comm.py:652:init_distributed] cdb=None
[2024-11-02 19:50:08,549] [INFO] [comm.py:652:init_distributed] cdb=None
[2024-11-02 19:50:08,550] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
[2024-11-02 19:50:16,865] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 2
[2024-11-02 19:50:16,865] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 2
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
[2024-11-02 19:50:29,074] [INFO] [partition_parameters.py:348:__exit__] finished initializing model - num_params = 563, num_elems = 68.98B

Loading checkpoint shards:   0%|          | 0/29 [00:00<?, ?it/s]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is Llama model initialized on GPU? #776

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is Llama model initialized on GPU? #776

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions