Skip to content

[Question] How to run on GPU? Seems like it always runs on CPU. #486

Open
@sikora507

Description

@sikora507

❓ Question

I am trying to run GPU docker image using scripts and it looks like I have everything set up, but still I get the "Using cpu device" information during training.

I've downloaded the repo rl-baselines3-zoo
I've run the script with --device cuda and without (fallback to "auto")
./scripts/run_docker_gpu.sh python train.py --algo ppo --env CartPole-v1 --device cuda
I've checked the nvidia-smi on my host machine:
NVIDIA-SMI 570.124.04 Driver Version: 570.124.04 CUDA Version: 12.8
NVIDIA GeForce GTX 1080 Ti
I've checked the nvidia-smi from the image itself and I got the same output, so my GPU is visible from the image itself.
For that I've prepared docker compose file which I run like this:
docker compose run rl-baselines3-zoo
The file itself:

version: "3.8"
services:
  rl-baselines3-zoo:
    image: stablebaselines/rl-baselines3-zoo
    volumes:
      - ./:/rl-baselines3-zoo
    runtime: nvidia # Add this line
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    working_dir: /rl-baselines3-zoo
    command: /bin/bash

When I run nvidia-smi from the container I can see my gpu as pasted above. The command:
(base) mambauser@c20a8aaf94f7:/rl-baselines3-zoo$ nvidia-smi
Whenever I run an example training script, it uses my CPU which I can see in System Monitor.
I've tried to run the command:
python train.py --algo ppo --env CartPole-v1
It doesn't matter if i run this from within the container or use ./scripts/run_docker_gpu.sh or if I try to force --device cuda, it still outputs:

========== CartPole-v1 ==========
Seed: 1578490461
Loading hyperparameters from: /rl-baselines3-zoo/hyperparams/ppo.yml
Default hyperparameters for environment (ones being tuned will be overridden):
OrderedDict([('batch_size', 256),
             ('clip_range', 'lin_0.2'),
             ('ent_coef', 0.0),
             ('gae_lambda', 0.8),
             ('gamma', 0.98),
             ('learning_rate', 'lin_0.001'),
             ('n_envs', 8),
             ('n_epochs', 20),
             ('n_steps', 32),
             ('n_timesteps', 100000.0),
             ('policy', 'MlpPolicy')])
Using 8 environments
Creating test environment
Using cpu device
Log path: logs/ppo/CartPole-v1_12

I know that cartpole and ppo might not be optimized for GPU and should be run on CPU, but still it bothers me. What am I missing?
What else I need to check?
I thought that this might be hardcoded to use CPU for this particular environment and algorithm but it seems like it isn't.

Checklist

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions