Open
Description
我修改了tools/train.py 如下内容
strategy = "ddp"
devices=[0,1,2,3]
trainer = pl.Trainer(
default_root_dir=cfg.save_dir,
max_epochs=cfg.schedule.total_epochs,
check_val_every_n_epoch=cfg.schedule.val_intervals,
accelerator=accelerator,
devices=devices,
gpus=len(devices),
log_every_n_steps=cfg.log.interval,
num_sanity_val_steps=0,
callbacks=[TQDMProgressBar(refresh_rate=0)], # disable tqdm bar
logger=logger,
benchmark=cfg.get("cudnn_benchmark", True),
gradient_clip_val=cfg.get("grad_clip", 0.0),
strategy=strategy,
precision=precision,
)
训练后只能够使用单GPU,不知道我哪里设置的不对?
环境如下:
pytorch-lightning 1.9.5
pytorch 1.13.1 py3.10_cuda11.6_cudnn8.3.2_0 pytorch
Metadata
Metadata
Assignees
Labels
No labels
Activity