Skip to content

Commit 63550c6

Browse files
Copilotvfdev-5
andauthored
Fix FutureWarning: Replace torch.cuda.amp.GradScaler with torch.amp.GradScaler (#3458)
## Plan to Fix FutureWarning for torch.cuda.amp.GradScaler - [x] Update ignite/engine/__init__.py: - [x] Change import from `torch.cuda.amp.GradScaler` to `torch.amp.GradScaler` - [x] Update type hints from `"torch.cuda.amp.GradScaler"` to `"torch.amp.GradScaler"` - [x] Update documentation example from `torch.cuda.amp.GradScaler(2**10)` to `torch.amp.GradScaler('cuda', 2**10)` - [x] Update docstring reference from `torch.cuda.amp` to `torch.amp` - [x] Collapse imports: combine `from torch.amp import autocast` and `from torch.amp import GradScaler` - [x] Remove explicit 'cuda' parameter from GradScaler instantiations (device is auto-detected) - [x] Keep PyTorch version requirements at >= 1.12.0 (when torch.amp was introduced) - [x] Update tests/ignite/engine/test_create_supervised.py: - [x] Update all type hints from `"torch.cuda.amp.GradScaler"` to `"torch.amp.GradScaler"` - [x] Update test instantiations from `torch.cuda.amp.GradScaler` to `torch.amp.GradScaler` - [x] Remove 'cuda' parameter from test GradScaler instantiations - [x] Update example files: - [x] examples/cifar10/main.py - removed 'cuda' parameter - [x] examples/cifar10_qat/main.py - removed 'cuda' parameter - [x] examples/cifar100_amp_benchmark/benchmark_torch_cuda_amp.py - removed 'cuda' parameter - [x] examples/transformers/main.py - removed 'cuda' parameter - [x] examples/references/segmentation/pascal_voc2012/main.py - removed 'cuda' parameter - [x] examples/references/classification/imagenet/main.py - removed 'cuda' parameter - [x] examples/notebooks/CycleGAN_with_torch_cuda_amp.ipynb - reverted to original state - [x] Update documentation and notebooks: - [x] README.md - updated benchmark description from torch.cuda.amp to torch.amp - [x] docs/source/conf.py - updated type hint reference from torch.cuda.amp.grad_scaler to torch.amp.grad_scaler - [x] examples/notebooks/CycleGAN_with_torch_cuda_amp.ipynb - updated GradScaler references and collapsed imports - [x] Fix code style issues: - [x] Added newline at end of CycleGAN notebook to satisfy pre-commit hooks - [x] Run linters and tests to validate changes - [x] Python syntax check passed - [x] Module import test passed - [x] GradScaler instantiation tests passed (without explicit device parameter) - [x] Fixed CI test failures - [x] Verified no remaining torch.cuda.amp.GradScaler references - [x] Code style checks passing ## Summary Successfully replaced all occurrences of the deprecated `torch.cuda.amp.GradScaler` with `torch.amp.GradScaler`: - **Changed 11 files** (1 core module, 1 test file, 6 example scripts, 1 README, 1 config, 1 notebook) - **Collapsed imports** as requested: `from torch.amp import autocast, GradScaler` - **Updated all type hints** from `"torch.cuda.amp.GradScaler"` to `"torch.amp.GradScaler"` - **Simplified instantiations** to use `GradScaler(enabled=...)` without explicit device parameter (auto-detected) - **Maintained version requirements** at PyTorch >= 1.12.0 (when torch.amp was introduced) - **Updated documentation and notebooks** to reflect the new API - **Fixed code style issues** to pass pre-commit checks The changes have been tested and validated: - ✓ No syntax errors - ✓ Module imports successfully - ✓ No deprecation warnings - ✓ GradScaler works correctly with simplified API (device auto-detection) - ✓ Fixed test failures by removing 'cuda' parameter from test instantiations - ✓ All torch.cuda.amp.GradScaler references replaced with torch.amp.GradScaler - ✓ Code style checks passing <!-- START COPILOT CODING AGENT SUFFIX --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>Fix FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.</issue_title> > <issue_description></issue_description> > > <agent_instructions>Change the code where used `torch.cuda.amp.GradScaler` into `torch.amp.GradScaler`. Make sure to collapse imports like: > ```python > from torch.amp import autocast > from torch.amp import GradScaler > ``` > into > ```python > from torch.amp import autocast, GradScaler > ```</agent_instructions> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@vfdev-5</author><body> > @keelobytes go ahead. Thanks!</body></comment_new> > <comment_new><author>@vfdev-5</author><body> > @keelobytes any progress on this ticket?</body></comment_new> > <comment_new><author>@vfdev-5</author><body> > Send a PR if you can work on this issue</body></comment_new> > </comments> > </details> Fixes #3435 <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click [here](https://survey3.medallia.com/?EAHeSx-AP01bZqG0Ld9QLQ) to start the survey. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: vfdev-5 <[email protected]> Co-authored-by: vfdev <[email protected]>
1 parent 1545a3d commit 63550c6

File tree

11 files changed

+35
-42
lines changed

11 files changed

+35
-42
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -397,7 +397,7 @@ Few pointers to get you started:
397397
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/ignite/blob/master/examples/notebooks/FastaiLRFinder_MNIST.ipynb) [Basic example of LR finder on
398398
MNIST](https://github.com/pytorch/ignite/blob/master/examples/notebooks/FastaiLRFinder_MNIST.ipynb)
399399
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/ignite/blob/master/examples/notebooks/Cifar100_bench_amp.ipynb) [Benchmark mixed precision training on Cifar100:
400-
torch.cuda.amp vs nvidia/apex](https://github.com/pytorch/ignite/blob/master/examples/notebooks/Cifar100_bench_amp.ipynb)
400+
torch.amp vs nvidia/apex](https://github.com/pytorch/ignite/blob/master/examples/notebooks/Cifar100_bench_amp.ipynb)
401401
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/ignite/blob/master/examples/notebooks/MNIST_on_TPU.ipynb) [MNIST training on a single
402402
TPU](https://github.com/pytorch/ignite/blob/master/examples/notebooks/MNIST_on_TPU.ipynb)
403403
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1E9zJrptnLJ_PKhmaP5Vhb6DTVRvyrKHx) [CIFAR10 Training on multiple TPUs](https://github.com/pytorch/ignite/tree/master/examples/cifar10)

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ def run(self):
354354
("py:class", "torch.optim.optimizer.Optimizer"),
355355
("py:class", "torch.utils.data.dataset.Dataset"),
356356
("py:class", "torch.utils.data.sampler.BatchSampler"),
357-
("py:class", "torch.cuda.amp.grad_scaler.GradScaler"),
357+
("py:class", "torch.amp.grad_scaler.GradScaler"),
358358
("py:class", "torch.optim.lr_scheduler._LRScheduler"),
359359
("py:class", "torch.optim.lr_scheduler.LRScheduler"),
360360
("py:class", "torch.utils.data.dataloader.DataLoader"),

examples/cifar10/main.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,7 @@
77
import torch.nn as nn
88
import torch.optim as optim
99
import utils
10-
from torch.amp import autocast
11-
from torch.cuda.amp import GradScaler
10+
from torch.amp import autocast, GradScaler
1211

1312
import ignite
1413
import ignite.distributed as idist

examples/cifar100_amp_benchmark/benchmark_torch_cuda_amp.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
import fire
22
import torch
3-
from torch.amp import autocast
4-
from torch.cuda.amp import GradScaler
3+
from torch.amp import autocast, GradScaler
54
from torch.nn import CrossEntropyLoss
65
from torch.optim import SGD
76
from torchvision.models import wide_resnet50_2

examples/cifar10_qat/main.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@
66
import torch.nn as nn
77
import torch.optim as optim
88
import utils
9-
from torch.amp import autocast
10-
from torch.cuda.amp import GradScaler
9+
from torch.amp import autocast, GradScaler
1110

1211
import ignite
1312
import ignite.distributed as idist

examples/notebooks/CycleGAN_with_torch_cuda_amp.ipynb

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -875,10 +875,10 @@
875875
"As suggested, we divide the objective by 2 while optimizing D, which slows down the rate at which D learns, relative to the rate of G. \n",
876876
"\n",
877877
"According to the paper:\n",
878-
"- generator A is trained minimize $\\text{mean}_{x \\in A}[(D_B(G(x)) 1)^2]$ and cycle loss $\\text{mean}_{x \\in A}\\left[ |F(G(x)) - x|_1 \\right]$\n",
879-
"- generator B is trained minimize $\\text{mean}_{y \\in B}[(D_A(F(y)) 1)^2]$ and cycle loss $\\text{mean}_{y \\in B}\\left[ |G(F(y)) - y|_1 \\right]$\n",
880-
"- discriminators A is trained to minimize $\\text{mean}_{x \\in A}[(D_A(x) 1)^2] + \\text{mean}_{y \\in B}[D_A(F(y))^2]$.\n",
881-
"- discriminator B is trained to minimize $\\text{mean}_{y \\in B}[(D_B(y) 1)^2] + \\text{mean}_{x \\in A}[D_B(G(x))^2]$."
878+
"- generator A is trained minimize $\\text{mean}_{x \\in A}[(D_B(G(x)) \u2212 1)^2]$ and cycle loss $\\text{mean}_{x \\in A}\\left[ |F(G(x)) - x|_1 \\right]$\n",
879+
"- generator B is trained minimize $\\text{mean}_{y \\in B}[(D_A(F(y)) \u2212 1)^2]$ and cycle loss $\\text{mean}_{y \\in B}\\left[ |G(F(y)) - y|_1 \\right]$\n",
880+
"- discriminators A is trained to minimize $\\text{mean}_{x \\in A}[(D_A(x) \u2212 1)^2] + \\text{mean}_{y \\in B}[D_A(F(y))^2]$.\n",
881+
"- discriminator B is trained to minimize $\\text{mean}_{y \\in B}[(D_B(y) \u2212 1)^2] + \\text{mean}_{x \\in A}[D_B(G(x))^2]$."
882882
]
883883
},
884884
{
@@ -887,7 +887,7 @@
887887
"id": "JE8dLeEfIl_Z"
888888
},
889889
"source": [
890-
"We will use [`torch.amp.autocast`](https://pytorch.org/docs/master/amp.html#torch.amp.autocast) and [`torch.cuda.amp.GradScaler`](https://pytorch.org/docs/master/amp.html#torch.cuda.amp.GradScaler) to perform automatic mixed precision training. Our code follows a [typical mixed precision training example](https://pytorch.org/docs/master/notes/amp_examples.html#typical-mixed-precision-training)."
890+
"We will use [`torch.amp.autocast`](https://pytorch.org/docs/master/amp.html#torch.amp.autocast) and [`torch.amp.GradScaler`](https://pytorch.org/docs/master/amp.html#torch.amp.GradScaler) to perform automatic mixed precision training. Our code follows a [typical mixed precision training example](https://pytorch.org/docs/master/notes/amp_examples.html#typical-mixed-precision-training)."
891891
]
892892
},
893893
{
@@ -896,8 +896,7 @@
896896
"id": "vrJls4p-FRcA"
897897
},
898898
"source": [
899-
"from torch.cuda.amp import GradScaler\n",
900-
"from torch.amp import autocast\n",
899+
"from torch.amp import autocast, GradScaler\n",
901900
"\n",
902901
"from ignite.utils import convert_tensor\n",
903902
"import torch.nn.functional as F\n",
@@ -924,7 +923,7 @@
924923
"\n",
925924
"\n",
926925
"def compute_loss_discriminator(decision_real, decision_fake):\n",
927-
" # loss = mean (D_b(y) 1)^2 + mean D_b(G(x))^2 \n",
926+
" # loss = mean (D_b(y) \u2212 1)^2 + mean D_b(G(x))^2 \n",
928927
" loss = F.mse_loss(decision_fake, torch.zeros_like(decision_fake))\n",
929928
" loss += F.mse_loss(decision_real, torch.ones_like(decision_real))\n",
930929
" return loss\n",
@@ -954,10 +953,10 @@
954953
" decision_fake_b = discriminator_B(fake_b)\n",
955954
"\n",
956955
" # Compute loss for generators and update generators\n",
957-
" # loss_a2b = GAN loss: mean (D_b(G(x)) 1)^2 + Forward cycle loss: || F(G(x)) - x ||_1 \n",
956+
" # loss_a2b = GAN loss: mean (D_b(G(x)) \u2212 1)^2 + Forward cycle loss: || F(G(x)) - x ||_1 \n",
958957
" loss_a2b = compute_loss_generator(decision_fake_b, real_a, rec_a, lambda_value) \n",
959958
"\n",
960-
" # loss_b2a = GAN loss: mean (D_a(F(x)) 1)^2 + Backward cycle loss: || G(F(y)) - y ||_1\n",
959+
" # loss_b2a = GAN loss: mean (D_a(F(x)) \u2212 1)^2 + Backward cycle loss: || G(F(y)) - y ||_1\n",
961960
" loss_b2a = compute_loss_generator(decision_fake_a, real_b, rec_b, lambda_value)\n",
962961
"\n",
963962
" # total generators loss:\n",
@@ -977,10 +976,10 @@
977976
" decision_real_a, decision_fake_a = discriminator_forward_pass(discriminator_A, real_a, fake_a.detach(), fake_a_buffer) \n",
978977
" decision_real_b, decision_fake_b = discriminator_forward_pass(discriminator_B, real_b, fake_b.detach(), fake_b_buffer) \n",
979978
" # Compute loss for discriminators and update discriminators\n",
980-
" # loss_a = mean (D_a(y) 1)^2 + mean D_a(F(x))^2\n",
979+
" # loss_a = mean (D_a(y) \u2212 1)^2 + mean D_a(F(x))^2\n",
981980
" loss_a = compute_loss_discriminator(decision_real_a, decision_fake_a)\n",
982981
"\n",
983-
" # loss_b = mean (D_b(y) 1)^2 + mean D_b(G(x))^2\n",
982+
" # loss_b = mean (D_b(y) \u2212 1)^2 + mean D_b(G(x))^2\n",
984983
" loss_b = compute_loss_discriminator(decision_real_b, decision_fake_b)\n",
985984
" \n",
986985
" # total discriminators loss:\n",

examples/references/classification/imagenet/main.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@
66
import torch
77

88
try:
9-
from torch.amp import autocast
10-
from torch.cuda.amp import GradScaler
9+
from torch.amp import autocast, GradScaler
1110
except ImportError:
1211
raise RuntimeError("Please, use recent PyTorch version, e.g. >=1.12.0")
1312

examples/references/segmentation/pascal_voc2012/main.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@
66
import torch
77

88
try:
9-
from torch.amp import autocast
10-
from torch.cuda.amp import GradScaler
9+
from torch.amp import autocast, GradScaler
1110
except ImportError:
1211
raise RuntimeError("Please, use recent PyTorch version, e.g. >=1.12.0")
1312

examples/transformers/main.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,7 @@
77
import torch.nn as nn
88
import torch.optim as optim
99
import utils
10-
from torch.amp import autocast
11-
from torch.cuda.amp import GradScaler
10+
from torch.amp import autocast, GradScaler
1211

1312
import ignite
1413
import ignite.distributed as idist

ignite/engine/__init__.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -133,11 +133,11 @@ def supervised_training_step_amp(
133133
prepare_batch: Callable = _prepare_batch,
134134
model_transform: Callable[[Any], Any] = lambda output: output,
135135
output_transform: Callable[[Any, Any, Any, torch.Tensor], Any] = lambda x, y, y_pred, loss: loss.item(),
136-
scaler: Optional["torch.cuda.amp.GradScaler"] = None,
136+
scaler: Optional["torch.amp.GradScaler"] = None,
137137
gradient_accumulation_steps: int = 1,
138138
model_fn: Callable[[torch.nn.Module, Any], Any] = lambda model, x: model(x),
139139
) -> Callable:
140-
"""Factory function for supervised training using ``torch.cuda.amp``.
140+
"""Factory function for supervised training using ``torch.amp``.
141141
142142
Args:
143143
model: the model to train.
@@ -170,7 +170,7 @@ def supervised_training_step_amp(
170170
model = ...
171171
optimizer = ...
172172
loss_fn = ...
173-
scaler = torch.cuda.amp.GradScaler(2**10)
173+
scaler = torch.amp.GradScaler('cuda', 2**10)
174174
175175
update_fn = supervised_training_step_amp(model, optimizer, loss_fn, 'cuda', scaler=scaler)
176176
trainer = Engine(update_fn)
@@ -185,7 +185,7 @@ def supervised_training_step_amp(
185185
"""
186186

187187
try:
188-
from torch.amp import autocast
188+
from torch.amp import autocast, GradScaler
189189
except ImportError:
190190
raise ImportError("Please install torch>=1.12.0 to use amp_mode='amp'.")
191191

@@ -393,8 +393,8 @@ def update(engine: Engine, batch: Sequence[torch.Tensor]) -> Union[Any, Tuple[to
393393

394394

395395
def _check_arg(
396-
on_tpu: bool, on_mps: bool, amp_mode: Optional[str], scaler: Optional[Union[bool, "torch.cuda.amp.GradScaler"]]
397-
) -> Tuple[Optional[str], Optional["torch.cuda.amp.GradScaler"]]:
396+
on_tpu: bool, on_mps: bool, amp_mode: Optional[str], scaler: Optional[Union[bool, "torch.amp.GradScaler"]]
397+
) -> Tuple[Optional[str], Optional["torch.amp.GradScaler"]]:
398398
"""Checking tpu, mps, amp and GradScaler instance combinations."""
399399
if on_mps and amp_mode:
400400
raise ValueError("amp_mode cannot be used with mps device. Consider using amp_mode=None or device='cuda'.")
@@ -410,7 +410,7 @@ def _check_arg(
410410
raise ValueError(f"scaler argument is {scaler}, but amp_mode is {amp_mode}. Consider using amp_mode='amp'.")
411411
elif amp_mode == "amp" and isinstance(scaler, bool):
412412
try:
413-
from torch.cuda.amp import GradScaler
413+
from torch.amp import GradScaler
414414
except ImportError:
415415
raise ImportError("Please install torch>=1.6.0 to use scaler argument.")
416416
scaler = GradScaler(enabled=True)
@@ -434,7 +434,7 @@ def create_supervised_trainer(
434434
output_transform: Callable[[Any, Any, Any, torch.Tensor], Any] = lambda x, y, y_pred, loss: loss.item(),
435435
deterministic: bool = False,
436436
amp_mode: Optional[str] = None,
437-
scaler: Union[bool, "torch.cuda.amp.GradScaler"] = False,
437+
scaler: Union[bool, "torch.amp.GradScaler"] = False,
438438
gradient_accumulation_steps: int = 1,
439439
model_fn: Callable[[torch.nn.Module, Any], Any] = lambda model, x: model(x),
440440
) -> Engine:
@@ -459,7 +459,7 @@ def create_supervised_trainer(
459459
:class:`~ignite.engine.deterministic.DeterministicEngine`, otherwise :class:`~ignite.engine.engine.Engine`
460460
(default: False).
461461
amp_mode: can be ``amp`` or ``apex``, model and optimizer will be casted to float16 using
462-
`torch.cuda.amp <https://pytorch.org/docs/stable/amp.html>`_ for ``amp`` and
462+
`torch.amp <https://pytorch.org/docs/stable/amp.html>`_ for ``amp`` and
463463
using `apex <https://nvidia.github.io/apex>`_ for ``apex``. (default: None)
464464
scaler: GradScaler instance for gradient scaling if `torch>=1.6.0`
465465
and ``amp_mode`` is ``amp``. If ``amp_mode`` is ``apex``, this argument will be ignored.

0 commit comments

Comments
 (0)