Skip to content

Commit b97301c

Browse files
Copilotvfdev-5
andcommitted
Apply torch.cuda.amp.GradScaler to torch.amp.GradScaler replacements in README, docs, and notebooks
Co-authored-by: vfdev-5 <[email protected]>
1 parent 942d8f4 commit b97301c

File tree

3 files changed

+14
-15
lines changed

3 files changed

+14
-15
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -397,7 +397,7 @@ Few pointers to get you started:
397397
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/ignite/blob/master/examples/notebooks/FastaiLRFinder_MNIST.ipynb) [Basic example of LR finder on
398398
MNIST](https://github.com/pytorch/ignite/blob/master/examples/notebooks/FastaiLRFinder_MNIST.ipynb)
399399
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/ignite/blob/master/examples/notebooks/Cifar100_bench_amp.ipynb) [Benchmark mixed precision training on Cifar100:
400-
torch.cuda.amp vs nvidia/apex](https://github.com/pytorch/ignite/blob/master/examples/notebooks/Cifar100_bench_amp.ipynb)
400+
torch.amp vs nvidia/apex](https://github.com/pytorch/ignite/blob/master/examples/notebooks/Cifar100_bench_amp.ipynb)
401401
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pytorch/ignite/blob/master/examples/notebooks/MNIST_on_TPU.ipynb) [MNIST training on a single
402402
TPU](https://github.com/pytorch/ignite/blob/master/examples/notebooks/MNIST_on_TPU.ipynb)
403403
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1E9zJrptnLJ_PKhmaP5Vhb6DTVRvyrKHx) [CIFAR10 Training on multiple TPUs](https://github.com/pytorch/ignite/tree/master/examples/cifar10)

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ def run(self):
354354
("py:class", "torch.optim.optimizer.Optimizer"),
355355
("py:class", "torch.utils.data.dataset.Dataset"),
356356
("py:class", "torch.utils.data.sampler.BatchSampler"),
357-
("py:class", "torch.cuda.amp.grad_scaler.GradScaler"),
357+
("py:class", "torch.amp.grad_scaler.GradScaler"),
358358
("py:class", "torch.optim.lr_scheduler._LRScheduler"),
359359
("py:class", "torch.optim.lr_scheduler.LRScheduler"),
360360
("py:class", "torch.utils.data.dataloader.DataLoader"),

examples/notebooks/CycleGAN_with_torch_cuda_amp.ipynb

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -875,10 +875,10 @@
875875
"As suggested, we divide the objective by 2 while optimizing D, which slows down the rate at which D learns, relative to the rate of G. \n",
876876
"\n",
877877
"According to the paper:\n",
878-
"- generator A is trained minimize $\\text{mean}_{x \\in A}[(D_B(G(x)) 1)^2]$ and cycle loss $\\text{mean}_{x \\in A}\\left[ |F(G(x)) - x|_1 \\right]$\n",
879-
"- generator B is trained minimize $\\text{mean}_{y \\in B}[(D_A(F(y)) 1)^2]$ and cycle loss $\\text{mean}_{y \\in B}\\left[ |G(F(y)) - y|_1 \\right]$\n",
880-
"- discriminators A is trained to minimize $\\text{mean}_{x \\in A}[(D_A(x) 1)^2] + \\text{mean}_{y \\in B}[D_A(F(y))^2]$.\n",
881-
"- discriminator B is trained to minimize $\\text{mean}_{y \\in B}[(D_B(y) 1)^2] + \\text{mean}_{x \\in A}[D_B(G(x))^2]$."
878+
"- generator A is trained minimize $\\text{mean}_{x \\in A}[(D_B(G(x)) \u2212 1)^2]$ and cycle loss $\\text{mean}_{x \\in A}\\left[ |F(G(x)) - x|_1 \\right]$\n",
879+
"- generator B is trained minimize $\\text{mean}_{y \\in B}[(D_A(F(y)) \u2212 1)^2]$ and cycle loss $\\text{mean}_{y \\in B}\\left[ |G(F(y)) - y|_1 \\right]$\n",
880+
"- discriminators A is trained to minimize $\\text{mean}_{x \\in A}[(D_A(x) \u2212 1)^2] + \\text{mean}_{y \\in B}[D_A(F(y))^2]$.\n",
881+
"- discriminator B is trained to minimize $\\text{mean}_{y \\in B}[(D_B(y) \u2212 1)^2] + \\text{mean}_{x \\in A}[D_B(G(x))^2]$."
882882
]
883883
},
884884
{
@@ -887,7 +887,7 @@
887887
"id": "JE8dLeEfIl_Z"
888888
},
889889
"source": [
890-
"We will use [`torch.amp.autocast`](https://pytorch.org/docs/master/amp.html#torch.amp.autocast) and [`torch.cuda.amp.GradScaler`](https://pytorch.org/docs/master/amp.html#torch.cuda.amp.GradScaler) to perform automatic mixed precision training. Our code follows a [typical mixed precision training example](https://pytorch.org/docs/master/notes/amp_examples.html#typical-mixed-precision-training)."
890+
"We will use [`torch.amp.autocast`](https://pytorch.org/docs/master/amp.html#torch.amp.autocast) and [`torch.amp.GradScaler`](https://pytorch.org/docs/master/amp.html#torch.amp.GradScaler) to perform automatic mixed precision training. Our code follows a [typical mixed precision training example](https://pytorch.org/docs/master/notes/amp_examples.html#typical-mixed-precision-training)."
891891
]
892892
},
893893
{
@@ -896,8 +896,7 @@
896896
"id": "vrJls4p-FRcA"
897897
},
898898
"source": [
899-
"from torch.cuda.amp import GradScaler\n",
900-
"from torch.amp import autocast\n",
899+
"from torch.amp import autocast, GradScaler\n",
901900
"\n",
902901
"from ignite.utils import convert_tensor\n",
903902
"import torch.nn.functional as F\n",
@@ -924,7 +923,7 @@
924923
"\n",
925924
"\n",
926925
"def compute_loss_discriminator(decision_real, decision_fake):\n",
927-
" # loss = mean (D_b(y) 1)^2 + mean D_b(G(x))^2 \n",
926+
" # loss = mean (D_b(y) \u2212 1)^2 + mean D_b(G(x))^2 \n",
928927
" loss = F.mse_loss(decision_fake, torch.zeros_like(decision_fake))\n",
929928
" loss += F.mse_loss(decision_real, torch.ones_like(decision_real))\n",
930929
" return loss\n",
@@ -954,10 +953,10 @@
954953
" decision_fake_b = discriminator_B(fake_b)\n",
955954
"\n",
956955
" # Compute loss for generators and update generators\n",
957-
" # loss_a2b = GAN loss: mean (D_b(G(x)) 1)^2 + Forward cycle loss: || F(G(x)) - x ||_1 \n",
956+
" # loss_a2b = GAN loss: mean (D_b(G(x)) \u2212 1)^2 + Forward cycle loss: || F(G(x)) - x ||_1 \n",
958957
" loss_a2b = compute_loss_generator(decision_fake_b, real_a, rec_a, lambda_value) \n",
959958
"\n",
960-
" # loss_b2a = GAN loss: mean (D_a(F(x)) 1)^2 + Backward cycle loss: || G(F(y)) - y ||_1\n",
959+
" # loss_b2a = GAN loss: mean (D_a(F(x)) \u2212 1)^2 + Backward cycle loss: || G(F(y)) - y ||_1\n",
961960
" loss_b2a = compute_loss_generator(decision_fake_a, real_b, rec_b, lambda_value)\n",
962961
"\n",
963962
" # total generators loss:\n",
@@ -977,10 +976,10 @@
977976
" decision_real_a, decision_fake_a = discriminator_forward_pass(discriminator_A, real_a, fake_a.detach(), fake_a_buffer) \n",
978977
" decision_real_b, decision_fake_b = discriminator_forward_pass(discriminator_B, real_b, fake_b.detach(), fake_b_buffer) \n",
979978
" # Compute loss for discriminators and update discriminators\n",
980-
" # loss_a = mean (D_a(y) 1)^2 + mean D_a(F(x))^2\n",
979+
" # loss_a = mean (D_a(y) \u2212 1)^2 + mean D_a(F(x))^2\n",
981980
" loss_a = compute_loss_discriminator(decision_real_a, decision_fake_a)\n",
982981
"\n",
983-
" # loss_b = mean (D_b(y) 1)^2 + mean D_b(G(x))^2\n",
982+
" # loss_b = mean (D_b(y) \u2212 1)^2 + mean D_b(G(x))^2\n",
984983
" loss_b = compute_loss_discriminator(decision_real_b, decision_fake_b)\n",
985984
" \n",
986985
" # total discriminators loss:\n",
@@ -1578,4 +1577,4 @@
15781577
"outputs": []
15791578
}
15801579
]
1581-
}
1580+
}

0 commit comments

Comments
 (0)