|
| 1 | +# AWS Deep Learning Containers for PyTorch 2.4 Graviton on EC2, ECS, and EKS |
| 2 | + |
| 3 | +[AWS Deep Learning Containers](https://aws.amazon.com/machine-learning/containers/) (DLC) for Amazon Elastic Kubernetes Service (EKS), Amazon Elastic |
| 4 | +Compute Cloud (EC2), and Amazon Elastic Container Service (ECS) are now available for the [Graviton](https://aws.amazon.com/ec2/graviton/) instance |
| 5 | +type with support for PyTorch 2.4. |
| 6 | + |
| 7 | +This release includes container images for inference on CPU and GPU, optimized for performance and scale on AWS. The CPU image has been tested with |
| 8 | +each of the EC2, ECS, and EKS services, while the GPU image only supports EC2 (see the table below). The GPU image provides stable versions of NVIDIA |
| 9 | +CUDA, cuDNN, NCCL, and other components. All software components in these images are scanned for security vulnerabilities and updated or patched in |
| 10 | +accordance with AWS Security best practices. |
| 11 | + |
| 12 | +| | EC2 | ECS | EKS | |
| 13 | +| --- | --- | --- | --- | |
| 14 | +| Graviton CPU | Supported | Supported | Supported | |
| 15 | +| Graviton with NVIDIA GPU | Supported | Not Supported | Not Supported | |
| 16 | + |
| 17 | +## Release Notes |
| 18 | + |
| 19 | +- Introduced containers for PyTorch 2.4 for inference supporting EC2, ECS, and EKS on Graviton instances. For details about this release, check out |
| 20 | + our GitHub release tags: [for CPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-ec2-2.4.0-inf-cpu-py311) and |
| 21 | + [for GPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-ec2-2.4.0-inf-gpu-py311). |
| 22 | +- TorchServe version: 0.11.1 |
| 23 | +- 11/01/24: Updated TorchServe to 0.12.0 (release tags: |
| 24 | + [for CPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.5-pt-graviton-ec2-2.4.0-inf-cpu-py311) and |
| 25 | + [for GPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.4-pt-graviton-ec2-2.4.0-inf-gpu-py311)) |
| 26 | +- The GPU image is the first ever DLC supporting Graviton (ARM64) + GPU platforms. It should be used with the |
| 27 | + [G5g instance type](https://aws.amazon.com/ec2/instance-types/g5g/), which is powered by Graviton CPUs and NVIDIA T4G Tensor Core GPUs. |
| 28 | +- Please refer to the official PyTorch 2.4 release notes [here](https://github.com/pytorch/pytorch/releases/tag/v2.4.0) for framework updates. |
| 29 | + |
| 30 | +## Performance Improvements |
| 31 | + |
| 32 | +These DLCs continue to deliver the best performance on Graviton CPU for BERT and RoBERTa sentiment analysis and fill mask models, making Graviton3 the |
| 33 | +most cost effective CPU platform on the AWS cloud for these models. For more information, please refer to the |
| 34 | +[Graviton PyTorch User Guide](https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md). |
| 35 | + |
| 36 | +## Security Advisory |
| 37 | + |
| 38 | +AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/). |
| 39 | + |
| 40 | +## Python 3.11 Support |
| 41 | + |
| 42 | +Python 3.11 is supported in the PyTorch Graviton Inference containers. |
| 43 | + |
| 44 | +## CPU Instance Type Support |
| 45 | + |
| 46 | +The containers support Graviton CPU instance types supported under each of the above mentioned services. |
| 47 | + |
| 48 | +## GPU Instance Type Support |
| 49 | + |
| 50 | +The containers support the Graviton GPU instance type G5g and contain the following software components for GPU support: |
| 51 | + |
| 52 | +- CUDA 12.4.0 |
| 53 | +- cuDNN 9.1.0.70+cuda12.4 |
| 54 | +- NCCL 2.20.5+cuda12.4 |
| 55 | + |
| 56 | +## Build and Test |
| 57 | + |
| 58 | +- Built on: c6g.2xlarge |
| 59 | +- Tested on: c7g.4xlarge, c6g.4xlarge, t4g.2xlarge, r6g.2xlarge, m6g.4xlarge, g5g.4xlarge |
| 60 | +- Tested with [MNIST](http://yann.lecun.com/exdb/mnist/) and Resnet50/DenseNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20220822 arm64) and EKS |
| 61 | + AMI (1.25.6-20230304 arm64) |
| 62 | + |
| 63 | +## Known Issues |
| 64 | + |
| 65 | +- There is no official [Triton](https://github.com/triton-lang/triton) distribution for ARM64/aarch64 yet, so some torch.compile workloads will fail |
| 66 | + with: |
| 67 | + |
| 68 | +``` |
| 69 | +torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: |
| 70 | +RuntimeError: Cannot find a working triton installation. More information on installing Triton can be found at https://github.com/openai/triton |
| 71 | +``` |
| 72 | + |
| 73 | +For latest updates, please refer to the [aws/deep-learning-containers GitHub repo](https://github.com/aws/deep-learning-containers/tags). |
0 commit comments