|
| 1 | +# AWS Deep Learning Containers for PyTorch 2.4 Graviton on EC2, ECS, and EKS |
| 2 | + |
| 3 | +[AWS Deep Learning Containers](https://aws.amazon.com/machine-learning/containers/) (DLC) for Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Compute Cloud (EC2), and Amazon Elastic Container Service (ECS) are now available for the [Graviton](https://aws.amazon.com/ec2/graviton/) instance type with support for PyTorch 2.4. |
| 4 | + |
| 5 | +This release includes container images for inference on CPU and GPU, optimized for performance and scale on AWS. The CPU image has been tested with each of the EC2, ECS, and EKS services, while the GPU image only supports EC2 (see the table below). The GPU image provides stable versions of NVIDIA CUDA, cuDNN, NCCL, and other components. All software components in these images are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices. |
| 6 | + |
| 7 | +| | EC2 | ECS | EKS | |
| 8 | +| ------------------------ | --------- | ------------- | ------------- | |
| 9 | +| Graviton CPU | Supported | Supported | Supported | |
| 10 | +| Graviton with NVIDIA GPU | Supported | Not Supported | Not Supported | |
| 11 | + |
| 12 | +## Release Notes |
| 13 | + |
| 14 | +- Introduced containers for PyTorch 2.4 for inference supporting EC2, ECS, and EKS on Graviton instances. For details about this release, check out our GitHub release tags: [for CPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-ec2-2.4.0-inf-cpu-py311) and [for GPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-ec2-2.4.0-inf-gpu-py311). |
| 15 | +- TorchServe version: 0.11.1 |
| 16 | +- 11/01/24: Updated TorchServe to 0.12.0 (release tags: [for CPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.5-pt-graviton-ec2-2.4.0-inf-cpu-py311) and [for GPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.4-pt-graviton-ec2-2.4.0-inf-gpu-py311)) |
| 17 | +- The GPU image is the first ever DLC supporting Graviton (ARM64) + GPU platforms. It should be used with the [G5g instance type](https://aws.amazon.com/ec2/instance-types/g5g/), which is powered by Graviton CPUs and NVIDIA T4G Tensor Core GPUs. |
| 18 | +- Please refer to the official PyTorch 2.4 release notes [here](https://github.com/pytorch/pytorch/releases/tag/v2.4.0) for framework updates. |
| 19 | + |
| 20 | +## Performance Improvements |
| 21 | + |
| 22 | +These DLCs continue to deliver the best performance on Graviton CPU for BERT and RoBERTa sentiment analysis and fill mask models, making Graviton3 the most cost effective CPU platform on the AWS cloud for these models. For more information, please refer to the [Graviton PyTorch User Guide](https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md). |
| 23 | + |
| 24 | +## Security Advisory |
| 25 | + |
| 26 | +AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/). |
| 27 | + |
| 28 | +## Python 3.11 Support |
| 29 | + |
| 30 | +Python 3.11 is supported in the PyTorch Graviton Inference containers. |
| 31 | + |
| 32 | +## CPU Instance Type Support |
| 33 | + |
| 34 | +The containers support Graviton CPU instance types supported under each of the above mentioned services. |
| 35 | + |
| 36 | +## GPU Instance Type Support |
| 37 | + |
| 38 | +The containers support the Graviton GPU instance type G5g and contain the following software components for GPU support: |
| 39 | + |
| 40 | +- CUDA 12.4.0 |
| 41 | +- cuDNN 9.1.0.70+cuda12.4 |
| 42 | +- NCCL 2.20.5+cuda12.4 |
| 43 | + |
| 44 | +## Build and Test |
| 45 | + |
| 46 | +- Built on: c6g.2xlarge |
| 47 | +- Tested on: c7g.4xlarge, c6g.4xlarge, t4g.2xlarge, r6g.2xlarge, m6g.4xlarge, g5g.4xlarge |
| 48 | +- Tested with [MNIST](http://yann.lecun.com/exdb/mnist/) and Resnet50/DenseNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20220822 arm64) and EKS AMI (1.25.6-20230304 arm64) |
| 49 | + |
| 50 | +## Known Issues |
| 51 | + |
| 52 | +- There is no official [Triton](https://github.com/triton-lang/triton) distribution for ARM64/aarch64 yet, so some torch.compile workloads will fail with: |
| 53 | + |
| 54 | +``` |
| 55 | +torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: |
| 56 | +RuntimeError: Cannot find a working triton installation. More information on installing Triton can be found at https://github.com/openai/triton |
| 57 | +``` |
| 58 | + |
| 59 | +For latest updates, please refer to the [aws/deep-learning-containers GitHub repo](https://github.com/aws/deep-learning-containers/tags). |
0 commit comments