Skip to content

Commit 36b503b

Browse files
committed
add release notes section
Signed-off-by: sirutBuasai <sirutbuasai27@outlook.com>
1 parent 38b3a22 commit 36b503b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+1614
-0
lines changed

docs/.nav.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,13 @@ nav:
33
- Getting Started:
44
- get_started/index.md
55
- Using Deep Learning Containers: get_started/using_dlcs.md
6+
- Release Notes:
7+
- releasenotes/index.md
8+
- Base: releasenotes/base/index.md
9+
- SGLang: releasenotes/sglang/index.md
10+
- vLLM: releasenotes/vllm/index.md
11+
- PyTorch: releasenotes/pytorch/index.md
12+
- Tensorflow: releasenotes/tensorflow/index.md
613
- Tutorials: tutorials
714
- Reference:
815
- Available Images: reference/available_images.md
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# AWS Deep Learning Base Containers for EC2, ECS, EKS (CUDA 12.8)
2+
3+
[AWS Deep Learning Containers (DLCs)](https://aws.amazon.com/machine-learning/containers/) now support Base images that serve as a foundational layer to build the machine learning environment on EC2, ECS and EKS, with Ubuntu 24.04.
4+
5+
These Base DLCs package the essential deep learning components and dependencies without being tied to a specific framework implementation, providing users the flexibility to customize the DLCs with their preferred frameworks.
6+
7+
## Release Notes
8+
9+
- Development Tools: Includes curl, build-essential, cmake, and git
10+
- Python Environment: Python 3.12 with AWS CLI, boto3, and requests pre-installed
11+
- GPU Support: CUDA 12.8.1 with cuda-compat for backward compatibility
12+
- Neural Network Libraries: cuDNN 9.8.0.87 for deep neural network operations
13+
- Distributed Training: NCCL 2.26.2-1 for multi-GPU and multi-node communication
14+
- Network Performance: EFA 1.40.0 for low-latency network communications
15+
16+
## Security Advisory
17+
18+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
19+
20+
## Python Support
21+
22+
Python 3.12 is supported.
23+
24+
## GPU Instance Type Support
25+
26+
- CUDA 12.8
27+
- cuDNN 9.8.0.87
28+
- NCCL 2.26.2-1
29+
30+
## Example URL
31+
32+
```
33+
763104351884.dkr.ecr.us-east-1.amazonaws.com/base:12.8.1-gpu-py312-cu128-ubuntu24.04-ec2
34+
```
35+
36+
## Build and Test
37+
38+
- Built on: c5.18xlarge
39+
- Tested on: p4d.24xlarge, p4de.24xlarge, p5.48xlarge
40+
- Tested with: [openclip](https://github.com/mlfoundations/open_clip), [nccl-tests](https://github.com/NVIDIA/nccl-tests)
41+
42+
## Known Issues
43+
44+
No known issues so far.
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# AWS Deep Learning Base Containers for EC2, ECS, EKS (CUDA 12.9)
2+
3+
[AWS Deep Learning Containers (DLCs)](https://aws.amazon.com/machine-learning/containers/) now support Base images that serve as a foundational layer to build the machine learning environment on EC2, ECS and EKS, with Ubuntu 22.04.
4+
5+
These Base DLCs package the essential deep learning components and dependencies without being tied to a specific framework implementation, providing users the flexibility to customize the DLCs with their preferred frameworks.
6+
7+
## Release Notes
8+
9+
- Development Tools: Includes curl, build-essential, cmake, and git
10+
- Python Environment: Python 3.12 with AWS CLI, boto3, and requests pre-installed
11+
- GPU Support: CUDA 12.9.1 with cuda-compat for backward compatibility
12+
- Neural Network Libraries: cuDNN 9.10.2.21 for deep neural network operations
13+
- Distributed Training: NCCL 2.27.3-1 for multi-GPU and multi-node communication
14+
- Network Performance: EFA 1.43.1 for low-latency network communications
15+
16+
## Security Advisory
17+
18+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
19+
20+
## Python Support
21+
22+
Python 3.12 is supported.
23+
24+
## GPU Instance Type Support
25+
26+
- CUDA 12.9
27+
- cuDNN 9.10.2.21
28+
- NCCL 2.27.3-1
29+
30+
## Example URL
31+
32+
```
33+
763104351884.dkr.ecr.us-east-1.amazonaws.com/base:12.9.1-gpu-py312-cu129-ubuntu22.04-ec2
34+
```
35+
36+
## Build and Test
37+
38+
- Built on: c5.18xlarge
39+
- Tested on: p4d.24xlarge, p4de.24xlarge, p5.48xlarge
40+
- Tested with: [openclip](https://github.com/mlfoundations/open_clip), [nccl-tests](https://github.com/NVIDIA/nccl-tests)
41+
42+
## Known Issues
43+
44+
No known issues so far.
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# AWS Deep Learning Base Containers for EC2, ECS, EKS (CUDA 13.0)
2+
3+
[AWS Deep Learning Containers (DLCs)](https://aws.amazon.com/machine-learning/containers/) now support Base images that serve as a foundational layer to build the machine learning environment on EC2, ECS and EKS, with Ubuntu 22.04.
4+
5+
These Base DLCs package the essential deep learning components and dependencies without being tied to a specific framework implementation, providing users the flexibility to customize the DLCs with their preferred frameworks.
6+
7+
## Release Notes
8+
9+
- Development Tools: Includes curl, build-essential, cmake, and git
10+
- Python Environment: Python 3.12 with AWS CLI, boto3, and requests pre-installed
11+
- GPU Support: CUDA 13.0.0 with cuda-compat for backward compatibility
12+
- Neural Network Libraries: cuDNN 9.13.0.50 for deep neural network operations
13+
- Distributed Training: NCCL 2.27.7-1 for multi-GPU and multi-node communication
14+
- Network Performance: EFA 1.44.0 for low-latency network communications
15+
16+
## Security Advisory
17+
18+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
19+
20+
## Python Support
21+
22+
Python 3.12 is supported.
23+
24+
## GPU Instance Type Support
25+
26+
- CUDA 13.0
27+
- cuDNN 9.13.0.50
28+
- NCCL 2.27.7-1
29+
30+
## Example URL
31+
32+
```
33+
763104351884.dkr.ecr.us-east-1.amazonaws.com/base:13.0.0-gpu-py312-cu130-ubuntu22.04-ec2
34+
```
35+
36+
## Build and Test
37+
38+
- Built on: c5.18xlarge
39+
- Tested on: p4d.24xlarge, p4de.24xlarge, p5.48xlarge
40+
- Tested with: [openclip](https://github.com/mlfoundations/open_clip), [nccl-tests](https://github.com/NVIDIA/nccl-tests)
41+
42+
## Known Issues
43+
44+
No known issues so far.

docs/releasenotes/base/index.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Base Container Release Notes
2+
3+
Release notes for AWS Deep Learning Base Containers with CUDA support.
4+
5+
## CUDA 13.0
6+
7+
| Platform | Type | Link |
8+
| ------------- | ------- | --------------------------------------------------- |
9+
| EC2, ECS, EKS | General | [Base CUDA 13.0 on EC2, ECS, EKS](cuda-13.0-ec2.md) |
10+
11+
## CUDA 12.9
12+
13+
| Platform | Type | Link |
14+
| ------------- | ------- | --------------------------------------------------- |
15+
| EC2, ECS, EKS | General | [Base CUDA 12.9 on EC2, ECS, EKS](cuda-12.9-ec2.md) |
16+
17+
## CUDA 12.8
18+
19+
| Platform | Type | Link |
20+
| ------------- | ------- | --------------------------------------------------- |
21+
| EC2, ECS, EKS | General | [Base CUDA 12.8 on EC2, ECS, EKS](cuda-12.8-ec2.md) |

docs/releasenotes/index.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Release Notes
2+
3+
This section contains release notes for AWS Deep Learning Containers organized by framework.
4+
5+
## Frameworks
6+
7+
- [Base](base/index.md) - Release notes for Base CUDA containers
8+
- [SGLang](sglang/index.md) - Release notes for SGLang inference containers
9+
- [vLLM](vllm/index.md) - Release notes for vLLM inference containers
10+
- [PyTorch](pytorch/index.md) - Release notes for PyTorch containers
11+
- [TensorFlow](tensorflow/index.md) - Release notes for TensorFlow containers
12+
13+
## Resources
14+
15+
- [Available Images](../reference/available_images.md)
16+
- [Support Policy](../reference/support_policy.md)
17+
- [GitHub Repository](https://github.com/aws/deep-learning-containers)
18+
- [Discussion Forum](https://repost.aws/tags/TAtQOYCNQXQAypuIl0ZxRowA/aws-deep-learning-containers)
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# AWS Deep Learning Containers for PyTorch 2.4 Graviton on EC2, ECS, and EKS
2+
3+
[AWS Deep Learning Containers](https://aws.amazon.com/machine-learning/containers/) (DLC) for Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Compute Cloud (EC2), and Amazon Elastic Container Service (ECS) are now available for the [Graviton](https://aws.amazon.com/ec2/graviton/) instance type with support for PyTorch 2.4.
4+
5+
This release includes container images for inference on CPU and GPU, optimized for performance and scale on AWS. The CPU image has been tested with each of the EC2, ECS, and EKS services, while the GPU image only supports EC2 (see the table below). The GPU image provides stable versions of NVIDIA CUDA, cuDNN, NCCL, and other components. All software components in these images are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices.
6+
7+
| | EC2 | ECS | EKS |
8+
| ------------------------ | --------- | ------------- | ------------- |
9+
| Graviton CPU | Supported | Supported | Supported |
10+
| Graviton with NVIDIA GPU | Supported | Not Supported | Not Supported |
11+
12+
## Release Notes
13+
14+
- Introduced containers for PyTorch 2.4 for inference supporting EC2, ECS, and EKS on Graviton instances. For details about this release, check out our GitHub release tags: [for CPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-ec2-2.4.0-inf-cpu-py311) and [for GPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-ec2-2.4.0-inf-gpu-py311).
15+
- TorchServe version: 0.11.1
16+
- 11/01/24: Updated TorchServe to 0.12.0 (release tags: [for CPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.5-pt-graviton-ec2-2.4.0-inf-cpu-py311) and [for GPU](https://github.com/aws/deep-learning-containers/releases/tag/v1.4-pt-graviton-ec2-2.4.0-inf-gpu-py311))
17+
- The GPU image is the first ever DLC supporting Graviton (ARM64) + GPU platforms. It should be used with the [G5g instance type](https://aws.amazon.com/ec2/instance-types/g5g/), which is powered by Graviton CPUs and NVIDIA T4G Tensor Core GPUs.
18+
- Please refer to the official PyTorch 2.4 release notes [here](https://github.com/pytorch/pytorch/releases/tag/v2.4.0) for framework updates.
19+
20+
## Performance Improvements
21+
22+
These DLCs continue to deliver the best performance on Graviton CPU for BERT and RoBERTa sentiment analysis and fill mask models, making Graviton3 the most cost effective CPU platform on the AWS cloud for these models. For more information, please refer to the [Graviton PyTorch User Guide](https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md).
23+
24+
## Security Advisory
25+
26+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
27+
28+
## Python 3.11 Support
29+
30+
Python 3.11 is supported in the PyTorch Graviton Inference containers.
31+
32+
## CPU Instance Type Support
33+
34+
The containers support Graviton CPU instance types supported under each of the above mentioned services.
35+
36+
## GPU Instance Type Support
37+
38+
The containers support the Graviton GPU instance type G5g and contain the following software components for GPU support:
39+
40+
- CUDA 12.4.0
41+
- cuDNN 9.1.0.70+cuda12.4
42+
- NCCL 2.20.5+cuda12.4
43+
44+
## Build and Test
45+
46+
- Built on: c6g.2xlarge
47+
- Tested on: c7g.4xlarge, c6g.4xlarge, t4g.2xlarge, r6g.2xlarge, m6g.4xlarge, g5g.4xlarge
48+
- Tested with [MNIST](http://yann.lecun.com/exdb/mnist/) and Resnet50/DenseNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20220822 arm64) and EKS AMI (1.25.6-20230304 arm64)
49+
50+
## Known Issues
51+
52+
- There is no official [Triton](https://github.com/triton-lang/triton) distribution for ARM64/aarch64 yet, so some torch.compile workloads will fail with:
53+
54+
```
55+
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
56+
RuntimeError: Cannot find a working triton installation. More information on installing Triton can be found at https://github.com/openai/triton
57+
```
58+
59+
For latest updates, please refer to the [aws/deep-learning-containers GitHub repo](https://github.com/aws/deep-learning-containers/tags).
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# AWS Deep Learning Containers for PyTorch 2.4 Graviton on SageMaker
2+
3+
[AWS Deep Learning Containers](https://aws.amazon.com/machine-learning/containers/) (DLCs) for Amazon SageMaker are now available for the [Graviton](https://aws.amazon.com/ec2/graviton/) instance type with support for PyTorch 2.4.
4+
5+
This release includes a container image for inference on CPU, optimized for performance and scale on AWS. This Docker image was tested on SageMaker. All software components in this image are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices.
6+
7+
Please refer to the [SageMaker Graviton blog](https://aws.amazon.com/blogs/machine-learning/run-machine-learning-inference-workloads-on-aws-graviton-based-instances-with-amazon-sagemaker/) and DLC [developer guide](https://docs.aws.amazon.com/dlami/latest/devguide/deep-learning-containers.html) to migrate the Deep Learning workloads to Graviton instances.
8+
9+
## Release Notes
10+
11+
- Introduced container for PyTorch 2.4 for inference supporting SageMaker services on Graviton instances. For details about this release, check out our GitHub [release tag](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-graviton-sagemaker-2.4.0-inf-cpu-py311).
12+
- TorchServe version: 0.11.1
13+
- 10/25/24: Updated TorchServe to 0.12.0 ([release tag](https://github.com/aws/deep-learning-containers/releases/tag/v1.3-pt-graviton-sagemaker-2.4.0-inf-cpu-py311))
14+
- Please refer to the official PyTorch 2.4 release notes [here](https://github.com/pytorch/pytorch/releases/tag/v2.4.0) for framework updates.
15+
16+
## Performance Improvements
17+
18+
These DLCs continue to deliver the best performance on Graviton for BERT and RoBERTa sentiment analysis and fill mask models, making Graviton3 the most cost effective CPU platform on the AWS cloud for these models. For more information, please refer to the [Graviton PyTorch User Guide](https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md).
19+
20+
## Security Advisory
21+
22+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
23+
24+
## Python 3.11 Support
25+
26+
Python 3.11 is supported in the PyTorch Graviton Inference containers.
27+
28+
## CPU Instance Type Support
29+
30+
The containers support Graviton CPU instance types supported under SageMaker.
31+
32+
## Build and Test
33+
34+
- Built on: c6g.2xlarge
35+
- Tested on: c7g.4xlarge, c6g.4xlarge, t4g.2xlarge, r6g.2xlarge, m6g.4xlarge
36+
- Tested with [MNIST](http://yann.lecun.com/exdb/mnist/) and Resnet50/DenseNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20220822 arm64) and EKS AMI (1.25.6-20230304 arm64)
37+
38+
## Known Issues
39+
40+
- None
41+
42+
For latest updates, please refer to the [aws/deep-learning-containers GitHub repo](https://github.com/aws/deep-learning-containers/tags).
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# AWS Deep Learning Containers for PyTorch 2.4 Inference on EC2, ECS and EKS
2+
3+
[AWS Deep Learning Containers](https://aws.amazon.com/machine-learning/containers/) (DLCs) for Amazon Elastic Compute Cloud (EC2), Amazon Elastic Container Service (ECS), and Amazon Elastic Kubernetes Service (EKS) are now available with PyTorch 2.4 and support for CUDA 12.4 on Ubuntu 22.04.
4+
5+
This release includes container images for inference on CPU and GPU, optimized for performance and scale on AWS. These Docker images have been tested with EC2, ECS and EKS services, and provide stable versions of NVIDIA CUDA, cuDNN, Intel MKL, and other components. All software components in these images are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices. If you are looking for a DLC to use with SageMaker, please refer to [this documentation](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#general-framework-containers-ec2-ecs-eks--sm-support).
6+
7+
## Release Notes
8+
9+
- Introduced containers for PyTorch 2.4.0 for inference supporting EC2, ECS, and EKS. For details about this release, check out our GitHub [release tag](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-ec2-2.4.0-inf-py311).
10+
- PyTorch 2.4 offers support for python custom operator API allowing users to integrate custom kernels such as Triton kernels into torch.compile.
11+
- Please refer to the official PyTorch 2.4 release notes [here](https://github.com/pytorch/pytorch/releases/tag/v2.4.0) for the full description of updates.
12+
- Added Python 3.11 support
13+
- Added CUDA 12.4 support
14+
- Added Ubuntu 22.04 support
15+
- The Dockerfile for CPU can be found [here](https://github.com/aws/deep-learning-containers/blob/master/pytorch/inference/docker/2.4/py3/Dockerfile.cpu), and the Dockerfile for GPU can be found [here](https://github.com/aws/deep-learning-containers/blob/master/pytorch/inference/docker/2.4/py3/cu124/Dockerfile.gpu).
16+
17+
## Security Advisory
18+
19+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
20+
21+
## Python 3.11 Support
22+
23+
Python 3.11 is supported in the PyTorch Inference containers.
24+
25+
## CPU Instance Type Support
26+
27+
The containers support x86_64 CPU instance types.
28+
29+
## GPU Instance Type Support
30+
31+
The containers support GPU instance types and contain the following software components for GPU support:
32+
33+
- CUDA 12.4.1
34+
- cuDNN 9.1.0.70+cuda12.4
35+
- NCCL 2.22.3+cuda12.4
36+
37+
## Build and Test
38+
39+
- Built on: c5.18xlarge
40+
- Tested on: c5.18xlarge, g3.16xlarge, m5.16xlarge, t3.2xlarge, p3.16xlarge, p3dn.24xlarge, p4d.24xlarge, g4dn.xlarge
41+
- Tested with [MNIST](http://yann.lecun.com/exdb/mnist/) and Resnet50/ImageNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20221102), and EKS AMI (amazon-eks-gpu-node-1.25.16-20240307)
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# AWS Deep Learning Containers for PyTorch 2.4 Inference on SageMaker
2+
3+
[AWS Deep Learning Containers](https://aws.amazon.com/machine-learning/containers/) (DLC) for Amazon SageMaker are now available with support for PyTorch 2.4 and support for CUDA 12.4 on Ubuntu 22.04.
4+
5+
This release includes container images for inference on CPU and GPU, optimized for performance and scale on AWS. These Docker images have been tested with SageMaker services, and provide stable versions of NVIDIA CUDA, cuDNN, and other components. All software components in these images are scanned for security vulnerabilities and updated or patched in accordance with AWS Security best practices.
6+
7+
## Release Notes
8+
9+
- Introduced containers for PyTorch 2.4 for inference supporting SageMaker services. For details about this release, check out our GitHub [release tag](https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-sagemaker-2.4.0-inf-py311).
10+
- PyTorch 2.4 offers support for python custom operator API allowing users to integrate custom kernels such as Triton kernels into torch.compile.
11+
- Please refer to the official PyTorch 2.4 release notes [here](https://github.com/pytorch/pytorch/releases/tag/v2.4.0) for the full description of updates.
12+
- Added Python 3.11 support
13+
- Added CUDA 12.4 support
14+
- Added Ubuntu 22.04 support
15+
- Added TorchServe 0.11.1 support
16+
- 10/25/24: Updated TorchServe to 0.12.0 ([release tag](https://github.com/aws/deep-learning-containers/releases/tag/v1.1-pt-sagemaker-2.4.0-inf-py311))
17+
- The Dockerfile for CPU can be found [here](https://github.com/aws/deep-learning-containers/blob/master/pytorch/inference/docker/2.4/py3/Dockerfile.cpu), and the Dockerfile for GPU can be found [here](https://github.com/aws/deep-learning-containers/blob/master/pytorch/inference/docker/2.4/py3/cu124/Dockerfile.gpu).
18+
19+
## Security Advisory
20+
21+
AWS recommends that customers monitor critical security updates in the [AWS Security Bulletin](https://aws.amazon.com/security/security-bulletins/).
22+
23+
## Python 3.11 Support
24+
25+
Python 3.11 is supported in the PyTorch Inference containers.
26+
27+
## CPU Instance Type Support
28+
29+
The containers support x86_64 CPU instance types.
30+
31+
## GPU Instance Type Support
32+
33+
The containers support GPU instance types and contain the following software components for GPU support:
34+
35+
- CUDA 12.4.1
36+
- cuDNN 9.1.0.70+cuda12.4
37+
- NCCL 2.22.3+cuda12.4
38+
39+
## Build and Test
40+
41+
- Built on: c5.18xlarge
42+
- Tested on: c5.18xlarge, g3.16xlarge, m5.16xlarge, t3.2xlarge, p3.16xlarge, p3dn.24xlarge, p4d.24xlarge, g4dn.xlarge
43+
- Tested with [MNIST](http://yann.lecun.com/exdb/mnist/) and Resnet50/ImageNet datasets on EC2, ECS AMI (Amazon Linux AMI 2.0.20221102), and EKS AMI (amazon-eks-gpu-node-1.25.16-20240307)

0 commit comments

Comments
 (0)