Skip to content

Commit ada3460

Browse files
authored
Merge branch 'master' into hf-pt-2-7-tr4-55-0-training
2 parents d62505f + 7dcbba3 commit ada3460

File tree

8 files changed

+71
-20
lines changed

8 files changed

+71
-20
lines changed

available_images.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -366,8 +366,12 @@ Note: Starting from Neuron SDK 2.17.0, Dockerfiles for PyTorch Neuron Containers
366366

367367
| Framework | Neuron Package | Neuron SDK Version | Job Type | Supported EC2 Instance Types | Python Version Options | Example URL |
368368
|----------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|--------------------|-----------|------------------------------|------------------------|----------------------------------------------------------------------------------------------------------------------|
369+
| [PyTorch 2.8.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.26.0/docker/pytorch/inference/2.8.0/Dockerfile.neuronx) | torch-neuronx, neuronx_distributed, neuronx_distributed_inference | Neuron 2.26.0 | inference | trn1,trn2,inf2 | 3.11 (py311) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference-neuronx:2.8.0-neuronx-py311-sdk2.26.0-ubuntu22.04 |
370+
| [PyTorch 2.8.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.26.0/docker/pytorch/training/2.8.0/Dockerfile.neuronx) | torch-neuronx, neuronx_distributed, neuronx_distributed_training | Neuron 2.26.0 | training | trn1,trn2,inf2 | 3.11 (py311) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training-neuronx:2.8.0-neuronx-py311-sdk2.26.0-ubuntu22.04 |
369371
| [PyTorch 2.7.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.25.0/docker/pytorch/inference/2.7.0/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_inference | Neuron 2.25.0 | inference | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference-neuronx:2.7.0-neuronx-py310-sdk2.25.0-ubuntu22.04 |
370372
| [PyTorch 2.7.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.25.0/docker/pytorch/training/2.7.0/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_training | Neuron 2.25.0 | training | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training-neuronx:2.7.0-neuronx-py310-sdk2.25.0-ubuntu22.04 |
373+
| [PyTorch 2.7.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.24.1/docker/pytorch/inference/2.7.0/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_inference | Neuron 2.24.1 | inference | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference-neuronx:2.7.0-neuronx-py310-sdk2.24.1-ubuntu22.04 |
374+
| [PyTorch 2.7.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.24.1/docker/pytorch/training/2.7.0/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_training | Neuron 2.24.1 | training | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training-neuronx:2.7.0-neuronx-py310-sdk2.24.1-ubuntu22.04 |
371375
| [PyTorch 2.6.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.23.0/docker/pytorch/inference/2.6.0/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_inference | Neuron 2.23.0 | inference | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference-neuronx:2.6.0-neuronx-py310-sdk2.23.0-ubuntu22.04 |
372376
| [PyTorch 2.6.0](https://github.com/aws-neuron/deep-learning-containers/blob/2.23.0/docker/pytorch/training/2.6.0/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_training | Neuron 2.23.0 | training | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training-neuronx:2.6.0-neuronx-py310-sdk2.23.0-ubuntu22.04 |
373377
| [PyTorch 2.5.1](https://github.com/aws-neuron/deep-learning-containers/blob/2.22.0/docker/pytorch/inference/2.5.1/Dockerfile.neuronx) | torch-neuronx, transformers-neuronx, neuronx_distributed, neuronx_distributed_inference | Neuron 2.22.0 | inference | trn1,trn2,inf2 | 3.10 (py310) | 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-inference-neuronx:2.5.1-neuronx-py310-sdk2.22.0-ubuntu22.04 |

huggingface/pytorch/training/docker/2.1/py3/sdk2.20.0/Dockerfile.neuronx.os_scan_allowlist.json

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -845,6 +845,35 @@
845845
}
846846
],
847847
"linux-libc-dev": [
848+
{
849+
"description": "In the Linux kernel, the following vulnerability has been resolved: of: module: add buffer overflow check in of_modalias(). In of_modalias(), if the buffer happens to be too small even for the 1st snprintf() call, the len parameter will become negative and str parameter (if not NULL initially) will point beyond the buffer's end. Add the buffer overflow check after the 1st snprintf() call and fix such check after the strlen() call (accounting for the terminating NUL char).",
850+
"vulnerability_id": "CVE-2024-38541",
851+
"name": "CVE-2024-38541",
852+
"package_name": "linux-libc-dev",
853+
"package_details": {
854+
"file_path": null,
855+
"name": "linux-libc-dev",
856+
"package_manager": "OS",
857+
"version": "5.4.0",
858+
"release": "192.212"
859+
},
860+
"remediation": {
861+
"recommendation": {
862+
"text": "None Provided"
863+
}
864+
},
865+
"cvss_v3_score": 7.8,
866+
"cvss_v30_score": 0.0,
867+
"cvss_v31_score": 7.8,
868+
"cvss_v2_score": 0.0,
869+
"cvss_v3_severity": "HIGH",
870+
"source_url": "https://ubuntu.com/security/CVE-2024-38541",
871+
"source": "UBUNTU_CVE",
872+
"severity": "CRITICAL",
873+
"status": "ACTIVE",
874+
"title": "CVE-2024-38541 - linux-libc-dev",
875+
"reason_to_ignore": "N/A"
876+
},
848877
{
849878
"description":"In the Linux kernel, the following vulnerability has been resolved: greybus: Fix use-after-free bug in gb_interface_release due to race condition. In gb_interface_create, &intf->mode_switch_completion is bound with gb_interface_mode_switch_work. Then it will be started by gb_interface_request_mode_switch. Here is the relevant code. if (!queue_work(system_long_wq, &intf->mode_switch_work)) { ... } If we call gb_interface_release to make cleanup, there may be an unfinished work. This function will call kfree to free the object \"intf\". However, if gb_interface_mode_switch_work is scheduled to run after kfree, it may cause use-after-free error as gb_interface_mode_switch_work will use the object \"intf\". The possible execution flow that may lead to the issue is as follows: CPU0 CPU1 | gb_interface_create | gb_interface_request_mode_switch gb_interface_release | kfree(intf) (free) | | gb_interface_mode_switch_work | mutex_lock(&intf->mutex) (use) Fix it by canceling the work before kfree.",
850879
"vulnerability_id":"CVE-2024-39495",
Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
11
{
2-
"70612": "In Jinja2, the from_string function is prone to Server Side Template Injection (SSTI) where it takes the \"source\" parameter as a template object, renders it, and then returns it. The attacker can exploit it with {{INJECTION COMMANDS}} in a URI. \r\nNOTE: The maintainer and multiple third parties believe that this vulnerability isn't valid because users shouldn't use untrusted templates without sandboxing."
2+
"70612": "In Jinja2, the from_string function is prone to Server Side Template Injection (SSTI) where it takes the \"source\" parameter as a template object, renders it, and then returns it. The attacker can exploit it with {{INJECTION COMMANDS}} in a URI. \r\nNOTE: The maintainer and multiple third parties believe that this vulnerability isn't valid because users shouldn't use untrusted templates without sandboxing.",
3+
"79077": "Affected versions of the h2 package are vulnerable to HTTP Request Smuggling due to improper validation of illegal characters in HTTP headers. The package allows CRLF characters to be injected into header names and values without proper sanitisation, which can cause request boundary manipulation when HTTP/2 requests are downgraded to HTTP/1.1 by downstream servers.",
4+
"78828": "Affected versions of the PyTorch package are vulnerable to Denial of Service (DoS) due to improper handling in the MKLDNN pooling implementation. The torch.mkldnn_max_pool2d function fails to properly validate input parameters, allowing crafted inputs to trigger resource exhaustion or crashes in the underlying MKLDNN library. An attacker with local access can exploit this vulnerability by passing specially crafted tensor dimensions or parameters to the max pooling function, causing the application to become unresponsive or crash.",
5+
"77744": "urllib3 is a user-friendly HTTP client library for Python. Prior to 2.5.0, it is possible to disable redirects for all requests by instantiating a PoolManager and specifying retries in a way that disable redirects. By default, requests and botocore users are not affected. An application attempting to mitigate SSRF or open redirect vulnerabilities by disabling redirects at the PoolManager level will remain vulnerable. This issue has been patched in version 2.5.0.",
6+
"77745": "Urllib3 is a user-friendly HTTP client library for Python. Starting in version 2.2.0 and before 2.5.0, urllib3 does not control redirects in browsers and Node.js. urllib3 supports being used in a Pyodide runtime, utilizing the JavaScript Fetch API or falling back on XMLHttpRequest. This means Python libraries can be used to make HTTP requests from a browser or Node.js. Additionally, urllib3 provides a mechanism to control redirects, but the retries and redirect parameters are ignored with Pyodide; the runtime itself determines redirect behaviour. This issue has been patched in version 2.5.0."
37
}

pytorch/inference/docker/2.6/py3/Dockerfile.arm64.cpu

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -189,8 +189,8 @@ RUN chmod +x /usr/local/bin/dockerd-entrypoint.py
189189

190190
# add telemetry
191191
COPY deep_learning_container.py /usr/local/bin/deep_learning_container.py
192-
COPY sitecustomize.py /usr/local/lib/${PYTHON_SHORT_VERSION}/sitecustomize.py
193192
RUN chmod +x /usr/local/bin/deep_learning_container.py
193+
# COPY sitecustomize.py /usr/local/lib/${PYTHON_SHORT_VERSION}/sitecustomize.py
194194

195195
RUN HOME_DIR=/root \
196196
&& curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \

release_images_general.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,14 +44,14 @@ release_images:
4444
public_registry: True
4545
4:
4646
framework: "vllm"
47-
version: "0.10.1"
47+
version: "0.10.2"
4848
arch_type: "x86"
4949
customer_type: "ec2"
5050
general:
5151
device_types: [ "gpu" ]
5252
python_versions: [ "py312" ]
5353
os_version: "ubuntu22.04"
54-
cuda_version: "cu128"
54+
cuda_version: "cu129"
5555
example: False
5656
disable_sm_tag: False
5757
force_release: False
@@ -69,4 +69,4 @@ release_images:
6969
example: False
7070
disable_sm_tag: False
7171
force_release: False
72-
public_registry: False
72+
public_registry: True

vllm/CHANGELOG.md

Lines changed: 26 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,28 @@
22

33
All notable changes to vLLM Deep Learning Containers will be documented in this file.
44

5+
## [0.10.2] - 2025-09-18
6+
### Updated
7+
- vllm/vllm-openai version `v0.10.2`, see [release note](https://github.com/vllm-project/vllm/releases/tag/v0.10.2) for details.
8+
9+
### Added
10+
- Introducing vLLM ARM64 support for AWS Graviton (g5g) with NVIDIA T4 GPUs, using XFormers/FlashInfer as attention backend and V0 engine for Turing architecture compatibility - [release tag](https://github.com/aws/deep-learning-containers/releases/tag/v1.1-vllm-arm64-ec2-0.10.2-gpu-py312)
11+
12+
### Sample ECR URI
13+
```
14+
763104351884.dkr.ecr.us-west-2.amazonaws.com/vllm-arm64:0.10.2-gpu-py312-cu129-ubuntu22.04-ec2-v1.1
15+
763104351884.dkr.ecr.us-west-2.amazonaws.com/vllm:0.10.2-gpu-py312-cu129-ubuntu22.04-ec2-v1.0
16+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.10.2-gpu-py312-cu129-ubuntu22.04-ec2
17+
```
18+
519
## [0.10.1] - 2025-08-25
620
### Updated
721
- vllm/vllm-openai version `v0.10.1.1`, see [release note](https://github.com/vllm-project/vllm/releases/tag/v0.10.1.1) for details.
822
- EFA installer version `1.43.2`
923
### Sample ECR URI
1024
```
11-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.10-gpu-py312-ec2
12-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.10.1-gpu-py312-cu128-ubuntu22.04-ec2
25+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.10-gpu-py312-ec2
26+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.10.1-gpu-py312-cu128-ubuntu22.04-ec2
1327
```
1428

1529
## [0.10.0] - 2025-08-04
@@ -18,17 +32,17 @@ All notable changes to vLLM Deep Learning Containers will be documented in this
1832
- EFA installer version `1.43.1`
1933
### Sample ECR URI
2034
```
21-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.10-gpu-py312-ec2
22-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.10.0-gpu-py312-cu128-ubuntu22.04-ec2
35+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.10-gpu-py312-ec2
36+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.10.0-gpu-py312-cu128-ubuntu22.04-ec2
2337
```
2438

2539
## [0.9.2] - 2025-07-15
2640
### Updated
2741
- vllm/vllm-openai version `v0.9.2`, see [release note](https://github.com/vllm-project/vllm/releases/tag/v0.9.2) for details.
2842
### Sample ECR URI
2943
```
30-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.9-gpu-py312-ec2
31-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.9.2-gpu-py312-cu128-ubuntu22.04-ec2
44+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.9-gpu-py312-ec2
45+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.9.2-gpu-py312-cu128-ubuntu22.04-ec2
3246
```
3347

3448
## [0.9.1] - 2025-06-13
@@ -37,8 +51,8 @@ All notable changes to vLLM Deep Learning Containers will be documented in this
3751
- EFA installer version `1.42.0`
3852
### Sample ECR URI
3953
```
40-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.9-gpu-py312-ec2
41-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.9.1-gpu-py312-cu128-ubuntu22.04-ec2
54+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.9-gpu-py312-ec2
55+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.9.1-gpu-py312-cu128-ubuntu22.04-ec2
4256
```
4357

4458

@@ -48,8 +62,8 @@ All notable changes to vLLM Deep Learning Containers will be documented in this
4862
- EFA installer version `1.41.0`
4963
### Sample ECR URI
5064
```
51-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.9-gpu-py312-ec2
52-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.9.0-gpu-py312-cu128-ubuntu22.04-ec2
65+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.9-gpu-py312-ec2
66+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.9.0-gpu-py312-cu128-ubuntu22.04-ec2
5367
```
5468

5569
## [0.8.5] - 2025-06-02
@@ -59,6 +73,6 @@ All notable changes to vLLM Deep Learning Containers will be documented in this
5973
- EFA installer version `1.40.0`
6074
### Sample ECR URI
6175
```
62-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.8-gpu-py312-ec2
63-
763104351884.dkr.ecr.us-east-1.amazonaws.com/0.8.5-gpu-py312-cu128-ubuntu22.04-ec2
76+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.8-gpu-py312-ec2
77+
763104351884.dkr.ecr.us-east-1.amazonaws.com/vllm:0.8.5-gpu-py312-cu128-ubuntu22.04-ec2
6478
```

vllm/buildspec.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ account_id: &ACCOUNT_ID <set-$ACCOUNT_ID-in-environment>
22
prod_account_id: &PROD_ACCOUNT_ID 763104351884
33
region: &REGION <set-$REGION-in-environment>
44
framework: &FRAMEWORK vllm
5-
version: &VERSION "0.10.1"
5+
version: &VERSION "0.10.2"
66
short_version: &SHORT_VERSION "0.10"
77
arch_type: &ARCH_TYPE x86_64
88
autopatch_build: "False"
@@ -35,7 +35,7 @@ images:
3535
<<: *BUILD_CONTEXT
3636
image_size_baseline: 20000
3737
device_type: &DEVICE_TYPE gpu
38-
cuda_version: &CUDA_VERSION cu128
38+
cuda_version: &CUDA_VERSION cu129
3939
python_version: &DOCKER_PYTHON_VERSION py3
4040
tag_python_version: &TAG_PYTHON_VERSION py312
4141
os_version: &OS_VERSION ubuntu22.04

vllm/x86_64/gpu/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM docker.io/vllm/vllm-openai:v0.10.1.1 as final
1+
FROM docker.io/vllm/vllm-openai:v0.10.2 as final
22
ARG PYTHON="python3"
33
ARG EFA_VERSION="1.43.2"
44
LABEL maintainer="Amazon AI"

0 commit comments

Comments
 (0)