generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 526
hf vllm 0.12.0 #5548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fgbelidji
wants to merge
45
commits into
aws:master
Choose a base branch
from
fgbelidji:hf-vllm-0.12.0
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
hf vllm 0.12.0 #5548
Changes from 17 commits
Commits
Show all changes
45 commits
Select commit
Hold shift + click to select a range
af074e3
Added hf-vllm v0.12.0
d431ead
Added tests for hf-vllm
9e1d266
Changed dlc_developer_config.toml
9e4d893
update conflict
8930020
Merge branch 'master' into hf-vllm-0.12.0
DevakiBolleneni b84eab4
modify toml file to add huggingface-vllm
6099561
Merge branch 'master' into hf-vllm-0.12.0
DevakiBolleneni 81a94d9
updated buildspec following new pipeline creation
289fb12
Fix test role
8948619
added transformers version
198d432
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji 7d0e3a7
fix region and suffix of base image
36e5adc
fix suffix of base image
73dae49
fix repo name
23cdda6
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji 5aa6216
reverted dlc_developer_config.toml
b36a91e
Merge branch 'master' into hf-vllm-0.12.0
DevakiBolleneni 4e95e1f
Merge branch 'master' into hf-vllm-0.12.0
DevakiBolleneni ea680df
huggingface_vllm in dlc_developer_config.toml
cf7f384
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji 17d50f3
Renamed hf-vllm to vllm
fb2c4a3
renamed hf-vllm tests to vllm
f46e252
removed renamed folders
e120746
added conftest, utils, requirements, and updated text_vllm
9ae89fb
changed testrunner so it won't skip hf-vllm tests
b2c4295
support for huggingface_vllm
7a5e5ba
changed image_type buildspec
8f0092e
enforce g6 instance
ee50601
fix instance
bae57d3
added local test
0e40bdd
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji 243eb2f
Merge branch 'master' into hf-vllm-0.12.0
181d61b
changed cuda compat logic
b5a604b
updated test to sagemaker v3
b7769c3
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji 8eb99bd
Enable local tests for huggingface_vllm
2d993e1
Merge branch 'hf-vllm-0.12.0' of github.com:fgbelidji/deep-learning-c…
ea1f74c
Add huggingface/vllm local mode tests with tiny-random-qwen3 model
83d5d3f
Fix indentation error in __init__.py
524f3aa
Download Qwen2.5-0.5B model at runtime for huggingface/vllm local tests
73f1d41
hf hub in requirements.txt
ee13c4f
Trigger CI
66602d1
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji 8b9347a
Fix: use docker_image instead of ecr_image for local tests
9f42c8c
Merge branch 'master' into hf-vllm-0.12.0
fgbelidji File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| account_id: &ACCOUNT_ID <set-$ACCOUNT_ID-in-environment> | ||
| prod_account_id: &PROD_ACCOUNT_ID 763104351884 | ||
| region: ®ION <set-$REGION-in-environment> | ||
| base_framework: &BASE_FRAMEWORK vllm | ||
| framework: &FRAMEWORK !join [ "huggingface_", *BASE_FRAMEWORK] | ||
| version: &VERSION "0.12.0" | ||
| short_version: &SHORT_VERSION "0.12" | ||
| arch_type: &ARCH_TYPE x86_64 | ||
| autopatch_build: "False" | ||
|
|
||
| repository_info: | ||
| build_repository: &BUILD_REPOSITORY | ||
| image_type: &IMAGE_TYPE gpu | ||
| root: huggingface/hf-vllm | ||
| repository_name: &REPOSITORY_NAME !join [ "pr", "-", "huggingface", "-", *BASE_FRAMEWORK ] | ||
| repository: &REPOSITORY !join [ *ACCOUNT_ID, .dkr.ecr., *REGION, .amazonaws.com/, *REPOSITORY_NAME ] | ||
| release_repository_name: &RELEASE_REPOSITORY_NAME !join [ "huggingface", "-", *BASE_FRAMEWORK ] | ||
| release_repository: &RELEASE_REPOSITORY !join [ *PROD_ACCOUNT_ID, .dkr.ecr., *REGION, .amazonaws.com/, *RELEASE_REPOSITORY_NAME ] | ||
|
|
||
| context: | ||
| build_context: &BUILD_CONTEXT | ||
| deep_learning_container: | ||
| source: ../../src/deep_learning_container.py | ||
| target: deep_learning_container.py | ||
| cuda-compatibility-lib: | ||
| source: ../build_artifacts/inference/cuda-compatibility-lib.sh | ||
| target: cuda-compatibility-lib.sh | ||
|
|
||
|
|
||
| images: | ||
| BuildHuggingFaceVllmGpuPy312Cu129DockerImage: | ||
| <<: *BUILD_REPOSITORY | ||
| context: | ||
| <<: *BUILD_CONTEXT | ||
| image_size_baseline: 26000 | ||
| device_type: &DEVICE_TYPE gpu | ||
| cuda_version: &CUDA_VERSION cu129 | ||
| python_version: &DOCKER_PYTHON_VERSION py3 | ||
| tag_python_version: &TAG_PYTHON_VERSION py312 | ||
| os_version: &OS_VERSION ubuntu22.04 | ||
| transformers_version: &TRANSFORMERS_VERSION 4.57.3 | ||
| vllm_version: &VLLM_VERSION 0.12.0 | ||
| tag: !join [ "hf-vllm", "-", *VERSION, "-", *DEVICE_TYPE, "-", *TAG_PYTHON_VERSION, "-", *CUDA_VERSION, "-", *OS_VERSION, "-sagemaker" ] | ||
| latest_release_tag: !join [ "hf-vllm", "-", *VERSION, "-", *DEVICE_TYPE, "-", *TAG_PYTHON_VERSION, "-", *CUDA_VERSION, "-", *OS_VERSION, "-sagemaker" ] | ||
| docker_file: !join [ docker/, *SHORT_VERSION, /, *CUDA_VERSION, /Dockerfile ] | ||
| target: sagemaker | ||
| build: true | ||
| enable_common_stage_build: false | ||
| test_configs: | ||
| test_platforms: | ||
| - sanity | ||
| - security | ||
| - sagemaker |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| ARG FINAL_BASE_IMAGE=763104351884.dkr.ecr.us-west-2.amazonaws.com/vllm:0.12.0-gpu-py312-cu129-ubuntu22.04-sagemaker-v1.0 | ||
| FROM ${FINAL_BASE_IMAGE} AS vllm-base | ||
|
|
||
| LABEL maintainer="Amazon AI" | ||
| LABEL dlc_major_version="1" | ||
|
|
||
| ARG HUGGINGFACE_HUB_VERSION=0.36.0 | ||
| ARG HF_XET_VERSION=1.2.0 | ||
|
|
||
| RUN apt-get update -y \ | ||
| && apt-get install -y --no-install-recommends curl unzip \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
|
|
||
| RUN pip install --upgrade pip && \ | ||
| pip install --no-cache-dir \ | ||
| huggingface-hub==${HUGGINGFACE_HUB_VERSION} \ | ||
| hf-xet==${HF_XET_VERSION} \ | ||
| grpcio | ||
|
|
||
|
|
||
| FROM vllm-base AS sagemaker | ||
| ENV HF_HUB_ENABLE_HF_TRANSFER="1" \ | ||
| HF_HUB_USER_AGENT_ORIGIN="aws:sagemaker:gpu-cuda:inference:hf-vllm" | ||
|
|
||
| COPY cuda-compatibility-lib.sh /usr/local/bin/cuda-compatibility-lib.sh | ||
| RUN chmod +x /usr/local/bin/cuda-compatibility-lib.sh | ||
|
|
||
| RUN set -eux; \ | ||
| HOME_DIR=/root; \ | ||
| uv pip install --system --upgrade pip requests PTable; \ | ||
| curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip; \ | ||
| unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/; \ | ||
| cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance; \ | ||
| chmod +x /usr/local/bin/testOSSCompliance; \ | ||
| chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh; \ | ||
| ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} python3; \ | ||
| rm -rf ${HOME_DIR}/oss_compliance* | ||
|
|
||
|
|
||
| ENTRYPOINT ["/usr/local/bin/sagemaker_entrypoint.sh"] | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| import os | ||
|
|
||
| try: | ||
| if os.path.exists("/usr/local/bin/deep_learning_container.py") and ( | ||
| os.getenv("OPT_OUT_TRACKING") is None or os.getenv("OPT_OUT_TRACKING", "").lower() != "true" | ||
| ): | ||
| import threading | ||
|
|
||
| cmd = "python /usr/local/bin/deep_learning_container.py --framework huggingface_pytorch --framework-version 2.7.1 --container-type training &>/dev/null" | ||
| x = threading.Thread(target=lambda: os.system(cmd)) | ||
| x.setDaemon(True) | ||
| x.start() | ||
| except Exception: | ||
| pass |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| # telemetry.sh | ||
| #!/bin/bash | ||
| if [ -f /usr/local/bin/deep_learning_container.py ] && [[ -z "${OPT_OUT_TRACKING}" || "${OPT_OUT_TRACKING,,}" != "true" ]]; then | ||
| ( | ||
| python /usr/local/bin/deep_learning_container.py \ | ||
| --framework "hf-vllm" \ | ||
| --framework-version "0.12.0" \ | ||
| --container-type "inference" \ | ||
| &>/dev/null & | ||
| ) | ||
| fi | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| # Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"). You | ||
| # may not use this file except in compliance with the License. A copy of | ||
| # the License is located at | ||
| # | ||
| # http://aws.amazon.com/apache2.0/ | ||
| # | ||
| # or in the "license" file accompanying this file. This file is | ||
| # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF | ||
| # ANY KIND, either express or implied. See the License for the specific | ||
| # language governing permissions and limitations under the License. | ||
| from __future__ import absolute_import |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.