generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 527
[HuggingFace][Neuronx] Inference - Optimum Neuron 0.3.0 - Neuron sdk 2.24.1 - Transformers to 4.51.3 #5274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[HuggingFace][Neuronx] Inference - Optimum Neuron 0.3.0 - Neuron sdk 2.24.1 - Transformers to 4.51.3 #5274
Changes from all commits
Commits
Show all changes
45 commits
Select commit
Hold shift + click to select a range
68347f0
revertme: dlc developer config
JingyaHuang 47738d5
update neuronx dockerfile
JingyaHuang b6f5740
downgrade trfrs
JingyaHuang 35cd1fd
update tiny artifacts
JingyaHuang 80b2e0b
fix: remove transformer-neuronx
JingyaHuang 9d6ec3a
address comments
JingyaHuang 5cec25c
add pypi to extra index for networkx compatibility
JingyaHuang 7badb09
fix: unbuntu version tag
JingyaHuang ed97698
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan 9878fe4
Merge branch 'master' into update-hf-pt2.3-inf
JingyaHuang 5df31f0
fix: tackle vulneralbility
JingyaHuang e43b179
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan a6a5ce8
Merge branch 'master' into update-hf-pt2.3-inf
malav-shastri 288b282
add empty allowlists
JingyaHuang fced133
Remove empty allowlist
sirutBuasai 60ea52e
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan 3c6f536
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan ebf920e
update sdk to 2.24.1 and add allowlist images
ahsan-z-khan 934b9bc
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan c49ea97
add suffix on apt libraries
ahsan-z-khan d239a0d
remove allowlist
ahsan-z-khan 452251c
remove suffix from installed_framework_version
ahsan-z-khan 753f743
remove transformers_neuronx as its not used anymore
ahsan-z-khan 192e5a4
add python allowlist
ahsan-z-khan 28f001a
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan 9b4e932
fix: sentence trfrs no deps
JingyaHuang cd25836
format
ahsan-z-khan 3449bdb
Update Dockerfile.neuronx.py_scan_allowlist.json
ahsan-z-khan f3ea766
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan 1efc5ac
Update Dockerfile.neuronx.py_scan_allowlist.json
ahsan-z-khan f645cdb
add no deps to peft
ahsan-z-khan f810f64
update allowlist for transformers
ahsan-z-khan 7f5ee91
change upgrade strategy
ahsan-z-khan 6fd0fac
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan ebfcf1b
Merge branch 'master' into update-hf-pt2.3-inf
ahsan-z-khan fabfc8f
update req
ahsan-z-khan 36297c9
add no upgrade networkx
ahsan-z-khan 2bf751e
fix: sdxl compiled with bs=1
JingyaHuang c477273
Update Dockerfile.neuronx
arjraman 31b4727
Update Dockerfile.neuronx.py_scan_allowlist.json
arjraman a4af4b0
Update requirements.txt
arjraman e796825
Update Dockerfile.neuronx.py_scan_allowlist.json
arjraman 5ad59a1
Update Dockerfile.neuronx.py_scan_allowlist.json
arjraman 9cdcd76
Update dlc_developer_config.toml
ahsan-z-khan 071408d
Update dlc_developer_config.toml
ahsan-z-khan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
196 changes: 196 additions & 0 deletions
196
huggingface/pytorch/inference/docker/2.7/py3/sdk2.24.1/Dockerfile.neuronx
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Overall, some of the pip commands can be combined to reduce layers |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,196 @@ | ||
| FROM ubuntu:22.04 | ||
|
|
||
| LABEL dlc_major_version="1" | ||
| LABEL maintainer="Amazon AI" | ||
| LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true | ||
|
|
||
| ARG PYTHON=python3.10 | ||
| ARG PYTHON_VERSION=3.10.12 | ||
| ARG MMS_VERSION=1.1.11 | ||
| ARG MAMBA_VERSION=23.1.0-4 | ||
|
|
||
| # Neuron SDK components version numbers | ||
| ARG NEURONX_FRAMEWORK_VERSION=2.7.0.2.8.6734 | ||
| ARG NEURONX_DISTRIBUTED_VERSION=0.13.14393 | ||
| ARG NEURONX_CC_VERSION=2.19.8089.0 | ||
JingyaHuang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ARG NEURONX_COLLECTIVES_LIB_VERSION=2.26.43.0-47cc904ea | ||
| ARG NEURONX_RUNTIME_LIB_VERSION=2.26.42.0-2ff3b5c7d | ||
| ARG NEURONX_TOOLS_VERSION=2.24.54.0 | ||
|
|
||
| # HF ARGS | ||
| ARG TRANSFORMERS_VERSION | ||
| ARG DIFFUSERS_VERSION=0.35.1 | ||
| ARG HUGGINGFACE_HUB_VERSION=0.35.0 | ||
| ARG OPTIMUM_NEURON_VERSION=0.3.0 | ||
| ARG SENTENCE_TRANSFORMERS=5.1.0 | ||
| ARG PEFT_VERSION=0.17.0 | ||
| ARG DATASETS_VERSION=4.1.0 | ||
|
|
||
| # See http://bugs.python.org/issue19846 | ||
| ENV LANG C.UTF-8 | ||
| ENV LD_LIBRARY_PATH /opt/aws/neuron/lib:/lib/x86_64-linux-gnu:/opt/conda/lib/:$LD_LIBRARY_PATH | ||
| ENV PATH /opt/conda/bin:/opt/aws/neuron/bin:$PATH | ||
| ENV SAGEMAKER_SERVING_MODULE sagemaker_pytorch_serving_container.serving:main | ||
| ENV TEMP=/home/model-server/tmp | ||
|
|
||
| RUN apt-get update \ | ||
| && apt-get upgrade -y \ | ||
| && apt-get install -y --no-install-recommends \ | ||
| apt-transport-https \ | ||
| build-essential \ | ||
| ca-certificates \ | ||
| cmake \ | ||
| curl \ | ||
| emacs \ | ||
| git \ | ||
| gnupg2 \ | ||
| gpg-agent \ | ||
| jq \ | ||
| libgl1-mesa-glx \ | ||
| libglib2.0-0 \ | ||
| libsm6 \ | ||
| libxext6 \ | ||
| libxrender-dev \ | ||
| libcap-dev \ | ||
| libhwloc-dev \ | ||
| openjdk-11-jdk \ | ||
| unzip \ | ||
| vim \ | ||
| wget \ | ||
| zlib1g-dev \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && rm -rf /tmp/tmp* \ | ||
| && apt-get clean | ||
|
|
||
| RUN echo "deb https://apt.repos.neuron.amazonaws.com focal main" > /etc/apt/sources.list.d/neuron.list | ||
| RUN wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | apt-key add - | ||
|
|
||
| # Install Neuronx tools | ||
| RUN apt-get update \ | ||
| && apt-get install -y \ | ||
| aws-neuronx-tools=$NEURONX_TOOLS_VERSION \ | ||
| aws-neuronx-collectives=$NEURONX_COLLECTIVES_LIB_VERSION \ | ||
| aws-neuronx-runtime-lib=$NEURONX_RUNTIME_LIB_VERSION \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && rm -rf /tmp/tmp* \ | ||
| && apt-get clean | ||
|
|
||
| # https://github.com/docker-library/openjdk/issues/261 https://github.com/docker-library/openjdk/pull/263/files | ||
| RUN keytool -importkeystore -srckeystore /etc/ssl/certs/java/cacerts -destkeystore /etc/ssl/certs/java/cacerts.jks -deststoretype JKS -srcstorepass changeit -deststorepass changeit -noprompt; \ | ||
| mv /etc/ssl/certs/java/cacerts.jks /etc/ssl/certs/java/cacerts; \ | ||
| /var/lib/dpkg/info/ca-certificates-java.postinst configure; | ||
|
|
||
| RUN curl -L -o ~/mambaforge.sh https://github.com/conda-forge/miniforge/releases/download/${MAMBA_VERSION}/Mambaforge-${MAMBA_VERSION}-Linux-x86_64.sh \ | ||
| && chmod +x ~/mambaforge.sh \ | ||
| && ~/mambaforge.sh -b -p /opt/conda \ | ||
| && rm ~/mambaforge.sh \ | ||
| && /opt/conda/bin/conda update -y conda \ | ||
| && /opt/conda/bin/conda install -c conda-forge -y \ | ||
| python=$PYTHON_VERSION \ | ||
| pyopenssl \ | ||
| cython \ | ||
| mkl-include \ | ||
| mkl \ | ||
| botocore \ | ||
| parso \ | ||
| scipy \ | ||
| typing \ | ||
| # Below 2 are included in miniconda base, but not mamba so need to install | ||
| conda-content-trust \ | ||
| charset-normalizer \ | ||
| && /opt/conda/bin/conda update -y conda \ | ||
| && /opt/conda/bin/conda clean -ya | ||
|
|
||
| RUN conda install -c conda-forge \ | ||
| scikit-learn \ | ||
| h5py \ | ||
| requests \ | ||
| && conda clean -ya \ | ||
| && pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \ | ||
| && ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \ | ||
| && pip install packaging \ | ||
| enum-compat \ | ||
| ipython \ | ||
| && rm -rf ~/.cache/pip/* | ||
|
|
||
| RUN pip install --no-cache-dir -U \ | ||
| opencv-python>=4.8.1.78 \ | ||
| "numpy>=1.22.2, <=1.25.2" \ | ||
| "scipy>=1.8.0" \ | ||
| six \ | ||
| "pillow>=10.0.1" \ | ||
| "awscli<2" \ | ||
| pandas==1.* \ | ||
| boto3 \ | ||
| "cryptography<46,>=41.0.5" \ | ||
| "protobuf>=3.20.3, <4" \ | ||
| "networkx~=2.6" \ | ||
| && pip install --no-deps --no-cache-dir -U torchvision==0.22.* \ | ||
| && rm -rf ~/.cache/pip/* | ||
|
|
||
| # Install Neuronx-cc and PyTorch | ||
| RUN pip install --index-url https://pip.repos.neuron.amazonaws.com \ | ||
| --extra-index-url https://pypi.org/simple \ | ||
| --trusted-host pip.repos.neuron.amazonaws.com \ | ||
| neuronx-cc==$NEURONX_CC_VERSION \ | ||
| torch-neuronx==$NEURONX_FRAMEWORK_VERSION \ | ||
| neuronx_distributed==$NEURONX_DISTRIBUTED_VERSION | ||
|
|
||
| WORKDIR / | ||
|
|
||
| RUN pip install --no-cache-dir \ | ||
| multi-model-server==$MMS_VERSION \ | ||
| sagemaker-inference | ||
|
|
||
| RUN useradd -m model-server \ | ||
| && mkdir -p /home/model-server/tmp \ | ||
| && chown -R model-server /home/model-server | ||
|
|
||
| COPY neuron-entrypoint.py /usr/local/bin/dockerd-entrypoint.py | ||
| COPY neuron-monitor.sh /usr/local/bin/neuron-monitor.sh | ||
| COPY config.properties /etc/sagemaker-mms.properties | ||
|
|
||
| RUN chmod +x /usr/local/bin/dockerd-entrypoint.py \ | ||
| && chmod +x /usr/local/bin/neuron-monitor.sh | ||
|
|
||
| ADD https://raw.githubusercontent.com/aws/deep-learning-containers/master/src/deep_learning_container.py /usr/local/bin/deep_learning_container.py | ||
|
|
||
| RUN chmod +x /usr/local/bin/deep_learning_container.py | ||
|
|
||
| ################################# | ||
| # Hugging Face specific section # | ||
| ################################# | ||
|
|
||
| RUN curl -o /license.txt https://aws-dlc-licenses.s3.amazonaws.com/pytorch-2.7/license.txt | ||
|
|
||
| # install Hugging Face libraries and its dependencies | ||
| RUN pip install --no-cache-dir -U \ | ||
| networkx~=2.6 \ | ||
| transformers[sentencepiece,audio,vision]==${TRANSFORMERS_VERSION} \ | ||
| diffusers==${DIFFUSERS_VERSION} \ | ||
| compel \ | ||
| controlnet-aux \ | ||
| huggingface_hub==${HUGGINGFACE_HUB_VERSION} \ | ||
| hf_transfer \ | ||
| datasets==${DATASETS_VERSION} \ | ||
| optimum-neuron==${OPTIMUM_NEURON_VERSION} \ | ||
| "sagemaker-huggingface-inference-toolkit>=2.4.1,<3" \ | ||
| sentence_transformers==${SENTENCE_TRANSFORMERS} \ | ||
| peft==${PEFT_VERSION} \ | ||
| && rm -rf ~/.cache/pip/* | ||
|
|
||
| RUN HOME_DIR=/root \ | ||
| && curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \ | ||
| && unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \ | ||
| && cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \ | ||
| && chmod +x /usr/local/bin/testOSSCompliance \ | ||
| && chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \ | ||
| && ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \ | ||
| && rm -rf ${HOME_DIR}/oss_compliance* \ | ||
| # conda leaves an empty /root/.cache/conda/notices.cache file which is not removed by conda clean -ya | ||
| && rm -rf ${HOME_DIR}/.cache/conda | ||
|
|
||
| ENV HF_HUB_USER_AGENT_ORIGIN="aws:sagemaker:neuron:inference:regular" | ||
| EXPOSE 8080 8081 | ||
| ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"] | ||
| CMD ["serve"] | ||
9 changes: 9 additions & 0 deletions
9
...face/pytorch/inference/docker/2.7/py3/sdk2.24.1/Dockerfile.neuronx.py_scan_allowlist.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| { | ||
| "77740": "protobuf, required by Neuron SDK. Affected versions of this package are vulnerable to a potential Denial of Service (DoS) attack due to unbounded recursion when parsing untrusted Protocol Buffers data.", | ||
| "77986": "In transformers, The vulnerability arises from insecure URL validation using the `startswith()` method, which can be bypassed through URL username injection. This allows attackers to craft URLs that appear to be from YouTube but resolve to malicious domains, potentially leading to phishing attacks, malware distribution, or data exfiltration. The issue is fixed in version 4.52.1. We cannot upgrade now, because it co dependent on Neuron SDK version and required by HF", | ||
| "78153": "A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library. This vulnerability affects versions 4.51.3 and earlier, and is fixed in version 4.52.1.", | ||
| "78688": "also In transformers", | ||
| "79595": "also In transformers", | ||
| "79596": "also In transformers", | ||
| "79855": "also In transformers" | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file modified
BIN
+9.3 KB
(100%)
...maker_tests/huggingface/inference/resources/tiny-distilbert-sst-2/pt_neuronx_model.tar.gz
Binary file not shown.
Binary file removed
BIN
-130 KB
test/sagemaker_tests/huggingface/inference/resources/tiny-gpt2/pt_neuronx_model.tar.gz
Binary file not shown.
Binary file added
BIN
+9.21 MB
test/sagemaker_tests/huggingface/inference/resources/tiny-llama3/pt_neuronx_model.tar.gz
Binary file not shown.
Binary file modified
BIN
-233 KB
(96%)
test/sagemaker_tests/huggingface/inference/resources/tiny-sdxl/pt_neuronx_model.tar.gz
Binary file not shown.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.