Skip to content

Commit 08f832b

Browse files
ErnevSharmashaernev
andauthored
Huggingface pytorch training latest (#4688)
* chore: manually installing latest Go version * Update ['dlc_developer_config.toml'] dlc_developer_config.toml: { 'build': { 'build_frameworks': ['huggingface_pytorch'], 'build_inference': False, 'build_training': True}, 'buildspec_override': { 'dlc-pr-huggingface-pytorch-training': 'huggingface/pytorch/training/buildspec.yml'}, 'dev': { 'arm64_mode': False, 'deep_canary_mode': False, 'graviton_mode': False, 'neuronx_mode': False}, 'test': { 'ec2_tests': True, 'ecs_tests': True, 'eks_tests': True, 'sagemaker_local_tests': True, 'sagemaker_remote_tests': True, 'sanity_tests': True, 'security_tests': True}} * Revert "Update ['dlc_developer_config.toml']" This reverts commit edc424d. * chore: ignoring two vuln for unused neko binaries and upgrading go * Update ['dlc_developer_config.toml'] dlc_developer_config.toml: { 'build': { 'build_frameworks': ['huggingface_pytorch'], 'build_inference': False, 'build_training': True}, 'buildspec_override': { 'dlc-pr-huggingface-pytorch-training': 'huggingface/pytorch/training/buildspec.yml'}, 'dev': { 'arm64_mode': False, 'deep_canary_mode': False, 'graviton_mode': False, 'neuronx_mode': False}, 'test': { 'ec2_tests': True, 'ecs_tests': True, 'eks_tests': True, 'sagemaker_local_tests': True, 'sagemaker_remote_tests': True, 'sanity_tests': True, 'security_tests': True}} * fix: rm correct tarball version * chore: adding logs to ecr scan test and patching library requirements to pass pip check * chore: fixing logging * chore: updating test to remove package from vulnerability list if there are no packages effected * chore: formatting * chore: more formatting * Revert "Update ['dlc_developer_config.toml']" This reverts commit 4555f31. --------- Co-authored-by: shaernev <shaernev@amazon.com>
1 parent f596f35 commit 08f832b

File tree

4 files changed

+649
-12
lines changed

4 files changed

+649
-12
lines changed

huggingface/pytorch/training/docker/2.5/py3/cu124/Dockerfile.gpu

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -49,23 +49,29 @@ RUN pip install --no-cache-dir \
4949
peft==${PEFT_VERSION} \
5050
flash-attn==${FLASH_ATTN_VERSION}
5151

52+
# Override conflicting versions to satisfy datasets
53+
RUN pip install --no-cache-dir dill==0.3.8 multiprocess==0.70.16 \
54+
&& pip install --no-cache-dir pathos==0.3.3 --no-deps \
55+
&& PATHOS_META=$(find /opt/conda/lib -type f -path "*pathos-0.3.3.dist-info/METADATA") \
56+
&& sed -i 's/dill.*/dill/' $PATHOS_META \
57+
&& sed -i 's/multiprocess.*/multiprocess/' $PATHOS_META
5258

53-
# hf_transfer will be a built-in feature, remove the env variavle then
59+
60+
# hf_transfer will be a built-in feature, remove the env variable then
5461
ENV HF_HUB_ENABLE_HF_TRANSFER="1"
5562

5663
RUN apt-get update \
5764
# TODO: Remove upgrade statements once packages are updated in base image
5865
&& apt-get -y upgrade --only-upgrade systemd openssl cryptsetup libkrb5-3 \
5966
&& apt-get install -y git git-lfs wget tar \
60-
&& wget https://go.dev/dl/go1.22.2.linux-amd64.tar.gz \
67+
&& wget https://go.dev/dl/go1.22.3.linux-amd64.tar.gz \
6168
&& rm -rf /usr/local/go \
62-
&& tar -C /usr/local -xzf go1.22.2.linux-amd64.tar.gz \
69+
&& tar -C /usr/local -xzf go1.22.3.linux-amd64.tar.gz \
6370
&& ln -s /usr/local/go/bin/go /usr/bin/go \
64-
&& rm go1.22.2.linux-amd64.tar.gz \
71+
&& rm go1.22.3.linux-amd64.tar.gz \
6572
&& apt-get clean \
6673
&& rm -rf /var/lib/apt/lists/*
6774

68-
6975
COPY cuda-compatibility-lib.sh /usr/local/bin/cuda-compatibility-lib.sh
7076
RUN chmod +x /usr/local/bin/cuda-compatibility-lib.sh
7177

0 commit comments

Comments
 (0)