Skip to content

Commit f3a0233

Browse files
BegunnerBegunner
andauthored
[docker] fix: new images for sgl056 and vllm012 have compatibility issues (#4714)
### What does this PR do? > TransformerEngine-v2.8 leads to unexpected crashes. Try to update it to v2.10. > Fix other resultant compatibility issues. --------- Co-authored-by: Begunner <[email protected]>
1 parent cd4072d commit f3a0233

File tree

3 files changed

+3
-2
lines changed

3 files changed

+3
-2
lines changed

docker/Dockerfile.stable.sglang

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ RUN pip install nvidia-mathdx
88

99
RUN MAX_JOBS=128 pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git
1010

11-
RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-cache-dir --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.8
11+
RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-cache-dir --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.10
1212

1313
RUN pip install --upgrade --no-cache-dir transformers tokenizers
1414

docker/Dockerfile.stable.vllm

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ RUN pip install nvidia-mathdx
4242

4343
RUN MAX_JOBS=128 pip install -v --disable-pip-version-check --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git
4444

45-
RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.8
45+
RUN export NVTE_FRAMEWORK=pytorch && MAX_JOBS=128 NVTE_BUILD_THREADS_PER_JOB=4 pip3 install --resume-retries 999 --no-build-isolation git+https://github.com/NVIDIA/TransformerEngine.git@release_v2.10
4646

4747
RUN pip install --upgrade transformers tokenizers
4848

verl/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@
4949

5050
patch_hub()
5151

52+
5253
if is_npu_available:
5354
# Workaround for torch-npu's lack of support for creating nested tensors from NPU tensors.
5455
#

0 commit comments

Comments
 (0)