Skip to content

Commit 04eb5d2

Browse files
Attempt to fix CUDA version incompatibility
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
1 parent 231de88 commit 04eb5d2

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

docker/ray-llm/Dockerfile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,9 @@ export UV_SYSTEM_PYTHON=1
4444
export TORCH_CUDA_ARCH_LIST="9.0a 10.0a"
4545

4646
# Install EP kernels (PPLX, DeepEP, and NVSHMEM)
47-
curl -fsSL "${VLLM_RAW}/tools/ep_kernels/install_python_libraries.sh" | bash -s -- --workspace /home/ray/llm_ep_support
47+
# Fix CUDA version mismatch: Use nvshmem 3.3.20 which was compiled with CUDA 12.8
48+
curl -fsSL "${VLLM_RAW}/tools/ep_kernels/install_python_libraries.sh" | \
49+
bash -s -- --workspace /home/ray/llm_ep_support --nvshmem-ver 3.3.20
4850

4951
# Install DeepGEMM
5052
curl -fsSL "${VLLM_RAW}/tools/install_deepgemm.sh" | bash

0 commit comments

Comments
 (0)