-
Notifications
You must be signed in to change notification settings - Fork 86
Provide Dockerfile for midstream MPI CUDA image #656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
184 changes: 184 additions & 0 deletions
184
images/runtime/training/py312-cuda130-torch29-openmpi41/Dockerfile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,184 @@ | ||
| FROM quay.io/opendatahub/odh-midstream-cuda-base-13-0:268a2d4baec5ed3c3ae09a6cce325fb83622d87b | ||
|
|
||
| ARG SSH_PORT=2222 | ||
| ARG OPENMPI_VERSION=4.1.6 | ||
| ARG UCX_VERSION=1.20.0 | ||
| # SHA256 checksums for supply-chain integrity verification. | ||
| ARG UCX_SHA256=2c15df0e7ae297997480793489292f272b108a8fec6c9be0b1c3d3c06fc15cb1 | ||
| ARG OPENMPI_SHA256=44da277b8cdc234e71c62473305a09d63f4dcca292ca40335aab7c4bf0e6a566 | ||
|
|
||
| LABEL name="training:py312-cuda130-torch29-openmpi41" \ | ||
| summary="CUDA 13.0 Python 3.12 PyTorch 2.9.0 OpenMPI 4.1 image based on C9S for Training" \ | ||
| description="CUDA 13.0 Python 3.12 PyTorch 2.9.0 OpenMPI 4.1 image based on C9S for Training" \ | ||
| io.k8s.display-name="CUDA 13.0 Python 3.12 PyTorch 2.9.0 OpenMPI 4.1 base image for Training" \ | ||
| io.k8s.description="CUDA 13.0 Python 3.12 PyTorch 2.9.0 OpenMPI 4.1 image based on C9S for Training" \ | ||
| authoritative-source-url="https://github.com/opendatahub-io/distributed-workloads" | ||
|
|
||
| USER 0 | ||
|
|
||
| # libjpeg-turbo: libjpeg.so.62 required by torchvision image I/O extension. | ||
| # libpng: libpng16.so.16 required by torchvision image I/O extension. | ||
| # libwebp: libwebp.so.7 required by torchvision image I/O extension. | ||
| # The C9S base image compiles numpy/scipy/pyarrow/pillow against system libraries | ||
| # not available in C9S (libopenblasp.so.0, libthrift-0.15.0.so, libre2.so.9); | ||
| # those packages are reinstalled from manylinux wheels after micropipenv runs. | ||
| # numactl-libs and openblas-openmp are installed separately AFTER the OpenMPI | ||
| # build step to avoid dnf clean_requirements_on_remove sweeping them out. | ||
| RUN dnf install -y openssh-server libjpeg-turbo libpng libwebp && dnf clean all | ||
|
|
||
| # Install UCX 1.20.0 prebuilt RPMs (BSD-3-Clause licensed — fully redistributable). | ||
| # Built against MOFED 24.10 + CUDA 13 — first UCX release with CUDA 13 support. | ||
| # Runtime packages (ucx, ucx-cuda, ucx-ib, ucx-ib-mlx5, ucx-rdmacm) are kept in | ||
| # the image; ucx-devel (headers) is removed after OpenMPI is built against it. | ||
| # UCX transport plugins are dlopen'd at runtime: ucx-ib-mlx5 gracefully skips | ||
| # if MOFED is absent; ucx-cuda activates when CUDA libs are present (always true here). | ||
| RUN curl -fsSLo /tmp/ucx.tar.bz2 \ | ||
| https://github.com/openucx/ucx/releases/download/v${UCX_VERSION}/ucx-${UCX_VERSION}-rocky9-mofed24.10-cuda13-x86_64.tar.bz2 \ | ||
| && echo "${UCX_SHA256} /tmp/ucx.tar.bz2" | sha256sum -c - \ | ||
| && tar -xjf /tmp/ucx.tar.bz2 -C /tmp \ | ||
| && rpm -ivh --nodeps \ | ||
| /tmp/ucx-${UCX_VERSION}-1.el9.x86_64.rpm \ | ||
| /tmp/ucx-cuda-${UCX_VERSION}-1.el9.x86_64.rpm \ | ||
| /tmp/ucx-ib-${UCX_VERSION}-1.el9.x86_64.rpm \ | ||
| /tmp/ucx-ib-mlx5-${UCX_VERSION}-1.el9.x86_64.rpm \ | ||
| /tmp/ucx-rdmacm-${UCX_VERSION}-1.el9.x86_64.rpm \ | ||
| /tmp/ucx-devel-${UCX_VERSION}-1.el9.x86_64.rpm \ | ||
|
sutaakar marked this conversation as resolved.
|
||
| && rm -f /tmp/ucx-*.rpm /tmp/ucx.tar.bz2 | ||
|
|
||
| # Build OpenMPI from source with CUDA support (BSD licensed — fully redistributable). | ||
| # OpenMPI install prefix: /usr/lib64/openmpi (keeps existing symlink/PATH layout) | ||
| # | ||
| # --with-cuda: opal_built_with_cuda_support=true — MPI calls accept GPU pointers. | ||
| # --with-ucx: links against UCX 1.20.0 (CUDA 13 aware) installed at /usr. | ||
| # ucx-cuda provides cuda_copy/cuda_ipc transports for GPU-Direct; | ||
| # ucx-ib-mlx5 provides rc_mlx5/dc_mlx5 for GPU-Direct RDMA over IB | ||
| # (activated when MOFED is present on the host at runtime). | ||
| RUN dnf install -y \ | ||
| # Runtime IB/RDMA libraries (kept after build) | ||
| rdma-core libibverbs librdmacm libibumad libmlx5 infiniband-diags \ | ||
| # OpenMPI runtime dependencies (kept after build) | ||
| hwloc libevent pmix \ | ||
| # Build tools (removed after build) | ||
| gcc gcc-c++ make perl \ | ||
| # Dev headers (removed after build) | ||
| rdma-core-devel libibverbs-devel librdmacm-devel \ | ||
| hwloc-devel libevent-devel pmix-devel zlib-devel \ | ||
| && curl -fsSLo /tmp/openmpi.tar.gz \ | ||
| https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-${OPENMPI_VERSION}.tar.gz \ | ||
| && echo "${OPENMPI_SHA256} /tmp/openmpi.tar.gz" | sha256sum -c - \ | ||
| && tar -xzf /tmp/openmpi.tar.gz -C /tmp \ | ||
| && cd /tmp/openmpi-${OPENMPI_VERSION} \ | ||
| && ./configure \ | ||
| --prefix=/usr/lib64/openmpi \ | ||
| --with-cuda=/usr/local/cuda \ | ||
| --with-ucx=/usr \ | ||
| --with-verbs \ | ||
| --with-hwloc \ | ||
| --with-libevent=external \ | ||
| --with-pmix=external \ | ||
| --enable-mpi-cxx \ | ||
| --enable-mpi1-compatibility \ | ||
| --disable-silent-rules \ | ||
| && make -j$(nproc) install \ | ||
| && rpm -e --nodeps ucx-devel-${UCX_VERSION}-1.el9.x86_64 \ | ||
| && dnf remove -y \ | ||
| gcc gcc-c++ make perl \ | ||
| rdma-core-devel libibverbs-devel librdmacm-devel \ | ||
| hwloc-devel libevent-devel pmix-devel zlib-devel \ | ||
| && dnf clean all \ | ||
| && rm -rf /tmp/openmpi-${OPENMPI_VERSION} /tmp/openmpi.tar.gz | ||
|
|
||
| # Re-install PyTorch runtime deps that dnf clean_requirements_on_remove sweeps | ||
| # out during the OpenMPI build step above (numactl-libs and openblas-openmp are | ||
| # transitive deps of hwloc-devel / the build toolchain and get auto-removed). | ||
| # Installing them in a fresh RUN marks them as explicit user installs. | ||
| RUN dnf install -y numactl-libs openblas-openmp && dnf clean all | ||
|
|
||
| # Create symlinks for OpenMPI binaries in /usr/bin so they're in default SSH PATH | ||
| RUN ln -s /usr/lib64/openmpi/bin/orted /usr/bin/orted \ | ||
| && ln -s /usr/lib64/openmpi/bin/mpiexec /usr/bin/mpiexec | ||
|
|
||
| # mpirun wrapper: registers the OpenShift random UID in /etc/passwd before launching mpirun. | ||
| # This is needed because the SDK overrides the container entrypoint, bypassing uid_entrypoint.sh. | ||
| # Placed in /usr/local/bin (not /usr/bin) so it takes precedence over the real mpirun in | ||
| # /usr/lib64/openmpi/bin, which the base image puts before /usr/bin in PATH. | ||
| COPY mpirun_wrapper.sh /usr/local/bin/mpirun | ||
| RUN chmod +x /usr/local/bin/mpirun | ||
|
|
||
| # Wrapper script so python is reachable in SSH sessions AND the virtualenv is activated. | ||
| # A symlink won't work: Python uses argv[0] to locate pyvenv.cfg, and /usr/local/bin/python | ||
| # has no venv in its parent chain. The wrapper execs /opt/app-root/bin/python, which does. | ||
| RUN printf '#!/bin/sh\nexec /opt/app-root/bin/python "$@"\n' > /usr/local/bin/python \ | ||
| && chmod +x /usr/local/bin/python | ||
|
|
||
| # Set LD_LIBRARY_PATH in /etc/environment for SSH sessions (loaded by PAM, not inherited from container). | ||
| # CUDA and cuDNN are registered in ldconfig; OpenMPI and UCX plugin dirs need explicit entries. | ||
| RUN echo "LD_LIBRARY_PATH=/usr/lib64/openmpi/lib:/usr/lib64/ucx" >> /etc/environment | ||
|
|
||
| RUN mkdir -p /var/run/sshd | ||
|
|
||
| # SSH client config | ||
| RUN sed -i "s/[ #]\(.*StrictHostKeyChecking \).*/ \1no/g" /etc/ssh/ssh_config \ | ||
| && echo " UserKnownHostsFile /dev/null" >> /etc/ssh/ssh_config \ | ||
| && echo " Port ${SSH_PORT}" >> /etc/ssh/ssh_config \ | ||
| && echo " SendEnv PATH LD_LIBRARY_PATH" >> /etc/ssh/ssh_config | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
|
|
||
| # SSH server config | ||
| RUN sed -i "s/#\(StrictModes \).*/\1no/g" /etc/ssh/sshd_config \ | ||
| && sed -i "s/#\(Port \).*/\1${SSH_PORT}/g" /etc/ssh/sshd_config \ | ||
| && echo "StrictModes no" >> /etc/ssh/sshd_config \ | ||
| && echo "Port ${SSH_PORT}" >> /etc/ssh/sshd_config | ||
|
sutaakar marked this conversation as resolved.
|
||
|
|
||
| # User-level sshd config for running as non-root | ||
| # OpenMPI MCA params file: read by every OpenMPI process regardless of environment variables, | ||
| # so this reliably applies to orted and worker processes launched via SSH. | ||
| RUN mkdir -p /home/mpiuser /home/mpiuser/.openmpi && \ | ||
| echo "PidFile /tmp/sshd.pid" > /home/mpiuser/.sshd_config && \ | ||
| echo "HostKey /home/mpiuser/.ssh/id_rsa" >> /home/mpiuser/.sshd_config && \ | ||
| echo "StrictModes no" >> /home/mpiuser/.sshd_config && \ | ||
| echo "Port ${SSH_PORT}" >> /home/mpiuser/.sshd_config && \ | ||
| echo "AcceptEnv PATH LD_LIBRARY_PATH" >> /home/mpiuser/.sshd_config | ||
|
sutaakar marked this conversation as resolved.
|
||
|
|
||
| # Install micropipenv to deploy packages from Pipfile.lock | ||
| RUN pip install --no-cache-dir -U "micropipenv[toml]" | ||
|
sutaakar marked this conversation as resolved.
|
||
|
|
||
| # Install Python dependencies from Pipfile.lock file | ||
| WORKDIR /opt/app-root/bin | ||
| COPY Pipfile.lock ./ | ||
|
|
||
| RUN micropipenv install -- --no-cache-dir && \ | ||
| rm -f ./Pipfile.lock && \ | ||
| pip install --no-cache-dir --no-deps s3fs==2026.1.0 && \ | ||
| # The C9S base image ships numpy, scipy, pyarrow, and pillow compiled against | ||
| # system libraries that are not present in the C9S repos (libopenblasp.so.0, | ||
| # libthrift-0.15.0.so, libre2.so.9). Reinstall from manylinux wheels, which | ||
| # bundle all required native libraries, replacing the C9S-compiled builds. | ||
| pip install --force-reinstall --no-cache-dir \ | ||
| numpy==1.26.4 \ | ||
| scipy==1.17.0 \ | ||
| pyarrow==22.0.0 \ | ||
| pillow==12.1.0 && \ | ||
| chmod -R g+w /opt/app-root/lib/python3.12/site-packages | ||
|
sutaakar marked this conversation as resolved.
|
||
|
|
||
| # OpenShift GID 0 pattern: give root group same permissions as owner. | ||
| # OpenShift random UIDs always have GID 0 as primary group. | ||
| RUN chgrp -R 0 /home/mpiuser && chmod -R g=u /home/mpiuser | ||
|
|
||
| # Allow uid_entrypoint to add random UID to /etc/passwd at runtime | ||
| RUN chmod g=u /etc/passwd | ||
|
|
||
| # uid_entrypoint: register the OpenShift random UID in /etc/passwd so that | ||
| # getpwuid() calls (used by Python getpass, PyTorch cache dirs, etc.) succeed. | ||
| COPY uid_entrypoint.sh /usr/local/bin/uid_entrypoint.sh | ||
| RUN chmod +x /usr/local/bin/uid_entrypoint.sh | ||
|
|
||
| WORKDIR /home/mpiuser | ||
| ENV HOME=/home/mpiuser | ||
| ENV PATH=/usr/local/bin:$PATH:$HOME/.local/bin | ||
| ENV LD_LIBRARY_PATH=/usr/lib64/openmpi/lib:/usr/lib64/ucx:${LD_LIBRARY_PATH} | ||
| # Override the base image's overly restrictive NVIDIA_REQUIRE_CUDA which only lists | ||
| # specific driver minor versions (535, 550, 565, 570, 575). Any driver >= 570 supports | ||
| # CUDA 13.0; the constraint caused the nvidia-container-runtime-hook to fail on 580.x+. | ||
| ENV NVIDIA_REQUIRE_CUDA="cuda>=13.0 driver>=570" | ||
| ENTRYPOINT ["/usr/local/bin/uid_entrypoint.sh"] | ||
| USER 1001 | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.