You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds first-class cu130 (CUDA 13.0) support to Ray's image build pipeline
and release
test suite. Previously cu130 existed only as a one-off compiled_graph
experiment pinned
to py3.12; this generalizes it into proper cu130 base images plus a
reusable cu130 GPU
test dependency layer spanning Python 3.10–3.13.
Image builds
- ci/raydepsets/configs/rayimg.depsets.yaml
- Added cu130 build-arg sets and made CUDA explicit in the build-arg key
rather than
implied: regular keys (py310…py314) carry no CUDA_CODE, and CUDA builds
use explicit
*_cu128 / *_cu130 keys. This prevents a "regular" Python build from
silently pulling a
CUDA index.
- New ray_img_cu130_* depset that relaxes cupy-cuda12x → cupy-cuda13x
for the cu130
image chain.
- New ray-gpu-cu130-base_extra_testdeps_* depset for the cu130 GPU
base-extra-testdeps image.
- Left the ray-ml depset as a single, non-cuda-coded, py3.10-only lock
(ray-ml only
ships a CUDA 12.x image), keeping the lock that byod.Dockerfile actually
consumes.
- .buildkite/{base,build,_images,linux_aarch64,release/build}.rayci.yml:
wire
cu13.0.0-cudnn into the base/release image matrices.
- New compiled locks:
python/deplocks/ray_img/ray_img_cu130_py3.{10–14}.lock and
ray-gpu-cu130-base_extra_testdeps_py3.{10–14}.lock.
- ray-images.json / ci/ray_ci/test_ray_docker_container.py updated for
cu130.
GPU release tests
- ci/raydepsets/configs/release_gpu_cu130.depsets.yaml (new): a shared
gpu_cu130_py3.{10–13}.lock torch layer that expands the gpu-cu130 base
image with a
CUDA 13.x torch build, constrained to the base image lock so versions
(e.g.
cupy-cuda13x) stay consistent. Installed via python_depset (BYOD) —
replaces the old
post_build_script + ray-ml-image approach.
- Removed the legacy release_compiled_graph_gpu_cu130.depsets.yaml,
byod_compiled_graph_gpu_cu130.sh, and
requirements_compiled_graph_gpu_cu130.in.
- New requirements_gpu_cu130.in / requirements_byod_gpu_cu130.in.
- New workload jobs_check_cuda_version.py asserting the runtime torch
CUDA version is
13.0.
- release/release_tests.yaml — cu130 tests across Python 3.10–3.13:
- hello_world_cu130_py{3.10–3.13}
- jobs_check_cuda_version_cu130_py{3.10–3.13}
- jobs_check_cuda_available.py3{10–13}_cu130 (new variations)
- compiled_graphs_GPU_cu130_py{3.10–3.13} and
compiled_graphs_GPU_multinode_cu130_py{3.10–3.13} (converted from a
single py3.12 test
to a full Python matrix on the shared cu130 torch layer)
Testing
Running the new cu130 GPU release tests (20 total, Python 3.10–3.13):
hello_world_cu130, jobs_check_cuda_version_cu130,
jobs_check_cuda_available.*_cu130,
compiled_graphs_GPU_cu130, compiled_graphs_GPU_multinode_cu130.
Release tests:
https://buildkite.com/ray-project/release/builds/96047/canvas?sid=019eab1c-8fd9-4539-a08d-44f060784f05&open=false
---------
Signed-off-by: sai.miduthuri <sai.miduthuri@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Co-authored-by: elliot-barn <elliot.barnwell@anyscale.com>
0 commit comments