feat: add Ray GRPO runtime image (CUDA 12.8, Python 3.12)#795
feat: add Ray GRPO runtime image (CUDA 12.8, Python 3.12)#795Fiona-Waters wants to merge 3 commits intoopendatahub-io:mainfrom
Conversation
Adds a derivative runtime image extending quay.io/modh/ray:2.53.0-py312-cu128 with vLLM 0.12.0, verl 0.7.1, flash-attn 2.8.3, and training-hub for LoRA-GRPO training on KubeRay clusters. Signed-off-by: Fiona-Waters <fiwaters6@gmail.com> Made-with: Cursor
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Central YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Remove --no-deps workaround, manual dependency list, and patch_init.py now that instructlab-training, mini_trainer, and training_hub have all relaxed their numba constraint to >=0.61.2, resolving the conflict with vllm's numba==0.61.2. Install training-hub[grpo] from the lora-grpo branch with proper dependency resolution. Made-with: Cursor
Align with existing Ray runtime image conventions in the repo. Made-with: Cursor
Summary
Adds a new Ray runtime image for LoRA-GRPO training on KubeRay, extending
quay.io/modh/ray:2.53.0-py312-cu128with the ML stack required by training-hub's GRPO backend.The image layers vLLM 0.12.0, verl 0.7.1, flash-attn 2.8.3, PyTorch 2.9.0, and training-hub on top of the existing base Ray CUDA 12.8 image, following the same directory layout as the existing runtime images.
Key packages
Notes
instructlab-trainingandrhai-innovation-mini-trainerare installed from theirmainbranches to pick up the relaxednumba>=0.61.2constraint (merged but not yet released to PyPI).training-hub[grpo]is installed from thelora-grpobranch (GRPO code not yet merged tomain). Once all packages are released, Step 4 simplifies topip install "training-hub[grpo]".Changes
images/runtime/ray/cuda/2.53.0-py312-cu128-grpo/with Dockerfile and READMETest Plan
quay.io/rh_ee_fwaters/ray-grpo:2.53.0-py312-cu128-v6