[Test] Add support for fractional GPU values in Ray start parameters … by tiennguyentony · Pull Request #4454 · ray-project/kuberay

tiennguyentony · 2026-01-28T20:05:08Z

[Feature] Add support for fractional GPU values in Ray start parameters and corresponding tests

Why are these changes needed?

This PR adds support for fractional GPU values in Ray start parameters, addressing issue #4447.

Problem: Users need to serve multiple small LLM models on a single GPU using Ray's fractional GPU serving feature (e.g., 0.4 GPU per model). The autoscaler was rejecting fractional GPU values with the error: "0.4 is not of type 'integer'".

Solution:

Modified pod.go: Changed GPU resource conversion from int64() to float64() to support fractional values
Added unit test in pod_test.go: TestUpdateRayStartParamsResources_WithFractionalGPU validates the conversion logic
Added e2e test in raycluster_test.go: TestRayClusterWithFractionalGPU validates end-to-end integration

This enables users to specify fractional GPU allocations like GPU: "0.4" in their Ray placement groups for efficient multi-model serving.

Related issue number

Closes #4447

Checks

I've made sure the tests are passing.
Testing Strategy
- Unit tests - TestUpdateRayStartParamsResources_WithFractionalGPU validates GPU conversion logic
- Manual tests - Ran e2e test TestRayClusterWithFractionalGPU locally (passes in 1.07s)
- This PR is not tested :(

Test Results

=== RUN   TestRayClusterWithFractionalGPU
    raycluster_test.go:327: [2026-01-28] Created RayCluster for testing fractional GPU conversion
    raycluster_test.go:343: [2026-01-28] RayCluster pods created successfully
    raycluster_test.go:366: ✓ Test passed: RayCluster with fractional GPU configuration created successfully
--- PASS: TestRayClusterWithFractionalGPU (1.07s)
PASS

Changes Summary

File	Lines Changed	Description
`ray-operator/controllers/ray/common/pod.go`	4 (+3, -1)	Core fix: Convert GPU resources using float64 instead of int64
`ray-operator/controllers/ray/common/pod_test.go`	41 (+41)	Unit test for fractional GPU conversion
`ray-operator/test/e2e/raycluster_test.go`	102 (+102)	E2E test for RayCluster with fractional GPU config
Total	147 (+146, -1)

…and corresponding tests

ray-operator/test/e2e/raycluster_test.go

ray-operator/controllers/ray/common/pod.go

ray-operator/test/e2e/raycluster_test.go

…cceleratorResources Critical bug fix: The addWellKnownAcceleratorResources function was using strconv.FormatInt which truncated fractional GPU values to integers. When users specify GPU resources via container.Resources.Limits (the standard Kubernetes pattern), values like 400m (0.4 GPU) were truncated to 0. This fix applies the same FormatFloat conversion used in updateRayStartParamsResources, ensuring both code paths properly handle fractional GPU values: - 400m 0.4 GPU - 1 1 GPU - 4 4 GPUs Added unit test TestAddWellKnownAcceleratorResources_WithFractionalGPU to validate the fix covers container resource limits. Fixes Issue ray-project#4447: Enable fractional GPU serving support

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

ray-operator/controllers/ray/common/pod_test.go

…of fractional GPU resources to Ray start parameters

…PU by removing unnecessary GroupResource wrapper

…sterWithFractionalGPU - Changed WithResources(rayv1ac.GroupResource().WithRequestedResources(...)) to WithResources(map[string]string{...}) - Fixed API usage to match the correct signature for setting resource specs in worker group - Added 2-second graceful shutdown to allow operator cleanup before namespace deletion - Prevents race condition where test cleanup happens before operator finishes cleanup operations - Fixes issue ray-project#4447: Add support for fractional GPU values in Ray start parameters

…nt namespace termination race - Added 2-second sleep before namespace deletion in TestRayClusterWithResourceQuota - Prevents 'unable to create new content in namespace because it is being terminated' error - Same fix as applied to TestRayClusterWithFractionalGPU - Addresses CI test flakiness during cleanup phase

[Test] Add support for fractional GPU values in Ray start parameters …

54e7ab4

…and corresponding tests

tiennguyentony requested review from MortalHappiness, andrewsykim, kevin85421 and rueian as code owners January 28, 2026 20:05

tiennguyentony marked this pull request as draft January 28, 2026 20:05

tiennguyentony marked this pull request as ready for review January 28, 2026 20:06

cursor bot reviewed Jan 28, 2026

View reviewed changes

ray-operator/test/e2e/raycluster_test.go Show resolved Hide resolved

[Test] Added 0.4 GPU Resource as the Bot mention in the test

9308b3f

cursor bot reviewed Jan 28, 2026

View reviewed changes

ray-operator/controllers/ray/common/pod.go Show resolved Hide resolved

ray-operator/test/e2e/raycluster_test.go Show resolved Hide resolved

ray-operator/test/e2e/raycluster_test.go Outdated Show resolved Hide resolved

cursor bot reviewed Jan 28, 2026

View reviewed changes

ray-operator/controllers/ray/common/pod_test.go Show resolved Hide resolved

tiennguyentony marked this pull request as draft January 29, 2026 00:43

tiennguyentony added 4 commits January 28, 2026 17:07

Test: Enhance TestRayClusterWithFractionalGPU to validate conversion …

f906d02

…of fractional GPU resources to Ray start parameters

Fix: Simplify resource specification in TestRayClusterWithFractionalG…

4eea5d7

…PU by removing unnecessary GroupResource wrapper

Copilot AI mentioned this pull request Feb 5, 2026

Review all open pull requests #4482

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Test] Add support for fractional GPU values in Ray start parameters …#4454

[Test] Add support for fractional GPU values in Ray start parameters …#4454
tiennguyentony wants to merge 7 commits intoray-project:masterfrom
tiennguyentony:fix/4447-fractional-gpu-support

tiennguyentony commented Jan 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tiennguyentony commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

[Feature] Add support for fractional GPU values in Ray start parameters and corresponding tests

Why are these changes needed?

Related issue number

Checks

Test Results

Changes Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tiennguyentony commented Jan 28, 2026 •

edited

Loading