Skip to content

Commit 1d4a403

Browse files
authored
[CI][RayService] deflaky the TestAutoscalingRayService (ray-project#3119)
Signed-off-by: Rueian <rueiancsie@gmail.com>
1 parent 017cb2f commit 1d4a403

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

ray-operator/test/e2erayservice/rayservice_ha_test.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,10 @@ func TestAutoscalingRayService(t *testing.T) {
9898
g.Expect(err).NotTo(HaveOccurred())
9999

100100
// Check the number of worker pods is correct when RayService is steady
101-
g.Eventually(WorkerPods(test, rayServiceUnderlyingRayCluster), TestTimeoutShort).Should(HaveLen(numberOfPodsWhenSteady),
101+
// TODO (rueian): with the current Ray version (2.43.0), autoscaler can have races with the scheduler and that causes overprovisioning.
102+
// So, we use TestTimeoutLong for here to wait for the autoscaler to do a scale down in the case of overprovisioning.
103+
// We may revisit the timeout again if the issue has been solved. See: https://github.com/ray-project/kuberay/issues/2981#issuecomment-2686172278
104+
g.Eventually(WorkerPods(test, rayServiceUnderlyingRayCluster), TestTimeoutLong).Should(HaveLen(numberOfPodsWhenSteady),
102105
"The WorkerGroupSpec.Replicas is %d", *rayServiceUnderlyingRayCluster.Spec.WorkerGroupSpecs[0].Replicas)
103106

104107
// Create Locust RayCluster

0 commit comments

Comments
 (0)