Release v0.3.1
Added
- A new
failure_tolerance_timeoutconfiguration parameter forKubeRayInteractiveJobandKubeRayCluster. It can be set to a positive value to give the cluster some time to transition out offailedstate (which can be transient in some scenarios) before raising an error.
Fixes
- ensure both
.head.serviceIPand.head.serviceNameare set on theRayClusterwhile waiting for cluster readiness
Full Changelog: v0.3.0...v0.3.1