Releases: danielgafni/dagster-ray
Release v0.4.0
This release introduces a new feature that is very useful in dev environments: Cluster Sharing. Cluster sharing allows reusing existing RayCluster resources created by previous Dagster steps. It's implemented for KubeRayCluster Dagster resource. This feature enables faster iteration speed and reduced infrastructure costs (at the expense of job isolation). Therefore KubeRayCluster is now recommended over KubeRayInteractiveJob for use in dev environments.
Learn more in Cluster Sharing docs.
Added
KubeRayCluster.cluster_sharingparameter that controls cluster sharing behavior.dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusterssensor that cleans up expired clusters (both shared and non-shared). Learn more in docs.dagster-rayentry now appears in the Dagster libraries list in the web UI.
Changed
- [:bomb: breaking] - removed
cleanup_kuberay_clusters_opand other associated definitions in favor ofdagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusterssensor that is more flexible.
Full Changelog: v0.3.1...v0.4.0
Release v0.3.1
Added
- A new
failure_tolerance_timeoutconfiguration parameter forKubeRayInteractiveJobandKubeRayCluster. It can be set to a positive value to give the cluster some time to transition out offailedstate (which can be transient in some scenarios) before raising an error.
Fixes
- ensure both
.head.serviceIPand.head.serviceNameare set on theRayClusterwhile waiting for cluster readiness
Full Changelog: v0.3.0...v0.3.1
Release v0.3.0
This release includes massive docs improvements and drops support for Python 3.9
Changes
- [:bomb: breaking] dropped Python 3.9 support (EOL October 2025)
- [internal] most of the general, backend-agnostic code has been moved to
dagster_ray.core(top-level imports still work)
Full Changelog: v0.2.1...v0.3.0
Release v0.2.1
Release v0.2.0
Changed
KubeRayInteractiveJob.deletion_strategynow defaults toDeleteClusterfor both successful and failed executions. This is a reasonable default for the use case.KubeRayInteractiveJob.ttl_seconds_after_finishednow defaults to600seconds.KubeRayCluster.lifecycle.cleanupnow defaults toalways- [:bomb: breaking]
RayJobandRayClusterclients and resources Kubernetes init parameters have been renamed tokube_configandkube_context.
Added
- new
enable_legacy_debuggerconfiguration parameter to subclasses ofRayResource - new
on_exceptionoption forlifecycle.cleanuppolicy. It's triggered during resource setup/cleanup (includingKeyboardInterrupt), but not by user@op/@assetcode. KubeRayInteractiveJobnow respectslifecycle.cleanup. It defaults toon_exception. Users are advised to rely on built-inRayJobcleanup mechanisms, such asttlSecondsAfterFinishedanddeletionStrategy.
Fixes
- removed
ignore_reinit_errorfromRayResourceinit options: it's potentially dangerous, for example in case the user has accidentally connected to another Ray cluster (including local ray) before initializing the resource.
Release v0.1.0
As far as I can tell, dagster-ray' s functionality has been quite stable for a while. The only thing keeping dagster-ray from graduating to 0.1.x was it's API -- it wasn't very polished and obviously required lots of improvements.
dagster-ray 0.1.0 comes with an better API for KubeRayCluster that's aligned with standard Kubernetes resources.
It also includes a bunch of new features, the main highlight being, of course, KubeRayInteractiveJob. This resource is a beast: it combines the slick UX of KubeRayCluster with convenient RayJob features such as timeouts, automatic cleanup, and the ability to target existing clusters (via clusterSelector).
KubeRayInteractiveJob is now the recommended way of running Ray applications in client mode on Kubernetes.
LifeCycleACtions(create=<bool>, wait=<bool>, connect=<bool>) options and .create(), .wait() and .connect() API brings more UX improvements: it makes it possible to lazily create RayCluster/RayJob and only wait for them to become ready once actually needed.
Breaking API changes are still expected in the future (I haven't got to Pipes yet!), but they are now guaranteed to happen only on minor version updates.
Changed
- [:bomb: breaking]
RayResource: top-levelskip_initandskip_setupconfiguration parameters have been removed. Thelifecyclefield is the new way of configuring steps performed during resource initialization.KubeRayCluster'sskip_cleanuphas been moved tolifecycleas well. - [:bomb: breaking] injected
dagster.io/run_idKubernetes label has been renamed todagster/run-id. Keys starting withdagster.io/have been converted todagster/to match howdagster-k8sdoes it. - [:bomb: breaking]
dagster_ray.kuberayConfigurations have been unified with KubeRay APIs. dagster-raynow populates Kubernetes labels with more values (including some useful Dagster Cloud values such asgit-sha)
Added
KubeRayInteractiveJob-- a new resource that utililizes the newInteractiveModeforRayJob. It can be used to connect to Ray in Client mode -- likeKubeRayCluster-- but gives access toRayJobfeatures, such as automatic cleanup (ttlSecondsAfterFinished), retries (backoffLimit) and timeouts (activeDeadlineSeconds).RayResourcesetup lifecycle has been overhauled: resources now has anactionsparameter with 3 configuration options:create,waitandconnect. The user can disable them and run.create(),.wait()and.connect()manually if needed.
Full Changelog: v0.0.23...v0.1.0
Release v0.1.0-alpha.0
This is a pre-release for dagster-ray 0.1.0. It's available for testing.
I want to merge #193 before releasing 0.1.0. There are no other API changes planned (currently).
🔧 Changed
- 💣 breaking:
RayResource: top-levelskip_initandskip_setupconfiguration parameters have been removed. Thelifecyclefield is the new way of configuring steps performed during resource initialization.KubeRayCluster'sskip_cleanuphas been moved tolifecycleas well. - 💣 breaking: injected
dagster.io/run_idKubernetes label has been renamed todagster/run-id. Keys starting withdagster.io/have been converted todagster/to match howdagster-k8sdoes it. - 💣 breaking:
dagster_ray.kuberayconfigurations have been unified with KubeRay APIs. dagster-raynow populates Kubernetes labels with more values (including some useful Dagster Cloud values such asgit-sha)
✨ Added
KubeRayInteractiveJob-- a new resource that utilizes the newInteractiveModeforRayJob. It can be used to connect to Ray in Client mode -- likeKubeRayCluster-- but gives access toRayJobfeatures, such as automatic cleanup (ttlSecondsAfterFinished), retries (backoffLimit) and timeouts (activeDeadlineSeconds).RayResourcesetup lifecycle has been overhauled: resources now has anactionsparameter with 3 configuration options:create,waitandconnect. The user can disable them and run.create(),.wait()and.connect()manually if needed.
Full Changelog: v0.0.23...v0.1.0-alpha.0
Release v0.0.23
What's Changed
- ⬆️ support Dagster 1.11.6 by @geoHeil in #181
- 🐛 fix early exiting raycluster readiness waiting loop by @danielgafni in #182
Full Changelog: v0.0.22...v0.0.23
Release v0.0.22
What's Changed
- fix: fix typos by @geoHeil in #164
- 🐛 retry urllib3.exceptions.ProtocolError when streaming k8s events by @danielgafni in #172
New Contributors
Full Changelog: v0.0.21...v0.0.22
Release v0.0.21
What's Changed
- Improve KubeRayCluster cleanup by @photoroman in #149 - thanks!
New Contributors
- @photoroman made their first contribution in #149
Full Changelog: v0.0.20...v0.0.21