Skip to content

Releases: danielgafni/dagster-ray

Release v0.4.0

10 Oct 15:26
f2fe537

Choose a tag to compare

This release introduces a new feature that is very useful in dev environments: Cluster Sharing. Cluster sharing allows reusing existing RayCluster resources created by previous Dagster steps. It's implemented for KubeRayCluster Dagster resource. This feature enables faster iteration speed and reduced infrastructure costs (at the expense of job isolation). Therefore KubeRayCluster is now recommended over KubeRayInteractiveJob for use in dev environments.

Learn more in Cluster Sharing docs.

Added

  • KubeRayCluster.cluster_sharing parameter that controls cluster sharing behavior.
  • dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters sensor that cleans up expired clusters (both shared and non-shared). Learn more in docs.
  • dagster-ray entry now appears in the Dagster libraries list in the web UI.

Changed

  • [:bomb: breaking] - removed cleanup_kuberay_clusters_op and other associated definitions in favor of dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters sensor that is more flexible.

Full Changelog: v0.3.1...v0.4.0

Release v0.3.1

02 Oct 12:01
573d1c8

Choose a tag to compare

Added

  • A new failure_tolerance_timeout configuration parameter for KubeRayInteractiveJob and KubeRayCluster. It can be set to a positive value to give the cluster some time to transition out of failed state (which can be transient in some scenarios) before raising an error.

Fixes

  • ensure both .head.serviceIP and .head.serviceName are set on the RayCluster while waiting for cluster readiness

Full Changelog: v0.3.0...v0.3.1

Release v0.3.0

19 Sep 16:26
872934c

Choose a tag to compare

This release includes massive docs improvements and drops support for Python 3.9

Changes

  • [:bomb: breaking] dropped Python 3.9 support (EOL October 2025)
  • [internal] most of the general, backend-agnostic code has been moved to dagster_ray.core (top-level imports still work)

Full Changelog: v0.2.1...v0.3.0

Release v0.2.1

18 Sep 15:47
9a10991

Choose a tag to compare

Fixes

  • Fixed broken wheel on PyPI

Full Changelog: v0.2.0...v0.2.1

Release v0.2.0

18 Sep 11:38
f747208

Choose a tag to compare

Changed

  • KubeRayInteractiveJob.deletion_strategy now defaults to DeleteCluster for both successful and failed executions. This is a reasonable default for the use case.
  • KubeRayInteractiveJob.ttl_seconds_after_finished now defaults to 600 seconds.
  • KubeRayCluster.lifecycle.cleanup now defaults to always
  • [:bomb: breaking] RayJob and RayCluster clients and resources Kubernetes init parameters have been renamed to kube_config and kube_context.

Added

  • new enable_legacy_debugger configuration parameter to subclasses of RayResource
  • new on_exception option for lifecycle.cleanup policy. It's triggered during resource setup/cleanup (including KeyboardInterrupt), but not by user @op/@asset code.
  • KubeRayInteractiveJob now respects lifecycle.cleanup. It defaults to on_exception. Users are advised to rely on built-in RayJob cleanup mechanisms, such as ttlSecondsAfterFinished and deletionStrategy.

Fixes

  • removed ignore_reinit_error from RayResource init options: it's potentially dangerous, for example in case the user has accidentally connected to another Ray cluster (including local ray) before initializing the resource.

Release v0.1.0

05 Sep 20:06
35278e0

Choose a tag to compare

As far as I can tell, dagster-ray' s functionality has been quite stable for a while. The only thing keeping dagster-ray from graduating to 0.1.x was it's API -- it wasn't very polished and obviously required lots of improvements.

dagster-ray 0.1.0 comes with an better API for KubeRayCluster that's aligned with standard Kubernetes resources.

It also includes a bunch of new features, the main highlight being, of course, KubeRayInteractiveJob. This resource is a beast: it combines the slick UX of KubeRayCluster with convenient RayJob features such as timeouts, automatic cleanup, and the ability to target existing clusters (via clusterSelector).

KubeRayInteractiveJob is now the recommended way of running Ray applications in client mode on Kubernetes.

LifeCycleACtions(create=<bool>, wait=<bool>, connect=<bool>) options and .create(), .wait() and .connect() API brings more UX improvements: it makes it possible to lazily create RayCluster/RayJob and only wait for them to become ready once actually needed.

Breaking API changes are still expected in the future (I haven't got to Pipes yet!), but they are now guaranteed to happen only on minor version updates.


Changed

  • [:bomb: breaking] RayResource: top-level skip_init and skip_setup configuration parameters have been removed. The lifecycle field is the new way of configuring steps performed during resource initialization. KubeRayCluster's skip_cleanup has been moved to lifecycle as well.
  • [:bomb: breaking] injected dagster.io/run_id Kubernetes label has been renamed to dagster/run-id. Keys starting with dagster.io/ have been converted to dagster/ to match how dagster-k8s does it.
  • [:bomb: breaking] dagster_ray.kuberay Configurations have been unified with KubeRay APIs.
  • dagster-ray now populates Kubernetes labels with more values (including some useful Dagster Cloud values such as git-sha)

Added

  • KubeRayInteractiveJob -- a new resource that utililizes the new InteractiveMode for RayJob. It can be used to connect to Ray in Client mode -- like KubeRayCluster -- but gives access to RayJob features, such as automatic cleanup (ttlSecondsAfterFinished), retries (backoffLimit) and timeouts (activeDeadlineSeconds).
  • RayResource setup lifecycle has been overhauled: resources now has an actions parameter with 3 configuration options: create, wait and connect. The user can disable them and run .create(), .wait() and .connect() manually if needed.

Full Changelog: v0.0.23...v0.1.0

Release v0.1.0-alpha.0

01 Sep 21:03
dc6f11e

Choose a tag to compare

Pre-release

This is a pre-release for dagster-ray 0.1.0. It's available for testing.

I want to merge #193 before releasing 0.1.0. There are no other API changes planned (currently).

🔧 Changed

  • 💣 breaking: RayResource: top-level skip_init and skip_setup configuration parameters have been removed. The lifecycle field is the new way of configuring steps performed during resource initialization. KubeRayCluster's skip_cleanup has been moved to lifecycle as well.
  • 💣 breaking: injected dagster.io/run_id Kubernetes label has been renamed to dagster/run-id. Keys starting with dagster.io/ have been converted to dagster/ to match how dagster-k8s does it.
  • 💣 breaking: dagster_ray.kuberay configurations have been unified with KubeRay APIs.
  • dagster-ray now populates Kubernetes labels with more values (including some useful Dagster Cloud values such as git-sha)

✨ Added

  • KubeRayInteractiveJob -- a new resource that utilizes the new InteractiveMode for RayJob. It can be used to connect to Ray in Client mode -- like KubeRayCluster -- but gives access to RayJob features, such as automatic cleanup (ttlSecondsAfterFinished), retries (backoffLimit) and timeouts (activeDeadlineSeconds).
  • RayResource setup lifecycle has been overhauled: resources now has an actions parameter with 3 configuration options: create, wait and connect. The user can disable them and run .create(), .wait() and .connect() manually if needed.

Full Changelog: v0.0.23...v0.1.0-alpha.0

Release v0.0.23

18 Aug 08:40
3fc87b1

Choose a tag to compare

What's Changed

Full Changelog: v0.0.22...v0.0.23

Release v0.0.22

15 Aug 10:52
0cbe3de

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.0.21...v0.0.22

Release v0.0.21

14 Jul 14:22
844d9e1

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.0.20...v0.0.21