Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions CHANDELOG.md → CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,41 +13,41 @@ Learn more in [Cluster Sharing docs](tutorial/kuberay.md/#cluster-sharing).

### Added
- `KubeRayCluster.cluster_sharing` parameter that controls cluster sharing behavior.
- `dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters` sensor that cleans up expired clusters (both shared and non-shared). Learn mode in [docs](api/kuberay.md#dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters)
- `dagster-ray` entry now appears in the Dagster libraries list in the web UI
- `dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters` sensor that cleans up expired clusters (both shared and non-shared). Learn more in [docs](api/kuberay.md#dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters).
- `dagster-ray` entry now appears in the Dagster libraries list in the web UI.

### Changed
- [:bomb: breaking] - removed `cleanup_kuberay_clusters_op` and other associated definitions in favor of `dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters` sensor that is more flexible
- [:bomb: breaking] - removed `cleanup_kuberay_clusters_op` and other associated definitions in favor of `dagster_ray.kuberay.sensors.cleanup_expired_kuberay_clusters` sensor that is more flexible.

## 0.3.1

### Added
- `failure_tolerance_timeout` configuration parameter for `KubeRayInteractiveJob` and `KubeRayCluster`. It can be set to a positive value to give the cluster some time to transition out of `failed` state (which can be transient in some scenarios) before raising an error.

### Fixes
- ensure both `.head.serviceIP` and `.head.serviceName` are set on the `RayCluster` while waiting for cluster readiness
- ensure both `.head.serviceIP` and `.head.serviceName` are set on the `RayCluster` while waiting for cluster readiness.

## 0.3.0

This release includes massive docs improvements and drops support for Python 3.9
This release includes massive docs improvements and drops support for Python 3.9.

### Changes

- [:bomb: breaking] dropped Python 3.9 support (EOL October 2025)
- [internal] most of the general, backend-agnostic code has been moved to `dagster_ray.core` (top-level imports still work)
- [:bomb: breaking] dropped Python 3.9 support (EOL October 2025).
- [internal] most of the general, backend-agnostic code has been moved to `dagster_ray.core` (top-level imports still work).

## 0.2.1

### Fixes

- Fixed broken wheel on PyPI
- Fixed broken wheel on PyPI.

## 0.2.0

### Changed
- `KubeRayInteractiveJob.deletion_strategy` now defaults to `DeleteCluster` for both successful and failed executions. This is a reasonable default for the use case.
- `KubeRayInteractiveJob.ttl_seconds_after_finished` now defaults to `600` seconds.
- `KubeRayCluster.lifecycle.cleanup` now defaults to `always`
- `KubeRayCluster.lifecycle.cleanup` now defaults to `always`.
- [:bomb: breaking] `RayJob` and `RayCluster` clients and resources Kubernetes init parameters have been renamed to `kube_config` and `kube_context`.

### Added
Expand All @@ -64,8 +64,8 @@ This release includes massive docs improvements and drops support for Python 3.9
- [:bomb: breaking] `RayResource`: top-level `skip_init` and `skip_setup` configuration parameters have been removed. The `lifecycle` field is the new way of configuring steps performed during resource initialization. `KubeRayCluster`'s `skip_cleanup` has been moved to `lifecycle` as well.
- [:bomb: breaking] injected `dagster.io/run_id` Kubernetes label has been renamed to `dagster/run-id`. Keys starting with `dagster.io/` have been converted to `dagster/` to match how `dagster-k8s` does it.
- [:bomb: breaking] `dagster_ray.kuberay` Configurations have been unified with KubeRay APIs.
- `dagster-ray` now populates Kubernetes labels with more values (including some useful Dagster Cloud values such as `git-sha`)
- `dagster-ray` now populates Kubernetes labels with more values (including some useful Dagster Cloud values such as `git-sha`).

### Added
- `KubeRayInteractiveJob` -- a resource that utililizes the new `InteractiveMode` for `RayJob`. It can be used to connect to Ray in Client mode -- like `KubeRayCluster` -- but gives access to `RayJob` features, such as automatic cleanup (`ttlSecondsAfterFinished`), retries (`backoffLimit`) and timeouts (`activeDeadlineSeconds`).
- `KubeRayInteractiveJob` -- a resource that utilizes the new `InteractiveMode` for `RayJob`. It can be used to connect to Ray in Client mode -- like `KubeRayCluster` -- but gives access to `RayJob` features, such as automatic cleanup (`ttlSecondsAfterFinished`), retries (`backoffLimit`) and timeouts (`activeDeadlineSeconds`).
- `RayResource` setup lifecycle has been overhauled: resources now has an `actions` parameter with 3 configuration options: `create`, `wait` and `connect`. The user can disable them and run `.create()`, `.wait()` and `.connect()` manually if needed.
2 changes: 1 addition & 1 deletion docs/changelog.md
2 changes: 1 addition & 1 deletion docs/tutorial/kuberay.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ ray_cluster = KubeRayInteractiveJob(

## KubeRayCluster

While [`KubeRayInteractiveJob`](../api/kuberay.md#dagster_ray.kuberay.KubeRayInteractiveJob) is recommended for production environments, [`KubeRayCluster`](../api/kuberay.md#dagster_ray.kuberay.KubeRayCluster) might be better alternative for dev environments.
While [`KubeRayInteractiveJob`](../api/kuberay.md#dagster_ray.kuberay.KubeRayInteractiveJob) is recommended for production environments, [`KubeRayCluster`](../api/kuberay.md#dagster_ray.kuberay.KubeRayCluster) might be a better alternative for dev environments.

Unlike `KubeRayInteractiveJob`, which can outsource garbage collection to the KubeRay controller, `KubeRayCluster` is entirely responsible for cluster management. This is bad for production environments (may result in dangling `RayCluster` instances if the Dagster step pod fails unexpectedly), but good for dev environments, because it allows `dagster-ray` to implement **cluster sharing**.

Expand Down
2 changes: 1 addition & 1 deletion examples/local/run_launcher/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ dagster dev

3. From the UI, run the example job and observe how the steps are executed in a Ray job.

Note that this example doens't have the `ray_executor` configured, so steps will be executed in the same Ray job using the default `multiprocess_executor`.
Note that this example doesn't have the `ray_executor` configured, so steps will be executed in the same Ray job using the default `multiprocess_executor`.

To see an example of how to use both `RayRunLauncher` and `ray_executor`, see the [RunLauncher and Executor example](../run_launcher_and_executor/README.md).