Skip to content

ray-project/kuberay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KubeRay

Build Status Go Report Card

KubeRay is a powerful, open-source Kubernetes operator that simplifies the deployment and management of Ray applications on Kubernetes. It offers several key components:

KubeRay core: This is the official, fully-maintained component of KubeRay that provides three custom resource definitions, RayCluster, RayJob, and RayService. These resources are designed to help you run a wide range of workloads with ease.

  • RayCluster: KubeRay fully manages the lifecycle of RayCluster, including cluster creation/deletion, autoscaling, and ensuring fault tolerance.

  • RayJob: With RayJob, KubeRay automatically creates a RayCluster and submits a job when the cluster is ready. You can also configure RayJob to automatically delete the RayCluster once the job finishes.

  • RayService: RayService is made up of two parts: a RayCluster and a Ray Serve deployment graph. RayService offers zero-downtime upgrades for RayCluster and high availability.

KubeRay ecosystem: Some optional components.

  • Kubectl Plugin (Beta): Starting from KubeRay v1.3.0, you can use the kubectl ray plugin to simplify common workflows when deploying Ray on Kubernetes. If you aren’t familiar with Kubernetes, this plugin simplifies running Ray on Kubernetes. See kubectl-plugin for more details.

  • KubeRay APIServer (Alpha): It provides a layer of simplified configuration for KubeRay resources. The KubeRay API server is used internally by some organizations to back user interfaces for KubeRay resource management.

  • KubeRay Dashboard (Experimental): Starting from KubeRay v1.4.0, we have introduced a new dashboard that enables users to view and manage KubeRay resources. While it is not yet production-ready, we welcome your feedback.

Documentation

From September 2023, all user-facing KubeRay documentation will be hosted on the Ray documentation. The KubeRay repository only contains documentation related to the development and maintenance of KubeRay.

Quick Start

Examples

KubeRay examples are hosted on the Ray documentation. Examples span a wide range of use cases, including training, LLM online inference, batch inference, and more.

Kubernetes Ecosystem

KubeRay integrates with the Kubernetes ecosystem, including observability tools (e.g., Prometheus, Grafana, py-spy), queuing systems (e.g., Volcano, Apache YuniKorn, Kueue), ingress controllers (e.g., Nginx), and more. See KubeRay Ecosystem for more details.

Blog Posts

Talks

Development

Please read our CONTRIBUTING guide before making a pull request. Refer to our DEVELOPMENT to build and run tests locally.

Getting Involved

Join Ray's Slack workspace, and search the following public channels:

  • #kuberay-questions: This channel aims to help KubeRay users with their questions. The messages will be closely monitored by the Ray and KubeRay maintainers.

KubeRay contributors are welcome to join the bi-weekly KubeRay community meetings.

Security

If you discover a potential security issue in this project, or think you may have discovered a security issue, we ask that you notify KubeRay Security via our Slack Channel. Please do not create a public GitHub issue.

License

This project is licensed under the Apache-2.0 License.