You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update operator/api to now use PodCliqueSet instead of PodGangSet.
* Update doc strings for scheduler/api to use PodCliqueSet instead of
PodGangSet.
* Regenerated CRDS, clientset and deepcopy functions.
* Regerated API docs.
* Initial commit on adaption of PodGangSet to PodCliqueSet in the rest
of the code base.
* adapted charts to PodCliqueSet and the changed OperatorConfig
* adapted operator/samples/simple to use PodCliqueSet
* Adapted operator/internal/components to use PodCliqueSet
* Changed docs/installation.md to reflect PodCliqueSet changes.
* Renamed component/podgangset to component/podcliqueset
* Adapted operator/internal/controller to use PodCliqueSet.
* Adapted rest of the operator/internal packages to use PodCliqueSet.
* Adapted operator/test to use PodCliqueSet
* Adapted docs to use PodCliqueSet
* Corrected charts to now use PodCliqueSet
* Corrected Dockerfile label to use PodCliqueSet
* Changed docs/assets to use PodCliqueSet
---------
Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Copy file name to clipboardExpand all lines: README.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
5
5
# Grove
6
6
7
-
Grove is a Kubernetes API purpose-built for orchestrating AI workloads in GPU clusters, where a single custom resource allows you to hierarchically compose multiple AI components with flexible gang-scheduling and auto-scaling specfications at multiple levels. Through native support for network topology-aware gang scheduling, multi-dimensional auto-scaling and prescriptive startup ordering, Grove enables developers to define complex AI stacks in a concise, declarative, and framework-agnostic manner.
7
+
Grove is a Kubernetes API purpose-built for orchestrating AI workloads in GPU clusters, where a single custom resource allows you to hierarchically compose multiple AI components with flexible gang-scheduling and auto-scaling specification at multiple levels. Through native support for network topology-aware gang scheduling, multidimensional auto-scaling and prescriptive startup ordering, Grove enables developers to define complex AI stacks in a concise, declarative, and framework-agnostic manner.
8
8
9
9
Grove was originally motivated by the challenges of orchestrating multinode, disaggregated inference systems. It provides a consistent and unified API that allows users to define, configure, and scale prefill, decode, and any other components like routing within a single custom resource. However, it is flexible enough to map naturally to the roles, scaling behaviors, and dependencies of any real-world inference systems, from "traditional" single node aggregated inference to agentic pipelines with multiple models.
10
10
@@ -15,24 +15,24 @@ Modern inference systems are often no longer single-pod workloads. They involve
15
15
16
16
## Core Concepts
17
17
18
-
The Grove API consists of a user API and a scheduling API. While the user API (`PodGangSet`, `PodClique`, `PodCliqueScalingGroup`) allows users to represent their AI workloads, the scheduling API (`PodGang`) enables scheduler integration to support the network topology-optimized gang-scheduling and auto-scaling requirements of the workload.
18
+
The Grove API consists of a user API and a scheduling API. While the user API (`PodCliqueSet`, `PodClique`, `PodCliqueScalingGroup`) allows users to represent their AI workloads, the scheduling API (`PodGang`) enables scheduler integration to support the network topology-optimized gang-scheduling and auto-scaling requirements of the workload.
|[PodGangSet](operator/api/core/v1alpha1/podgangset.go)| The top-level Grove object that defines a group of components managed and colocated together. Also supports autoscaling with topology aware spread of PodGangSet replicas for availability. |
23
-
|[PodClique](operator/api/core/v1alpha1/podclique.go)| A group of pods representing a specific role (e.g., leader, worker, frontend). Each clique has an independent configuration and supports custom scaling logic. |
24
-
|[PodCliqueScalingGroup](operator/api/core/v1alpha1/scalinggroup.go)| A set of PodCliques that scale and are scheduled together. Ideal for tightly coupled roles like prefill leader and worker. |
25
-
|[PodGang](scheduler/api/core/v1alpha1/podgang.go)| The scheduler API that defines a unit of gang-scheduling. A PodGang is a collection of groups of similar pods, where each pod group defines a minimum number of replicas guaranteed for gang-scheduling. |
|[PodCliqueSet](operator/api/core/v1alpha1/podcliqueset.go)| The top-level Grove object that defines a group of components managed and colocated together. Also supports autoscaling with topology aware spread of PodCliqueSet replicas for availability.|
23
+
|[PodClique](operator/api/core/v1alpha1/podclique.go)| A group of pods representing a specific role (e.g., leader, worker, frontend). Each clique has an independent configuration and supports custom scaling logic.|
24
+
|[PodCliqueScalingGroup](operator/api/core/v1alpha1/scalinggroup.go)| A set of PodCliques that scale and are scheduled together. Ideal for tightly coupled roles like prefill leader and worker. |
25
+
|[PodGang](scheduler/api/core/v1alpha1/podgang.go)| The scheduler API that defines a unit of gang-scheduling. A PodGang is a collection of groups of similar pods, where each pod group defines a minimum number of replicas guaranteed for gang-scheduling. |
26
26
27
27
28
28
## Key Capabilities
29
29
30
30
-**Declarative composition of Role-Based Pod Groups**
31
-
`PodGangSet` API provides users a capability to declaratively compose tightly coupled group of pods with explicit role based logic, e.g. disaggregated roles in a model serving stack such as `prefill`, `decode` and `routing`.
31
+
`PodCliqueSet` API provides users a capability to declaratively compose tightly coupled group of pods with explicit role based logic, e.g. disaggregated roles in a model serving stack such as `prefill`, `decode` and `routing`.
32
32
-**Flexible Gang Scheduling**
33
-
`PodClique`'s and `PodCliqueScalingGroup`s allow users to specify flexible gang-scheduling requirements at multiple levels within a `PodGangSet` to prevent resource deadlocks.
33
+
`PodClique`'s and `PodCliqueScalingGroup`s allow users to specify flexible gang-scheduling requirements at multiple levels within a `PodCliqueSet` to prevent resource deadlocks.
34
34
-**Multi-level Horizontal Auto-Scaling**
35
-
Supports pluggable horizontal auto-scaling solutions to scale `PodGangSet`, `PodClique` and `PodCliqueScalingGroup` custom resources.
35
+
Supports pluggable horizontal auto-scaling solutions to scale `PodCliqueSet`, `PodClique` and `PodCliqueScalingGroup` custom resources.
36
36
-**Network Topology-Aware Scheduling**
37
37
Allows specifying network topology pack and spread constraints to optimize for both network performance and service availability.
0 commit comments