Skip to content

[WIP] OSDOCS-14875: Kueue FairSharing docs #94463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: kueue-docs
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions _topic_maps/_topic_map.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,16 @@
# topic groups and topics on the main page.

---
Name: About
Name: Overview
Dir: welcome
Distros: openshift-kueue
Topics:
- Name: Welcome
File: index
- Name: About Red Hat build of Kueue
- Name: Introduction to Red Hat build of Kueue
File: about-kueue
- Name: Understanding Red Hat build of Kueue components
File: kueue-components
---
Name: Install
Dir: install
Expand All @@ -53,6 +55,8 @@ Distros: openshift-kueue
Topics:
- Name: Configuring quotas
File: configuring-quotas
- Name: Configuring fair sharing
File: configuring-fairsharing
---
Name: Develop
Dir: develop
Expand Down
27 changes: 27 additions & 0 deletions configure/configuring-fairsharing.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
:_mod-docs-content-type: ASSEMBLY
include::_attributes/common-attributes.adoc[]
[id="configuring-fairsharing"]
= Configuring fair sharing
:context: configuring-fairsharing

toc::[]

You can enable weighted access to _borrowable_ resources among different tenants in a cluster queue _cohort_ by configuring fair sharing.

include::snippets/snippet-borrowable-resources.adoc[]
include::snippets/snippet-cohort.adoc[]

When fair sharing is enabled, {product-title} determines whether pending workloads can preempt admitted workloads, by using cluster queue `weight` values and the preemption policies set by the `withinClusterQueue` and `reclaimWithinCohort` configurations.

After you configure fair sharing, you can assign a numeric share value to each cluster queue.

This share value to summarize the usage of borrowed resources in a ClusterQueue, in comparison to others in the same cohort.

The share value is weighted by the .spec.fairSharing.weight defined in a ClusterQueue.

During admission, Kueue prefers to admit Workloads from ClusterQueues that have the lowest share value first. During preemption, Kueue prefers to preempt Workloads from ClusterQueues that have the highest share value first.

[id="configuring-fairsharing-prereqs"]
== Prerequisites

* Ensure that classic preemption using the `borrowWithinCohort` configuration is not enabled.
3 changes: 1 addition & 2 deletions modules/configuring-clusterqueues.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@
[id="configuring-clusterqueues_{context}"]
= Configuring a cluster queue

A cluster queue is a cluster-scoped resource, represented by a `ClusterQueue` object, that governs a pool of resources such as CPU, memory, and pods.
Cluster queues can be used to define usage limits, quotas for resource flavors, order of consumption, and fair sharing rules.
include::snippets/snippet-clusterqueue.adoc[]

[NOTE]
====
Expand Down
49 changes: 49 additions & 0 deletions modules/configuring-cohorts.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
// Module included in the following assemblies:
//
// * welcome/kueue-components.adoc

:_mod-docs-content-type: REFERENCE
[id="configuring-cohorts_{context}"]
= Example cohort configuration

In the following example configuration, cluster queues A and B are included in the `example` cohort:

.Example `ClusterQueue` object A
[source,yaml]
----
apiVersion: kueue.openshift.io/v1
kind: ClusterQueue
metadata:
name: queue-a
spec:
cohort: example
resourceQuota:
static:
cpu: "10"
memory: "20Gi"
----

.Example `ClusterQueue` object B
[source,yaml]
----
apiVersion: kueue.openshift.io/v1
kind: ClusterQueue
metadata:
name: queue-b
spec:
cohort: example
resourceQuota:
static:
cpu: "15"
memory: "30Gi"
----

.Example `Cohort` object
[source,yaml]
----
apiVersion: kueue.openshift.io/v1
kind: Cohort
metadata:
name: example
spec: {}
----
4 changes: 1 addition & 3 deletions modules/configuring-localqueues.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,7 @@
[id="configuring-localqueues_{context}"]
= Configuring a local queue

A local queue is a namespaced object, represented by a `LocalQueue` object, that groups closely related workloads that belong to a single namespace.

As an administrator, you can configure a `LocalQueue` object to point to a cluster queue. This allocates resources from the cluster queue to workloads in the namespace specified in the `LocalQueue` object.
include::snippets/snippet-localqueue.adoc[]

.Prerequisites

Expand Down
4 changes: 2 additions & 2 deletions modules/configuring-resourceflavors.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
[id="configuring-resourceflavors_{context}"]
= Configuring a resource flavor

After you have configured a `ClusterQueue` object, you can configure a `ResourceFlavor` object.
include::snippets/snippet-resourceflavor.adoc[]

Resources in a cluster are typically not homogeneous. If the resources in your cluster are homogeneous, you can use an empty `ResourceFlavor` instead of adding labels to custom resource flavors.

You can use a custom `ResourceFlavor` object to represent different resource variations that are associated with cluster nodes through labels, taints, and tolerations. You can then associate workloads with specific node types to enable fine-grained resource management.
After you have configured a `ClusterQueue` object, you can configure a `ResourceFlavor` object. You can then associate workloads with specific node types to enable fine-grained resource management.

.Prerequisites

Expand Down
12 changes: 12 additions & 0 deletions snippets/snippet-borrowable-resources.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Text snippet included in the following modules:
//
// *
//
// Text snippet included in the following assemblies:
//
// * welcome/kueue-components.adoc
// * configure/configuring-fairsharing.adoc

:_mod-docs-content-type: SNIPPET

Borrowable resources are defined as the unused nominal quota of all the cluster queues in a cohort, or group.
12 changes: 12 additions & 0 deletions snippets/snippet-clusterqueue.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Text snippet included in the following modules:
//
// * modules/configuring-clusterqueues.adoc
//
// Text snippet included in the following assemblies:
//
// * welcome/kueue-components.adoc

:_mod-docs-content-type: SNIPPET

A cluster queue is a cluster-scoped resource, represented by a `ClusterQueue` object, that governs a pool of resources such as CPU, memory, and pods.
Cluster queues can be used to define usage limits, quotas for resource flavors, order of consumption, and fair sharing rules.
12 changes: 12 additions & 0 deletions snippets/snippet-cohort.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Text snippet included in the following modules:
//
// *
//
// Text snippet included in the following assemblies:
//
// * welcome/kueue-components.adoc
// * configure/configuring-fairsharing.adoc

:_mod-docs-content-type: SNIPPET

A cohort is a group of cluster queues, defined by a `Cohort` object, that can share borrowable resources with one another.
13 changes: 13 additions & 0 deletions snippets/snippet-localqueue.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// Text snippet included in the following modules:
//
// * modules/configuring-localqueues.adoc
//
// Text snippet included in the following assemblies:
//
// * welcome/kueue-components.adoc

:_mod-docs-content-type: SNIPPET

A local queue is a namespaced object, represented by a `LocalQueue` object, that groups closely related workloads that belong to a single namespace.

As an administrator, you can configure a `LocalQueue` object to point to a cluster queue. This allocates resources from the cluster queue to workloads in the namespace specified in the `LocalQueue` object.
12 changes: 12 additions & 0 deletions snippets/snippet-resourceflavor.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Text snippet included in the following modules:
//
// * modules/configuring-resourceflavors.adoc
//
// Text snippet included in the following assemblies:
//
// * welcome/kueue-components.adoc

:_mod-docs-content-type: SNIPPET

Resource flavors represent different resource variations that are associated with cluster nodes through labels, taints, and tolerations.
Resource flavors are defined in a `ResourceFlavor` object.
6 changes: 2 additions & 4 deletions welcome/about-kueue.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
:_mod-docs-content-type: ASSEMBLY
include::_attributes/common-attributes.adoc[]
[id="about-kueue"]
= About {product-title}
= Introduction to {product-title}
:context: about-kueue

toc::[]
Expand All @@ -18,11 +18,8 @@ In the context of {product-title}, a job can be defined as a one-time or on-dema

{product-title} is compatible with environments that use heterogeneous, elastic resources. This means that the environment has many different resource types, and those resources are capable of dynamic scaling.

{product-title} does not replace any existing components in a Kubernetes cluster, but instead integrates with the existing Kubernetes API server, scheduler, and cluster autoscaler components.

{product-title} supports all-or-nothing semantics. This means that either an entire job with all of its components is admitted to the cluster, or the entire job is rejected if it does not fit on the cluster.

// Personas
[id="about-kueue-personas"]
== Personas

Expand All @@ -33,6 +30,7 @@ Batch users:: Batch users run jobs on the cluster. Examples of batch users might
Serving users:: Serving users run jobs on the cluster. For example, to expose a trained AI/ML model for inference.
Platform developers:: Platform developers integrate {product-title} with other software. They might also contribute to the Kueue open source project.


[id="about-kueue-workflow"]
== Workflow overview
// TODO: add diagram?
Expand Down
38 changes: 38 additions & 0 deletions welcome/kueue-components.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
:_mod-docs-content-type: ASSEMBLY
include::_attributes/common-attributes.adoc[]
[id="kueue-components"]
= Understanding {product-title} components
:context: kueue-components

toc::[]

{product-title} does not replace any existing components in a Kubernetes cluster, but instead integrates with the existing Kubernetes API server, scheduler, and cluster autoscaler components.

The following is an alphabetical list of explanations of {product-title} components.

[id="kueue-components-borrowable-resources"]
== Borrowable resources

include::snippets/snippet-borrowable-resources.adoc[]

[id="kueue-components-cohort"]
== Cohorts

include::snippets/snippet-cohort.adoc[]

include::modules/configuring-cohorts.adoc[leveloffset=+2]

[id="kueue-components-cq"]
== Cluster queues

include::snippets/snippet-clusterqueue.adoc[]

[id="kueue-components-lq"]
== Local queues

include::snippets/snippet-localqueue.adoc[]

[id="kueue-components-resourceflavor"]
== Resource flavors

include::snippets/snippet-resourceflavor.adoc[]