Skip to content

Telcodocs 2124: Ensure NUMA Resources Operator Works on HyperShift Hosted Clusters #91162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions modules/cnf-configuring-kubelet-nro.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
//
// *scalability_and_performance/cnf-numa-aware-scheduling.adoc

:_module-type: PROCEDURE
:_mod-docs-content-type: PROCEDURE
[id="cnf-configuring-kubelet-config-nro_{context}"]
= Creating a KubeletConfig CRD

Expand Down Expand Up @@ -41,11 +41,16 @@ spec:
memory: "512Mi"
topologyManagerPolicy: "single-numa-node" <5>
----
<1> Adjust this label to match the `machineConfigPoolSelector` in the `NUMAResourcesOperator` CR.
<1> Ensure that this label matches the `machineConfigPoolSelector` setting in the `NUMAResourcesOperator` CR that you configure later in "Creating the NUMAResourcesOperator custom resource".
<2> For `cpuManagerPolicy`, `static` must use a lowercase `s`.
<3> Adjust this based on the CPU on your nodes.
<4> For `memoryManagerPolicy`, `Static` must use an uppercase `S`.
<5> `topologyManagerPolicy` must be set to `single-numa-node`.
+
[NOTE]
====
For hosted control plane clusters, the `machineConfigPoolSelector` setting does not have any functional effect. Node association is instead determined by the specified `NodePool` object.
====

.. Create the `KubeletConfig` CR by running the following command:
+
Expand Down
142 changes: 142 additions & 0 deletions modules/cnf-creating-nrop-cr-hosted-control-plane.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
// Module included in the following assemblies:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 [error] OpenShiftAsciiDoc.ModuleContainsContentType: Module is missing the '_mod-docs-content-type' variable.

//
// *scalability_and_performance/cnf-numa-aware-scheduling.adoc

:_mod-docs-content-type: PROCEDURE
[id="cnf-creating-nrop-cr-hosted-control-plane_{context}"]
= Creating the NUMAResourcesOperator custom resource for {hcp}

After you install the NUMA Resources Operator, create the `NUMAResourcesOperator` custom resource (CR) that instructs the NUMA Resources Operator to install all the cluster infrastructure needed to support the NUMA-aware scheduler on {hcp}, including daemon sets and APIs.

--
:FeatureName: Creating the NUMAResourcesOperator custom resource for {hcp}
include::snippets/technology-preview.adoc[]
--

.Prerequisites

* Install the OpenShift CLI (`oc`).
* Log in as a user with `cluster-admin` privileges.
* Install the NUMA Resources Operator.

.Procedure

. Export the management cluster kubeconfig file by running the following command:
+
[source,terminal]
----
$ export KUBECONFIG=</path/to/management-cluster-kubeconfig>
----

. Find the `node-pool-name` for your cluster by running the following command:
+
[source,terminal]
----
$ oc --kubeconfig="$MGMT_KUBECONFIG" get np -A
----
+
.Example output
[source,terminal]
----
NAMESPACE NAME CLUSTER DESIRED NODES CURRENT NODES AUTOSCALING AUTOREPAIR VERSION UPDATINGVERSION UPDATINGCONFIG MESSAGE
clusters democluster-us-east-1a democluster 1 1 False False 4.19.0 False False
----
+
The `node-pool-name` is the `NAME` field in the output. In this example, the `node-pool-name` is `democluster-us-east-1a`.

. Create the `NUMAResourcesOperator` custom resource by completing the following steps:

.. Create a YAML file named `nrop-hcp.yaml`. At a minimum, the file should contain the following content::
+
[source,yaml]
----
apiVersion: nodetopology.openshift.io/v1
kind: NUMAResourcesOperator
metadata:
name: numaresourcesoperator
spec:
nodeGroups:
- poolName: democluster-us-east-1a <1>
----
+
<1> The `poolName` is the `node-pool-name` retrieved in step 2.

. On the management cluster, run the following command to list the available secrets:
+
[source,terminal]
----
$ oc get secrets -n clusters
----
+
.Example output
[source,terminal]
----
NAME TYPE DATA AGE
builder-dockercfg-25qpp kubernetes.io/dockercfg 1 128m
default-dockercfg-mkvlz kubernetes.io/dockercfg 1 128m
democluster-admin-kubeconfig Opaque 1 127m
democluster-etcd-encryption-key Opaque 1 128m
democluster-kubeadmin-password Opaque 1 126m
democluster-pull-secret Opaque 1 128m
deployer-dockercfg-8lfpd kubernetes.io/dockercfg 1 128m
----

. Extract the `kubeconfig` file for the hosted cluster by running the following command:
+
[source,terminal]
----
$ oc get secret <SECRET_NAME> -n clusters -o jsonpath='{.data.kubeconfig}' | base64 -d > hosted-cluster-kubeconfig
----
+
.Example
[source,terminal]
----
$ oc get secret democluster-admin-kubeconfig -n clusters -o jsonpath='{.data.kubeconfig}' | base64 -d > hosted-cluster-kubeconfig
----

. Export the hosted cluster `kubeconfig` file by running the following command:
+
[source,terminal]
----
$ export HC_KUBECONFIG=<path_to_hosted-cluster-kubeconfig>
----

. Create the `NUMAResourcesOperator` CR by running the following command on the hosted cluster:
+
[source,terminal]
----
$ oc create -f nrop-hcp.yaml
----

.Verification

. Verify that the NUMA Resources Operator deployed successfully by running the following command:
+
[source,terminal]
----
$ oc get numaresourcesoperators.nodetopology.openshift.io
----
+
.Example output
[source,terminal]
----
NAME AGE
numaresourcesoperator 27s
----

. After a few minutes, run the following command to verify that the required resources deployed successfully:
+
[source,terminal]
----
$ oc get all -n openshift-numaresources
----
+
.Example output
[source,terminal]
----
NAME READY STATUS RESTARTS AGE
pod/numaresources-controller-manager-7d9d84c58d-qk2mr 1/1 Running 0 12m
pod/numaresourcesoperator-democluster-7d96r 2/2 Running 0 97s
pod/numaresourcesoperator-democluster-crsht 2/2 Running 0 97s
pod/numaresourcesoperator-democluster-jp9mw 2/2 Running 0 97s
----
5 changes: 5 additions & 0 deletions modules/cnf-deploying-the-numa-aware-scheduler.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ spec:
----
$ oc create -f nro-scheduler.yaml
----
+
[NOTE]
====
In a hosted control plane cluster this command is run on the hosted control plane node.
====

. After a few seconds, run the following command to confirm the successful deployment of the required resources:
+
Expand Down
11 changes: 8 additions & 3 deletions modules/cnf-sample-single-numa-policy-from-pp.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
//
// *scalability_and_performance/cnf-numa-aware-scheduling.adoc

:_module-type: REFERENCE
:_mod-docs-content-type: REFERENCE
[id="cnf-sample-performance-policy_{context}"]
= Sample performance profile

Expand Down Expand Up @@ -32,5 +32,10 @@ spec:
realTime: true
----

<1> This should match the `MachineConfigPool` that you want to configure the NUMA Resources Operator on. For example, you might have created a `MachineConfigPool` named `worker-cnf` that designates a set of nodes that run telecommunications workloads.
<2> The `topologyPolicy` must be set to `single-numa-node`. Ensure that this is the case by setting the `topology-manager-policy` argument to `single-numa-node` when running the PPC tool.
<1> This value should match the `MachineConfigPool` value that you want to configure the NUMA Resources Operator on. For example, you might create a `MachineConfigPool` object named `worker-cnf` that designates a set of nodes that run telecommunications workloads. The value for `MachineConfigPool` should match the `machineConfigPoolSelector` value in the `NUMAResourcesOperator` CR that you configure later in "Creating the NUMAResourcesOperator custom resource".
<2> Ensure that the `topologyPolicy` is set to `single-numa-node` by setting the `topology-manager-policy` argument to `single-numa-node` when you run the PPC tool.
+
[NOTE]
====
For hosted control plane clusters, the `machineConfigPoolSelector` does not have any functional effect. Node association is instead determined by the specified `NodePool` object.
====
21 changes: 14 additions & 7 deletions scalability_and_performance/cnf-numa-aware-scheduling.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,7 @@ include::modules/cnf-installing-numa-resources-operator-cli.adoc[leveloffset=+2]

include::modules/cnf-installing-numa-resources-operator-console.adoc[leveloffset=+2]

include::modules/cnf-scheduling-numa-aware-workloads-overview.adoc[leveloffset=+1]

include::modules/cnf-creating-nrop-cr.adoc[leveloffset=+2]

include::modules/cnf-deploying-the-numa-aware-scheduler.adoc[leveloffset=+2]

include::modules/cnf-configuring-single-numa-policy.adoc[leveloffset=+2]
include::modules/cnf-configuring-single-numa-policy.adoc[leveloffset=+1]

[role="_additional-resources"]
.Additional resources
Expand All @@ -51,6 +45,19 @@ include::modules/cnf-sample-single-numa-policy-from-pp.adoc[leveloffset=+2]

include::modules/cnf-configuring-kubelet-nro.adoc[leveloffset=+2]

include::modules/cnf-scheduling-numa-aware-workloads-overview.adoc[leveloffset=+1]

include::modules/cnf-creating-nrop-cr.adoc[leveloffset=+2]

include::modules/cnf-creating-nrop-cr-hosted-control-plane.adoc[leveloffset=+2]

[role="_additional-resources"]
.Additional resources

* xref:../scalability_and_performance/cnf-tuning-low-latency-hosted-cp-nodes-with-perf-profile.adoc#cnf-create-performance-profiles-hosted-cp[Creating a performance profile for hosted control planes]

include::modules/cnf-deploying-the-numa-aware-scheduler.adoc[leveloffset=+2]

include::modules/cnf-scheduling-numa-aware-workloads.adoc[leveloffset=+2]

include::modules/cnf-configuring-node-groups-for-the-numaresourcesoperator.adoc[leveloffset=+1]
Expand Down