fhpa-usage tutorial construction by Krishiv-Mahajan · Pull Request #28 · karmada-io/playground

Krishiv-Mahajan · 2026-05-22T11:06:50Z

Description

This PR introduces a complete, interactive Killercoda tutorial demonstrating how to use the Karmada FederatedHPA (FHPA) to perform cross-cluster workload autoscaling.
The scenario guides the user through setting up a multi-cluster metrics pipeline, deploying a scalable workload, and triggering dynamic scale-up and scale-down events based on CPU utilization.

testing:

The scenario can be tested on :https://killercoda.com/karmada-demo/scenario/karmada-FHPA-example

karmada-bot · 2026-05-22T11:06:54Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jwcesign for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist

Code Review

This pull request introduces a new tutorial for Karmada FederatedHPA, providing scripts and documentation to demonstrate multi-cluster autoscaling based on CPU metrics. The reviewer suggested several improvements to the setup script's robustness, including better error handling in generated scripts, safer JSON patching for resource limits, and ensuring idempotency for node operations. Additionally, feedback was provided to correct invalid kubeconfig merging logic and fix minor typos and redundancies in the documentation and verification scripts.

gemini-code-assist · 2026-05-22T11:11:54Z

@@ -0,0 +1,3 @@
+# Summary
+
+In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling,created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.


There is a missing space after the comma in "autoscaling,created".

Suggested change

In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling,created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.

In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling, created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.

gemini-code-assist · 2026-05-22T11:11:54Z

+KUBECONFIG_PATH=${KUBECONFIG_PATH:-"${HOME}/.kube"}
+
+function installKind() {
+    cat << EOF > installKind.sh


The generated script installKind.sh should include set -e to ensure it exits immediately if any command (like wget) fails. This applies to other generated scripts in this file as well (e.g., createCluster.sh, installMetricsServer.sh, etc.).

Suggested change

cat << EOF > installKind.sh

cat << EOF > installKind.sh

set -e

gemini-code-assist · 2026-05-22T11:11:54Z

+}
+
+function createCluster() {
+    cat << EOF > createCluster.sh


The generated script createCluster.sh should include set -e to ensure it exits immediately if cluster creation or configuration fails.

Suggested change

cat << EOF > createCluster.sh

cat << EOF > createCluster.sh

set -e

gemini-code-assist · 2026-05-22T11:11:54Z

+    kind delete cluster --name=member1 || true
+    kind create cluster --name=member1 --config=cluster1.yaml
+    # Patch kindnet to use less CPU
+    kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'


Using replace in a JSON patch will fail if the target path (e.g., resources) does not already exist in the object. Since kindnet typically does not have resource requests/limits defined by default in Kind, it is safer to use the add operation for the entire resources object.

Suggested change

kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'

kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/resources", "value": {"requests": {"cpu": "50m"}, "limits": {"cpu": "200m"}}}]'

gemini-code-assist · 2026-05-22T11:11:54Z

+
+    kind delete cluster --name=member2 || true
+    kind create cluster --name=member2 --config=cluster2.yaml
+    kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'


As with the previous patch, using add is safer than replace if the resources field is missing from the kindnet manifest.

Suggested change

kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'

kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/resources", "value": {"requests": {"cpu": "50m"}, "limits": {"cpu": "200m"}}}]'

gemini-code-assist · 2026-05-22T11:11:54Z

+    kubectl --kubeconfig \$HOME/.kube/config patch deployment coredns -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "30m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "100m"}]'
+    mv \$HOME/.kube/config config-member2
+
+    KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten >> ${KUBECONFIG_PATH}/config


Appending to a kubeconfig file using >> is incorrect as it creates an invalid YAML structure (multiple documents without separators). Furthermore, this merged config on the member node appears to be unused since the individual config files are scp-ed back to the host cluster and used explicitly via --kubeconfig in subsequent steps. This line should be removed to avoid confusion and potential corruption of the local config.

gemini-code-assist · 2026-05-22T11:11:54Z

+        root@${member_cluster_ip}:~
+}
+
+kubectl delete node node01


If the script is re-run or if the node has already been removed from the cluster, kubectl delete node will return an error and stop the script (due to set -e). Adding || true makes the script more idempotent.

Suggested change

kubectl delete node node01

kubectl delete node node01 || true

gemini-code-assist · 2026-05-22T11:11:54Z

+#!/bin/bash
+set -e
+
+set -e


The set -e command is duplicated.

zhzhuang-zju

Thanks @Krishiv-Mahajan

zhzhuang-zju · 2026-05-23T08:54:08Z

+1. Join `kind-member1` and `kind-member2` to the host cluster.
+
+   RUN `MEMBER_CLUSTER_NAME=kind-member1`{{exec}}
+
+   This sets the variable `MEMBER_CLUSTER_NAME` to `kind-member1` for use in the join command.
+
+   RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config join ${MEMBER_CLUSTER_NAME} --cluster-kubeconfig=$HOME/.kube/config-member1 --cluster-context=kind-member1`{{exec}}
+
+   This joins the `kind-member1` cluster to the Karmada control plane using its kubeconfig file and context.
+
+   RUN `MEMBER_CLUSTER_NAME=kind-member2`{{exec}}
+
+   This sets the variable to `kind-member2` for the second cluster join.
+
+   RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config join ${MEMBER_CLUSTER_NAME} --cluster-kubeconfig=$HOME/.kube/config-member2 --cluster-context=kind-member2`{{exec}}


You keep switching between different scenarios here, which is not ideal.
Just use the non-variables approach.

yeah I got a bit confused.

I will use the non variable one in all scenarios

zhzhuang-zju · 2026-05-23T08:55:52Z

+The FederatedHPA (FHPA) relies on a two-layer metrics pipeline to gather the data needed for autoscaling.
+
+First, the `metrics-server` component must be running on each member cluster. It is responsible for collecting per-pod resource utilization data (such as CPU and memory usage) at the local cluster level.


Suggested change

The FederatedHPA (FHPA) relies on a two-layer metrics pipeline to gather the data needed for autoscaling.

First, the `metrics-server` component must be running on each member cluster. It is responsible for collecting per-pod resource utilization data (such as CPU and memory usage) at the local cluster level.

We need to install metrics-server for member clusters to provider the metrics API

zhzhuang-zju · 2026-05-23T08:57:46Z

+
+RUN `bash ~/installMetricsServer.sh`{{exec}}
+
+It automatically downloads the upstream metrics-server manifest, patches it with the `--kubelet-insecure-tls=true` flag for compatibility with our Kind environment, and applies it to both `kind-member1` and `kind-member2`.


Suggested change

It automatically downloads the upstream metrics-server manifest, patches it with the `--kubelet-insecure-tls=true` flag for compatibility with our Kind environment, and applies it to both `kind-member1` and `kind-member2`.

It automatically downloads the upstream metrics-server manifest and applies it to both `kind-member1` and `kind-member2`.

zhzhuang-zju · 2026-05-23T09:37:13Z

+
+**2. Register the Custom Metrics API:**
+Next, we must register the custom metrics `APIService` on both member clusters so the adapter can securely access their local metric endpoints.
+


The demo FHPA uses resource metrics, right? Why do we still need to register the Custom Metrics API?

You are absolutely right, we dont need it , I will just remove it

zhzhuang-zju · 2026-05-23T09:41:59Z

+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get deployment nginx`{{exec}}
+
+This command confirms that the Nginx Deployment template has been successfully registered in the control plane.


Suggested change

This command confirms that the Nginx Deployment template has been successfully registered in the control plane.

zhzhuang-zju · 2026-05-23T11:23:50Z

+
+> **Note:** It takes a brief moment for the scheduler to distribute the workload and for the clusters to pull the container image. If the command returns "No resources found", wait ~30 seconds and run it again. Since our initial replica count is 1, you should see exactly 1 pod running on one of the member clusters.
+>
+> *Troubleshooting:* If you see several lines of `Unhandled Error` regarding `metrics.k8s.io`, don't worry! This is normal and just indicates that the Karmada metrics adapter is still starting up in the background. You can safely ignore these warnings.


Just a gentle correction. This is unrelated to Karmada metrics adapter, it actually depends on the metrics server within member clusters.

We can advance the verification step. After deploying metrics server, confirm its normal availability by running kubectl --kubeconfig $HOME/.kube/config-memberX top pods without errors.

Thanks for the correction

zhzhuang-zju · 2026-05-23T11:31:12Z

+**Verify the Multi-Cluster Service:**
+
+RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config get svc --operation-scope members`{{exec}}
+
+> *Note: If you see `Unhandled Error` warnings regarding metrics, you can safely ignore them.*
+
+You should see the `nginx-service` running on the member clusters. This is the service we will use to generate load!


Suggested change

**Verify the Multi-Cluster Service:**

RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config get svc --operation-scope members`{{exec}}

> *Note: If you see `Unhandled Error` warnings regarding metrics, you can safely ignore them.*

You should see the `nginx-service` running on the member clusters. This is the service we will use to generate load!

It is improper to verify Multi-Cluster Service via the command karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config get svc --operation-scope members. Services on member clusters are not created by Multi-Cluster Service, but distributed to member clusters in advance via PropagationPolicy.

We can remove this part for simplicity.

I've applied this change , I completely removed that misleading verification step from step10/text.md.

zhzhuang-zju

Thanks, you made a great improvement compared with the initial version.

After all the new scenarios are merged, you can submit a PR to update them in README.md

zhzhuang-zju · 2026-05-25T01:52:55Z

+
+To confirm the metrics server is running normally, wait a few moments and then check if it can successfully serve pod metrics:
+
+RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}


Suggested change

RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}

RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}

RUN `kubectl --kubeconfig=$HOME/.kube/config-member2 top pods --all-namespaces`{{exec}}

zhzhuang-zju · 2026-05-25T01:58:29Z

+set -e
+
+kubectl --kubeconfig=$HOME/.kube/config-member1 -n kube-system get deployment metrics-server
+kubectl --kubeconfig=$HOME/.kube/config-member2 -n kube-system get deployment metrics-server


Suggested change

kubectl --kubeconfig=$HOME/.kube/config-member2 -n kube-system get deployment metrics-server

kubectl --kubeconfig=$HOME/.kube/config-member2 -n kube-system get deployment metrics-server

kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces

kubectl --kubeconfig=$HOME/.kube/config-member2 top pods --all-namespaces

We can add a metrics availability check here to avoid unhandled metrics.k8s.io error warnings in subsequent steps.

zhzhuang-zju · 2026-05-25T01:59:01Z

+> *Note: As before, you can safely ignore any `metrics.k8s.io` Unhandled Error warnings if they appear.*
+


Suggested change

> *Note: As before, you can safely ignore any `metrics.k8s.io` Unhandled Error warnings if they appear.*

zhzhuang-zju · 2026-05-25T02:19:35Z

+With the local metrics servers running, we now need to bridge that data to the Karmada control plane so the FHPA controller can make global scaling decisions.
+
+**Install the `karmada-metrics-adapter`:**
+This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `custom.metrics.k8s.io` APIService in the control plane, which the FederatedHPA controller uses to fetch metrics.


Suggested change

This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `custom.metrics.k8s.io` APIService in the control plane, which the FederatedHPA controller uses to fetch metrics.

This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `metrics.k8s.io` and `custom.metrics.k8s.io` APIServices in the control plane, which the FederatedHPA controller uses to fetch metrics.

zhzhuang-zju · 2026-05-25T02:20:38Z

+set -e
+
+kubectl --kubeconfig $HOME/.kube/config -n karmada-system get deployment karmada-metrics-adapter
+kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.custom.metrics.k8s.io


Suggested change

kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.custom.metrics.k8s.io

kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.metrics.k8s.io

Krishiv-Mahajan · 2026-05-25T04:22:32Z

Thanks, you made a great improvement compared with the initial version.

After all the new scenarios are merged, you can submit a PR to update them in README.md

ok, once all the prs are merged I will update the README.md

Signed-off-by: Krishiv-Mahajan <mahajankrishiv10@gmail.com>

zhzhuang-zju · 2026-05-25T09:44:19Z

Thanks
/lgtm
/cc @RainbowMango for APPROVAL

karmada-bot · 2026-05-25T09:44:22Z

@zhzhuang-zju: GitHub didn't allow me to request PR reviews from the following users: for, APPROVAL.

Note that only karmada-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

Thanks
/lgtm
/cc @RainbowMango for APPROVAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

karmada-bot requested review from RainbowMango and jwcesign May 22, 2026 11:06

karmada-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 22, 2026

gemini-code-assist Bot reviewed May 22, 2026

View reviewed changes

zhzhuang-zju mentioned this pull request May 23, 2026

[LFX term 1] Enhance Karmada's Quick Start Experience and Incorporate macOS Support karmada-io/karmada#7269

Open

13 tasks

zhzhuang-zju reviewed May 23, 2026

View reviewed changes

zhzhuang-zju reviewed May 25, 2026

View reviewed changes

Krishiv-Mahajan force-pushed the fhpa-usage branch from ecc8f88 to 013d297 Compare May 25, 2026 04:19

fhpa-usage tutorial construction

5ebe4f3

Signed-off-by: Krishiv-Mahajan <mahajankrishiv10@gmail.com>

Krishiv-Mahajan force-pushed the fhpa-usage branch from 2f8d5a9 to 5ebe4f3 Compare May 25, 2026 09:41

karmada-bot assigned zhzhuang-zju May 25, 2026

karmada-bot added the lgtm Indicates that a PR is ready to be merged. label May 25, 2026

		@@ -0,0 +1,3 @@
		# Summary

		In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling,created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.

	cat << EOF > installKind.sh
	cat << EOF > installKind.sh
	set -e

	cat << EOF > createCluster.sh
	cat << EOF > createCluster.sh
	set -e

	kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'
	kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/resources", "value": {"requests": {"cpu": "50m"}, "limits": {"cpu": "200m"}}}]'

	kubectl delete node node01
	kubectl delete node node01 \|\| true

		The FederatedHPA (FHPA) relies on a two-layer metrics pipeline to gather the data needed for autoscaling.

		First, the `metrics-server` component must be running on each member cluster. It is responsible for collecting per-pod resource utilization data (such as CPU and memory usage) at the local cluster level.


		RUN `bash ~/installMetricsServer.sh`{{exec}}

		It automatically downloads the upstream metrics-server manifest, patches it with the `--kubelet-insecure-tls=true` flag for compatibility with our Kind environment, and applies it to both `kind-member1` and `kind-member2`.


		2. Register the Custom Metrics API:
		Next, we must register the custom metrics `APIService` on both member clusters so the adapter can securely access their local metric endpoints.


		RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get deployment nginx`{{exec}}

		This command confirms that the Nginx Deployment template has been successfully registered in the control plane.


		To confirm the metrics server is running normally, wait a few moments and then check if it can successfully serve pod metrics:

		RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}

		> Note: As before, you can safely ignore any `metrics.k8s.io` Unhandled Error warnings if they appear.

	This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `custom.metrics.k8s.io` APIService in the control plane, which the FederatedHPA controller uses to fetch metrics.
	This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `metrics.k8s.io` and `custom.metrics.k8s.io` APIServices in the control plane, which the FederatedHPA controller uses to fetch metrics.

	kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.custom.metrics.k8s.io
	kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.metrics.k8s.io

Conversation

Krishiv-Mahajan commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

testing:

Uh oh!

karmada-bot commented May 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

zhzhuang-zju left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhzhuang-zju left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Krishiv-Mahajan commented May 25, 2026

Uh oh!

zhzhuang-zju commented May 25, 2026

Uh oh!

Krishiv-Mahajan commented May 22, 2026 •

edited

Loading