Skip to content

fhpa-usage tutorial construction#28

Open
Krishiv-Mahajan wants to merge 1 commit into
karmada-io:mainfrom
Krishiv-Mahajan:fhpa-usage
Open

fhpa-usage tutorial construction#28
Krishiv-Mahajan wants to merge 1 commit into
karmada-io:mainfrom
Krishiv-Mahajan:fhpa-usage

Conversation

@Krishiv-Mahajan
Copy link
Copy Markdown
Contributor

@Krishiv-Mahajan Krishiv-Mahajan commented May 22, 2026

part of karmada-io/karmada#7269

Description

This PR introduces a complete, interactive Killercoda tutorial demonstrating how to use the Karmada FederatedHPA (FHPA) to perform cross-cluster workload autoscaling.
The scenario guides the user through setting up a multi-cluster metrics pipeline, deploying a scalable workload, and triggering dynamic scale-up and scale-down events based on CPU utilization.

testing:

The scenario can be tested on :https://killercoda.com/karmada-demo/scenario/karmada-FHPA-example

@karmada-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jwcesign for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 22, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new tutorial for Karmada FederatedHPA, providing scripts and documentation to demonstrate multi-cluster autoscaling based on CPU metrics. The reviewer suggested several improvements to the setup script's robustness, including better error handling in generated scripts, safer JSON patching for resource limits, and ensuring idempotency for node operations. Additionally, feedback was provided to correct invalid kubeconfig merging logic and fix minor typos and redundancies in the documentation and verification scripts.

Comment thread karmada-FHPA-example/finish.md Outdated
@@ -0,0 +1,3 @@
# Summary

In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling,created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a missing space after the comma in "autoscaling,created".

Suggested change
In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling,created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.
In this scenario, we installed metrics-server on member clusters and enabled the karmada-metrics-adapter on the control plane to provide CPU metrics for autoscaling, created a FederatedHPA resource to monitor CPU utilization across all member clusters and automatically adjust the number of replicas based on workload demand. To test the scaling behavior, we generated CPU load using the `williamyeh/hey` load-generator pod and observed the FederatedHPA trigger a scale-up event. After stopping the load generation, we also observed the FederatedHPA scale the workload back down automatically once CPU usage returned to normal.

KUBECONFIG_PATH=${KUBECONFIG_PATH:-"${HOME}/.kube"}

function installKind() {
cat << EOF > installKind.sh
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The generated script installKind.sh should include set -e to ensure it exits immediately if any command (like wget) fails. This applies to other generated scripts in this file as well (e.g., createCluster.sh, installMetricsServer.sh, etc.).

Suggested change
cat << EOF > installKind.sh
cat << EOF > installKind.sh
set -e

}

function createCluster() {
cat << EOF > createCluster.sh
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The generated script createCluster.sh should include set -e to ensure it exits immediately if cluster creation or configuration fails.

Suggested change
cat << EOF > createCluster.sh
cat << EOF > createCluster.sh
set -e

Comment thread karmada-FHPA-example/foreground.sh Outdated
kind delete cluster --name=member1 || true
kind create cluster --name=member1 --config=cluster1.yaml
# Patch kindnet to use less CPU
kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using replace in a JSON patch will fail if the target path (e.g., resources) does not already exist in the object. Since kindnet typically does not have resource requests/limits defined by default in Kind, it is safer to use the add operation for the entire resources object.

Suggested change
kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'
kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/resources", "value": {"requests": {"cpu": "50m"}, "limits": {"cpu": "200m"}}}]'

Comment thread karmada-FHPA-example/foreground.sh Outdated

kind delete cluster --name=member2 || true
kind create cluster --name=member2 --config=cluster2.yaml
kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

As with the previous patch, using add is safer than replace if the resources field is missing from the kindnet manifest.

Suggested change
kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "50m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "200m"}]'
kubectl --kubeconfig \$HOME/.kube/config patch daemonset kindnet -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/resources", "value": {"requests": {"cpu": "50m"}, "limits": {"cpu": "200m"}}}]'

Comment thread karmada-FHPA-example/foreground.sh Outdated
kubectl --kubeconfig \$HOME/.kube/config patch deployment coredns -n kube-system --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/requests/cpu", "value": "30m"}, {"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value": "100m"}]'
mv \$HOME/.kube/config config-member2

KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten >> ${KUBECONFIG_PATH}/config
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Appending to a kubeconfig file using >> is incorrect as it creates an invalid YAML structure (multiple documents without separators). Furthermore, this merged config on the member node appears to be unused since the individual config files are scp-ed back to the host cluster and used explicitly via --kubeconfig in subsequent steps. This line should be removed to avoid confusion and potential corruption of the local config.

Comment thread karmada-FHPA-example/foreground.sh Outdated
root@${member_cluster_ip}:~
}

kubectl delete node node01
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the script is re-run or if the node has already been removed from the cluster, kubectl delete node will return an error and stop the script (due to set -e). Adding || true makes the script more idempotent.

Suggested change
kubectl delete node node01
kubectl delete node node01 || true

Comment thread karmada-FHPA-example/step4/verify.sh Outdated
#!/bin/bash
set -e

set -e
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The set -e command is duplicated.

Copy link
Copy Markdown
Contributor

@zhzhuang-zju zhzhuang-zju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread karmada-FHPA-example/step5/text.md Outdated
Comment on lines +3 to +17
1. Join `kind-member1` and `kind-member2` to the host cluster.

RUN `MEMBER_CLUSTER_NAME=kind-member1`{{exec}}

This sets the variable `MEMBER_CLUSTER_NAME` to `kind-member1` for use in the join command.

RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config join ${MEMBER_CLUSTER_NAME} --cluster-kubeconfig=$HOME/.kube/config-member1 --cluster-context=kind-member1`{{exec}}

This joins the `kind-member1` cluster to the Karmada control plane using its kubeconfig file and context.

RUN `MEMBER_CLUSTER_NAME=kind-member2`{{exec}}

This sets the variable to `kind-member2` for the second cluster join.

RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config join ${MEMBER_CLUSTER_NAME} --cluster-kubeconfig=$HOME/.kube/config-member2 --cluster-context=kind-member2`{{exec}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You keep switching between different scenarios here, which is not ideal.
Just use the non-variables approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I got a bit confused.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use the non variable one in all scenarios

Comment thread karmada-FHPA-example/step6/text.md Outdated
Comment on lines +3 to +5
The FederatedHPA (FHPA) relies on a two-layer metrics pipeline to gather the data needed for autoscaling.

First, the `metrics-server` component must be running on each member cluster. It is responsible for collecting per-pod resource utilization data (such as CPU and memory usage) at the local cluster level.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The FederatedHPA (FHPA) relies on a two-layer metrics pipeline to gather the data needed for autoscaling.
First, the `metrics-server` component must be running on each member cluster. It is responsible for collecting per-pod resource utilization data (such as CPU and memory usage) at the local cluster level.
We need to install metrics-server for member clusters to provider the metrics API

Comment thread karmada-FHPA-example/step6/text.md Outdated

RUN `bash ~/installMetricsServer.sh`{{exec}}

It automatically downloads the upstream metrics-server manifest, patches it with the `--kubelet-insecure-tls=true` flag for compatibility with our Kind environment, and applies it to both `kind-member1` and `kind-member2`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It automatically downloads the upstream metrics-server manifest, patches it with the `--kubelet-insecure-tls=true` flag for compatibility with our Kind environment, and applies it to both `kind-member1` and `kind-member2`.
It automatically downloads the upstream metrics-server manifest and applies it to both `kind-member1` and `kind-member2`.

Comment thread karmada-FHPA-example/step7/text.md Outdated

**2. Register the Custom Metrics API:**
Next, we must register the custom metrics `APIService` on both member clusters so the adapter can securely access their local metric endpoints.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The demo FHPA uses resource metrics, right? Why do we still need to register the Custom Metrics API?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right, we dont need it , I will just remove it

Comment thread karmada-FHPA-example/step8/text.md Outdated

RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get deployment nginx`{{exec}}

This command confirms that the Nginx Deployment template has been successfully registered in the control plane.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This command confirms that the Nginx Deployment template has been successfully registered in the control plane.

Comment thread karmada-FHPA-example/step9/text.md Outdated

> **Note:** It takes a brief moment for the scheduler to distribute the workload and for the clusters to pull the container image. If the command returns "No resources found", wait ~30 seconds and run it again. Since our initial replica count is 1, you should see exactly 1 pod running on one of the member clusters.
>
> *Troubleshooting:* If you see several lines of `Unhandled Error` regarding `metrics.k8s.io`, don't worry! This is normal and just indicates that the Karmada metrics adapter is still starting up in the background. You can safely ignore these warnings.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a gentle correction. This is unrelated to Karmada metrics adapter, it actually depends on the metrics server within member clusters.

We can advance the verification step. After deploying metrics server, confirm its normal availability by running kubectl --kubeconfig $HOME/.kube/config-memberX top pods without errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the correction

Comment thread karmada-FHPA-example/step10/text.md Outdated
Comment on lines +32 to +38
**Verify the Multi-Cluster Service:**

RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config get svc --operation-scope members`{{exec}}

> *Note: If you see `Unhandled Error` warnings regarding metrics, you can safely ignore them.*

You should see the `nginx-service` running on the member clusters. This is the service we will use to generate load!
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Verify the Multi-Cluster Service:**
RUN `karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config get svc --operation-scope members`{{exec}}
> *Note: If you see `Unhandled Error` warnings regarding metrics, you can safely ignore them.*
You should see the `nginx-service` running on the member clusters. This is the service we will use to generate load!

It is improper to verify Multi-Cluster Service via the command karmadactl --kubeconfig /etc/karmada/karmada-apiserver.config get svc --operation-scope members. Services on member clusters are not created by Multi-Cluster Service, but distributed to member clusters in advance via PropagationPolicy.

We can remove this part for simplicity.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've applied this change , I completely removed that misleading verification step from step10/text.md.

Copy link
Copy Markdown
Contributor

@zhzhuang-zju zhzhuang-zju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, you made a great improvement compared with the initial version.

After all the new scenarios are merged, you can submit a PR to update them in README.md


To confirm the metrics server is running normally, wait a few moments and then check if it can successfully serve pod metrics:

RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}
RUN `kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces`{{exec}}
RUN `kubectl --kubeconfig=$HOME/.kube/config-member2 top pods --all-namespaces`{{exec}}

set -e

kubectl --kubeconfig=$HOME/.kube/config-member1 -n kube-system get deployment metrics-server
kubectl --kubeconfig=$HOME/.kube/config-member2 -n kube-system get deployment metrics-server
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kubectl --kubeconfig=$HOME/.kube/config-member2 -n kube-system get deployment metrics-server
kubectl --kubeconfig=$HOME/.kube/config-member2 -n kube-system get deployment metrics-server
kubectl --kubeconfig=$HOME/.kube/config-member1 top pods --all-namespaces
kubectl --kubeconfig=$HOME/.kube/config-member2 top pods --all-namespaces

We can add a metrics availability check here to avoid unhandled metrics.k8s.io error warnings in subsequent steps.

Comment thread karmada-FHPA-example/step12/text.md Outdated
Comment on lines +11 to +12
> *Note: As before, you can safely ignore any `metrics.k8s.io` Unhandled Error warnings if they appear.*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> *Note: As before, you can safely ignore any `metrics.k8s.io` Unhandled Error warnings if they appear.*

Comment thread karmada-FHPA-example/step7/text.md Outdated
With the local metrics servers running, we now need to bridge that data to the Karmada control plane so the FHPA controller can make global scaling decisions.

**Install the `karmada-metrics-adapter`:**
This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `custom.metrics.k8s.io` APIService in the control plane, which the FederatedHPA controller uses to fetch metrics.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `custom.metrics.k8s.io` APIService in the control plane, which the FederatedHPA controller uses to fetch metrics.
This add-on runs on the Karmada control plane and aggregates the metrics collected from the member clusters. It also automatically registers the `metrics.k8s.io` and `custom.metrics.k8s.io` APIServices in the control plane, which the FederatedHPA controller uses to fetch metrics.

Comment thread karmada-FHPA-example/step7/verify.sh Outdated
set -e

kubectl --kubeconfig $HOME/.kube/config -n karmada-system get deployment karmada-metrics-adapter
kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.custom.metrics.k8s.io
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.custom.metrics.k8s.io
kubectl --kubeconfig $HOME/.kube/config get apiservice v1beta1.metrics.k8s.io

@Krishiv-Mahajan
Copy link
Copy Markdown
Contributor Author

Thanks, you made a great improvement compared with the initial version.

After all the new scenarios are merged, you can submit a PR to update them in README.md

ok, once all the prs are merged I will update the README.md

Signed-off-by: Krishiv-Mahajan <mahajankrishiv10@gmail.com>
@zhzhuang-zju
Copy link
Copy Markdown
Contributor

Thanks
/lgtm
/cc @RainbowMango for APPROVAL

@karmada-bot
Copy link
Copy Markdown
Contributor

@zhzhuang-zju: GitHub didn't allow me to request PR reviews from the following users: for, APPROVAL.

Note that only karmada-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

Thanks
/lgtm
/cc @RainbowMango for APPROVAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants