multi-component-workload tutorial creation by Krishiv-Mahajan · Pull Request #29 · karmada-io/playground

Krishiv-Mahajan · 2026-05-23T14:29:07Z

This PR introduces a new interactive 10-step Killercoda tutorial scenario designed to demonstrate how Karmada handles complex, multi-component workloads.
Unlike simple stateless deployments, advanced applications (such as Big Data frameworks like Apache Flink or batch schedulers like Volcano) consist of multiple tightly-coupled components with distinct resource profiles (e.g., JobManagers and TaskManagers). This tutorial guides users through the process of teaching Karmada how to accurately interpret these resources and schedule them atomically.

Testing:

This can be tested on: https://killercoda.com/testing-scenario/scenario/karmada-multi-component-workload-example

karmada-bot · 2026-05-23T14:29:11Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rainbowmango for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2026-05-23T14:29:11Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

Krishiv-Mahajan · 2026-05-23T14:29:30Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a comprehensive tutorial for scheduling multi-component workloads, such as Apache Flink and Volcano, across multiple clusters using Karmada. The changes include environment setup scripts, tutorial documentation, and verification logic. Feedback from the review identifies several critical issues, including a logic bug in the Flink Lua interpreter where JobManager memory was incorrectly referenced for TaskManagers, and a potential kubeconfig corruption issue during cluster creation. Additionally, the reviewer pointed out incorrect memory units in the Flink manifest, inconsistencies between the code and tutorial text, and missing verification checks for the scheduler component.

gemini-code-assist · 2026-05-23T14:31:16Z

+          local tm_cpu    = get(observedObj, {"spec","taskManager","resource","cpu"})
+          local tm_memory = get(observedObj, {"spec","taskManager","resource","memory"})
+          if tm_cpu ~= nil then tm_requires.resourceRequest.cpu = tm_cpu end
+          if tm_memory ~= nil then tm_requires.resourceRequest.memory = kube.getResourceQuantity(tm_memory) end


There is a bug here: jm_memory is used instead of tm_memory. Additionally, removing kube.getResourceQuantity() will keep the output consistent with the tutorial text and the Volcano example.

Suggested change

if tm_memory ~= nil then tm_requires.resourceRequest.memory = kube.getResourceQuantity(tm_memory) end

if tm_memory ~= nil then tm_requires.resourceRequest.memory = tm_memory end

gemini-code-assist · 2026-05-23T14:31:16Z

+    mv $HOME/.kube/config ~/config-member1
+    kind create cluster --name=member2 --config=cluster2.yaml
+    mv $HOME/.kube/config config-member2
+    KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten >> ${KUBECONFIG_PATH}/config


Using >> to append the output of kubectl config view --flatten to an existing kubeconfig file will result in an invalid YAML file if the destination already contains data (e.g., the default config in a Killercoda environment). It is better to overwrite the file or use a proper merge strategy.

Suggested change

KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten >> ${KUBECONFIG_PATH}/config

KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten > ${KUBECONFIG_PATH}/config

gemini-code-assist · 2026-05-23T14:31:16Z

+          local jm_cpu    = get(observedObj, {"spec","jobManager","resource","cpu"})
+          local jm_memory = get(observedObj, {"spec","jobManager","resource","memory"})
+          if jm_cpu ~= nil then jm_requires.resourceRequest.cpu = jm_cpu end
+          if jm_memory ~= nil then jm_requires.resourceRequest.memory = kube.getResourceQuantity(jm_memory) end


For consistency with the Volcano example and to match the expected output in the tutorial steps, it is better to avoid using kube.getResourceQuantity() for memory. This ensures the original string (e.g., "100Mi") is preserved in the ResourceBinding.

Suggested change

if jm_memory ~= nil then jm_requires.resourceRequest.memory = kube.getResourceQuantity(jm_memory) end

if jm_memory ~= nil then jm_requires.resourceRequest.memory = jm_memory end

gemini-code-assist · 2026-05-23T14:31:17Z

+    replicas: 1
+    resource:
+      cpu: 1
+      memory: 100m


The memory quantity 100m is likely a typo for 100Mi. In Kubernetes resource quantities, m suffix for memory represents milli-bytes (0.1 bytes), which is likely not intended and will cause scheduling issues.

Suggested change

memory: 100m

memory: 100Mi

gemini-code-assist · 2026-05-23T14:31:17Z

+  taskManager:
+    resource:
+      cpu: 1
+      memory: 100m


The memory quantity 100m should be corrected to 100Mi for consistency and correctness.

Suggested change

memory: 100m

memory: 100Mi

gemini-code-assist · 2026-05-23T14:31:17Z

@@ -0,0 +1,3 @@
+#!/bin/bash
+
+kubectl -n karmada-system get deployment karmada-controller-manager -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true" && kubectl -n karmada-system get deployment karmada-webhook -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true"


The verification script should also check the karmada-scheduler deployment, as it was also patched in the instructions to enable the MultiplePodTemplatesScheduling feature gate.

Suggested change

kubectl -n karmada-system get deployment karmada-controller-manager -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true" && kubectl -n karmada-system get deployment karmada-webhook -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true"

kubectl -n karmada-system get deployment karmada-controller-manager -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true" && kubectl -n karmada-system get deployment karmada-scheduler -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true" && kubectl -n karmada-system get deployment karmada-webhook -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true"

gemini-code-assist · 2026-05-23T14:31:17Z

+The Flink manifest specified `parallelism: 2` and `taskmanager.numberOfTaskSlots: "2"`. Using the Lua interpreter we applied earlier, Karmada correctly calculates that `ceil(2/2) = 1` taskManager replica is needed. Let's verify that Karmada captured this, along with the CPU (1) and memory (100m) requests:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components[] | select(.name=="taskmanager") | .replicaRequirements.resourceRequest'`{{exec}}
+
+This outputs a JSON object with `"cpu": "1"` and `"memory": "100m"`.


The expected memory value in the tutorial text should be updated to 100Mi to be consistent with the fix in the manifest and the actual resource requirements.

Suggested change

The Flink manifest specified `parallelism: 2` and `taskmanager.numberOfTaskSlots: "2"`. Using the Lua interpreter we applied earlier, Karmada correctly calculates that `ceil(2/2) = 1` taskManager replica is needed. Let's verify that Karmada captured this, along with the CPU (1) and memory (100m) requests:

RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components[] | select(.name=="taskmanager") | .replicaRequirements.resourceRequest'`{{exec}}

This outputs a JSON object with `"cpu": "1"` and `"memory": "100m"`.

The Flink manifest specified `parallelism: 2` and `taskmanager.numberOfTaskSlots: "2"`. Using the Lua interpreter we applied earlier, Karmada correctly calculates that `ceil(2/2) = 1` taskManager replica is needed. Let's verify that Karmada captured this, along with the CPU (1) and memory (100Mi) requests:\n\nRUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components[] | select(.name=="taskmanager") | .replicaRequirements.resourceRequest'`{{exec}}\n\nThis outputs a JSON object with "cpu": "1" and "memory": "100Mi".

zhzhuang-zju · 2026-05-25T03:13:17Z

+
+Multi-component scheduling (`MultiplePodTemplatesScheduling`) is currently an **Alpha** feature in Karmada and is **disabled by default**. We need to explicitly enable it on the `karmada-controller-manager`, `karmada-scheduler`, and `karmada-webhook` components.
+
+> **Note:** Because these components are running as native Pods on the underlying host cluster, we patch them using the default `kubectl` context, **not** the Karmada API server kubeconfig. We also temporarily change their deployment strategy to `Recreate` to prevent resource deadlocks during the rollout.


We also temporarily change their deployment strategy to Recreate to prevent resource deadlocks during the rollout.

Can you elaborate more on this?

I have added a bit more detail in the latest commit

Thanks for the explanation. However, these three components don't seem to have resource configurations declared. So there shouldn't be deadlocks during their restart, right?

zhzhuang-zju · 2026-05-25T03:14:02Z

@@ -0,0 +1,3 @@
+#!/bin/bash
+
+kubectl -n karmada-system get deployment karmada-controller-manager -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true" && kubectl -n karmada-system get deployment karmada-webhook -o json | jq -r '.spec.template.spec.containers[0].command[]' | grep -q "MultiplePodTemplatesScheduling=true"


zhzhuang-zju · 2026-05-25T07:01:08Z

+1. Apply their **Custom Resource Definitions (CRDs)** to the Karmada control plane and propagate them to the member clusters.
+2. Apply **Resource Interpreter Customizations** to teach Karmada how to extract per-component resource requirements from these specific workload types.
+
+> **Note:** We have pre-downloaded the necessary CRDs and placed them in `/root/examples/` for you.


Suggested change

> **Note:** We have pre-downloaded the necessary CRDs and placed them in `/root/examples/` for you.

> **Note:** Karmada has built-in support for interpreting common third-party multi-component workload resources such as FlinkDeployment and VolcanoJob. They define rules for Karmada to parse these resources, covering extraction of replicas and resource requirements of each component, judgment of workload health status and identification of dependent resources.

zhzhuang-zju · 2026-05-25T07:02:27Z

+**Apply the Resource Interpreter Customizations:**
+
+Karmada uses a built-in "Resource Interpreter" to dynamically inspect unfamiliar custom resources. By applying these Lua-based configurations, we teach the interpreter exactly where to look in a `FlinkDeployment` and `VolcanoJob` to find their individual components, replicas, and CPU/Memory requests.
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply -f /root/examples/flink-interpreter.yaml`{{exec}}
+
+This applies the Flink Resource Interpreter Customization.
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply -f /root/examples/volcano-interpreter.yaml`{{exec}}
+
+This applies the Volcano Resource Interpreter Customization.


Suggested change

**Apply the Resource Interpreter Customizations:**

Karmada uses a built-in "Resource Interpreter" to dynamically inspect unfamiliar custom resources. By applying these Lua-based configurations, we teach the interpreter exactly where to look in a `FlinkDeployment` and `VolcanoJob` to find their individual components, replicas, and CPU/Memory requests.

RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply -f /root/examples/flink-interpreter.yaml`{{exec}}

This applies the Flink Resource Interpreter Customization.

RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply -f /root/examples/volcano-interpreter.yaml`{{exec}}

This applies the Volcano Resource Interpreter Customization.

zhzhuang-zju · 2026-05-25T07:04:32Z

@@ -0,0 +1,68 @@
+### Provide Workload Definitions to Karmada
+
+Before Karmada can schedule complex Flink and Volcano workloads, it needs to understand their structure. 


Suggested change

Before Karmada can schedule complex Flink and Volcano workloads, it needs to understand their structure.

Before Karmada can schedule complex FlinkDeployment workloads, it needs to understand their structure.

Using FlinkDeployment for the demonstration is sufficient.

zhzhuang-zju · 2026-05-25T07:05:50Z

+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployment-cr.yaml`{{exec}}
+
+This applies the Flink Custom Resource.


Suggested change

This applies the Flink Custom Resource.

This applies the FlinkDeployment Custom Resource.

Please use the full name

zhzhuang-zju · 2026-05-25T07:08:18Z

+
+</details>
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployment-cr.yaml`{{exec}}


A question: why do we need validate=false here?

FlinkDeployment CRD is not registered on the Karmada API server itself and --validate=false bypasses schema validation on the Karmada API server.

What exactly are you referring to as "registered"? The -validate=false flag is unnecessary if the steps are followed properly.

zhzhuang-zju · 2026-05-25T07:17:33Z

+
+If a multi-cluster scheduler treats these complex jobs as a single generic workload, it may underestimate the total resources required, or accidentally scatter the tightly-coupled components across entirely different geographical clusters, destroying the low-latency communication required for the job to function.
+
+In this scenario, we will deploy multi-component workloads (Flink and Volcano) and use custom Resource Interpreters to teach Karmada how to extract their individual components. We will then use `SpreadConstraints` to ensure all components of a job are scheduled atomistically to the exact same target cluster.


Suggested change

In this scenario, we will deploy multi-component workloads (Flink and Volcano) and use custom Resource Interpreters to teach Karmada how to extract their individual components. We will then use `SpreadConstraints` to ensure all components of a job are scheduled atomistically to the exact same target cluster.

In this scenario, we will deploy multi-component workloads (FlinkDeployment) and use custom Resource Interpreters to teach Karmada how to extract their individual components. We will then use `SpreadConstraints` to ensure all workload components are scheduled atomistically to the identical target cluster with sufficient resources.

zhzhuang-zju · 2026-05-25T07:33:06Z

+
+When you apply a workload, Karmada uses its Resource Interpreter to analyze the custom resource, extract its requirements, and wrap it into a `ResourceBinding`. Let's inspect this binding to see what Karmada discovered.
+
+First, extract the dynamic binding name into a variable:


We can divide this page into several sections to improve readability:

#### Replicas #### Resource Requirement These two sections ensure Karmada can accurately perceive resource demands of multi-component workloads, serving as basis for filtering available clusters. #### Scheduling Result Only one cluster will be selected as the scheduling result.

zhzhuang-zju · 2026-05-26T01:23:40Z

+
+Multi-component scheduling (`MultiplePodTemplatesScheduling`) is currently an **Alpha** feature in Karmada and is **disabled by default**. We need to explicitly enable it on the `karmada-controller-manager`, `karmada-scheduler`, and `karmada-webhook` components.
+
+> **Note:** Because these components are running as native Pods on the underlying host cluster, we patch them using the default `kubectl` context, **not** the Karmada API server kubeconfig. We also temporarily change their deployment strategy to `Recreate` to prevent resource deadlocks during the rollout.


Thanks for the explanation. However, these three components don't seem to have resource configurations declared. So there shouldn't be deadlocks during their restart, right?

zhzhuang-zju · 2026-05-26T03:47:23Z

+
+Multi-component scheduling (`MultiplePodTemplatesScheduling`) is currently an **Alpha** feature in Karmada and is **disabled by default**. We need to explicitly enable it on three core control plane components to ensure the entire scheduling pipeline can process multi-component workloads:
+
+- **`karmada-webhook`**: Needs the feature gate to successfully validate and mutate the multi-component fields within incoming `ResourceBinding` and `PropagationPolicy` objects.


Suggested change

- **`karmada-webhook`**: Needs the feature gate to successfully validate and mutate the multi-component fields within incoming `ResourceBinding` and `PropagationPolicy` objects.

- **`karmada-webhook`**: Needs the feature gate to successfully validate the multi-component fields within incoming `ResourceBinding` objects.

zhzhuang-zju · 2026-05-26T03:52:18Z

+
+- **`karmada-webhook`**: Needs the feature gate to successfully validate and mutate the multi-component fields within incoming `ResourceBinding` and `PropagationPolicy` objects.
+- **`karmada-controller-manager`**: Requires it to execute custom Resource Interpreters that extract the specific components, and to build the comprehensive `ResourceBinding` that contains them.
+- **`karmada-scheduler`**: Uses it to compute the aggregate resource requirements of all extracted components, ensuring the selected target cluster has sufficient capacity to host the entire complex workload.


Suggested change

- **`karmada-scheduler`**: Uses it to compute the aggregate resource requirements of all extracted components, ensuring the selected target cluster has sufficient capacity to host the entire complex workload.

- **`karmada-scheduler`**: Uses it to obtain the detailed resource requirements of the workload, ensuring the selected target cluster has sufficient capacity to host the entire complex workload.

zhzhuang-zju · 2026-05-26T06:35:52Z

+
+</details>
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployment-cr.yaml`{{exec}}


What exactly are you referring to as "registered"? The -validate=false flag is unnecessary if the steps are followed properly.

zhzhuang-zju · 2026-05-26T06:38:41Z

+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployments.flink.apache.org-v1.yaml`{{exec}}
+
+This applies the Flink CRD.


Suggested change

This applies the Flink CRD.

This applies the FlinkDeployment CRD.

Please use the full name

zhzhuang-zju · 2026-05-26T06:40:32Z

+
+Karmada needs to know exactly how to parse the custom resources to find their component definitions. We provide Lua scripts that teach Karmada how to do this.
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply -f /root/examples/flink-interpreter.yaml`{{exec}}


We have built-in interpreters, why do we need apply the interpreter here?

You're right, thanks for catching that, I will just fix it

zhzhuang-zju · 2026-05-26T06:47:32Z

+Let's check if Karmada successfully parsed the `spec.components` array. The array should contain exactly 2 distinct components:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components | length'`{{exec}}
+
+This outputs `2`, confirming exactly two distinct components were extracted.
+
+Check the specific names of the components Karmada identified:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components[].name'`{{exec}}
+
+This outputs `"jobmanager"` and `"taskmanager"`.


The title is replicas, but the content inside has nothing to do with replicas at all!

zhzhuang-zju · 2026-05-26T07:09:04Z

+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployment-cr.yaml`{{exec}}
+
+This applies the FlinkDeployment Custom Resource.


You can add a brief description of this FlinkDeployment CR, like:

This FlinkDeployment includes a JobManager (1 replica, 1 CPU, 100Mi memory) and a TaskManager.
The TaskManager replica count is automatically computed as 1 using ceil(parallelism/numberOfTaskSlots), with resources of 1 CPU and 100Mi memory.

zhzhuang-zju · 2026-05-26T07:13:01Z

+#### Replicas
+
+Let's check if Karmada successfully parsed the `spec.components` array. The array should contain exactly 2 distinct components:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components | length'`{{exec}}
+
+This outputs `2`, confirming exactly two distinct components were extracted.
+
+Check the specific names of the components Karmada identified:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components[].name'`{{exec}}
+
+This outputs `"jobmanager"` and `"taskmanager"`.
+
+#### Resource Requirement
+
+The Flink manifest specified `parallelism: 2` and `taskmanager.numberOfTaskSlots: "2"`. Using the Lua interpreter we applied earlier, Karmada correctly calculates that `ceil(2/2) = 1` taskManager replica is needed. Let's verify that Karmada captured this, along with the CPU (1) and memory (100Mi) requests:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.spec.components[] | select(.name=="taskmanager") | .replicaRequirements.resourceRequest'`{{exec}}
+
+This outputs a JSON object with `"cpu": "1"` and `"memory": "100Mi"`.


Since the FlinkDeployment CR has been described in Step 7, we can keep this part concise. Just print the components within bindingSpec. If the result matches the expectation from Step 7, it proves that Karmada can parse FlinkDeployment correctly.

zhzhuang-zju · 2026-05-26T07:17:06Z

+Check that the workload was successfully scheduled by the Karmada control plane:
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq '.status.conditions[] | select(.type=="Scheduled") | .status'`{{exec}}
+
+This outputs `"True"`, indicating the workload was successfully scheduled.
+
+Finally, let's see which cluster it landed on and verify that the Flink components actually exist there:
+
+RUN `TARGET_CLUSTER=$(kubectl --kubeconfig /etc/karmada/karmada-apiserver.config get resourcebinding $BINDING_NAME -n default -o json | jq -r '.spec.clusters[0].name')`{{exec}}
+
+This extracts the scheduled target cluster into a variable.
+
+Verify that the FlinkDeployment exists on the target cluster:
+
+RUN `kubectl --kubeconfig=$HOME/.kube/config-${TARGET_CLUSTER#kind-} get flinkdeployment -n default`{{exec}}
+
+This lists the `flinkdeployment-sample` resource, verifying it exists on the target cluster.


print bingSpec.clusters, it only has one target cluster

Use the command karmadactl get flinkdeployment --operation-scope members to verify the flinkdeployment exists on the target cluster

zhzhuang-zju

Thanks

zhzhuang-zju · 2026-05-26T12:37:07Z

+
+**Apply the Resource Interpreters:**
+
+While Karmada's newer versions have built-in support for parsing FlinkDeployment workloads, the version installed in this environment requires us to explicitly provide a Lua script that teaches Karmada how to extract the component definitions.


the version installed in this environment requires us to explicitly provide a Lua script that teaches Karmada how to extract the component definitions.

The karmadactl version is v1.17.2, it should have the resource interpter for FlinkDeployment . So why should we explicitly provide a Lua script again?

I ran into the null components issue during testing , I reverted the changes but I was actually caused by something else , I have fixed it now

zhzhuang-zju · 2026-05-26T12:41:55Z

+
+**Apply the CRDs and PropagationPolicy:**
+
+RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployments.flink.apache.org-v1.yaml`{{exec}}


I still have the question: why do we need validate=false here?

zhzhuang-zju · 2026-05-26T12:47:39Z

+- **`karmada-controller-manager`**: Requires it to execute custom Resource Interpreters that extract the specific components, and to build the comprehensive `ResourceBinding` that contains them.
+- **`karmada-scheduler`**: Uses it to obtain the detailed resource requirements of the workload, ensuring the selected target cluster has sufficient capacity to host the entire complex workload.
+
+> **Note:** Because these components are running as native Pods on the underlying host cluster, we patch them using the default `kubectl` context, **not** the Karmada API server kubeconfig. We temporarily change their deployment strategy to `Recreate`. This isn't strictly for resource limitations, but to prevent a race condition in this tutorial where the old leader pod processes incoming workloads (with the feature gate disabled) while the new pod is starting up.


This isn't strictly for resource limitations, but to prevent a race condition in this tutorial where the old leader pod processes incoming workloads (with the feature gate disabled) while the new pod is starting up.

If you have concerns about this, you can set the corresponding feature gates via command-line arguments when running the command karmadactl init --xxxx to avoid restarts later. This will reduce complex operations and extra explanations down the line.

Run karmadactl init --help to check the usage.

Thank you for the suggestion to use the karmadactl init flags.

I checked karmadactl init --help and have updated the initialization command in Step 3 to pass the feature gates directly at start time using:

--karmada-controller-manager-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true"
--karmada-scheduler-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true"
--karmada-webhook-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true"

zhzhuang-zju

Thanks @Krishiv-Mahajan, much better

zhzhuang-zju · 2026-05-27T02:05:38Z

+
+To achieve this, we must apply their **Custom Resource Definitions (CRDs)** to the Karmada control plane and propagate them to the member clusters.
+
+> **Note:** Karmada (v1.17+) has built-in support for interpreting FlinkDeployment workloads. It automatically handles extraction of each component's replicas and resource requirements, so no manual Resource Interpreter is needed.


Suggested change

> **Note:** Karmada (v1.17+) has built-in support for interpreting FlinkDeployment workloads. It automatically handles extraction of each component's replicas and resource requirements, so no manual Resource Interpreter is needed.

> **Note:** Karmada has built-in support for interpreting FlinkDeployment workloads. It automatically handles extraction of each component's replicas and resource requirements, so no manual Resource Interpreter is needed.

Actually, it starts from v1.15. But we can simply remove this info

zhzhuang-zju · 2026-05-27T02:22:07Z

+
+If a multi-cluster scheduler treats these complex jobs as a single generic workload, it may underestimate the total resources required, or accidentally scatter the tightly-coupled components across entirely different geographical clusters, destroying the low-latency communication required for the job to function.
+
+In this scenario, we will deploy a multi-component workload (FlinkDeployment, though VolcanoJob is also fully supported) and use custom Resource Interpreters to teach Karmada how to extract its individual components. We will then use `SpreadConstraints` to ensure all workload components are scheduled atomistically to the identical target cluster with sufficient resources.


Suggested change

In this scenario, we will deploy a multi-component workload (FlinkDeployment, though VolcanoJob is also fully supported) and use custom Resource Interpreters to teach Karmada how to extract its individual components. We will then use `SpreadConstraints` to ensure all workload components are scheduled atomistically to the identical target cluster with sufficient resources.

In this scenario, we will deploy a FlinkDeployment and use a dedicated PropagationPolicy to atomically propagate this multi-component workload to a member cluster with sufficient resources.

zhzhuang-zju · 2026-05-27T02:26:30Z

+
+RUN `karmadactl init --karmada-controller-manager-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true" --karmada-scheduler-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true" --karmada-webhook-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true"`{{exec}}
+
+This sets up the Karmada control plane on the host cluster with multi-component scheduling enabled on the `karmada-controller-manager`, `karmada-scheduler`, and `karmada-webhook` from the start — no additional patches or restarts needed.


The previous explanation of each component's role in MultiplePodTemplatesScheduling works well and can be retained.

zhzhuang-zju · 2026-05-27T02:35:42Z

+    resource:
+      cpu: 1
+      memory: 100m
+  serviceAccount: flink
+  taskManager:
+    resource:
+      cpu: 1
+      memory: 100m


Suggested change

resource:

cpu: 1

memory: 100m

serviceAccount: flink

taskManager:

resource:

cpu: 1

memory: 100m

resource:

cpu: 0.01

memory: 1m

serviceAccount: flink

taskManager:

resource:

cpu: 0.02

memory: 2m

We just fixed a bug that when cluster resources are insufficient, multiple template resources can still be scheduled. So we should lower the resource request.

Please update the relevant content accordingly as well.

zhzhuang-zju · 2026-05-27T02:40:13Z

+
+This FlinkDeployment includes a JobManager (1 replica, 1 CPU, 100Mi memory) and a TaskManager.
+The TaskManager replica count is automatically computed as 1 using `ceil(parallelism/numberOfTaskSlots)`, with resources of 1 CPU and 100Mi memory.
+


We can add a note here:

During scheduling, karmada-scheduler will filter out the cluster with sufficient resources based on node resources and quotas of member clusters. So in this scenario, we set the resource requests of FlinkDeployment as low as possible to ensure successful propagation.

zhzhuang-zju

Thanks, others LGTM

Please make sure to squash your commits after making change to make sure the PR is ready to get merged.

zhzhuang-zju · 2026-05-27T06:16:30Z

+
+- **karmada-controller-manager**: Parses the workload using the Resource Interpreter framework and populates the `spec.components` array in the `ResourceBinding` to declare the resource requests of all sub-components.
+- **karmada-scheduler**: Reads the `spec.components` array to calculate the total aggregated resources needed, ensuring the workload is only scheduled to member clusters with sufficient capacity to co-locate all components.
+- **karmada-webhook**: Intercepts the scheduling policies and validates the multi-component configurations.


Suggested change

- **karmada-webhook**: Intercepts the scheduling policies and validates the multi-component configurations.

- **karmada-webhook**: Validates the multi-component fields within incoming `ResourceBinding` objects.

Signed-off-by: Krishiv-Mahajan <mahajankrishiv10@gmail.com>

karmada-bot requested review from RainbowMango and jwcesign May 23, 2026 14:29

karmada-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label May 23, 2026

gemini-code-assist Bot reviewed May 23, 2026

View reviewed changes

zhzhuang-zju mentioned this pull request May 25, 2026

[LFX term 1] Enhance Karmada's Quick Start Experience and Incorporate macOS Support karmada-io/karmada#7269

Open

13 tasks

zhzhuang-zju reviewed May 25, 2026

View reviewed changes

Krishiv-Mahajan force-pushed the big-data branch 2 times, most recently from 5f4c501 to c6ae27d Compare May 25, 2026 16:37

zhzhuang-zju reviewed May 26, 2026

View reviewed changes

Krishiv-Mahajan force-pushed the big-data branch from c0a7a26 to ee4c598 Compare May 26, 2026 13:30

zhzhuang-zju reviewed May 27, 2026

View reviewed changes

multi-component-workload tutorial construction

434d84c

Signed-off-by: Krishiv-Mahajan <mahajankrishiv10@gmail.com>

Krishiv-Mahajan force-pushed the big-data branch from cda4ad6 to 434d84c Compare May 27, 2026 07:13

	if tm_memory ~= nil then tm_requires.resourceRequest.memory = kube.getResourceQuantity(tm_memory) end
	if tm_memory ~= nil then tm_requires.resourceRequest.memory = tm_memory end

	KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten >> ${KUBECONFIG_PATH}/config
	KUBECONFIG=~/config-member1:~/config-member2 kubectl config view --merge --flatten > ${KUBECONFIG_PATH}/config

	if jm_memory ~= nil then jm_requires.resourceRequest.memory = kube.getResourceQuantity(jm_memory) end
	if jm_memory ~= nil then jm_requires.resourceRequest.memory = jm_memory end

		@@ -0,0 +1,3 @@
		#!/bin/bash

		kubectl -n karmada-system get deployment karmada-controller-manager -o json \| jq -r '.spec.template.spec.containers[0].command[]' \| grep -q "MultiplePodTemplatesScheduling=true" && kubectl -n karmada-system get deployment karmada-webhook -o json \| jq -r '.spec.template.spec.containers[0].command[]' \| grep -q "MultiplePodTemplatesScheduling=true"


		Multi-component scheduling (`MultiplePodTemplatesScheduling`) is currently an Alpha feature in Karmada and is disabled by default. We need to explicitly enable it on the `karmada-controller-manager`, `karmada-scheduler`, and `karmada-webhook` components.

		> Note: Because these components are running as native Pods on the underlying host cluster, we patch them using the default `kubectl` context, not the Karmada API server kubeconfig. We also temporarily change their deployment strategy to `Recreate` to prevent resource deadlocks during the rollout.

	> Note: We have pre-downloaded the necessary CRDs and placed them in `/root/examples/` for you.
	> Note: Karmada has built-in support for interpreting common third-party multi-component workload resources such as FlinkDeployment and VolcanoJob. They define rules for Karmada to parse these resources, covering extraction of replicas and resource requirements of each component, judgment of workload health status and identification of dependent resources.

		@@ -0,0 +1,68 @@
		### Provide Workload Definitions to Karmada

		Before Karmada can schedule complex Flink and Volcano workloads, it needs to understand their structure.


		RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployment-cr.yaml`{{exec}}

		This applies the Flink Custom Resource.

	This applies the Flink Custom Resource.
	This applies the FlinkDeployment Custom Resource.


		</details>

		RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployment-cr.yaml`{{exec}}


		If a multi-cluster scheduler treats these complex jobs as a single generic workload, it may underestimate the total resources required, or accidentally scatter the tightly-coupled components across entirely different geographical clusters, destroying the low-latency communication required for the job to function.

		In this scenario, we will deploy multi-component workloads (Flink and Volcano) and use custom Resource Interpreters to teach Karmada how to extract their individual components. We will then use `SpreadConstraints` to ensure all components of a job are scheduled atomistically to the exact same target cluster.


		When you apply a workload, Karmada uses its Resource Interpreter to analyze the custom resource, extract its requirements, and wrap it into a `ResourceBinding`. Let's inspect this binding to see what Karmada discovered.

		First, extract the dynamic binding name into a variable:


		Multi-component scheduling (`MultiplePodTemplatesScheduling`) is currently an Alpha feature in Karmada and is disabled by default. We need to explicitly enable it on three core control plane components to ensure the entire scheduling pipeline can process multi-component workloads:

		- `karmada-webhook`: Needs the feature gate to successfully validate and mutate the multi-component fields within incoming `ResourceBinding` and `PropagationPolicy` objects.

	- `karmada-scheduler`: Uses it to compute the aggregate resource requirements of all extracted components, ensuring the selected target cluster has sufficient capacity to host the entire complex workload.
	- `karmada-scheduler`: Uses it to obtain the detailed resource requirements of the workload, ensuring the selected target cluster has sufficient capacity to host the entire complex workload.


		RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployments.flink.apache.org-v1.yaml`{{exec}}

		This applies the Flink CRD.

	This applies the Flink CRD.
	This applies the FlinkDeployment CRD.


		Karmada needs to know exactly how to parse the custom resources to find their component definitions. We provide Lua scripts that teach Karmada how to do this.

		RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply -f /root/examples/flink-interpreter.yaml`{{exec}}


		Apply the Resource Interpreters:

		While Karmada's newer versions have built-in support for parsing FlinkDeployment workloads, the version installed in this environment requires us to explicitly provide a Lua script that teaches Karmada how to extract the component definitions.


		Apply the CRDs and PropagationPolicy:

		RUN `kubectl --kubeconfig /etc/karmada/karmada-apiserver.config apply --validate=false -f /root/examples/flinkdeployments.flink.apache.org-v1.yaml`{{exec}}


		To achieve this, we must apply their Custom Resource Definitions (CRDs) to the Karmada control plane and propagate them to the member clusters.

		> Note: Karmada (v1.17+) has built-in support for interpreting FlinkDeployment workloads. It automatically handles extraction of each component's replicas and resource requirements, so no manual Resource Interpreter is needed.

Conversation

Krishiv-Mahajan commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing:

Uh oh!

karmada-bot commented May 23, 2026

Uh oh!

gemini-code-assist Bot commented May 23, 2026

Uh oh!

Krishiv-Mahajan commented May 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Krishiv-Mahajan commented May 23, 2026 •

edited

Loading


		If a multi-cluster scheduler treats these complex jobs as a single generic workload, it may underestimate the total resources required, or accidentally scatter the tightly-coupled components across entirely different geographical clusters, destroying the low-latency communication required for the job to function.

		In this scenario, we will deploy a multi-component workload (FlinkDeployment, though VolcanoJob is also fully supported) and use custom Resource Interpreters to teach Karmada how to extract its individual components. We will then use `SpreadConstraints` to ensure all workload components are scheduled atomistically to the identical target cluster with sufficient resources.

	In this scenario, we will deploy a multi-component workload (FlinkDeployment, though VolcanoJob is also fully supported) and use custom Resource Interpreters to teach Karmada how to extract its individual components. We will then use `SpreadConstraints` to ensure all workload components are scheduled atomistically to the identical target cluster with sufficient resources.
	In this scenario, we will deploy a FlinkDeployment and use a dedicated PropagationPolicy to atomically propagate this multi-component workload to a member cluster with sufficient resources.


		RUN `karmadactl init --karmada-controller-manager-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true" --karmada-scheduler-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true" --karmada-webhook-extra-args="--feature-gates=MultiplePodTemplatesScheduling=true"`{{exec}}

		This sets up the Karmada control plane on the host cluster with multi-component scheduling enabled on the `karmada-controller-manager`, `karmada-scheduler`, and `karmada-webhook` from the start — no additional patches or restarts needed.


		This FlinkDeployment includes a JobManager (1 replica, 1 CPU, 100Mi memory) and a TaskManager.
		The TaskManager replica count is automatically computed as 1 using `ceil(parallelism/numberOfTaskSlots)`, with resources of 1 CPU and 100Mi memory.

	- karmada-webhook: Intercepts the scheduling policies and validates the multi-component configurations.
	- karmada-webhook: Validates the multi-component fields within incoming `ResourceBinding` objects.