operator pdb: fix the remaining issues #7013

zhzhuang-zju · 2025-12-12T03:41:58Z

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
Fix left comments in #6933

Which issue(s) this PR fixes:

Part of #6282

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

karmada-bot · 2025-12-12T03:42:08Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign poor12 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

operator/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2025-12-12T03:42:17Z

Summary of Changes

Hello @zhzhuang-zju, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on refining the operator's handling of PodDisruptionBudgets (PDBs) by streamlining the creation of owner references for Deployments and StatefulSets. It enhances efficiency by eliminating redundant API calls to retrieve resource UIDs and improves code clarity through the introduction of standardized GVK constants and general import cleanup across various control plane components. These changes contribute to a more robust and maintainable PDB management system.

Highlights

PDB Owner Reference Streamlining: The process of creating owner references for PodDisruptionBudgets (PDBs) has been optimized. Instead of fetching the Deployment or StatefulSet after creation/update to get its UID, the CreateOrUpdate functions now directly return the updated object, which is then used to establish the owner reference.
API Client Function Signature Update: The CreateOrUpdateDeployment and CreateOrUpdateStatefulSet functions in idempotency.go have been modified to return the created or updated Kubernetes resource object, in addition to any error encountered. This change enables more direct use of the resource's properties, such as its UID.
Standardized GroupVersionKind (GVK) Definitions: New package-level variables, DeploymentGVK and StatefulSetGVK, have been introduced to provide standardized GroupVersionKind definitions for Deployment and StatefulSet resources, promoting consistency across the codebase.
Code Cleanup and Refactoring: Unused context and schema imports have been removed from several control plane component files, and the createPodDisruptionBudget function in pdb.go has been renamed to buildPodDisruptionBudget and refactored to directly return the PDB object without an error, as it no longer performs API calls.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the creation of PodDisruptionBudgets (PDBs) for various components, simplifying the code by avoiding an extra API call to fetch the owner object. The changes in CreateOrUpdateDeployment and CreateOrUpdateStatefulSet to return the created/updated object are central to this improvement.

While the overall direction is good, I've found a couple of issues in operator/pkg/util/apiclient/idempotency.go that should be addressed:

CreateOrUpdateDeployment is missing a Get call before Update, which will likely cause update conflicts.
CreateOrUpdateStatefulSet has a variable shadowing issue and returns an incorrect error value.

I've provided detailed comments and suggestions to fix these issues. Once these are addressed, the PR will be a solid improvement.

gemini-code-assist · 2025-12-12T03:45:03Z

operator/pkg/util/apiclient/idempotency.go

+func CreateOrUpdateDeployment(client clientset.Interface, deployment *appsv1.Deployment) (*appsv1.Deployment, error) {
+	var latest *appsv1.Deployment
+	var err error
+	latest, err = client.AppsV1().Deployments(deployment.GetNamespace()).Create(context.TODO(), deployment, metav1.CreateOptions{})
 	if err != nil {
 		if !apierrors.IsAlreadyExists(err) {
-			return err
+			return nil, err
 		}

-		_, err := client.AppsV1().Deployments(deployment.GetNamespace()).Update(context.TODO(), deployment, metav1.UpdateOptions{})
+		latest, err = client.AppsV1().Deployments(deployment.GetNamespace()).Update(context.TODO(), deployment, metav1.UpdateOptions{})
 		if err != nil {
-			return err
+			return nil, err
 		}
 	}

 	klog.V(5).InfoS("Successfully created or updated deployment", "deployment", deployment.GetName())
-	return nil
+	return latest, err
 }


The update path in this function is likely to fail with a conflict error. When updating a resource, the ResourceVersion of the object must be set to the current version in the cluster to avoid conflicts. The deployment object here is the desired state and does not have the ResourceVersion of the existing object.

To fix this, you should Get the existing deployment before calling Update to retrieve its ResourceVersion, similar to the pattern used in CreateOrUpdateStatefulSet.

func CreateOrUpdateDeployment(client clientset.Interface, deployment *appsv1.Deployment) (*appsv1.Deployment, error) { var latest *appsv1.Deployment var err error latest, err = client.AppsV1().Deployments(deployment.GetNamespace()).Create(context.TODO(), deployment, metav1.CreateOptions{}) if err != nil { if !apierrors.IsAlreadyExists(err) { return nil, err } older, getErr := client.AppsV1().Deployments(deployment.GetNamespace()).Get(context.TODO(), deployment.GetName(), metav1.GetOptions{}) if getErr != nil { return nil, getErr } deployment.ResourceVersion = older.ResourceVersion latest, err = client.AppsV1().Deployments(deployment.GetNamespace()).Update(context.TODO(), deployment, metav1.UpdateOptions{}) if err != nil { return nil, err } } klog.V(5).InfoS("Successfully created or updated deployment", "deployment", deployment.GetName()) return latest, err }

gemini-code-assist · 2025-12-12T03:45:03Z

operator/pkg/util/apiclient/idempotency.go

+func CreateOrUpdateStatefulSet(client clientset.Interface, statefulSet *appsv1.StatefulSet) (*appsv1.StatefulSet, error) {
+	var latest *appsv1.StatefulSet
+	var err error
+	latest, err = client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Create(context.TODO(), statefulSet, metav1.CreateOptions{})
 	if err != nil {
 		if !apierrors.IsAlreadyExists(err) {
-			return err
+			return nil, err
 		}

 		older, err := client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Get(context.TODO(), statefulSet.GetName(), metav1.GetOptions{})
 		if err != nil {
-			return err
+			return nil, err
 		}

 		statefulSet.ResourceVersion = older.ResourceVersion
-		_, err = client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Update(context.TODO(), statefulSet, metav1.UpdateOptions{})
+		latest, err = client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Update(context.TODO(), statefulSet, metav1.UpdateOptions{})
 		if err != nil {
-			return err
+			return nil, err
 		}
 	}

 	klog.V(5).InfoS("Successfully created or updated statefulset", "statefulset", statefulSet.GetName())
-	return nil
+	return latest, nil
 }


This function has a variable shadowing issue and an incorrect return value that should be addressed.

At line 262, older, err := ... shadows the err variable from the outer scope. This can lead to unexpected behavior as the error from the Create operation is not correctly handled in the update path.

At line 275, the function returns nil for the error, which is inconsistent with CreateOrUpdateDeployment and can mask an error from a successful update path.

To fix these issues and improve consistency, the function should be refactored to handle errors properly without shadowing and return the final error state.

func CreateOrUpdateStatefulSet(client clientset.Interface, statefulSet *appsv1.StatefulSet) (*appsv1.StatefulSet, error) { var latest *appsv1.StatefulSet var err error latest, err = client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Create(context.TODO(), statefulSet, metav1.CreateOptions{}) if err != nil { if !apierrors.IsAlreadyExists(err) { return nil, err } var older *appsv1.StatefulSet older, err = client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Get(context.TODO(), statefulSet.GetName(), metav1.GetOptions{}) if err != nil { return nil, err } statefulSet.ResourceVersion = older.ResourceVersion latest, err = client.AppsV1().StatefulSets(statefulSet.GetNamespace()).Update(context.TODO(), statefulSet, metav1.UpdateOptions{}) if err != nil { return nil, err } } klog.V(5).InfoS("Successfully created or updated statefulset", "statefulset", statefulSet.GetName()) return latest, err }

@gemini-code-assist

At line 262, older, err := ... shadows the err variable from the outer scope. This can lead to unexpected behavior as the error from the Create operation is not correctly handled in the update path.

Could you elaborate on the specific issues with shadowed declarations? I don’t really see a problem here—err appears to be properly handled in all cases.

Thanks for asking for clarification! I understand why it might seem like the err variable is handled correctly in all cases, but there's a subtle yet important issue with variable shadowing in Go that can lead to unexpected behavior.

When you have the line older, err := client.AppsV1().StatefulSets(...).Get(...) at line 262, the := operator declares a new err variable within that if block's scope. This new err variable shadows the err variable declared in the outer function scope.

Here's the problem:

Initial Create call: The err variable in the outer scope holds the result of the Create operation. If it's an AlreadyExists error, the code proceeds into the if err != nil block.

Get call within the if block: older, err := client.AppsV1().StatefulSets(...).Get(...) declares a new err variable. If this Get call fails, this new err variable will hold that error, and the function correctly returns nil, err (referring to the new err).

Update call within the if block: _, err = client.AppsV1().StatefulSets(...).Update(...) then assigns the result of the Update operation to this new, shadowed err variable. If this Update call fails, this new err variable will hold that error, and the function correctly returns nil, err (referring to the new err).

The critical flaw occurs if the Update call succeeds, but the Create call originally failed with AlreadyExists:

The outer err is apierrors.IsAlreadyExists(err). The code enters the if block.

The Get call succeeds.

The Update call succeeds. The new err variable (declared at line 262) is now nil.

The function then executes return latest, nil (referring to the new, nil err from the inner scope).

This means that even though the original Create operation failed (with AlreadyExists), and the Update operation successfully handled the existing resource, the function's return value for err would be nil. This masks the fact that the resource wasn't created but rather updated, which can be misleading for callers expecting a nil error only on successful creation or update without prior existence issues.

My suggested change addresses this by ensuring that the err variable is consistently the same variable throughout the function, so its final state accurately reflects the overall outcome of the CreateOrUpdate logic. By changing older, err := ... to older, err = ... (and similarly for the latest, err = ... assignment), we avoid introducing a new err variable and ensure that the final return latest, err correctly propagates any error that occurred at any stage, or nil if everything was successful.

This masks the fact that the resource wasn't created but rather updated, which can be misleading for callers expecting a nil error only on successful creation or update without prior existence issues.

You've misunderstood one key point: the purpose of this function is CreateOrUpdate. If the create fails but the subsequent update succeeds, the operation is considered successful from the caller's perspective. This function is specifically designed to abstract away whether the resource already exists or not.

Signed-off-by: zhzhuang-zju <[email protected]>

codecov-commenter · 2025-12-12T06:57:25Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 20.00000% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.61%. Comparing base (88b3e4b) to head (b1ac00e).
⚠️ Report is 4 commits behind head on master.

Files with missing lines	Patch %	Lines
operator/pkg/util/apiclient/idempotency.go	0.00%	17 Missing ⚠️
operator/pkg/controlplane/apiserver/apiserver.go	40.00%	0 Missing and 3 partials ⚠️
operator/pkg/controlplane/pdb/pdb.go	0.00%	3 Missing ⚠️
operator/pkg/controlplane/controlplane.go	50.00%	0 Missing and 1 partial ⚠️
operator/pkg/controlplane/etcd/etcd.go	50.00%	0 Missing and 1 partial ⚠️
.../pkg/controlplane/metricsadapter/metricsadapter.go	50.00%	0 Missing and 1 partial ⚠️
operator/pkg/controlplane/search/search.go	50.00%	0 Missing and 1 partial ⚠️
operator/pkg/controlplane/webhook/webhook.go	50.00%	0 Missing and 1 partial ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #7013      +/-   ##
==========================================
- Coverage   46.65%   46.61%   -0.05%     
==========================================
  Files         699      699              
  Lines       48163    48158       -5     
==========================================
- Hits        22469    22447      -22     
- Misses      23999    24022      +23     
+ Partials     1695     1689       -6

Flag	Coverage Δ
unittests	`46.61% <20.00%> (-0.05%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

RainbowMango · 2025-12-16T03:57:10Z

@zhzhuang-zju Please rebase this PR as the E2E tests have been merged #7010.

zhzhuang-zju · 2025-12-16T03:57:49Z

~~Since #7010 has been merged, I will rebase to trigger the newly added e2e test.~~

@zhzhuang-zju Please rebase this PR as the E2E tests have been merged #7010.

Sure

karmada-bot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Dec 12, 2025

karmada-bot requested review from chaosi-zju and lonelyCZ December 12, 2025 03:42

karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 12, 2025

gemini-code-assist bot reviewed Dec 12, 2025

View reviewed changes

operator pdb: fix the remaining issues

b1ac00e

Signed-off-by: zhzhuang-zju <[email protected]>

zhzhuang-zju force-pushed the operator-pdb branch from 46a6ae6 to b1ac00e Compare December 12, 2025 06:40

zhzhuang-zju mentioned this pull request Dec 16, 2025

operator e2e: add pdb e2e test #7010

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

operator pdb: fix the remaining issues #7013

operator pdb: fix the remaining issues #7013

Uh oh!

zhzhuang-zju commented Dec 12, 2025

Uh oh!

karmada-bot commented Dec 12, 2025

Uh oh!

gemini-code-assist bot commented Dec 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 12, 2025

Uh oh!

gemini-code-assist bot Dec 12, 2025

Uh oh!

zhzhuang-zju Dec 12, 2025

Uh oh!

gemini-code-assist bot Dec 12, 2025

Uh oh!

zhzhuang-zju Dec 12, 2025

Uh oh!

codecov-commenter commented Dec 12, 2025

Uh oh!

RainbowMango commented Dec 16, 2025

Uh oh!

zhzhuang-zju commented Dec 16, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

operator pdb: fix the remaining issues #7013

Are you sure you want to change the base?

operator pdb: fix the remaining issues #7013

Uh oh!

Conversation

zhzhuang-zju commented Dec 12, 2025

Uh oh!

karmada-bot commented Dec 12, 2025

Uh oh!

gemini-code-assist bot commented Dec 12, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

zhzhuang-zju Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

zhzhuang-zju Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Dec 12, 2025

Codecov Report

Uh oh!

RainbowMango commented Dec 16, 2025

Uh oh!

zhzhuang-zju commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhzhuang-zju commented Dec 16, 2025 •

edited

Loading