Skip to content

K8SPSMDB-1455: avoid repeated delete calls#2389

Merged
egegunes merged 22 commits into
mainfrom
K8SPSMDB-1455
Jun 16, 2026
Merged

K8SPSMDB-1455: avoid repeated delete calls#2389
egegunes merged 22 commits into
mainfrom
K8SPSMDB-1455

Conversation

@mayankshah1607

@mayankshah1607 mayankshah1607 commented Jun 8, 2026

Copy link
Copy Markdown
Member

CHANGE DESCRIPTION

Problem:
Fixes #1751

The operator makes repeated redundant Delete calls to KubeAPI

Cause:
A lot of the clean-up logic makes unconditional Delete calls on every reconcile. This causes the operator to flood the API with unneeded Delete calls

Solution:

  • added a DeleteIfExists helper that deletes an object only if it exists. The initial Get call hits the cache so effectively we are reducing the requests to the KubeAPI
  • Note that this PR also fixes a potential regression which completely disabled the use of controller-runtime client cache

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?
  • Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported MongoDB version?
  • Does the change support oldest and newest supported Kubernetes version?

Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings June 8, 2026 10:11
@pull-request-size pull-request-size Bot added the size/M 30-99 lines label Jun 8, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a small Kubernetes utility helper to make resource deletions idempotent and updates the main PSMDB controller to use it, aiming to avoid repeated delete attempts when objects are already gone.

Changes:

  • Added k8s.DeleteIfExists(ctx, client, obj) helper in pkg/k8s/utils.go.
  • Replaced several direct Delete(...)/IsNotFound(...) patterns in the PSMDB controller with DeleteIfExists (arbiter SFS, non-voting SFS, config SFS/service, mongos SFS).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/k8s/utils.go Adds DeleteIfExists helper used to skip deletion when the object is absent.
pkg/controller/perconaservermongodb/psmdb_controller.go Switches multiple deletion call sites to use the new helper to reduce repeated delete calls.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/k8s/utils.go
Comment thread pkg/k8s/utils.go Outdated
Comment on lines +49 to +55
func DeleteIfExists(ctx context.Context, c client.Client, obj client.Object) error {
if err := c.Get(ctx, client.ObjectKeyFromObject(obj), obj); k8serrors.IsNotFound(err) {
return nil
} else if err != nil {
return errors.Wrapf(err, "failed to get object: %s", obj.GetName())
}
return c.Delete(ctx, obj)
Copilot AI review requested due to automatic review settings June 8, 2026 10:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comment thread pkg/k8s/utils.go Outdated
Comment thread pkg/controller/perconaservermongodb/psmdb_controller.go Outdated
err = errors.Errorf("delete nonVoting statefulset %s: %v", replset.Name, err)
return err
if err := k8sutils.DeleteIfExists(ctx, r.client, psmdb.NewStatefulSet(naming.NonVotingStatefulSetName(cr, replset), cr.Namespace)); err != nil {
return errors.Wrapf(err, "failed to delete non voting statefulset: %s", naming.NonVotingStatefulSetName(cr, replset))
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Comment on lines -84 to -92
client, err := client.New(mgr.GetConfig(), client.Options{
Scheme: mgr.GetScheme(),
Cache: &client.CacheOptions{
DisableFor: []client.Object{&corev1.Node{}},
},
})
if err != nil {
return nil, errors.Wrap(err, "create client")
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@egegunes this was added in #1533

And it effectively disabled client caching (see that Reader is missing, and it ignores the manager's cache).. If the intention was to just drop caching for Nodes, it must be set in NewManager.. Can you confirm the expectation of that change?

Copilot AI review requested due to automatic review settings June 8, 2026 11:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread pkg/k8s/utils.go Outdated
Comment thread pkg/k8s/utils.go Outdated
@hors hors added this to the v1.23.0 milestone Jun 8, 2026
@mayankshah1607 mayankshah1607 marked this pull request as ready for review June 9, 2026 04:07
egegunes
egegunes previously approved these changes Jun 9, 2026
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
egegunes
egegunes previously approved these changes Jun 9, 2026
gkech
gkech previously approved these changes Jun 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread pkg/k8s/utils.go
Comment on lines +49 to +60
func DeleteIfExists(ctx context.Context, c client.Client, obj client.Object) error {
if err := c.Get(ctx, client.ObjectKeyFromObject(obj), obj); k8serrors.IsNotFound(err) {
return nil
} else if err != nil {
return errors.Wrapf(err, "failed to get %T %s/%s", obj, obj.GetNamespace(), obj.GetName())
}

if err := c.Delete(ctx, obj); client.IgnoreNotFound(err) != nil {
return errors.Wrapf(err, "failed to delete %T %s/%s", obj, obj.GetNamespace(), obj.GetName())
}
return nil
}
}

return r.deleteMongos(ctx, cr)
if err := k8sutils.DeleteIfExists(ctx, r.client, psmdb.MongosStatefulset(cr)); err != nil {
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings June 12, 2026 11:11

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread pkg/k8s/utils.go
Comment on lines +49 to +60
func DeleteIfExists(ctx context.Context, c client.Client, obj client.Object) error {
if err := c.Get(ctx, client.ObjectKeyFromObject(obj), obj); k8serrors.IsNotFound(err) {
return nil
} else if err != nil {
return errors.Wrapf(err, "failed to get %T %s/%s", obj, obj.GetNamespace(), obj.GetName())
}

if err := c.Delete(ctx, obj); client.IgnoreNotFound(err) != nil {
return errors.Wrapf(err, "failed to delete %T %s/%s", obj, obj.GetNamespace(), obj.GetName())
}
return nil
}
Comment on lines +56 to +60
const eventRegardingNameIndex = "regarding.name"

func eventRegardingNameIndexer(o client.Object) []string {
return []string{o.(*eventsv1.Event).Regarding.Name}
}
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings June 15, 2026 07:56
@github-actions github-actions Bot added the tests label Jun 15, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines +58 to +60
func eventRegardingNameIndexer(o client.Object) []string {
return []string{o.(*eventsv1.Event).Regarding.Name}
}
Comment thread pkg/k8s/utils.go
Comment on lines +49 to +60
func DeleteIfExists(ctx context.Context, c client.Client, obj client.Object) error {
if err := c.Get(ctx, client.ObjectKeyFromObject(obj), obj); k8serrors.IsNotFound(err) {
return nil
} else if err != nil {
return errors.Wrapf(err, "failed to get %T %s/%s", obj, obj.GetNamespace(), obj.GetName())
}

if err := c.Delete(ctx, obj); client.IgnoreNotFound(err) != nil {
return errors.Wrapf(err, "failed to delete %T %s/%s", obj, obj.GetNamespace(), obj.GetName())
}
return nil
}
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings June 15, 2026 08:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment on lines +77 to +82
// the volume resize logic lists PVC events through the cached client,
// which requires the field to be indexed
err = mgr.GetFieldIndexer().IndexField(context.TODO(), &eventsv1.Event{}, eventRegardingNameIndex, eventRegardingNameIndexer)
if err != nil {
return errors.Wrapf(err, "index events by %s", eventRegardingNameIndex)
}
Comment thread e2e-tests/pvc-resize/run

# user should be able to restore to the previous size and make the cluster ready
patch_pvc_request "${cluster}" "3G"
wait_pvc_request_revert "${cluster}" "3G"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We changed this behaviour in #2341

We don't land in an error state anymore, we instead revert the update in case of downscaling. This test was being very flaky

@mayankshah1607 mayankshah1607 marked this pull request as ready for review June 15, 2026 14:23
@mayankshah1607 mayankshah1607 requested a review from egegunes June 15, 2026 16:52
@JNKPercona

Copy link
Copy Markdown
Collaborator
Test Name Result Time
arbiter passed 00:00:00
balancer passed 00:00:00
cert-management-policy passed 00:00:00
cross-site-sharded passed 00:00:00
custom-replset-name passed 00:00:00
custom-tls passed 00:00:00
custom-users-roles passed 00:00:00
custom-users-roles-sharded passed 00:00:00
data-at-rest-encryption passed 00:00:00
data-sharded passed 00:00:00
demand-backup passed 00:16:22
demand-backup-eks-credentials-irsa passed 00:00:00
demand-backup-fs passed 00:23:13
demand-backup-if-unhealthy passed 00:00:00
demand-backup-incremental-aws passed 00:00:00
demand-backup-incremental-azure passed 00:00:00
demand-backup-incremental-gcp-native passed 00:00:00
demand-backup-incremental-gcp-s3 passed 00:00:00
demand-backup-incremental-minio passed 00:00:00
demand-backup-incremental-sharded-aws passed 00:18:33
demand-backup-incremental-sharded-azure passed 00:00:00
demand-backup-incremental-sharded-gcp-native passed 00:00:00
demand-backup-incremental-sharded-gcp-s3 passed 00:00:00
demand-backup-incremental-sharded-minio passed 00:00:00
demand-backup-logical-minio-native-tls passed 00:00:00
demand-backup-physical-parallel passed 00:00:00
demand-backup-physical-aws passed 00:00:00
demand-backup-physical-azure passed 00:00:00
demand-backup-physical-gcp-s3 passed 00:00:00
demand-backup-physical-gcp-native passed 00:00:00
demand-backup-physical-minio passed 00:00:00
demand-backup-physical-minio-native passed 00:00:00
demand-backup-physical-minio-native-tls passed 00:00:00
demand-backup-physical-sharded-parallel passed 00:00:00
demand-backup-physical-sharded-aws passed 00:00:00
demand-backup-physical-sharded-azure passed 00:00:00
demand-backup-physical-sharded-gcp-native passed 00:00:00
demand-backup-physical-sharded-minio passed 00:00:00
demand-backup-physical-sharded-minio-native passed 00:00:00
demand-backup-sharded passed 00:00:00
demand-backup-snapshot passed 00:00:00
demand-backup-snapshot-vault passed 00:00:00
disabled-auth passed 00:00:00
expose-sharded passed 00:00:00
finalizer passed 00:00:00
ignore-labels-annotations passed 00:00:00
init-deploy passed 00:00:00
ldap passed 00:00:00
ldap-tls passed 00:12:57
limits passed 00:00:00
liveness passed 00:00:00
mongod-major-upgrade passed 00:00:00
mongod-major-upgrade-sharded passed 00:00:00
monitoring-2-0 passed 00:00:00
monitoring-pmm3 passed 00:00:00
multi-cluster-service passed 00:00:00
multi-storage passed 00:00:00
non-voting-and-hidden passed 00:00:00
one-pod passed 00:00:00
operator-self-healing-chaos passed 00:00:00
pitr passed 00:00:00
pitr-physical passed 00:59:14
pitr-sharded passed 00:00:00
pitr-to-new-cluster passed 00:00:00
pitr-physical-backup-source passed 00:00:00
preinit-updates passed 00:00:00
pvc-auto-resize passed 00:13:34
pvc-resize passed 00:00:00
recover-no-primary passed 00:00:00
replset-overrides passed 00:00:00
replset-remapping passed 00:00:00
replset-remapping-sharded passed 00:00:00
rs-shard-migration passed 00:00:00
scaling passed 00:00:00
scheduled-backup passed 00:00:00
security-context passed 00:00:00
self-healing-chaos passed 00:00:00
service-per-pod passed 00:00:00
serviceless-external-nodes passed 00:00:00
smart-update passed 00:00:00
split-horizon passed 00:00:00
split-horizon-manual-tls passed 00:00:00
stable-resource-version passed 00:00:00
storage passed 00:00:00
tls-issue-cert-manager passed 00:00:00
unsafe-psa passed 00:00:00
upgrade passed 00:00:00
upgrade-consistency passed 00:00:00
upgrade-consistency-sharded-tls passed 00:00:00
upgrade-sharded passed 00:00:00
upgrade-partial-backup passed 00:00:00
users passed 00:00:00
users-vault passed 00:00:00
version-service passed 00:00:00
Summary Value
Tests Run 94/94
Job Duration 01:29:47
Total Test Time 02:23:55

commit: 136576e
image: perconalab/percona-server-mongodb-operator:PR-2389-136576e0b

@egegunes egegunes merged commit faa8943 into main Jun 16, 2026
20 checks passed
@egegunes egegunes deleted the K8SPSMDB-1455 branch June 16, 2026 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L 100-499 lines tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Non-existing statefulsets are being deleted over and over

7 participants