Skip to content

reconcileOperatorDefaults performs unnecessary full Update on RabbitmqCluster CR, causing Helm SSA conflicts #2098

@antoineozenne

Description

@antoineozenne

Describe the bug

reconcileOperatorDefaults calls r.updateRabbitmqCluster() on the RabbitmqCluster CR every reconciliation loop when ImagePullSecrets is nil, even if DEFAULT_IMAGE_PULL_SECRETS is empty and nothing was actually modified.

This causes two problems:

  1. No-op Update: strings.SplitSeq("", ",") yields "", so nothing is appended, but r.updateRabbitmqCluster() is called anyway:
    if rabbitmqCluster.Spec.ImagePullSecrets == nil {
    // split the comma separated list of default image pull secrets from
    // the 'DEFAULT_IMAGE_PULL_SECRETS' env var, but ignore empty strings.
    for reference := range strings.SplitSeq(r.DefaultImagePullSecrets, ",") {
    if len(reference) > 0 {
    rabbitmqCluster.Spec.ImagePullSecrets = append(rabbitmqCluster.Spec.ImagePullSecrets, corev1.LocalObjectReference{Name: reference})
    }
    }
    if requeue, err := r.updateRabbitmqCluster(ctx, rabbitmqCluster, "image pull secrets"); err != nil {
    return requeue, err
    }
    }
  2. Full PUT instead of targeted Patch: r.updateRabbitmqCluster() sends the entire CR, so the field manager "manager" takes ownership of all fields, not just imagePullSecrets. Additionally, Go's resource.Quantity.MarshalJSON() always serializes as a JSON string, silently converting e.g. cpu: 1 (int) to cpu: "1" (string) on x-kubernetes-int-or-string fields.

The combination causes SSA conflicts on subsequent helm upgrade because the stored JSON type no longer matches what Helm sends.

To Reproduce

  1. Deploy the operator with DEFAULT_IMAGE_PULL_SECRETS unset or empty
  2. Create a RabbitmqCluster via Helm with a bare integer resource value:
spec:
  resources:
    limits:
      cpu: 1
      memory: 1Gi
  1. Wait for at least one reconciliation loop
  2. Run helm upgrade on the same release
  3. Observe:
UPGRADE FAILED: conflict occurred while applying object <ns>/<name> rabbitmq.com/v1beta1, Kind=RabbitmqCluster: Apply failed with 1 conflict: conflict with "manager" using rabbitmq.com/v1beta1: .spec.resources.limits.cpu

Only limits.cpu conflicts because it's the only value that is a bare YAML integer. Values like 1Gi or 500m are YAML strings and survive the round-trip unchanged.

Expected behavior

helm upgrade should succeed. The operator should:

  1. Not update when nothing was appended.
  2. Not take field ownership of fields it did not intend to modify.

Version and environment information

  • RabbitMQ: 4.1.3
  • RabbitMQ Cluster Operator: 2.19.0
  • Kubernetes: 1.32
  • Cloud provider or hardware configuration: Azure AKS

Additional context

The current workaround for our specific use case is to quote integer values in Helm values (cpu: "1" instead of cpu: 1).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions