Skip to content

Readiness probe succeeds even if the "active" label can't be updated #142

@faroukbi

Description

@faroukbi

While operating a Solace cluster provisioned via Helm, we faced a situation where the readiness probe wasn't capable of updating the "active" label of messaging pods, nevertheless the script readiness_check.sh reported 0 as return code and the pods continued being seen as read. As a consequence, the service forwarded traffic to the inactive node, which then rejected connections.

Steps to reproduce the issue:

  1. Provision a Solace cluster. The primary node should have label "active" set to "true" and the backup node should have the label "active" set to "false"
  2. Remove the created rolebinding so that the used service account doesn't have permission to call the pod patch API
  3. Execute a failover from the primary to the backup

Noticed behavior

The primary node continues being ready and has the label "active" set to true even if it is inactive.
The backup node continues being ready and has the label "active" set to false even if it is active.
The service continues forwarding traffic to the primary node, which is inactive and then rejects connections.

Expected behavior

Both pods should be marked as not ready, as the readiness probe can't call the pod patch API. The script readiness_check.sh should return a different return code than 0.

Probable cause

Following 2 calls of the curl commands return code 0, even if the Kubernetes API returns HTTP 403.

solaceConfigMap.yaml

        if ! curl -sS --output /dev/null --cacert $CACERT --connect-timeout 5 \
            --request PATCH --data "$(cat /tmp/patch_label.json)" \
            -H "Authorization: Bearer $KUBE_TOKEN" -H "Content-Type:application/json-patch+json" \
            $K8S/api/v1/namespaces/$NAMESPACE/pods/$HOSTNAME ; then
          # Label update didn't work this way, fall back to alternative legacy method to update label
          if ! curl -sSk --output /dev/null -H "Authorization: Bearer $KUBE_TOKEN" --request PATCH --data "$(cat /tmp/patch_label.json)" \
            -H "Content-Type:application/json-patch+json" \
            https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/$STATEFULSET_NAMESPACE/pods/$HOSTNAME ; then
            echo "`date` ERROR: ${APP}-Unable to update pod label, check access from pod to K8s API or RBAC authorization" >&2
            rm -f ${FINAL_ACTIVITY_LOGGED_TRACKING_FILE}; exit 1
          fi
        fi

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions