Skip to content

CA gets re-generated every time in ArgoCD #489

@chernetskyi

Description

@chernetskyi

ArgoCD does not support Helm lookup template function which is used by default to re-use the CA of the existing certificate.

Description

By default, Helm chart generates the certificate for the admission webhooks via Helm. However, it does so using a lookup template function to re-use the CA of the existing certificate:

{{- $prevSecret := (lookup "v1" "Secret" .Release.Namespace (include "k8s-agents-operator.certificateSecret.name" . )) }}

The problem is lookup function does not work in ArgoCD.
argoproj/argo-cd#5202
argoproj/argo-cd#21745

Apparently, at one point, part of the setup relies on the certificate issued by the old CA, and another - by the new one. Not sure when or why this happens, but the regular ArgoCD application sync (that among other things re-creates the CA) is not enough.

When this happens, admission webhooks stop working, and the following log message can be observed in the manager container's logs:

http: TLS handshake error from 172.24.76.230:33540: remote error: tls: bad certificate

To temporarily address the issue, we sync ArgoCD application with prune on.

Steps to Reproduce

Sorry, not exactly sure. Tried syncing ArgoCD application with prune off and it was not enough by itself to reproduce the issue.

Expected Behavior

Because I do not think ArgoCD is an exotic environment, I expect the default configuration of the chart to work in ArgoCD.

I know the chart actually provides two other options to manage the certificates that should both work in ArgoCD. Because of them, I would at least expect this issue mentioned anywhere in the documentation. Something like

k8s-agents-operator Helm chart in its default configuration will not work correctly in ArgoCD. Please configure one of the two alternative methods to manage the certificates for admission webhooks.

Since we also know syncing with prune on resolves the issue temporarily, maybe the recommendation can be to turn enable pruning with automatic syncs:
https://argo-cd.readthedocs.io/en/stable/user-guide/auto_sync/#automatic-pruning
However, I am not sure about the consequences of such solution.

Relevant Logs / Console output

The following log message can be observed in the manager container's logs:

http: TLS handshake error from 172.24.76.230:33540: remote error: tls: bad certificate

Your Environment

  • EKS 1.33
  • ArgoCD v3.1.1+fa342d1
  • nri-bundle 6.0.26

Additional context

When was browsing through GitHub issues, found out that k8s-metadata-injection chart, for example, defaults to using Jobs to generate certificates, instead of generating them via Helm (does not even offer it as an option). I do not know why the default approaches differ in Helm charts, but unlike relying on lookup, Jobs should work in ArgoCD.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions