Description
Checklist:
- I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
- I've included steps to reproduce the bug.
- I've pasted the output of
argocd version
.
Describe the bug
Related to #21661
Currently, executing the command argocd app sync --dry-run
affects both the application’s state and the internal metrics exposed by ArgoCD (eg: argocd_app_info
).
The main issue is that if there are alerts based on these metrics, and the dry-run execution identifies an error (e.g., a change that violates a Kyverno policy or an invalid CRD schema), the application state changes to SyncErr
. This also updates the metrics, which can potentially trigger alerts based on these metrics.
For example:
# alert manager alert definition
- alert: ArgoCDApplicationUnknown
expr: sum by (cluster) (argocd_app_info{sync_status="Unknown"}) > 0
for: 15m
...
Since the definition of a dry-run
is to execute requests without persistence, I wonder if it makes sense to handle it in a way that ensures no changes are made.
Alternatively, adding a dry_run label to differentiate operational requests from dry-run requests could also be an option (a similar change has been proposed for kyverno link).
To Reproduce
Run argocd app sync --dry-run
command.
Expected behavior
There are different proposals:
- Add a
dryrun
label to distinguish dryrun activity from real ones. - Ignore
dryrun
executions (not affecting metrics).
Initially I fond of the 1 because dryrun has a cost associated in terms of resources consumption / performance. Ignoring activity related to dryrun make extremely hard to identify the reason of performance issues.
Version
It is affecting whatever version because currently dryrun executions are not distinguished.