-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Checklist:
- I've included steps to reproduce the bug.
- I've included the version of argo rollouts.
Describe the bug
We are using AWS Load Balancers for managing traffic routing with our canary deployments. Part of the Rollout steps include an AnalysisRun that queries CloudWatch for metrics on the canary Target Group. These CloudWatch queries are built using metadata from the .status.alb.canaryTargetGroup
field.
Initially this all works as expected, but recently we've noticed that something is updating the Target Groups behind the scenes and they have a different ARN than what was initially written to the Rollout's status. In theory this shouldn't be an issue, but we've observed that the Target Group metadata under the .status.alb
and .status.albs
fields are never updated to reflect the new values. Since the Target Group ARNs are never updated to the new values our AnalysisRuns ultimately fail because they are attempting to query CloudWatch metrics for a non-existent AWS resource.
To Reproduce
- Create an Ingress object using an ALB
- Define stable/canary/root Service objects following the documented examples
- Add a Rollout resource that uses the
canary
deployment strategy with the above services (docs). - Confirm that the corresponding AWS Load Balancer and Target Groups are created in AWS.
- Verify that the Target Group ARNs in the Rollout resource match those in the AWS console.
- Re-create one or more of the Target Groups on the AWS Load Balancer.
- Verify that the Target Group ARNs remain unchanged on the Rollout resource, even after the argo-rollouts controller performs reconciliation.
Expected behavior
I would expect that the argo-rollouts controller should eventually reconcile any differences in the Load Balancer or Target Group ARNs with the values in the Rollout .status.alb
and .status.albs
fields.
Screenshots
N/A
Version
controller: v1.8.3
Helm chart: 2.40.4
Logs
From the argo-rollouts controller:
time="2025-09-24T20:53:36Z" level=info msg="No status changes. Skipping patch" generation=117 namespace=my-team resourceVersion=81340549 rollout=my-service
^ Indicates that the controller believes there aren't any diffs on the .status
field, so it never patches the Rollout object.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.