Description
Bug Report
What did you do?
Given a controller manager Deployment managed by a CSV, attempting a kubectl rollout restart
on the controller manager produces an unexpected result.
What did you expect to see?
A new Replica Set with a new set of pods comes up, replacing the existing RS.
What did you see instead? Under which circumstances?
A new RS is created but it gets immediately scaled down and the old RS takes over instead. For example:
➜ oc describe deploy
Name: quay-operator-tng
Namespace: default
CreationTimestamp: Wed, 28 Aug 2024 09:16:06 +0200
Labels: olm.deployment-spec-hash=5UgbKi05MO7Ei5ZQssoGmupLx4sNY2p8bWGNDS
olm.managed=true
olm.owner=quay-operator.v3.8.13
olm.owner.kind=ClusterServiceVersion
olm.owner.namespace=default
operators.coreos.com/project-quay.default=
Annotations: deployment.kubernetes.io/revision: 8
Selector: name=quay-operator-alm-owned
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
.......
.......
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: quay-operator-tng-684667795f (0/0 replicas created)
NewReplicaSet: quay-operator-tng-78f9489957 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 32h deployment-controller Scaled up replica set quay-operator-tng-5ff7b59799 to 1
Normal ScalingReplicaSet 32h deployment-controller Scaled up replica set quay-operator-tng-78f9489957 to 1
Normal ScalingReplicaSet 32h deployment-controller Scaled down replica set quay-operator-tng-5ff7b59799 to 0 from 1
Normal ScalingReplicaSet 32h deployment-controller Scaled up replica set quay-operator-tng-6b9c5fc95b to 1
Normal ScalingReplicaSet 32h deployment-controller Scaled down replica set quay-operator-tng-6b9c5fc95b to 0 from 1
Normal ScalingReplicaSet 31h deployment-controller Scaled up replica set quay-operator-tng-f8bc859f5 to 1
Normal ScalingReplicaSet 31h deployment-controller Scaled down replica set quay-operator-tng-f8bc859f5 to 0 from 1
Normal ScalingReplicaSet 31h deployment-controller Scaled up replica set quay-operator-tng-684667795f to 1
Normal ScalingReplicaSet 31h deployment-controller Scaled down replica set quay-operator-tng-684667795f to 0 from 1
The reason for this, is OLM reverts the annotation kubectl.kubernetes.io/restartedAt placed on the Deployment by the rollout restart action.
Possible Solution
A 3-way merge patch here might avoid overriding fields we don't care about -
https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/api/wrappers/deployment_install_client.go#L124