Skip to content

Kubectl rollout restart of a deployment managed by OLM doesn't work as expected #3392

Open
@itroyano

Description

@itroyano

Bug Report

What did you do?

Given a controller manager Deployment managed by a CSV, attempting a kubectl rollout restart on the controller manager produces an unexpected result.

What did you expect to see?

A new Replica Set with a new set of pods comes up, replacing the existing RS.

What did you see instead? Under which circumstances?

A new RS is created but it gets immediately scaled down and the old RS takes over instead. For example:

➜ oc describe deploy
Name:                   quay-operator-tng
Namespace:              default
CreationTimestamp:      Wed, 28 Aug 2024 09:16:06 +0200
Labels:                 olm.deployment-spec-hash=5UgbKi05MO7Ei5ZQssoGmupLx4sNY2p8bWGNDS
                        olm.managed=true
                        olm.owner=quay-operator.v3.8.13
                        olm.owner.kind=ClusterServiceVersion
                        olm.owner.namespace=default
                        operators.coreos.com/project-quay.default=
Annotations:            deployment.kubernetes.io/revision: 8
Selector:               name=quay-operator-alm-owned
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
.......
.......
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  quay-operator-tng-684667795f (0/0 replicas created)
NewReplicaSet:   quay-operator-tng-78f9489957 (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  32h   deployment-controller  Scaled up replica set quay-operator-tng-5ff7b59799 to 1
  Normal  ScalingReplicaSet  32h   deployment-controller  Scaled up replica set quay-operator-tng-78f9489957 to 1
  Normal  ScalingReplicaSet  32h   deployment-controller  Scaled down replica set quay-operator-tng-5ff7b59799 to 0 from 1
  Normal  ScalingReplicaSet  32h   deployment-controller  Scaled up replica set quay-operator-tng-6b9c5fc95b to 1
  Normal  ScalingReplicaSet  32h   deployment-controller  Scaled down replica set quay-operator-tng-6b9c5fc95b to 0 from 1
  Normal  ScalingReplicaSet  31h   deployment-controller  Scaled up replica set quay-operator-tng-f8bc859f5 to 1
  Normal  ScalingReplicaSet  31h   deployment-controller  Scaled down replica set quay-operator-tng-f8bc859f5 to 0 from 1
  Normal  ScalingReplicaSet  31h   deployment-controller  Scaled up replica set quay-operator-tng-684667795f to 1
  Normal  ScalingReplicaSet  31h   deployment-controller  Scaled down replica set quay-operator-tng-684667795f to 0 from 1

The reason for this, is OLM reverts the annotation kubectl.kubernetes.io/restartedAt placed on the Deployment by the rollout restart action.

Possible Solution

A 3-way merge patch here might avoid overriding fields we don't care about -
https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/api/wrappers/deployment_install_client.go#L124

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions