-
Notifications
You must be signed in to change notification settings - Fork 162
Description
Report
After an error in a previous restore, the percona operator is in a faulty state, where:
When another restore custom resource is applied,
The percona operator does not react to it in any way and the restore resource stays stateless.
More about the problem
The custom resource never gets the requested state.
The operator logs don't show any signs of reaction to the new restore object in the DEBUG logs.
The only way i could find to resolve the faulty state, was to restart the operator pod.
Then the restore objects are again recognized by the operator.
Steps to reproduce
Reproduction:
1. Create a Restore Yaml which references a backup in another AWS Account (first-restore.yaml)
2. Set the role policies and add the bucket policies but omit the statement in the kms key policy, which allows to access the relevent s3 bucket.
3. Apply the restore custom resource
4. The Restore will log an error in the percona operator and its state will not change (log-first-restore.txt)
5. Delete the restore custom resource
6. Set the correct kms policy to allow the role to decrypt the s3 bucket.
7. Apply another restore custom resource, which can be valid or not. The operator will not react to it. Its state will also not change (second-restore.yaml, log-second-restore.txt)
Versions
1. Kubernetes Version: 1.32
2. Percona Operator: 1.19.1
3. Database Version: v7.0.15-9
Anything else?
No response