Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not requeue when finalizing managed service instance #3500

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

danail-branekov
Copy link
Member

Is there a related GitHub Issue?

Flakes

What is this change about?

A couple of flakes on the CI showed that a managed service might be
deprovisioned with the broker twice:

https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-tests-periodic/builds/19402
https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-tests-periodic/builds/19396
https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-tests-periodic/builds/19358

Analysis showed that when the controller requeues the reconcile event
on
finalization
,
the very same object (same generation, same resource version) could be
reconciled twice thus request deprovision the instance from the broker
twice.

In order to address this, this change does not requeue the event after
deprovision is requested from the broker. Instead, upon successful
deprovision from the broker, the controller removes the finalizer, sets
the DeprovisionRequested condition and returns an empty result (no
requeue).

The drawback of this approach is that we need to remove the finalizers
on two places - whenever deprovision from the broker is requested, and
when the status has the DeprovisionRequested condition. Not
removing the finalizer after the deprovisionRequested check seems to
make the issue much harder to reproduce, however still possible. I could
speculate that the service instance CR and its status are different
resources ("main resource" and subresource) and therefore could go out
of sync for a while.

Tag your pair, your PM, and/or team

@georgethebeatle

A couple of flakes on the CI showed that a managed service might be
deprovisioned with the broker twice:

```
https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-tests-periodic/builds/19402
https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-tests-periodic/builds/19396
https://ci.korifi.cf-app.com/teams/main/pipelines/main/jobs/run-tests-periodic/builds/19358
```

Analysis showed that when the controller [requeues the reconcile event
on
finalization](https://github.com/cloudfoundry/korifi/blob/2dfdb3848bc8690e3e5be14a535ae52c46152d46/controllers/controllers/services/instances/managed/controller.go#L265),
the very same object (same generation, same resource version) could be
reconciled twice thus request deprovision the instance from the broker
twice.

In order to address this, this change does not requeue the event after
deprovision is requested from the broker. Instead, upon successful
deprovision from the broker, the controller removes the finalizer, sets
the `DeprovisionRequested` condition and returns an empty result (no
requeue).

The drawback of this approach is that we need to remove the finalizers
on two places - whenever deprovision from the broker is requested, and
when the status has the `DeprovisionRequested` condition. Not
removing the finalizer after the `deprovisionRequested` check seems to
make the issue much harder to reproduce, however still possible. I could
speculate that the service instance CR and its status are different
resources ("main resource" and subresource) and therefore could go out
of sync for a while.
@danail-branekov danail-branekov enabled auto-merge (rebase) October 2, 2024 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant