Description
What happened?
We create managed resources using only the Observe
management policy and the provider will initially set the Synced
condition to True
but will not set the Ready
condition at all.
We can see in the provider logs that the object seems to be successfully reconciled and is requeued according to the configured sync interval
.
Once the sync interval is hit, the resource becomes ready.
How can we reproduce it?
- Create any MR in the cluster with only the
Observe
management policy. We can easily reproduce this withprovider-upjet-azure
in any resource, for instanceVirtualNetwork
with the following spec:
apiVersion: network.azure.upbound.io/v1beta2
kind: VirtualNetwork
metadata:
annotations:
crossplane.io/external-name: vnet-test-crossplane
name: vnet-test-crossplane
spec:
deletionPolicy: Delete
forProvider:
resourceGroupName: rg-test-crossplane
initProvider: {}
managementPolicies:
- Observe
providerConfigRef:
name: default
- You will see the object become
Synced
almost immediately, but theReady
condition will staynil
. Also, the wholeatProvider
struct staysnil
. Provider logs will show the following:
2024-12-06T20:57:33Z DEBUG provider-azure Reconciling {"controller": "managed/network.azure.upbound.io/v1beta1, kind=virtualnetwork", "request": {"name":"vnet-test-crossplane"}}
2024-12-06T20:57:33Z DEBUG provider-azure Connecting to the service provider {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:33Z DEBUG provider-azure Instance state not found in cache, reconstructing... {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:33Z DEBUG provider-azure Observing the external resource {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork"}
2024-12-06T20:57:34Z DEBUG provider-azure Diff detected {"uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "name": "vnet-test-crossplane", "gvk": "network.azure.upbound.io/v1beta1, Kind=VirtualNetwork", "instanceDiff": "*terraform.InstanceDiff{mu:sync.Mutex{state:0, sema:0x0}, Attributes:map[string]*terraform.ResourceAttrDiff{\"location\":*terraform.ResourceAttrDiff{Old:\"canadacentral\", New:\"\", NewComputed:false, NewRemoved:true, NewExtra:interface {}(nil), RequiresNew:true, Sensitive:false, Type:0x0}}, Destroy:false, DestroyDeposed:false, DestroyTainted:false, RawConfig:cty.NilVal, RawState:cty.NilVal, RawPlan:cty.NilVal, Meta:map[string]interface {}(nil)}"}
2024-12-06T20:57:34Z DEBUG provider-azure Skipping update due to managementPolicies. Reconciliation succeeded {"controller": "managed/network.azure.upbound.io/v1beta1, kind=virtualnetwork", "request": {"name":"vnet-test-crossplane"}, "uid": "bb757cb5-8934-471d-981b-7862513ac9b1", "version": "1176637016", "external-name": "vnet-test-crossplane", "requeue-after": "2024-12-07T02:49:57Z"}
- You can either wait for the requeue to trigger or you can restart the provider.
- After that, the
Ready
condition will be set totrue
.
What environment did it happen in?
Crossplane version: 1.18.0
Additional information
After step-by-step debugging, we have found that the bug seems to come from the AddFinalizer
function in the APIFinalizer
struct. This struct is used by provider-upjet-azure
and probably all upjet
based providers (although I didn't validate).
The AddFinalizer
function calls the kube client's Update
function which will apply an object in the kube API, but also return the updated object returned by the API to the caller. This, in turn, nullifies any struct that is not applied by the Update
function such as the Status
sub-struct.
Here are the relevant parts of the code:
- AddFinalizer function
- This ends up calling Into from the kube client
- Reconciler's call to AddFinalizer