Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete deletion of CnsFileAccessConfig instance if associated nodeVM instance is not found #3151

Conversation

vdkotkar
Copy link
Contributor

@vdkotkar vdkotkar commented Jan 10, 2025

What this PR does / why we need it:
This PR contains following changes:

  1. If nodeVM instance is deleted and later we get request to delete CnsFileAccessConfig instance, then currently we return error and CnsFileAccessConfig instance never gets deleted.
    Updating code to proceed with the deletion of CnsFileAccessConfig instance if the associated nodeVM instance is not found (already deleted). To fetch VM IP which is required to configure ACL, we get it from CnsFileVolumeClient CR.
  2. Adding finaliser on CnsFileVolumeClient CR during creation, so that it doesn't get deleted abruptly (since we are using it from CnsFileAccessConfig reconciler).
  3. Deleting the finaliser on CnsFileVolumeClient CR just before when we delete the instance.
  4. Now, during upgrade from older WCP version to current WCP version, we need to add finaliser on existing CnsFileVolumeClient CRs. Added a code in syncer start to add finalisers on all existing CnsFileVolumeClient CRs.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Testing done:
make check and make test are passing.

Verified that namespace deletion is succeeding now when file volume is attached to the pod in namespace.

followed below mentioned steps:

  1. Create a WCP Namespace "test", assign storage policy, VM class, TKG library
  2. Create a TKG test-cluster-e2e-script - in the above namespace
  3. Create a "test" namespace inside TKG, Create a workload (CSI File Volume, and attach/mount it to Pod)
  4. Wait for the Pod with File volume to be up and running
    Verify the CRD cnsfileaccessconfigs.cns.vmware.com is created
  5. Delete Namespace test-gc-e2e-demo-ns from VC UI

Logs for reference:

GC:

# k get pvc -A
NAMESPACE   NAME                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
test        example-vanilla-file-pvc   Bound    pvc-b2968c0f-5a4c-407b-8c5c-d650beaa9ad7   100Mi      RWX            test-policy    <unset>                 38s

# k get pods -n test
NAME                        READY   STATUS    RESTARTS   AGE
example-vanilla-file-pod1   1/1     Running   0          41s


WCP: 

# k get cluster -A
NAMESPACE   NAME                      CLUSTERCLASS             PHASE         AGE   VERSION
test        test-cluster-e2e-script   builtin-generic-v3.1.0   Provisioned   24m   v1.31.4+vmware.1-fips

# k get cnsfilevolumeclient -n test -o yaml
apiVersion: v1
items:
- apiVersion: cns.vmware.com/v1alpha1
  kind: CnsFileVolumeClient
  metadata:
    creationTimestamp: "2025-02-05T14:09:12Z"
    finalizers:
    - cns.vmware.com
    generation: 1
    name: 8bd9c3f4-fe7d-407e-9f5e-d456ab664ef8-b2968c0f-5a4c-407b-8c5c-d650beaa9ad7
    namespace: test
    resourceVersion: "8483851"
    uid: 2803e72f-28a9-4540-af57-47f85744b4bd
  spec:
    externalIPtoClientVms:
      166.168.16.44:
      - test-cluster-e2e-script-node-pool-1-qwblx-lg2bf-7qszx
kind: List
metadata:
  resourceVersion: ""


# k get cnsfileaccessconfig -A -oyaml
apiVersion: v1
items:
- apiVersion: cns.vmware.com/v1alpha1
  kind: CnsFileAccessConfig
  metadata:
    creationTimestamp: "2025-02-05T14:09:05Z"
    finalizers:
    - cns.vmware.com
    generation: 2
    name: test-cluster-e2e-script-node-pool-1-qwblx-lg2bf-7qszx-8bd9c3f4-fe7d-407e-9f5e-d456ab664ef8-b2968c0f-5a4c-407b-8c5c-d650beaa9ad7
    namespace: test
    ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: true
      controller: true
      kind: VirtualMachine
      name: test-cluster-e2e-script-node-pool-1-qwblx-lg2bf-7qszx
      uid: cac71469-97b1-45f2-9b48-4c2607f00bed
    resourceVersion: "8483852"
    uid: 1a4f5c7e-b0af-46a9-b8f6-cba3d6dba526
  spec:
    pvcName: 8bd9c3f4-fe7d-407e-9f5e-d456ab664ef8-b2968c0f-5a4c-407b-8c5c-d650beaa9ad7
    vmName: test-cluster-e2e-script-node-pool-1-qwblx-lg2bf-7qszx
  status:
    accessPoints:
      NFSv3: host10.cibgst.com:/525978d8-2bd6-40d8-2859-36dc5210c436
      NFSv4.1: host10.cibgst.com:/vsanfs/525978d8-2bd6-40d8-2859-36dc5210c436
    done: true
kind: List
metadata:
  resourceVersion: ""


# k describe ns test
Name:         test
Labels:       kubernetes.io/metadata.name=test
              vSphereClusterID=domain-c10
Annotations:  vmware-system-resource-pool-cpu-limit: 
              vmware-system-resource-pool-memory-limit: 
Status:       Active
...


//// Delete namespace from VC UI

# k get pvc -n test
No resources found in test namespace.

# k get pod -n test
No resources found in test namespace.

# k get cnsfileaccessconfig -n test 
No resources found in test namespace.

# k get cnsfilevolumeclient -n test 
No resources found in test namespace.

# k describe ns test
Error from server (NotFound): namespaces "test" not found

Special notes for your reviewer:

Release note:

Complete deletion of CnsFileAccessConfig instance if associated nodeVM instance is not found

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 10, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 10, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @vdkotkar. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jan 10, 2025
@divyenpatel
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 10, 2025
@vdkotkar vdkotkar changed the title Complete deletion of CnsFileAccessConfig instance if associated nodeVM instance is not found [WIP] Complete deletion of CnsFileAccessConfig instance if associated nodeVM instance is not found Jan 21, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 21, 2025
@vdkotkar vdkotkar force-pushed the cnsfileaccessconfig_deletion_vm_not_found branch from 91af55d to 8b5cbd7 Compare February 3, 2025 13:56
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 3, 2025
@vdkotkar vdkotkar force-pushed the cnsfileaccessconfig_deletion_vm_not_found branch 3 times, most recently from d94a9bb to 35c10d7 Compare February 4, 2025 13:09
@vdkotkar vdkotkar changed the title [WIP] Complete deletion of CnsFileAccessConfig instance if associated nodeVM instance is not found Complete deletion of CnsFileAccessConfig instance if associated nodeVM instance is not found Feb 4, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2025
@vdkotkar vdkotkar force-pushed the cnsfileaccessconfig_deletion_vm_not_found branch from 35c10d7 to 0ca04d5 Compare February 5, 2025 13:10
@vdkotkar
Copy link
Contributor Author

vdkotkar commented Feb 5, 2025

/test pull-vsphere-csi-driver-unit-test

@vdkotkar vdkotkar force-pushed the cnsfileaccessconfig_deletion_vm_not_found branch from 0ca04d5 to 7586cba Compare February 5, 2025 17:56
Copy link
Member

@divyenpatel divyenpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: divyenpatel, vdkotkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 6, 2025
@k8s-ci-robot k8s-ci-robot merged commit 23d7e01 into kubernetes-sigs:master Feb 6, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants