Skip to content

VolumeAttachment not marked as detached causes problems when the Node is deleted. #215

@msau42

Description

@msau42

In #184, we had decided that instead of marking the VolumeAttachment as detached, we would just requeue the volume to have the workqueue process it again.

However, this doesn't work in the case where the Node is deleted. In that scenario:

  1. ListVolumes() shows that volume is not attached to the node anymore
  2. ReconcileVA() sets force sync
  3. syncAttach() just tries to reattach the volume again and fails because node is gone
  4. In k/k AD controller, we try to attach to new node, but it fails on the multi-attach check because volume is still attached in asw.

What should happen is:

  1. ListVolumes() shows that volume is not attached to the node anymore
  2. We actually mark VolumeAttachment.status.attached as detached
  3. In k/k AD controller, VerifyVolumesAttached() sees that VolumeAttachment is detached, updates asw
  4. AD reconciler allows new Attach on new node to proceed.

I'm not sure the best way to fix step 2). Some suggestions I have in order of preference:

  1. We go back to actually updating VolumeAttachment in ReconcileVA() like the original PR did. But we call markAsDetached to make sure we update everything properly.
  2. We pass some more state to syncVA() so that it can markAsDetached if csiAttach failed on the force sync.

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions