Multi-Attach error for RWO (Block) volume when Node VM is shutdown before Pods are evicted and Volumes are detached from Node VM.
Note: This issue is present in all the Kubernetes releases.

Impact: After the Node is shutdown, the Pod running on that Node does not come up on the new Node. The events on the Pod will have a warning message for FailedAttachVolume. Error Message: Multi-Attach error for volume "pvc-uuid" Volume is already exclusively attached to one node and can't be attached to another.

Upstream Issue: Kubernetes is being enhanced to fix this issue. For more information, see the Kubernetes Enhancement Proposals (KEP) PR - kubernetes/enhancements#1116.

Workaround

The pods stuck in this state can be recovered by following steps.
  1. Find the Node VM in the vCenter Inventory. Make sure the correct VM associated with the Node is used for further instructions.
  2. Detach all the Persistent Volumes Disks attached to this Node VM.
    Note: Do not detach the Primary disks used by the Guest OS.
  3. Right-click a virtual machine in the inventory and select Edit Settings.
  4. From the Virtual Hardware find all the Hard Disks for the Persistent Volumes and remove them.
    Note: Do not select Delete files from datastore.
  5. Click OK to reconfigure VM to detach all the Persistent Volumes disks from shutdown/powered off Node VM.
  6. Execute kubectl get volumeattachments and find all volumeattachments objects associated with the shutdown Node VM.
  7. Edit volumeattachment object with kubectl edit volumeattachments <volumeattachments-object-name> and remove finalizers.
  8. Check if the volumeattachment object is deleted by Kubernetes. If this object remains on the system, you can safely delete this with kubectl delete volumeattachments <volumeattachments-object-name>.
  9. Wait for some time for the Pod to come up on a new Node.