When a flash caching device fails, vSAN evaluates the accessibility of the objects on the disk group that contains the cache device, and rebuilds them on another host if possible and the Primary level of failures to tolerate is set to 1 or more.
Component Failure State and Accessibility
Both cache device and capacity devices that reside in the disk group, for example, magnetic disks, are marked as degraded. vSAN interprets the failure of a single flash caching device as a failure of the entire disk group.
Behavior of vSAN
vSAN responds to the failure of a flash caching device in the following way:
Parameter | Behavior |
---|---|
Primary level of failures to tolerate | If the Primary level of failures to tolerate in the VM storage policy is equal to or greater than 1, the virtual machine objects are still accessible from another ESXi host in the cluster. If resources are available, vSAN starts an automatic reprotection. If the Primary level of failures to tolerate is set to 0, a virtual machine object is inaccessible if one of the object's components is on the failed disk group. |
I/O operations on the disk group | vSAN stops all running I/O operations for 5-7 seconds until it re-evaluates whether an object is still available without the failed component. If vSAN determines that the object is available, all running I/O operations are resumed. |
Rebuilding data | vSAN examines whether the hosts and the capacity devices can satisfy the requirements for space and placement rules for the objects on the failed device or disk group. If such a host with capacity is available, vSAN starts the recovery process immediately because the components are marked as degraded. |