You can enable vSphere HA and vSAN on the same cluster. vSphere HA provides the same level of protection for virtual machines on vSAN datastores as it does on traditional datastores. This level of protection imposes specific restrictions when vSphere HA and vSAN interact.
ESXi Host Requirements
- The cluster's ESXi hosts all must be version 5.5 Update 1 or later.
- The cluster must have a minimum of three ESXi hosts, unless it is a vSAN two-host cluster. For best results, configure the vSAN cluster with four or more hosts.
Networking Differences
vSAN uses its own logical network. When vSAN and vSphere HA are enabled for the same cluster, the HA interagent traffic flows over this storage network rather than the management network. vSphere HA uses the management network only when vSAN is turned off. vCenter Server chooses the appropriate network when vSphere HA is configured on a host.
When a virtual machine is only partially accessible in all network partitions, you cannot power on the virtual machine or fully access it in any partition. For example, if you partition a cluster into P1 and P2, the VM namespace object is accessible to the partition named P1 and not to P2. The VMDK is accessible to the partition named P2 and not to P1. In such cases, the virtual machine cannot be powered on and it is not fully accessible in any partition .
The following table shows the differences in vSphere HA networking whether or not vSAN is used.
vSAN On | vSAN Off | |
---|---|---|
Network used by vSphere HA | vSAN storage network | Management network |
Heartbeat datastores | Any datastore mounted to more than one host, but not vSAN datastores | Any datastore mounted to more than one host |
Host declared isolated | Isolation addresses not pingable and vSAN storage network inaccessible | Isolation addresses not pingable and management network inaccessible |
If you change the vSAN network configuration, the vSphere HA agents do not automatically acquire the new network settings. To change the vSAN network, you must re-enable host monitoring for the vSphere HA cluster:
- Deactivate Host Monitoring for the vSphere HA cluster.
- Make the vSAN network changes.
- Right-click all hosts in the cluster and select Reconfigure HA.
- Reactivate Host Monitoring for the vSphere HA cluster.
Capacity Reservation Settings
When you reserve capacity for your vSphere HA cluster with an admission control policy, this setting must be coordinated with the corresponding Failures to tolerate policy setting in the vSAN rule set. It must not be lower than the capacity reserved by the vSphere HA admission control setting. For example, if the vSAN rule set allows for only two failures, the vSphere HA admission control policy must reserve capacity that is equivalent to only one or two host failures. If you are using the Percentage of Cluster Resources Reserved policy for a cluster that has eight hosts, you must not reserve more than 25 percent of the cluster resources. In the same cluster, with the Failures to tolerate policy, the setting must not be higher than two hosts. If vSphere HA reserves less capacity, failover activity might be unpredictable. Reserving too much capacity overly constrains the powering on of virtual machines and intercluster vSphere vMotion migrations. For information about the Percentage of Cluster Resources Reserved policy, see the vSphere Availability documentation.
vSAN and vSphere HA Behavior in a Multiple Host Failure
After a vSAN cluster fails with a loss of failover quorum for a virtual machine object, vSphere HA might not be able to restart the virtual machine even when the cluster quorum has been restored. vSphere HA guarantees the restart only when it has a cluster quorum and can access the most recent copy of the virtual machine object. The most recent copy is the last copy to be written.
Consider an example where a vSAN virtual machine is provisioned to tolerate one host failure. The virtual machine runs on a vSAN cluster that includes three hosts, H1, H2, and H3. All three hosts fail in a sequence, with H3 being the last host to fail.
After H1 and H2 recover, the cluster has a quorum (one host failure tolerated). Despite this quorum, vSphere HA is unable to restart the virtual machine because the last host that failed (H3) contains the most recent copy of the virtual machine object and is still inaccessible.
In this example, either all three hosts must recover at the same time, or the two-host quorum must include H3. If neither condition is met, HA attempts to restart the virtual machine when host H3 is online again.