You can enable vSphere HA and Virtual SAN on the same cluster. As with traditional datastores, vSphere HA provides the same level of protection for virtual machines that reside on Virtual SAN datastores. However, specific considerations exist when vSphere HA and Virtual SAN interact.

ESXi Host Requirements

You can use Virtual SAN with a vSphere HA cluster only if the following conditions are met:

  • The cluster's ESXi hosts all must be version 5.5 or later.

  • The cluster must have a minimum of three ESXi hosts.

Networking Differences

Virtual SAN has its own network. When Virtual SAN and vSphere HA are enabled for the same cluster, the HA interagent traffic flows over this storage network rather than the management network. The management network is used by vSphere HA only when Virtual SAN is disabled. vCenter Server chooses the appropriate network when vSphere HA is configured on a host.

Note:

Virtual SAN can only be enabled when vSphere HA is disabled.

The following table shows the differences in vSphere HA networking when Virtual SAN is used or not.

Table 1. vSphere HA networking differences

Virtual SAN Enabled

Virtual SAN Disabled

Network used by vSphere HA

Virtual SAN storage network

Management network

Heartbeat datastores

Any datastore mounted to > 1 host, but not Virtual SAN datastores

Any datastore mounted to > 1 host

Host declared isolated

Isolation addresses not pingable and Virtual SAN storage network inaccessible

Isolation addresses not pingable and management network inaccessible

If you change the Virtual SAN network configuration, the vSphere HA agents do not automatically pick up the new network settings. So to make changes to the Virtual SAN network, you must take the following steps in the vSphere Web Client:

  1. Disable Host Monitoring for the vSphere HA cluster.

  2. Make the Virtual SAN network changes.

  3. Right-click all hosts in the cluster and select Reconfigure HA.

  4. Re-enable Host Monitoring for the vSphere HA cluster.

Capacity Reservation Settings

When you reserve capacity for your vSphere HA cluster with an admission control policy, this setting must be coordinated with the corresponding Virtual SAN setting that ensures data accessibility on failures. Specifically, the Number of Failures Tolerated setting in the Virtual SAN rule set must not be lower than the capacity reserved by the vSphere HA admission control setting.

For example, if the Virtual SAN rule set allows for only two failures, the vSphere HA admission control policy must reserve capacity that is equivalent to only one or two host failures. If you are using the Percentage of Cluster Resources Reserved policy for a cluster that has eight hosts, you must not reserve more than 25% of the cluster resources. In the same cluster, with the Host Failures Cluster Tolerates policy, the setting must not be higher than two hosts. If less capacity is reserved by vSphere HA, failover activity might be unpredictable, while reserving too much capacity overly constrains the powering on of virtual machines and inter-cluster vMotion migrations.

Virtual SAN and vSphere HA Behavior in a Multiple Host Failure Situation

After a Virtual SAN cluster fails with a loss of failover quorum for a virtual machine object, vSphere HA might not be able to restart the virtual machine even when the cluster quorum has been restored. vSphere HA guarantees the restart only when it has a cluster quorum and can access the most recent copy of the virtual machine object. The most recent copy is the last copy to be written.

Consider an example where a Virtual SAN virtual machine is provisioned to tolerate one host failure. The virtual machine runs on a Virtual SAN cluster that includes three hosts, H1, H2, and H3. All three hosts fail in a sequence with H3 being the last host to fail.

After H1 and H2 recover, the cluster has a quorum (one host failure tolerated). Despite this, vSphere HA is unable to restart the virtual machine because the last host that failed (H3) contains the most recent copy of the virtual machine object and is still unaccessible.

In this example, either all three hosts must recover at the same time, or the two-host quorum must include H3. If neither condition is met, HA attempts to restart the virtual machine when host H3 comes back on line.