The vSphere HA configuration protects customer workloads in the VI workload domain. You consider the varying and sometimes significant CPU or memory reservations for the customer workloads and the requirements of vSAN.
You configure several vSphere HA features to provide high availability for the customer workloads in the domain.
vSphere HA Feature |
Description |
---|---|
Host failure response |
vSphere HA can respond to individual host failures by restarting virtual machines on other hosts within the cluster. |
Response for host isolation |
If a host becomes isolated, vSphere HA can detect and shut down or restart virtual machines on available hosts. |
Datastore with PDL or APD |
When virtual machines are hosted on non-vSAN datastores, vSphere HA can detect datastore outages and restart virtual machines on hosts that have datastore access. |
Admission control policy |
Configure how the cluster determines available resources. In a smaller vSphere HA cluster, a larger proportion of the cluster resources are reserved to accommodate ESXi host failures according to the selected admission control policy. |
VM and Application Monitoring |
If a virtual machine failure occurs, the VM and Application Monitoring service restarts that virtual machine. The service uses VMware Tools to evaluate whether a virtual machine in the cluster is running. |
Policy Name |
Description |
---|---|
Host failures the cluster tolerates |
vSphere HA ensures that a specified number of ESXi hosts can fail and sufficient resources remain in the cluster to fail over all the virtual machines from those ESXi hosts. |
Percentage of cluster resources reserved |
vSphere HA reserves a specified percentage of aggregated CPU and memory resources for failover. |
Specify failover hosts |
If an ESXi host fails, vSphere HA attempts to restart its virtual machines on any of the specified failover ESXi hosts. If a restart is not possible, for example, the failover ESXi hosts have insufficient resources or have failed as well, then vSphere HA attempts to restart the virtual machines on other ESXi hosts in the vSphere Cluster. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
VCF-WLD-VCS-CLS-005 |
Use vSphere HA to protect all virtual machines against failures. |
vSphere HA supports a robust level of protection for both ESXi host and virtual machine availability. |
You must provide sufficient resources on the remaining hosts so that virtual machines can be migrated to those hosts in the event of a host outage. |
VCF-WLD-VCS-CLS-006 |
Set host isolation response to |
vSAN requires that the host isolation response be set to Power Off and to restart virtual machines on available ESXi hosts. |
If a false positive event occurs, virtual machines are powered off and an ESXi host is declared isolated incorrectly. |
VCF-WLD-VCS-CLS-007 |
Set the advanced cluster setting |
Ensures that vSphere HA uses the manual isolation addresses instead of the default management network gateway address. |
You must configure this parameter manually. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
VCF-WLD-VCS-CLS-008 |
Configure admission control for 1 ESXi host failure and percentage-based failover capacity. |
Using the percentage-based reservation works well in situations where virtual machines have varying and sometimes significant CPU or memory reservations. vSphere automatically calculates the reserved percentage according to the number of ESXi host failures to tolerate and the number of ESXi hosts in the cluster. |
In a cluster of 4 ESXi hosts, the resources of only 3 ESXi hosts are available for use. |
VCF-WLD-VCS-CLS-009 |
Set the isolation address for the cluster to the gateway IP address for the vSAN network. |
Allows vSphere HA to validate complete network isolation if a connection failure occurs on an ESXi host. |
You must configure the isolation address manually. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
VCF-WLD-VCS-CLS-010 |
Increase admission control percentage to the half of the ESXi hosts in the cluster. |
Allocating only half of a stretched cluster ensures that all VMs have enough resources if an availability zone outage occurs. |
In a cluster of 8 ESXi hosts, the resources of only 4 ESXi hosts are available for use. If you add more ESXi hosts to the cluster, add them in pairs, one per availability zone. |
VCF-WLD-VCS-CLS-011 |
Set an additional isolation address to the vSAN network gateway in the second availability zone. |
Allows vSphere HA to validate complete network isolation if a connection failure occurs on an ESXi host or between availability zones. |
None. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
VCF-WLD-VCS-CLS-012 |
Enable VM Monitoring for each cluster. |
VM Monitoring provides in-guest protection for most VM workloads. The application or service running on the virtual machine must be capable of restarting successfully after a reboot or the virtual machine restart is not sufficient. |
None. |
VCF-WLD-VCS-CLS-013 |
Set the advanced cluster setting |
The NSX Edge appliances in the cluster are restarted when an OS failure occurs and heartbeats are not received from VMware Tools instead of waiting additionally for the I/O check to complete. I/O monitoring is deactivated for the workload virtual machines too. |
You must manually enable I/O monitoring by configuring the |