If a host fails and its virtual machines need to be restarted, you can control the order in which this is done with the VM restart priority setting. You can also configure how vSphere HA responds if hosts lose management network connectivity with other hosts by using the host isolation response setting.
These settings apply to all virtual machines in the cluster in the case of a host failure or isolation. You can also configure exceptions for specific virtual machines. See Customize an Individual Virtual Machine in the vSphere Web Client.
VM Restart Priority
VM restart priority determines the relative order in which virtual machines are placed on new hosts after a host failure. Such virtual machines are restarted, with the highest priority virtual machines attempted first and continuing to those with lower priority until all virtual machines are restarted or no more cluster resources are available. Note that if vSphere HA fails to power on a high-priority virtual machine, it does proceed to try any lower-priority virtual machines. Because of this, the VM restart priority cannot be used to enforce a restart priority for a multiple virtual machine application. Also, if the number of hosts failures exceeds what admission control permits, the virtual machines with lower priority might not be restarted until more resources become available. Virtual machines are restarted on the failover hosts, if specified.
The values for this setting are: Disabled, Low, Medium (the default), and High. If you select Disabled, vSphere HA is disabled for the virtual machine, which means that it is not restarted on other ESXi hosts if its host fails. The Disabled setting is ignored by the vSphere HA VM/Application monitoring feature since this feature protects virtual machines against operating system-level failures and not virtual machine failures. When an operating system-level failure occurs, the operating system is rebooted by vSphere HA and the virtual machine is left running on the same host. You can change this setting for individual virtual machines.
A virtual machine reset causes a hard guest operating system reboot but does not power cycle the virtual machine.
The restart priority settings for virtual machines vary depending on user needs. Assign higher restart priority to the virtual machines that provide the most important services.
For example, in the case of a multitier application you might rank assignments according to functions hosted on the virtual machines.
High. Database servers that will provide data for applications.
Medium. Application servers that consume data in the database and provide results on web pages.
Low. Web servers that receive user requests, pass queries to application servers, and return results to users.
Host Isolation Response
Host isolation response determines what happens when a host in a vSphere HA cluster loses its management network connections but continues to run. You can use the isolation response to have vSphere HA power off virtual machines that are running on an isolated host and restart them on a non-isolated host. Host isolation responses require that Host Monitoring Status is enabled. If Host Monitoring Status is disabled, host isolation responses are also suspended. A host determines that it is isolated when it is unable to communicate with the agents running on the other hosts and it is unable to ping its isolation addresses. When this occurs, the host executes its isolation response. The responses are: Leave powered on (the default), Power off then failover, and Shut down then failover. You can customize this property for individual virtual machines.
If a virtual machine has a restart priority setting of Disabled, then no host isolation response is made.
To use the Shut down VM setting, you must install VMware Tools in the guest operating system of the virtual machine. Shutting down the virtual machine provides the advantage of preserving its state. Shutting down is better than powering off the virtual machine, which does not flush most recent changes to disk or commit transactions. Virtual machines that are in the process of shutting down will take longer to fail over while the shutdown completes. Virtual Machines that have not shut down in 300 seconds, or the time specified in the advanced attribute das.isolationshutdowntimeout seconds, are powered off.
After you create a vSphere HA cluster, you can override the default cluster settings for Restart Priority and Isolation Response for specific virtual machines. Such overrides are useful for virtual machines that are used for special tasks. For example, virtual machines that provide infrastructure services like DNS or DHCP might need to be powered on before other virtual machines in the cluster.
If a host has its isolation response disabled (that is, it leaves virtual machines powered on when isolated) and the host loses access to both the management and storage networks, a "split brain" situation can arise. In this case, the isolated host loses the disk locks and the virtual machines are failed over to another host even though the original instances of the virtual machines remain running on the isolated host. When the host regains access to the virtual machine's datastore, there will be two copies of the virtual machines, although the copy on the originally isolated host does not have access to the vmdk files and data corruption is prevented.
To recover from this situation, ESXi generates a question on the virtual machine that has lost the disk locks for when the host comes out of isolation and realizes that it cannot reacquire the disk locks. vSphere HA automatically answers this question and this allows the virtual machine instance that has lost the disk locks to power off, leaving just the instance that has the disk locks.