Observe the following best practices for monitoring the status and validity of your vSphere HA cluster.
Setting Alarms to Monitor Cluster Changes
When vSphere HA or Fault Tolerance take action to maintain availability, for example, a virtual machine failover, you can be notified about such changes. Configure alarms in vCenter Server to be triggered when these actions occur, and have alerts, such as emails, sent to a specified set of administrators.
Several default vSphere HA alarms are available.
Insufficient failover resources (a cluster alarm)
Cannot find master (a cluster alarm)
Failover in progress (a cluster alarm)
Host HA status (a host alarm)
VM monitoring error (a virtual machine alarm)
VM monitoring action (a virtual machine alarm)
Failover failed (a virtual machine alarm)
The default alarms include the feature name, vSphere HA.
Monitoring Cluster Validity
A valid cluster is one in which the admission control policy has not been violated.
A cluster enabled for vSphere HA becomes invalid when the number of virtual machines powered on exceeds the failover requirements, that is, the current failover capacity is smaller than configured failover capacity. If admission control is disabled, clusters do not become invalid.
In the vSphere Web Client, select vSphere HA from the cluster's Monitor tab and then select Configuration Issues. A list of current vSphere HA issues appears.
DRS behavior is not affected if a cluster is red because of a vSphere HA issue.