Learn how Tanzu Mission Control determines the health of attached and provisioned clusters.
The cluster detail page for each cluster in the Tanzu Mission Control console shows current overall health of the cluster at the top of the page. This health status is also displayed in the list of all clusters on the Clusters page. Additionally, further down the cluster detail page, more health information is broken out into detailed aspects of the overall health. Tanzu Mission Control continuously monitors each cluster and updates the console with changes.
The cluster agent extensions that are deployed on your cluster (both provisioned and attached) send change events from nodes and ports as they occur, and a regularly occuring component status event for each component to Tanzu Mission Control. These events are regarded collectively as the heartbeat, which Tanzu Mission Control uses to determine the health of the cluster.
Cluster Health
- HEALTHY
A cluster is healthy when all nodes and components are healthy, and a heartbeat for the cluster is received every minute.
- UNHEALTHY
A cluster is unhealthy if either of the following are reported as unhealthy:
- one or more of the cluster's control plane nodes
- one or more of the cluster's components
- WARNING
A cluster can have a warning status if any of its worker nodes are in an unhealthy or unknown state.
A cluster can also have a warning status if any nodes (worker or control plane) are in a warning state.
- UNKNOWN
The health status of a cluster in unknown if either of the following are reported as unknown:
- one or more of the cluster's control plane nodes
- one or more of the cluster's components
- DISCONNECTED
A cluster is considered disconnected if no heartbeat is received from the cluster for more than 3 minutes.
Node Health
The title of the Worker nodes section shows you how many worker nodes you have in the cluster, and below that the number of worker nodes that are healthy. To see all the nodes (including the control plane), click the Nodes tab, which shows the health of each individual node in the Status column.
- HEALTHY
If NodeReady is True, and all other conditions are healthy, then the node is healthy.
- UNHEALTHY
The node is unhealthy if NodeReady is False. The node is also unhealthy if NodeReady is True and more than half of the other conditions are in an unhealthy state.
- WARNING
The warning status indicates that NodeReady is True, but some (less than half) of the other conditions are in an unhealthy state.
- UNKNOWN
If NodeReady has any value other than True or False, the health status of the node is unknown. The node can also have an unknown status if no heartbeat has been received from the cluster for more than three minutes.
Component Health
kube-apiserver
scheduler
controller-manager
- one or more
etcd
components (etcd-0
,etcd-1
,etcd-2
, and so on)
- HEALTHY
If the last reported value of the Healthy condition of the component is True, then the component is healthy.
- UNHEALTHY
If the last reported value of the Healthy condition of the component is False, then the component is unhealthy.
- UNKNOWN
If the last reported value of the Healthy condition of the component is Unknown, or it is something other than True or False, then the health status of the component is unknown. The component can also be in this state if no heartbeat has been received from the cluster for more than three minutes.