You can monitor the cluster status by using the VMware Cloud Director appliance management user interface.

You can view the names of the cells in a cluster, the roles of the cells, the cell status, the name of the cell that the standby cells are following, and the cluster failover mode by using the VMware Cloud Director appliance management UI or the VMware Cloud Director appliance API. This procedure describes the steps to monitor the appliance cluster health in the management UI. The Embedded Database Availability tab of the VMware Cloud Director appliance management UI shows the cluster health and failover mode of the appliance.

Procedure

  1. Log in as root to the appliance management UI at https://primary_eth1_ip_address:5480.
  2. In the left panel, select Embedded DB Availability.

    You can view the short DNS names of the nodes, their roles, their status, the name of their upstream node, that is, the current primary, and the available actions on the nodes.

    In the Following column, a question mark (?) in front of the host name indicates that the current primary is unreachable. An exclamation mark (!) in front of the host name indicates that the metadata of the current primary is not updated and might be wrong, or that the node is not attached to the current primary. The problem might occur if you restart the node after a prolonged downtime. If the node cannot attach to the primary, you must unregister it and replace it with a new standby.

  3. View the cluster Health.
    Cluster Health Status Description
    Healthy

    The cluster is in a healthy state. The primary and both of the standby cells are online and operational.

    The VMware Cloud Director UI and API are functional.

    Degraded

    The cluster is in a degraded state. The primary and one of the standby cells are online and operational, but the other standby cell is non-functional. The primary database is functional in this state, but if there is another database failure of either of the operational cells, the primary will become non-functional. The non-functional standby cell must be replaced with a new, functioning standby cell as soon as possible to restore the cluster to a Healthy state.

    The VMware Cloud Director UI and API are functional.

    No_Active_Primary

    There is no operational primary database. If there are two operational standby cells, one of them must be promoted to become the new primary cell. If the environment does not have two operational standby cells, you must diagnose the problem and remedy the situation manually.

    The VMware Cloud Director UI and API are not available.

    Read_Only_Primary

    There is an online primary database, but it is Read_Only because the environment does not have an operational standby cell. Two new standby cells must be deployed.

    The VMware Cloud Director UI and API are not available.

    Critical_Problem

    The cluster is in an inconsistent state. For example, more than one primary cell is online or a standby cell is following the wrong primary. You must diagnose the problem and remedy the situation manually.

    This state might affect the VMware Cloud Director UI and API availability.

    SSH_Problem

    The SSH problem indicates that the postgres user cannot connect to its peer database nodes over SSH. You must fix this critical problem as soon as possible. See Your VMware Cloud Director Cluster Health Indicates an SSH Problem.

    The VMware Cloud Director UI and API might not be fully functional.

  4. View the appliance failover mode.
    Failover Mode Description
    Automatic If a failure of the primary database occurs, VMware Cloud Director automatically triggers a database failover.
    Manual If a failure of the primary database occurs, you must initiate a database failover using the VMware Cloud Director appliance management UI or failover API.
    Indeterminate Failover mode is not consistent across all the nodes of the cluster. You must diagnose the problem and remedy the situation. By using the VMware Cloud Director appliance API, reset the FailoverMode to either Manual or Automatic. See the Failovermode information in the VMware Cloud Director Appliance API Schema Reference.