Troubleshooting VMware Workspace ONE Access postgres cluster outage deployed through VMware Aria Suite Lifecycle.

Problem

VMware Workspace ONE Access cluster health status displays as CRITICAL in VMware Aria Suite Lifecycle Health Notification due to network loss in the VMware Workspace ONE Access appliance.

Cause

Network loss on the postgres cluster primary node. For /usr/local/bin/pcp_watchdog_info -p 9898 -h localhost -U pgpool command, it would prompt for a password. If /usr/local/etc/pgpool.pwd file is present on the VMware Workspace ONE Access node, that would contain the password. If the password is not available, use the default password password.

Command parameters help are shown below:

-h : The host against which the command is run is localhost.

-p : The port on which pgpool accepts connections is 9898.

-U : The pgpool health check and replication delay check user is pgpool.

There must be an expected response, such as one of the following:

3 YES <Host1>:9999 Linux <Host1> <Host1>

<Host1>:9999 Linux <Host1> <Host1> 9999 9000 4 MASTER

<Host2>:9999 Linux <Host2> <Host2> 9999 9000 7 STANDBY

<Host3>:9999 Linux <Host3> <Host3> 9999 9000 7 STANDBY

The response must contain a MASTER node and 2 STANDBY nodes. If any of the node's status is SHUTDOWN or the command execution is struck, resolve the issue as specified in the following Solutions section.

Solution

  1. Bring down the services on VMware Workspace ONE Access nodes. Refer to KB 78815 for the required steps.
  2. Power OFF the VMware Workspace ONE Access appliances in vCenter.
  3. Power ON the VMware Workspace ONE Access nodes through VMware Aria Suite Lifecycle.