Notes for troubleshooting manager-service-automatic-failover command.

Problem

  1. The manager-service-automatic-failover command fails or displays this message for more than two minutes: Enabling Manager Service automatic failover mode on node: IAAS_MANAGER_SERVICE_NODEID.

    1. Log in to the VMware vRealize ™ Automation appliance management console at https://va-hostname.domain.name:5480 with the user name host and the password you entered when you deployed the appliance.

    2. Select vRA Settings > Cluster.

    3. Verify that the Management Agent service is running on all Manager Service hosts.

    4. Verify that the last connected time for all IaaS Manager Service nodes is less than 30 seconds.

    If you find any Management Agent connectivity issues, resolve them manually and retry the command to enable the Manager Service automatic failover.

  2. The manager-service-automatic-failover command fails to enable failover on a Manager Service node. It is safe to rerun the command to fix this.

  3. Some Manager Service hosts in the IaaS deployment have failover enabled while other hosts do not. All Manager Service hosts in the IaaS deployment must have the feature enabled or it does not work. To correct this issue, do one of the following:

    • Disable failover on all Manager Service nodes and use the manual failover approach instead. Only run failover on one host at a time.

    • If multiple attempts fail to enable the feature on a Manager Service node, stop the Windows VMware vCloud Automation Center Service on this node and set the node startup type to Manual until you resolve the issue.

  4. Validate that failover is enabled on each Manager Service node using Python.

    1. Open a command prompt on a Manager Service node.

    2. Run python /usr/lib/vcac/tools/vami/commands/manager-service-automatic-failover ENABLE.

    3. Verify that the system returns this message: Enabling Manager Service automatic failover mode on node: IAAS_MANAGER_SERVICE_NODEID done.

  5. Validate that failover is enabled on each Manager Service node by inspecting the Manager Service configuration file.

    1. Open a command prompt on a Manager Service node.

    2. Navigate to the vRealize Automation installation folder and open the Manager Service configuration file at VMware\vCAC\Server\ManagerService.exe.config.

    3. Verify that the following elements are present in the <appSettings> section.

      • <add key="FailoverModeEnabled" value="True" />

      • <add key="FailoverPingIntervalMilliseconds" value="30000" />

      • <add key="FailoverNodeState" value="active" />

      • <add key="FailoverMaxFailedDatabasePingAttepts" value="5" />

      • <add key="FailoverMaxFailedRepositoryPingAttepts" value="5" />

  6. Verify that Windows VMware vCloud Automation Center Service status is started and startup type is automatic.

  7. Validate that failover is disabled on each Manager Service node using Python.

    1. Open a command prompt on a Manager Service node.

    2. Run python /usr/lib/vcac/tools/vami/commands/manager-service-automatic-failover DISABLE.

    3. Verify that the system returns this message: Disabling Manager Service automatic failover mode on node: IAAS_MANAGER_SERVICE_NODEID done.

  8. Validate that failover is disabled on each Manager Service node by inspecting the Manager Service configuration file.

    1. Open a command prompt on a Manager Service node.

    2. Navigate to the vRealize Automation installation folder and open the Manager Service configuration file at VMware\vCAC\Server\ManagerService.exe.config.

    3. Verify that the following element is present in the <appSettings> section.

      • <add key="FailoverModeEnabled" value="False" />

  9. To create a cold standby Manager Service node, set the node Windows VMware vCloud Automation Center Service status to stopped and startup type to manual.

  10. For an active Manager Service node, the node Windows VMware vCloud Automation Center Service status must be started and startup type must be automatic.

  11. The manager-service-automatic-failover command uses the Manager Service node internal id - IAAS_MANAGER_SERVICE_NODEID. To find the hostname corresponding to this internal id, run the command vra-command list-nodes and look for the Manager Service host with NodeId: IAAS_MANAGER_SERVICE_NODEID.

  12. To locate the Manager Service that the system has automatically elected to be currently active, perform these steps.

    1. Open a command prompt with an SS connection to the master vRealize Automation appliance node.

    2. Run vra-command list-nodes --components.

      • If failover is enabled, find the Manager Service node with State: Active.

      • If failover is disabled, find the Manager Service node with State: Started.