Failover Manager actions for various failure scenarios are described in Failure scenarios and Failover Manager actions .

Table 1. Failure scenarios and Failover Manager actions

Scenario

Failover Manager action

VMware Smart Assurance Manager fails

Failover Manager initiates an automatic failover to use the corresponding Standby for the failed component.

Host where the VMware Smart Assurance Manager is running fails

Failover Manager initiates an automatic failover to use the corresponding Standby for the failed component.

SAM with Notification Cache Publishing Enabled fails or any one of its services (Tomcat, Rabbit MQ, and ElasticSearch) fail

The Failover Manager treats any single component failure as a collective failure of all components and initiates automatic failover to use the corresponding Standby components.

The Failover Manager also initiates a reconfigure on the SolutionPack for VMware Smart Assurance to point to the new location of the VMware Smart Assurance Tomcat.

VMware Smart Assurance Domain Manager fails and it is associated with an VMware M&R collector

When one of the analysis Domain Managers fails, a custom hook script can stop the smarts-collector and start another smarts-collector to collect data from the newly-promoted Domain Manager in the Standby location, provided that you configured the hook script for the Failover Manager.

The Failover Manager monitors the Domain Managers and does not monitor the VMware M&R collectors. Typically, a hook script is configured for deployments where delays due to high latency are a concern. “Advanced techniques: Hook scripts and VMware M&R collectors” on page 59 provides more information about scenarios and a hook script procedure.

VMware Smart Assurance Broker fails

If the host where the Broker resides is running and the Broker is down, Failover Manager initiates an automatic failover to promote the Standby Broker as Active and attempts to restart the failed Active Broker.

If the attempt to restart the failed Broker is successful, then both Active and Standby Brokers maintain the same list of active Managers. The Failover Manager registers the promoted Active Broker with the Active Managers.

Also, if there is a SolutionPack for VMware Smart Assurance that interacts with the Broker, the Failover Manager is not able to update the SolutionPack. The SolutionPack continues to use the restarted Broker.

Host where the Broker is running fails or the Broker cannot be restarted

If the host where the Broker resides is down or the Broker is down, Failover Manager initiates an automatic failover to promote the Standby Broker as Active and attempts to restart the failed Active Broker.

If the attempt to restart the failed Broker is not successful:

  • The newly-promoted Active Broker maintains the list of active Managers. The Failover Manager registers the promoted Active Broker with the Active Managers.

  • User intervention is required to reconfigure the SolutionPack for VMware Smart Assurance with information about the newly-promoted Active Broker on Site B (Standby).

VMware Smart Assurance Manager or Broker is running but is not responding or is intermittently responding

No failover is initiated. The Failover Manager sends an email message to indicate that the VMware Smart Assurance component is not responding.

Intermittent network communication failure with the host where a VMware Smart Assurance component is running

No failover is initiated. The Failover Manager attempts to send an email message to indicate this issue.

Communication between the Active and Standby location is lost

No failover is initiated. The Failover Manager attempts to send an email message to indicate this issue.

VMware Smart Assurance Trap Exploder stops

Failover Manager restarts the Trap Exploder. If the two attempts fail, the Failover Manager proceeds to fail over to the Standby Trap Exploder. The Failover Manager registers the Standby Trap Exploder with the Broker and un-suspends trap forwarding in it and changes its role to Active.

Chapter 10, “Trap Exploder Failover,” provides more information.

The site fails (all components in Location A fail)

Failover Manager switches the components, including the Broker, to the Standby location.

In the case of a failover:

  • If a failover is initiated, Failover Manager checks the state of the corresponding Standby component prior to the failover completion. If, for any reason, the Standby component is not operational, failover is not performed.

    If the Standby component is operational, the failover continues. When the failover is completed, the Failover Manager automatically changes the configuration of the solution so that the corresponding Standby component becomes the new Active component. The Broker registration is changed to list the location of the new Active component.

    If the Failover Manager cannot reach the Broker or the Broker is down at the time of an VMware Smart Assurance Manager failover, the VMware Smart Assurance Manager failover is not initiated. Instead, the Failover Manager performs a Broker failover. Once the Broker failover is completed, Failover Manager continues with the VMware Smart Assurance Manager failover.

    All clients, including Global Consoles and other VMware Smart Assurance components, automatically reconnect to the new Active component. If you open another Global Console, the new console connection needs to be established. To do so, specify the new Active Broker value in the Attach Manager dialog box.

    When the failed component is restarted, it automatically resumes the Standby role. Also, in the case of SAM and Adapter Platform, it is automatically configured to listen to the Active component.

  • If the Failover Manager fails, no failover is initiated.

    If the Failover Manager monitoring process exits for some reason, you are notified and you must restart Failover Manager manually.

  • If a network component fails and causes the failure of multiple VMware Smart Assurance components, all affected VMware Smart Assurance components are switched to the Standby, provided that communications between locations to the Failover Manager and to the Broker are operating correctly.

  • The Failover Manager generates email to notify administrators of a failure situation.

  • The Failover Manager periodically synchronizes configuration files from an Active component to the Standby counterpart. Passwordless communication is required for this automatic action. “Set up passwordless communication between Active and Standby hosts” on page 20 provides instructions for configuring passwordless communication and the SSH connection.

    Additionally:

  • In a warm failover for IP Manager, MPLS, Server Manager, NPM, and VoIP Availability Manager, only the Active Domain Manager monitors the network.

  • Host configurations for Active and Standby VMware Smart Assurance components have to be the same. This is necessary because the roles of Active and Standby components will switch from one host to another when a failover occurs.