When you are planning your failover system, consider the following:

  • Ensure that you have superuser (User ID 0) or administrative privileges to perform the procedures to set up a failover system.

  • Failover Manager is supported for Linux and Windows platform versions listed in the VMware Smart Assurance Support Matrix for SAM, IP, ESM, MPLS, NPM, OTM, and VoIP Managers.

  • For Windows failover support, the Failover Manager and all components need to be installed and running on Windows. Cross-platform is not supported. For example, Failover Manager on Linux and Domain Managers on Windows is not supported.

  • Determine where to install the components of the VMware Smart Assurance Failover System.

    VMware Smart Assurance products installed in the Active location must have corresponding products installed in the Standby location, except for IP Configuration Manager. Since IP Configuration Manager is not supported for failover, only one instance is allowed.

    In this document, all instructions assume that Location A or Side A is the Active site and Location B or Side B is the Standby site. For example, instructions about domain configuration changes on the Active side, assume that the changes are made on the host defined as Site A.

    Each Active and Standby component that is a part of the VMware Smart Assurance Failover System can be installed on a separate host or multiple components can be installed on one host. At the very least, an Active component and its Standby component should operate from distinct installation areas. Ensure that your hosts have sufficient resources.

    • It is recommended that installation locations are the same in the Active and Standby sides. If that is not possible, set the StrictSitemod parameter to False in the BASEDIR/conf/failover/failover.conf file.

    • The mixing of hardware is not allowed. Failover support is available from one physical box to another physical box, or from one virtual machine to another virtual machine, only if they are running the same operating system.

    • Both, the Active and Standby components must be running the same version of VMware Smart Assurance software.

      Supported products for failover on page 13 provides a list of supported VMware Smart Assurance products. The VMware Smart Assurance Support Matrix for SAM, IP, ESM, MPLS, NPM, OTM, and VoIP Managers provides resource (hardware and memory) information.

  • Determine which ports to use for Active and Standby components.

    Service Assurance Manager, Adapter Platform, BIM, and all Domain Managers must be started on predefined ports. The ports are also specified in the ServerSection entries in the failover.conf file.

    For Linux, the sm_service install --port value for the Broker and the sm_service install --port value and the --sport value for the Trap Exploder must be changed to values greater than 1024. The Failover Manager uses a non-root user account when restarting these components, so privileged ports (below 1024) cannot be used. For Windows, this restriction is not applicable. Failover Manager on Windows cannot be configured with a non-administrator user account to perform failover actions.

  • The Failover Manager software is included in any VMware Smart Assurance product installation. No additional installation task for Failover Manager is required. Later, for one of the deployment tasks, you issue a command to manually install the service for the Failover Manager and then start the service.

    When deciding on where to run the Failover Manager, consider the following scenario:

    • If the Failover Manager is running with Standby SAM from the host on Location B and the Active SAM on Location A fails, the failover occurs and Standby SAM on Location B is promoted to Active. Then, if the newly-promoted Active SAM host on Location B fails, the Failover Manager will fail with it. You lose the capability to failover.

      To avoid losing failover capability, you need to initiate a failback as soon as the failed SAM on Location A is operational.

    • If the Failover Manager is running on a separate dedicated host, you do not lose the capability to failover. The Failover Manager will change the failed SAM host on Location B to Active.

      The Failover Manager may reside in a separate location or on the Standby side. Ideally, the Failover Manager should run on a highly available host that is separate from the components it monitors.

    • If you plan to run the Failover Manager on a separate host, you need to install an VMware Smart Assurance product on that host.

    • If you do not plan to run the Failover Manager from a separate host, start the Failover Manager service from the host where the Standby SAM is running.

  • Determine which ports to use for Active and Standby components.

  • For SAM with Notification Cache Publishing that is configured in the SolutionPack for VMware Smart Assurance for use with the VMware M&R UI, you need to determine VMware M&R and SolutionPack block information for the Failover Manager configuration file. “MNRSection” on page 57 describes the information that is required.


    Also, if you have upgraded VMware M&R, check the version information after the upgrade. You might need to update version information in the MNRSection of the failover.conf file.

  • In a SAM with Notification Cache Publishing Enabled scenario, where Domain Managers are geographically dispersed and VMware M&R collectors are not located near the Domain Managers, delays due to high latency may be a concern when an Active Domain Manager in the Active location (Site A) fails. For latency reasons, you should consider installing a second collector near the Standby (Site B). “Example of smarts-collector messages that indicate high latency” on page 69 provides additional information about high latency.

    The Failover Manager monitors the Domain Managers and does not monitor the VMware M&R collectors. You can configure a hook script so that the script stops the smarts-collector and starts a second smarts-collector to collect data from the newly-promoted Domain Manager in the Standby location.

    Not all scenarios require multiple VMware M&R collectors. “Advanced techniques: Hook scripts and VMware M&R collectors” on page 59 provides more information about scenarios and a hook script procedure.

  • The Failover Manager monitors any number of Brokers you have defined. The Brokers can be running on the same host as the Failover Manager or on different hosts. VMware recommends that the Broker operate on the same host as the Failover Manager. The Broker requires minimal resources and recovers very quickly after a failure.

  • The Active Broker to which the Failover Manager is connected must be specified as the default Broker for every component in the runcmd_env.sh file. The Failover Manager checks for this default Broker when it monitors the components.

  • Run the Linux hostname command for each host. The output is used later when you configure the failover.conf file.

    The host name specified in the ServerSection of the failover.conf file must exactly match the name displayed when you run the Linux hostname command. If the host names do not match, the hosts will not be registered with the Failover Manager.

    In Windows, the recommended equivalent of the Linux hostname command is to use the Fully Qualified Domain Name (FQDN) for the host. Although both shortname and FQDN are supported, the best practice is to use FQDN in all places like runcmd_env.sh, failover.conf, and trusted host entries. To locate the FQDN, use the Full Computer name in the Control Panel > System and Security > System.