VMware Data Services Manager High Availability refers to the deployment of the Provider VM. By default, when you deploy a Provider VM it runs in standalone mode. You can choose to deploy a Provider High Availability (HA) cluster. See Configuring Provider High Availability

You can also create High Availability clusters of MySQL databases or PostgreSQL databases. For more information on High Availability clusters, see Managing High Availability Clusters.

Deploying your Provider in HA mode may protect against the following types of failures:

  • VM failure
  • ESXi host failure or server hardware failures when you deploy the Provider nodes on separate hosts, and the Providers are using different datastores.
  • VMware Data Services Manager service failure

In a Provider HA cluster, you deploy multiple Provider VMs. You designate one node as the Primary; every other node that you add to the cluster is a Standby Provider node:

  • VMware Data Services Manager supports a minimum of 3 (1 Primary, 2 Standby), and a maximum of 4 (1 Primary, 3 Standby), nodes in a Provider HA cluster.
  • A 2-node Provider HA cluster (1 Primary, 1 Standby) configuration may be functional in some deployments, but is not recommended due to its incomplete HA status and single point of failure.

In HA mode, VMware Data Services Manager employs continuous replication from the Primary Provider to all Standby Provider nodes. The Provider HA cluster is stateless; any Provider node can serve a read request at any time. Write requests are always forwarded to the Primary Provider node.

Restrictions

Keep the following restrictions in mind as you plan your Provider HA deployment:

  • It is recommended that you add a load balancer to frontend all of the Provider instances. VMware Data Services Manager clients will access the load balancer IP address, and the load balancer will distribute the traffic to any of the Provider nodes in the HA cluster.
  • VMware Data Services Manager supports Provider VM log aggregation and filtering only when you have configured Elastic Search settings.
  • VMware Data Services Manager does not automatically handle log file purging. You must manually configure log rotation.
  • All Standby Providers must be running the same version of the VMware Data Services Manager control plane software as the Primary Provider (i.e. use the same .ova).
  • VMware Data Services Manager supports only manual failover to promote a Standby to Primary; you must run the failover --operation promote command on the Standby Provider after you ssh into the Provider VM.
  • VMware Data Services Manager does not support deleting a Standby Provider via the console. You must use the VMware Data Services Manager API to remove a Standby Provider from the HA cluster.
  • After you delete a Standby Provider, the node can not be re-registered as a Standby; you must deploy a fresh Provider VM.
  • Provider software update is a manual process; you must update the Provider software from each individual Provider node.
  • VMware Data Services Manager does not support moving from Provider HA mode to standalone mode. If only a single Primary Provider remains in your HA cluster and you want to return to standalone Provider mode, you must deploy a new Provider VM and restore it from a backup of the remaining (Primary) Provider node.
  • VMware Data Services Manager makes every attempt to synchronize Agent VMs when the Primary Provider changes. If an Agent is down and does not get advised, you must use the VMware Data Services Manager API to notify the Agent to update its RabbitMQ settings.

About Provider Software Updates and High Availability Mode

All nodes in the Provider HA cluster must be running the same version of the VMware Data Services Manager .ova.

You stage and update the software on one or more Provider nodes via the Update Manager view in the Provider console, or by using VMware Data Services Manager API /updatemanager/ endpoints.

VMware Data Services Manager does not support rolling updates. You must manually update the VMware Data Services Manager software on each of your Primary and Standby Provider nodes:

  • Do not stage and update the Provider software through the load balancer. You must trigger the update using the actual FQDN or IP address of each Provider node.
  • You must update the Primary Provider first.
  • No particular order is required when you update the software on the Standby Providers, but you must update the Standbys sequentially, not simultaneously.

VMware Data Services Manager triggers Agent software updates only after the software is updated on all nodes in the Provider HA cluster.

check-circle-line exclamation-circle-line close-line
Scroll to top icon