Data Management for VMware Tanzu High Availability refers to the deployment of the Provider VM. By default, when you deploy a Provider VM it runs in standalone mode. You can choose to deploy a Provider High Availability (HA) cluster.
Deploying your Provider in HA mode may protect against the following types of failures:
- VM failure
- ESXi host failure or server hardware failures when you deploy the Provider nodes on separate hosts, and the Providers are using different datastores.
- Data Management for VMware Tanzu service failure
In a Provider HA cluster, you deploy multiple Provider VMs. You designate one node as the Primary; every other node that you add to the cluster is a Standby Provider node:
- Data Management for VMware Tanzu supports a minimum of 3 (1 Primary, 2 Standby), and a maximum of 4 (1 Primary, 3 Standby), nodes in a Provider HA cluster.
- A 2-node Provider HA cluster (1 Primary, 1 Standby) configuration may be functional in some deployments, but is not recommended due to its incomplete HA status and single point of failure.
In HA mode, Data Management for VMware Tanzu employs continuous replication from the Primary Provider to all Standby Provider nodes. The Provider HA cluster is stateless; any Provider node can serve a read request at any time. Write requests are always forwarded to the Primary Provider node.
Keep the following restrictions in mind as you plan your Provider HA deployment:
- It is recommended that you add a load balancer to frontend all of the Provider instances. DMS clients will access the load balancer IP address, and the load balancer will distribute the traffic to any of the Provider nodes in the HA cluster.
- Data Management for VMware Tanzu supports Provider VM log aggregation and filtering only when you have configured Elastic Search settings.
- Data Management for VMware Tanzu does not automatically handle log file purging. You must manually configure log rotation.
- All Standby Providers must be running the same version of the Data Management for VMware Tanzu control plane software as the Primary Provider (i.e. use the same
- Data Management for VMware Tanzu supports only manual failover to promote a Standby to Primary; you must run the
failover --operation promote command on the Standby Provider after you
ssh into the Provider VM.
- Data Management for VMware Tanzu does not support deleting a Standby Provider via the console. You must use the Data Management for VMware Tanzu API to remove a Standby Provider from the HA cluster.
- After you delete a Standby Provider, the node can not be re-registered as a Standby; you must deploy a fresh Provider VM.
- Provider software update is a manual process; you must update the Provider software from each individual Provider node.
- Data Management for VMware Tanzu does not support moving from Provider HA mode to standalone mode. If only a single Primary Provider remains in your HA cluster and you want to return to standalone Provider mode, you must deploy a new Provider VM and restore it from a backup of the remaining (Primary) Provider node.
- Data Management for VMware Tanzu makes every attempt to synchronize Agent VMs when the Primary Provider changes. If an Agent is down and does not get advised, you must use the DMS API to notify the Agent to update its RabbitMQ settings.
About Provider Software Updates and High Availability Mode
All nodes in the Provider HA cluster must be running the same version of the Data Management for VMware Tanzu
You stage and update the software on one or more Provider nodes via the Update Manager view in the Provider console, or by using Data Management for VMware Tanzu API
Data Management for VMware Tanzu does not support rolling updates. You must manually update the DMS software on each of your Primary and Standby Provider nodes:
- Do not stage and update the Provider software through the load balancer. You must trigger the update using the actual FQDN or IP address of each Provider node.
- You must update the Primary Provider first.
- No particular order is required when you update the software on the Standby Providers, but you must update the Standbys sequentially, not simultaneously.
Data Management for VMware Tanzu triggers Agent software updates only after the software is updated on all nodes in the Provider HA cluster.