This topic tells you about how multi-site replication works in VMware SQL with MySQL for Tanzu Application Service and contains information to help you decide whether to use multi-site replication.
Multi-site capability in VMware Tanzu for MySQL is a disaster recovery solution that enables developers to provision leader-follower instances across two foundations. These leader-follower service instances comprise instances that are configured for replication. Operators can configure two foundations to be in the same data center or spread across multiple data centers or geographical regions.
Within your primary foundation you have a choice of topologies for your replication leader instance:
The secondary foundation always uses a multi‑site replication for the replication follower instance.
Important VMware Tanzu for MySQL supports using a High-availability cluster service instance as a multi-site leader only for clusters running MySQL version 8.0.x or greater. The MySQL version is set in the plan definition section of Ops Manager. For more information, see Configuring Service Plans.
The foundation types are:
The multi‑site replication and high-availability cluster plan types are configured separately from the leader-follower plan type.
For information about enabling multi-site replication, see Preparing for multi-site replication.
For information for developers about using multi-site replication, see Using VMware Tanzu for MySQL for multi-site replication.
Multi-site replication has the following benefits:
Resilience for Service Instances: Developers can trigger a failover to maintain app uptime during a data center outage. For more information, see Triggering multi-site replication failover and switchover.
Data Center Upgrades with Zero Downtime: Operators can upgrade data centers without taking databases offline by triggering a switchover first.
Support for Multiple Cloud Deployment Models: Operators can configure multi-site replication with a single cloud or hybrid cloud deployment model. Both deployment models have the same end-user experience.
Support for Active-Passive and App-Layer Active-Active Disaster Recovery: For more information, see About Active-Passive topology and About App-Layer Active-Active topology.
In an active-passive topology, when all foundations and workloads are healthy, all app traffic is directed to the primary foundation. The secondary foundation receives no app traffic.
VMware recommends using this topology when your secondary foundation is scaled down, generally inactive, and does not receive significant app traffic.
For information about active-passive failover and switchover, see About failover and switchover.
The following diagram describes the active-passive topology in a healthy state:
In the previous diagram:
In an app-layer active-active topology, when all foundations and workloads are healthy, app traffic is directed to both the primary and secondary foundation.
VMware recommends using this topology when both your primary and secondary foundations are scaled up and are expected to serve traffic.
For information about app-layer active-active failover and switchover, see About failover and switchover.
The following diagram describes the app-layer active-active topology in a healthy state:
In the previous diagram:
The GLB directs traffic to the apps in the primary and secondary foundations. The apps in the primary and secondary foundation issue transactions to the leader service instance.
The app in the secondary foundation issues transactions to the leader:
The follower service instance in the secondary foundation continuously replicates data from the leader service instance in the primary foundation.
The app-layer active-active topology does not enable multi-primary (bidirectional) replication. The follower service instance is read-only, and apps can write only to the leader.
VMware Tanzu for MySQL prioritizes data consistency over availability. Therefore, VMware Tanzu for MySQL does not trigger failover or switchover automatically and developers must manually trigger a failover or switchover. For instructions about triggering failover or switchover, see Triggering multi-wite replication failover and switchover.
The following table describes when you can trigger a failover or switchover:
Failover if... | Switchover if... |
---|---|
|
|
For information about multi-site replication topologies in an healthy state, see About Active-Passive topology and About App-Layer Active-Active topology.
In both the active-passive and app-layer active-active topologies, if a developer triggers a failover, all app traffic is directed to the leader in the secondary foundation and the primary foundation receives no app traffic.
The following diagram describes what happens when you trigger a failover for a multi-site replication topology:
In the previous diagram:
In an active-passive topology, if a developer triggers a switchover, all app traffic is directed to the leader in the secondary foundation. The primary foundation receives no app traffic.
The following diagram describes what happens when you trigger a switchover in an active-passive topology:
In the preceding diagram:
If a developer triggers a switchover, app traffic is still directed to both the primary and secondary foundation. However, the leader service instance is in secondary foundation.
The following diagram describes what happens when you trigger a switchover in an active-active topology:
In the preceding diagram:
The GLB directs traffic to the apps in the primary and secondary foundations. The apps in the primary and secondary foundation issue transactions to the leader service instance in the secondary foundation.
The app in the primary foundation issues transactions to the leader:
The follower service instance in the primary foundation continuously replicates data from the leader service instance in the secondary foundation
If external-access is enabled, replication can occur between two non-routable foundations. Replication traffic goes through tcp-router.
The following diagram describes what happens when you enable external-access for multi-site replication:
In the preceding diagram:
Before external-access is enabled, replication traffic goes directly to the follower.
After external-access is enabled, replication traffic goes through tcp-router to the follower.
Before you deploy a multi-site configuration, consider the following requirements.
When calculating IaaS usage, take into account that each multi-site configuration deploys two service instances: a primary either of type multi‑site replication or HA cluster, and a secondary of type multi‑site replication.
For more information, see Setting limits for on-demand service instances and Persistent Disk Usage.
You must consider the multi‑site replication instances that are deployed in data centers are a geographically farther apart experience higher latencies.
For information about the standard networking rules, see Required Networking Rules for multi‑site replication.
With the addition of a new feature that allows scaling up and down between single node multisite leaders and Highly Available (HA) leaders, extra care must be taken to ensure that the availability zones (AZ) are compatible between the two plans.
To minimize impact of an availability zone (AZ) outage and to remove single points of failure, VMware recommends that you provision three AZs if using HA deployments. With three AZs, nodes are deployed to separate AZs.
Important In order to scale between the two plan types, the availability zones configured for the HA plan must match those configured for the single node multi-site plan.
For more information, see Availability Using Multiple AZs in VMware SQL with MySQL for Tanzu Application Service Recommended Usage and Limitations.
Multi-site replication has the following limitation: