About multi-site replication

This topic tells you about how multi‑site replication topology works in VMware SQL with MySQL for Tanzu Application Service and contains information to help you decide whether to use the Multi‑Site Replication plan.

The multi‑site replication topology in VMware Tanzu for MySQL is a disaster recovery solution that enables developers to provision a leader-follower instance across two foundations. This leader-follower service instance is comprised of two single node instances that are configured for replication. Operators can configure two foundations to be in the same data center or spread across multiple data centers or geographical regions.

The foundation types are:

Primary Foundation: This foundation is deployed in your main data center. Generally, this data center receives the majority of app traffic. VMware Tanzu for MySQL assumes that the leader is deployed on this foundation when healthy.
Secondary Foundation: This foundation is deployed in your failover data center. Generally, this data center receives less app traffic than the primary foundation, or no traffic at all. VMware Tanzu for MySQL assumes that the follower is deployed on this foundation unless a developer triggers a failover.

The Multi‑Site Replication plan type is configured separately from the leader-follower service plan.

For information about enabling Multi‑Site Replication, see Preparing for Multi‑Site Replication.

For information for developers about using Multi‑Site Replication, see Using VMware Tanzu for MySQL for Multi‑Site Replication.

Multi‑Site Replication benefits

The Multi‑Site Replication plan has the following benefits:

Resilience for Service Instances: Developers can trigger a failover to maintain app uptime during a data center outage. For more information, see Triggering Multi‑Site Replication failover and switchover.
Data Center Upgrades with Zero Downtime: Operators can upgrade data centers without taking databases offline by triggering a switchover first.
Support for Multiple Cloud Deployment Models: Operators can configure multi‑site replication with a single cloud or hybrid cloud deployment model. Both deployment models have the same end-user experience.
Support for Active-Passive and App-Layer Active-Active Disaster Recovery: For more information, see About Active-Passive topology and About App-Layer Active-Active topology.

Active-Passive topology

In an active-passive topology, when all foundations and workloads are healthy, all app traffic is directed to the primary foundation. The secondary foundation receives no app traffic.

VMware recommends using this topology when your secondary foundation is scaled down, generally inactive, and does not receive significant app traffic.

For information about active-passive failover and switchover, see About failover and switchover.

The following diagram describes the active-passive topology in a healthy state:

This active-passive topology diagram is described here

In the previous diagram:

The global DNS load balancer (GLB) directs traffic to the app in the primary foundation. This app issues transactions to the leader service instance.
The follower service instance in the secondary foundation continuously replicates data from the leader service instance in the primary foundation.

App-Layer Active-Active topology

In an app-layer active-active topology, when all foundations and workloads are healthy, app traffic is directed to both the primary and secondary foundation.

VMware recommends using this topology when both your primary and secondary foundations are scaled up and are expected to serve traffic.

For information about app-layer active-active failover and switchover, see About failover and switchover.

The following diagram describes the app-layer active-active topology in a healthy state:

This app-layer active-active topology diagram is described here

In the previous diagram:

The GLB directs traffic to the apps in the primary and secondary foundations. The apps in the primary and secondary foundation issue transactions to the leader service instance.

The app in the secondary foundation issues transactions to the leader:
1. Connects to the follower service instance
2. Issues transactions to the follower service instance
3. Forwards the transactions from the follower service instance to the leader
The follower service instance in the secondary foundation continuously replicates data from the leader service instance in the primary foundation.

In the Multi‑Site Replication plan, the app-layer active-active topology does not enable multi-primary replication. Therefore, the follower service instance is read-only. Apps can only write to the leader.

Failover and switchover

VMware Tanzu for MySQL prioritizes data consistency over availability. Therefore, VMware Tanzu for MySQL does not trigger failover or switchover automatically and developers must manually trigger a failover or switchover. For instructions about triggering failover or switchover, see see Triggering Multi‑Site Replication Failover and switchover.

The following table describes when you can trigger a failover or switchover:

Failover if...	Switchover if...
The leader MySQL process has crashed or is unhealthy and is not automatically recovered by BOSH. The leader VM is destroyed or unrecoverable. The availability zone (AZ) for the leader VM experiences an unexpected outage. The data center for the leader VM experiences an unexpected outage.	Both the leader and the follower instance are healthy. You plan to do foundation or data center upgrades or maintenance on your primary site. For example, upgrading stemcells or data center hardware. You plan to degrade performance on the primary site.

For information about multi‑site replication topologies in an healthy state, see About Active-Passive topology and About App-Layer Active-Active topology.

Failover

In both the active-passive and app-layer active-active topologies, if a developer triggers a failover, all app traffic is directed to the leader in the secondary foundation and the primary foundation receives no app traffic.

The following diagram describes what happens when you trigger a failover for a multi‑site replication topology:

This multi-site failover diagram is described here

In the previous diagram:

The GLB directs traffic to the app in the secondary foundation. This app issues transactions to the leader service instance in the secondary foundation.
The leader service instance in the secondary foundation does not replicate data to another service instance.

Active-Passive switchover

In an active-passive topology, if a developer triggers a switchover, all app traffic is directed to the leader in the secondary foundation. The primary foundation receives no app traffic.

The following diagram describes what happens when you trigger a switchover in an active-passive topology:

This active-passive switchover diagram is described here

In the preceding diagram:

The GLB directs traffic to the app in the secondary foundation. This app issues transactions to the leader service instance in the secondary foundation.
The leader service instance in the secondary foundation replicates data to the follower service instance in the primary foundation.

App-Layer Active-Active switchover

If a developer triggers a switchover, app traffic is still directed to both the primary and secondary foundation. However, the leader service instance is in secondary foundation.

The following diagram describes what happens when you trigger a switchover in an active-active topology:

This active-active switchover diagram is described here

In the preceding diagram:

The GLB directs traffic to the apps in the primary and secondary foundations. The apps in the primary and secondary foundation issue transactions to the leader service instance in the secondary foundation.

The app in the primary foundation issues transactions to the leader:
1. Connects to the follower service instance.
2. Issues transactions to the follower service instance.
3. Forwards the transactions from the follower service instance to the leader.
The follower service instance in the primary foundation continuously replicates data from the leader service instance in the secondary foundation

About enabling external access

If external-access is enabled, replication can occur between two non-routable foundations. Replication traffic goes through tcp-router.

The following diagram describes what happens when you enable external-access for multi-site replication:

alt-text=""

In the preceding diagram:

Before external-access is enabled, replication traffic goes directly to the follower.
After external-access is enabled, replication traffic goes through tcp-router to the follower.

Infrastructure requirements for Multi‑Site Replication

Before you use the Multi‑Site Replication plan, consider the following requirements.

Capacity planning

When calculating IaaS usage, you must take into account that each Multi‑Site Replication instance requires two VMs. Therefore, the resources used for Multi‑Site Replication plan is twice the amount of single node plan.

For more information, see Setting limits for on-demand service instances and Persistent Disk Usage.

Networking requirements

You must consider the Multi‑Site Replication instances that are deployed in data centers are a geographically farther apart experience higher latencies.

For information about the standard networking rules, see Required Networking Rules for Multi‑Site Replication.

Multi‑Site Replication limitations

The Multi‑Site Replication plan has the following limitation:

Synchronous replication is not supported. For multi‑site replication plans, any data that is written to the leader is asynchronously replicated to the follower.