This topic describes how highly available (HA) clusters work in VMware Tanzu SQL with MySQL for VMs and contains information to help you decide whether to use HA clusters.
In an HA cluster topology, a cluster consists of three nodes, where each node contains the same set of data synchronized across nodes. Data is replicated across nodes simultaneously or not written at all if replication fails even on a single node.
With three nodes, HA clusters are highly available and automatically resilient to failure. This is because if a node loses contact with the other two nodes, the other two nodes can remain in contact and continue to accept transactions. Because the number of nodes is odd, the two nodes can assume they represent the primary component. For more information about cluster availability, see Percona XtraDB Cluster: Quorum and Availability of the cluster in the Percona XtraDB Cluster documentation.
In HA topology, the Tanzu SQL for VMs service uses the following:
Three Switchboard proxies that are co-located on the MySQL nodes. The proxies gracefully handle failure of nodes which enables fast failover to other nodes within the cluster. For more information, see MySQL Proxy.
A jumpbox VM called
mysql-monitor for monitoring, diagnosing and backing up MySQL nodes. You can run the
mysql-diag tool on the
mysql-monitor VM to view the status of your cluster nodes. For more information about
mysql-diag, see Running mysql-diag.
The following diagram shows an HA cluster in three availability zones (AZs):
The diagram shows an app communicating with a MySQL cluster using BOSH DNS. Each of the three MySQL nodes is located in its own AZ. Each MySQL node contains a proxy and a server, and has its own disk. Arrows are shown connecting the MySQL nodes, indicating that they can communicate with each other across AZs. The dotted lines indicate data replication from the primary to the secondary databases.
Traffic from apps and clients is sent round-robin to the proxies on all nodes over BOSH DNS. The proxies direct their traffic to the server on the primary node.
If the primary server fails, a secondary node is promoted to become the new primary. The proxies will update to ensure that traffic is sent to the new primary.
For more information about other availability options supported by Tanzu SQL for VMs, see Availability Options.
Before deciding to use HA clusters for Tanzu SQL for VMs service plans, consider the following requirements.
When calculating IaaS usage, you must take into account that each HA instance requires three VMs. Therefore, the resources used for an HA plan must be tripled. For more information, see Resource Usage Planning for On-Demand Plans.
To minimize impact of an availability zone (AZ) outage and to remove single points of failure, VMware recommends that you provision three AZs if using HA deployments. With three AZs, nodes are deployed to separate AZs.
For more information, see Availability Using Multiple AZs in VMware Tanzu SQL with MySQL for VMs Recommended Usage and Limitations.
In addition to the standard networking rules needed for Tanzu SQL for VMs, the operator must ensure HA cluster specific network rules are also configured.
For information about HA cluster specific networking rules, see Required Networking Rules for Highly Available Cluster Plans.
When deployed with HA topology, Tanzu SQL for VMs runs three nodes. This cluster arrangement imposes some limitations which do not apply to single node or leader-follower MySQL database servers. For more information about the difference between single node, leader-follower, and HA cluster topologies, see Availability Options.
HA clusters perform validations at startup and during runtime to prevent you from using MySQL features that are not supported. If a validation fails during startup, the server is halted and throws an error. If a validation fails during runtime the operation is denied and throws an error.
These validations are designed to ensure optimal operation for common cluster setups that do not require experimental features and do not rely on operations not supported by HA clusters.
For more information about HA limitations, see Percona XtraDB Cluster Limitations in the Percona XtraDB Cluster documentation.
HA clusters only supports the InnoDB storage engine. The InnoDB is the default storage engine for new tables. Pre-existing tables that are not using InnoDB are at risk because they are not replicated within a cluster.
Large DDL statements can lock all schemas. This can be mitigated by using the Rolling Schema Upgrade (RSU) method. For instructions on how to use the RSU method, see Using a rolling schema upgrade (RSU) in the VMware Tanzu Support knowledge base.
Large transaction sizes can inhibit the performance of the cluster and apps using the cluster because they are buffered in memory.
You cannot execute a DML statement in parallel with a DDL statement, if both statements affect the same tables. If they are expected in parallel, the DML and DDL statements are both applied immediately to the cluster, causing errors.
All tables must have a primary key. You can use multi-column primary keys. This is because HA clusters replicate using row based replication and ensure unique rows on each instance.
Explicit table locking is not supported. For more information, see EXPLICIT TABLE LOCKING in the Percona XtraDB Cluster documentation.
By default, auto-increment variables are not sequential and each node has gaps in IDs. This prevents auto-increment replication conflicts across the cluster. For more information, see wsrep_auto_increment_control in the Percona XtraDB Cluster documentation.
Users cannot change the session behavior of auto-increment variables. This is because the HA cluster controls the behavior of these variables. For example, if an app runs
SET SESSION auto_increment_increment, the cluster ignores the change.
In InnoDB, some transactions can cause deadlocks. You can minimize deadlocks in your apps by rewriting transactions and SQL statements. For more information about deadlocks, see Deadlocks in InnoDB and How to Minimize and Handle Deadlocks in MySQL documentation.
Table partitioning can cause performance issues in the cluster. This is as a result of the implicit table locks that are used when running table partition commands.
This section describes some defaults that Tanzu SQL for VMs applies to HA components.
When a new node is added to or rejoins a cluster, a primary node from the cluster is designated as the state donor. The new node synchronizes with the donor through the State Snapshot Transfer (SST) process.
Tanzu SQL for VMs uses Xtrabackup for SST, which lets the state donor node to continue accepting reads and writes during the transfer. HA clusters default to using rsync to perform SST, which is fast, but blocks requests during the transfer. For more information about Percona XtraBackup SST Configuration in the Percona XtraDB Cluster documentation.
By default, HA clusters splits large
Load Data commands into small manageable units. This deviates from the standard behavior for MySQL. For more information, see wsrep_load_data_splitting in the Percona XtraDB Cluster documentation.
These are the maximum transaction sizes set for HA clusters:
The maximum write-set size allowed in HA clusters is 2 GB. For more information, see wsrep_max_ws_size.
The maximum number of rows each write-set can contain defaults to no limit. For more information, see wsrep_max_ws_rows.
VMware Tanzu SQL with MySQL for VMs uses a proxy to send client connections to the healthy MySQL database cluster nodes in a highly available cluster plan. Using a proxy gracefully handles failure of nodes, enabling fast, failover to other nodes within the cluster. When a node becomes unhealthy, the proxy closes all connections to the unhealthy node and re-routes all subsequent connections to a healthy node.
The proxy used in Tanzu SQL for VMs is Switchboard. Switchboard was developed to replace HAProxy as the proxy tier for the high availability cluster for MySQL databases in Tanzu SQL for VMs.
Switchboard offers the following features:
MySQL Server Access
MySQL clients communicate with nodes through this network port. These connections are automatically passed through to the nodes.
Switchboard and API
Operators can connect to Switchboard to view the state of the nodes.
For more information about monitoring proxy health status, see Monitoring Node Health.