This topic describes the replication canary for an highly available (HA) cluster. The replication canary runs on the jumpbox VM and monitors an HA cluster to ensure that replication is working.
The replication canary writes to a private dataset in the cluster and attempts to read the written data from each node. It pauses between read and write mode to ensure that the write-sets have been replicated. The private dataset does not use a significant amount of disk capacity.
When replication fails, the canary cannot read the data from all the nodes and does the following:
When replication fails, data can be lost. Contact Support immediately in the case of replication failure.
If the canary detects replication failure, it immediately sends an email through the VMware Tanzu Application Service for VMs (TAS for VMs) notification service. You must have configured email notifications in TAS for VMs for the replication canary to send emails.
For more information about configuring email notifications, see the Configure Email Notifications procedure for your IaaS in the Tanzu Operations Manager documentation.
The notification service sends emails similar to the following:
Subject: CF Notification: p-mysql Replication Canary, alert 417
This message was sent directly to your email address.
{alert-code 417}
This is an email to notify you that the MySQL service's replication canary
has detected an unsafe cluster condition in which replication is not
performing as expected across all nodes.
When the canary detects replication failure, it deactivates connections to the database cluster through the proxies. When the replication issue is resolved, the canary automatically restores client access to the cluster. You can determine if the canary deactivated the cluster access by observing the cluster using the Switchboard API.
To determine if cluster access is deactivated, do the following:
Do the prerequisite procedure in Monitor node health.
To view cluster access, run the following command:
curl https://USERNAME:PASSWORD@N-HOSTNAME/v0/cluster
Where:
USERNAME
is the username
you recorded in step 1.PASSWORD
is the password
you recorded in step 1.N
is 0, 1, or 2 depending on the proxy you want to connect to.HOSTNAME
is the hostname
you recorded in step 1.For example:
$ curl https://abcdefghijklmno:012345678912345@0-proxy.123abc45-67d8-912e-34f5-g34612c10dba.org.dedicated-mysql.cf-app.com/v0/cluster [ { "currentBackendIndex":0, "trafficEnabled":false, "message":"Disabling cluster traffic", "lastUpdated":"2016-07-27T05:16:29.197754077Z" } ]
When cluster access is deactivated, trafficEnabled
is set to false.
If you must restore client access to the cluster while replication is failing, contact Support.
For more information about the Switchboard API, see Monitoring node health.