You can trigger a failover of apps from the leader to the follower.
You can trigger a failover in the following scenarios:
You can use the following metrics to determine if you need to trigger a failover:
/p.mysql/available
: This metric monitors whether the MySQL server is currently available. For more information, see Server availability.
/p.mysql/follower/seconds_behind_master
: This metric monitors how far behind the follower is in applying writes from the leader. For more information, see Leader-Follower metrics.
/p.mysql/follower/seconds_since_leader_heartbeat
: This metric monitors the number of seconds that elapse between the leader heartbeat and the replication of the heartbeat in the follower. For more information, see Leader-Follower metrics.
For information about errands used to trigger failover, see configure-leader-follower, make-leader, and make-read-only.
To trigger a failover:
To retrieve the information necessary for stopping the leader and promoting the follower:
Log in to your deployment by running:
cf login API-URL
When prompted, enter your credentials.
Target the org and space where the leader-follower service instance is located by running:
cf target -o DESTINATION-ORG -s DESTINATION-SPACE
Record the GUID of the service instance by running:
cf service SERVICE-INSTANCE-NAME --guid
Where SERVICE-INSTANCE-NAME
is the name of the leader-follower service instance.
For example:
$ cf service my-lf-instance --guid 82ddc607-710a-404e-b1b8-a7e3ea7ec063
If you do not know the name of the service instance, you can list service instances in the space with cf services
.
Follow the procedure at Gather credential and IP Address information and SSH into Tanzu Operations Manager to SSH into the Tanzu Operations Manager VM.
From the Tanzu Operations Manager VM, log in to your BOSH Director with the BOSH CLI. For more information on logging in with the BOSH CLI, see Log in to the BOSH Director.
Use the BOSH CLI to run the inspect
errand by running:
bosh -d service-instance_GUID run-errand inspect
Where GUID
is the GUID of the leader-follower service instance you recorded.
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ run-errand inspect
See the output about the leader-follower MySQL VMs and identify the instance marked Role: leader
.
For example output:
Instance mysql/ca0ed8b5-7590-4cde-bba8-7ca2935f2bd0 Exit Code 0 Stdout 2018/04/03 18:08:46 Started executing command: inspect 2018/04/03 18:08:46 IP Address: 10.0.8.11 Role: leader Read Only: false Replication Configured: false Replication Mode: async Has Data: true GTID Executed: 82ddc607-710a-404e-b1b8-a7e3ea7ec063:1-18 2018/04/03 18:08:46 Successfully executed command: inspect Stderr -
Instance mysql/37e4b6bc-2ed6-4bd2-84d1-e59a91f5e7f8 Exit Code 0 Stdout 2018/04/03 18:08:46 Started executing command: inspect 2018/04/03 18:08:46 IP Address: 10.0.8.10 Role: follower Read Only: true Replication Configured: true Replication Mode: async Has Data: true GTID Executed: 82ddc607-710a-404e-b1b8-a7e3ea7ec063:1-18 2018/04/03 18:08:46 Successfully executed command: inspect
Record the index of the instance marked Role: leader
. In this example output, the index of the leader VM is ca0ed8b5-7590-4cde-bba8-7ca2935f2bd0
.
Record the index of the other instance, which is the follower VM. In this example output, the index of the follower VM is 37e4b6bc-2ed6-4bd2-84d1-e59a91f5e7f8
.
If you still have access to the AZ where the leader VM is located, determine if the leader VM is in the AZ you want to take offline by running:
bosh -d service-instance_GUID run-errand instances
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ instances Deployment 'service-instance_f378ec82-61a4-4e66-8ed9-889c7cf5342f'
Instance Process State AZ IPs mysql/ca0ed8b5-7590-4cde-bba8-7ca2935f2bd0 failing us-central1-f 10.0.8.11 mysql/37e4b6bc-2ed6-4bd2-84d1-e59a91f5e7f8 running us-central1-a 10.0.8.10 2 instances
The leader VM might not display its status as failing
if you are performing planned maintenance.
To stop the leader VM and promote the follower VM to the new leader:
Stop any data from being written to the leader VM by setting it to read-only by running:
bosh -d service-instance_GUID \
run-errand make-read-only \
--instance=mysql/INDEX
Where:
GUID
: This is the GUID of the leader-follower service instance retrieved above.INDEX
: This is the index of the leader VM retrieved above.For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ run-errand make-read-only \ --instance=mysql/ca0ed8b5-7590-4cde-bba8-7ca2935f2bd0
If you still have access to the AZ where the leader VM is located, stop the leader VM by running:
bosh -d service-instance_GUID stop mysql/INDEX
Use the index of the leader VM retrieved above.
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ stop mysql/ca0ed8b5-7590-4cde-bba8-7ca2935f2bd0
Set the follower VM as writable by running:
bosh -d service-instance_GUID run-errand make-leader --instance=mysql/INDEX
Use the index of the follower VM retrieved above.
For example:
$ bosh -d service-instance\_82dc607-710a-404e-b1b8-a7e3ea7ec063 \ run-errand make-leader \ --instance=mysql/37e4b6bc-2ed6-4bd2-84d1-e59a91f5e7f8
If this command returns an error, re-run it until the follower VM has completed applying the transactions.
At this point, a single instance is working but leader-follower replication has not yet been restored. To fail your app over to a single instance instead of restoring leader-follower, skip to Unbind and Rebind the App.
If you are triggering a failover in response to the AZ of the leader VM going offline, you can fail your app over to a single instance by following the procedure in Unbind and Rebind the App. However, to restore leader-follower, you must regain access to the AZ where your leader VM is located before following the procedure in Clean Up Former Leader VM (Optional) and Configure the New Follower.
If you are triggering a failover in response to a failing leader VM, to clean up the former leader VM:
Deactivate resurrection, specifying the same deployment as previously shown, by running:
bosh update-resurrection off
Retrieve the CID of the failing former leader VM by running:
bosh -d service-instance_GUID instances \
--details \
--failing \
--column=”VM CID” \
--json
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 instances \ --details \ --failing \ --column=”VM CID” \ --json
Retrieve the disk CID of the failing former leader VM by running:
bosh -d service-instance_GUID instances \
--details \
--failing \
--column=”Disk CIDs” \
--json
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 instances \ --details \ --failing \ --column=”Disk CIDs” \ --json
Delete the failing former leader VM by running:
bosh -d service-instance_GUID delete-vm vm-CID
Where:
GUID
: This is the GUID of the leader-follower service instance retrieved above.CID
: This is the CID of the failing former leader VM retrieved above.For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ delete-vm i-1db9ede6
Orphan the disk of the failing former leader VM by running:
bosh -d service-instance_GUID orphan-disk DISK-CID
Where:
GUID
: This is the GUID of the leader-follower service instance retrieved above.DISK-CID
: This is the disk CID of the failing former leader VM retrieved above.For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ orphan-disk b-1db9ede6Orphaning a disk rather than deleting it preserves the disk for possible recovery. After performing recovery operations, you can reattach the disk to a VM. BOSH deletes orphaned disks after five days by default.
To start the former leader VM again and configure it as the new follower:
Create the former leader VM again by running:
bosh -d service-instance_GUID \
recreate \
mysql/INDEX
Where:
GUID
: This is the GUID of the leader-follower service instance retrieved above.INDEX
: This is the index of the former leader VM that you are re-creating$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ recreate \ mysql/ca0ed8b5-7590-4cde-bba8-7ca2935f2bd01.
Set the former leader VM as a follower, using the same values as previously shown, by running:
bosh -d service-instance_GUID \
run-errand configure-leader-follower \
--instance=mysql/INDEX
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ run-errand configure-leader-follower \ --instance=mysql/ca0ed8b5-7590-4cde-bba8-7ca2935f2bd0
Use the BOSH CLI to run the inspect
errand, using the same value as previously shown, by running:
bosh -d service-instance_GUID \
run-errand inspect
For example:
$ bosh -d service-instance\_82ddc607-710a-404e-b1b8-a7e3ea7ec063 \ run-errand inspect
If the output displays one instance marked Role: leader
and another instance marked Role: follower
, then leader-follower replication and high availability are resumed. The deployment should be in its original, working state. If you wish, you can re-enable resurrection.
Note If you have BOSH DNS enabled in Tanzu Operations Manager, you do not need to unbind and re-bind your app to a leader-follower service instance to failover the app. The operator activates BOSH DNS in BOSH Director > BOSH DNS Config.
To fail their apps over to the new leader VM, your developers must bind and rebind their apps to the leader-follower service instance.
Important If a developer rebinds an app to the VMware SQL with MySQL for TAS service after unbinding, they must also rebind any existing custom schemas to the app. When you rebind an app, stored code, programs, and triggers break. For more information about binding custom schemas, see Use custom schemas.
To unbind and rebind your app:
Unbind the app from the leader-follower service instance by running:
cf unbind-service APP-NAME SERVICE-INSTANCE-NAME
Where:
APP-NAME
: This is the name of the app bound to the leader-follower service instance.SERVICE-INSTANCE-NAME
: This is the name of the leader-follower service instance.Rebind the app to the leader-follower service instance by running:
cf bind-service APP-NAME SERVICE-INSTANCE-NAME
Restage the app by running:
cf restage APP-NAME