This topic describes database failover functionality for appliances running version 9.0.0 or later. For these appliances, failover is entirely automated except when a service provider appliance shuts down unexpectedly, in which case there is still a manual process.
Note: This topic only applies to appliances running version 9.0.0 or later. For information pertinent to older appliances, see
Database Failover - Legacy Appliances.
Automated Failover
Automated failover functionality includes the following.
- Single datasource connection - All appliances now use a single datasource connection which is proxied through the new pgbouncer service. Pgbouncer is a connection pooler which maintains connections to the underlying postgres databases while providing a single connection point for the appliance services to use. This service also handles all failover and primary migrations; it is updated automatically to point to the new primary database, enabling a seamless transition between primaries.
- Controlled switchover of primary database on appliance shutdown or restart - During any guest-initiated shutdown or restart of the current primary database, the primary role is migrated to the other appliance in the HA pair in a controlled manner. This causes only minor disruption to the functioning of the appliances, primarily impacting desktop connections via the Blast (web) portal. Desktop connections made via the Horizon Client are unaffected. There is also a brief period of horizonadmin unavailability, and users requesting a desktop connection during the switchover might get an error that no desktops are available (but retrying shortly afterwards should allow them to connect to a desktop).
- Auto failover of primary database for unexpected failure of tenant appliance - Primary failures are now automatically detected, followed by a request to the service provider appliances to verify that the appliance is actually down. The failover process is then initiated after a three minute delay. This new failover process is less disruptive than past failover processes, and leaves the slony cluster in a normal replicating state afterwards – no slony reinit is required after a failover occurs.
These features are supported by the services described below.
Service | Description | Log location | Notes |
---|---|---|---|
Pgbouncer | Connection pooler that brokers database connections between the platform services and postgres. | /var/log/pgbouncer/ |
|
Dbmonitor | Monitoring service that does the following:
|
/var/log/dbmonitor/ | |
Switchover | Script runs on shutdown to perform switchover action. | /var/log/desktone/slony-services | |
Resubscribe | Script runs on startup to perform resubscribe action. | /var/log/desktone/slony-services |
Manual Failover of Service Provider Appliance
If a service provider appliance has an unexpected shutdown, you can conduct a manual failover by performing the steps below on the other service provider appliance.
- Run the failover-slony-master script (located in /usr/local/desktone/scripts/) as root:
failover-slony-master <database type> '<database password>'
where <database type> is fdb, edb, or avdb. - Confirm that the pgbouncer.ini file is pointing to the front-end IP address of the current appliance:
/usr/local/desktone/scripts# grep '<IP address>' /etc/pgbouncer/pgbouncer.ini
- Reload the pgbouncer service:
service pgbouncer reload
- Confirm primary status and replication by running slony-status:
/usr/local/desktone/scripts# slony-status <org #>