VMware Postgres includes pg_auto_failover, an open source extension for PostgreSQL. pg_auto_failover facilitates the creation of a High Availability (HA) configuration, monitors replication between Postgres instances, and manages automatic failover for a group of Postgres nodes.
pg_auto_failover was introduced in VMware Postgres 12.6.1, 11.11.1, and 10.16.1.
This page describes:
The pg_auto_failover architecture requires three key components as a minimum:
This architecture guarantees availability of the Postgres service to users and applications, while automating maintenance operations.
For more details on the key architecture concepts of pg_auto_failover, see Architecture Basics in the pg_auto_failover documentation.
pg_auto_failover is included with the VMware Postgres release, and installed during the rpm
package installation. For details see Installing the Postgres Server RPM.
Install the VMware Postgres release on all three nominated hosts, the monitor, primary, and secondary.
pg_auto_failover uses the pg_autoctl
utility with subcommands to initialize, manage, and orchestrate the replicated environment. Login to each host, and confirm the pg_auto_failover installation by using:
$ pg_autoctl --version
For the pg_autoctl command reference see Main pg_autoctl commands.
Prior to creating the HA architecture in your environment, review the important decision points discussed in Configuring pg_auto_failover.
pg_auto_failover relies on the creation of a monitor service that manages one or several HA formations. A formation can contain a group of two or more Postgres nodes.
In the following examples, replace the example hostnames MONITOR-HOST
, HA-PRIMARY
, and HA-SECONDARY
with your network’s fully qualified hostnames.
Create the monitor service, similar to:
$ export PGDATA=/monitor
$ pg_autoctl create monitor --no-ssl --auth trust --run
where PGDATA=/monitor
is set here for example clarity. Alter the PGDATA
environment variable to reflect the data location on your monitor host.
IMPORTANT: This example creates a monitor in a test environment with no security restrictions (--auth trust
), which is not a recommended practice for production environments.
For further configuration details, including security options, refer to the example in the pg_auto_failover Run a Monitor tutorial or the pg_auto_failover Monitor page.
Add the new primary and secondary nodes to the formation.
On the monitor node, obtain the monitor connection URI, which is used to add the primary and the secondary nodes to the formation.
$ pg_autoctl show uri --formation monitor
postgres://autoctl_node@MONITOR-HOST/pg_auto_failover?sslmode=prefer
Login to the primary node, and bring up a Postgres instance, adding it to the HA formation:
$ pg_autoctl create postgres \
--auth trust \
--ssl-self-signed \
--pgdata /primary_data \
--monitor postgres://autoctl_node@MONITOR-HOST/pg_auto_failover?sslmode=prefer \
--run
NOTE: When creating the instance, ensure that the $PGDATA
directory is not a mount point, and that it can be recreated and deleted by the pg_autofailover user and Postgres. If $PGDATA
is a mountpoint for a filesystem, change it to be a directory in the mounted filesystem. For example: if a filesystem is mounted to /var/lib/psql/data
and $PGDATA
is set to /var/lib/pgsql/data
, change $PGDATA
to /var/lib/pgsql/data/data
.
Login to the secondary node, which does not contain any database files and is not initialized, and add it to the formation, similar to:
$ pg_autoctl create postgres \
--auth trust \
--ssl-self-signed \
--pgdata /secondary_data \
--monitor postgres://autoctl_node@MONITOR-HOST/pg_auto_failover?sslmode=prefer \
--run
For further examples, refer to the pg_auto_failover documentation Bring up the nodes, or the pg_auto_failover documentaton command reference for pg_autoctl create postgres.
On the monitor, confirm the primary and secondary node architecture, similar to:
$ pg_autoctl show state --pgdata monitor
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+-------------------+-----------+-----------+-------------------+------------------
node_1 | 1 | HA-PRIMARY:5432 | 0/3000060 | yes | primary | primary
node_2 | 2 | HA-SECONDARY:5432 | 0/3000060 | yes | secondary | secondary
For further details, refer to pg_autoctl show state.
Verify the default formation and the postgres uri used to connect to the pg_auto_failover nodes.
$ pg_autoctl show uri
Type | Name | Connection String
-----------+---------+-------------------------------
monitor | monitor | postgres://autoctl_node@MONITOR-HOST:5432/pg_auto_failover?sslmode=prefer
formation | default |
For details see pg_autoctl show uri.
For examples of failover scenarios, refer to the pg_auto_failover documentation.
This topic describes the migration process from an existing Tanzu Postgres Patroni High Availability (HA) solution, to pg_auto_failover HA.
IMPORTANT: Moving from a Patroni HA configuration to pg_auto_failover requires system downtime.
Summary:
The documented migration process is based on a two node Patroni HA installation example, using VMware Postgres 12.6.1 instances. These prerequisites apply to the example scenario, so you must adjust the requirements to suit your environment. To upgrade a two-node VMware Postgres installation you require:
pg_dumpall
and psql
restore.(Optional) Take a backup of the Patroni cluster primary node, to ensure you can recover the initial state of the 12.6.1 database node.
(Optional) Save your Patroni configuration files on your local client. These settings will be used as a reference to configure your new pg_auto_failover environment.
Stop the Patroni cluster and the VMware Postgres 12.6.1 instances.
Install the VMware Postgres 15.1.0 release on each of the new nodes: primary, secondary, and pg_auto_failover monitor. For details see Installing the Postgres Server RPM.
Initialize and start the 15.1.0 VMware Postgres instance on the new primary node.
$ initdb -D /new_primary_data
where /new_primary_data
is the new database file location. Replace with your own values.
$ pg_ctl -D /new_primary_data start
From the new primary node, take a copy of the 12.6.1 data files using pg_dumpall
with a connection URI similar to:
$ pg_dumpall --dbname=postgres://<user>@<IP-address-old-primary>:<port-old-primary>/postgres?sslmode=prefer -f /home/postgres/primary_data.sql
where the URI protocol notation is in the format of <user>@<host>:<port>/<dbname>?optionkey=optionvalue
. For more details on the URI parameters, refer to Connection URIs in the PostgreSQL documentation.
For more details on pg_dumpall, refer to the PostgreSQL pg_dumpall documentation.
Using your preferred copy method (for example, scp
), copy the /home/postgres/primary_data.sql
file from the old primary node to the new primary. In this example, we used /tmp/new_primary_data.sql
as the target location in the new primary.
On the new primary node, use psql
to restore the data files, similar to:
$ psql postgres://<user>@<IP-of-new-primary>:<port of new primary>/postgres?sslmode=prefer -f /tmp/new_primary_data.sql
where /tmp/new_primary_data.sql
is the source file for the restore.
Inspect the data by running test queries.
Bring down the new 15.1.0 instance.
The following examples do not use any strict security settings. Use them only as guidelines.
On the new monitor host, create the pg_auto_failover monitor:
$ pg_autoctl create monitor \
--auth trust \
--ssl-self-signed \
--pgdata /monitor \
--run
Add the new primary node to the HA formation. Login to the new primary host, and execute pg_autoctl create postgres
similar to:
$ pg_autoctl create postgres \
--auth trust \
--ssl-self-signed \
--monitor postgres://autoctl_node@MONITOR-HOST/pg_auto_failover?sslmode=prefer \
--run
where MONITOR-HOST
is the monitor node example hostname, and 5000 is the example port the monitor instance is running on.
Login to the secondary node, which does not contain any database files and is not initialized, and add it to the HA formation, similar to:
$ pg_autoctl create postgres \
--auth trust \
--ssl-self-signed \
--pgdata /new_secondary_data \
--monitor postgres://autoctl_node@MONITOR-HOST/pg_auto_failover?sslmode=prefer \
--run
Verify the state of the configuration, using pg_autoctl show state
.
Create a test table on the new primary node and verify that it appears on the secondary node.
Adjust the pg_auto_failvoer settings to match your old Patroni environment, where applicable. Patroni configuration parameters do not map one to one with pg_auto_failover setup parameters.
Establish client connectivity to the new pg_auto_failover cluster and test your client applications.
When all tests are complete, you may drop the old hosts and their data.