Extracts all databases in a Greenplum Database system to a single script file or other archive file.
pg_dumpall [<connection-option> ...] [<dump_option> ...]
pg_dumpall -? | --help
pg_dumpall -V | --version
pg_dumpall
is a standard PostgreSQL utility for backing up all databases in a Greenplum Database (or PostgreSQL) instance, and is also supported in Greenplum Database. It creates a single (non-parallel) dump file. For routine backups of Greenplum Database it is better to use the Greenplum Database backup utility, gpbackup, for the best performance.
pg_dumpall
creates a single script file that contains SQL commands that can be used as input to psql to restore the databases. It does this by calling pg_dump for each database. pg_dumpall
also dumps global objects that are common to all databases. (pg_dump
does not save these objects.) This currently includes information about database users and groups, and access permissions that apply to databases as a whole.
Since pg_dumpall
reads tables from all databases you will most likely have to connect as a database superuser in order to produce a complete dump. Also you will need superuser privileges to run the saved script in order to be allowed to add users and groups, and to create databases.
The SQL script will be written to the standard output. Use the [-f | --file]
option or shell operators to redirect it into a file.
pg_dumpall
needs to connect several times to the Greenplum Database coordinator server (once per database). If you use password authentication it is likely to ask for a password each time. It is convenient to have a ~/.pgpass
file in such cases.
Dump Options
PGCLIENTENCODING
environment variable to the desired dump encoding.)
pg_dump
sends the output to standard out.
ALTER OWNER
or
SET SESSION AUTHORIZATION
statements to set ownership of created database objects. These statements will fail when the script is run unless it is started by a superuser (or the same user that owns all of the objects in the script). To make a script that can be restored by any user, but will give that user ownership of all the objects, specify
-O
. This option is only meaningful for the plain-text format. For the archive formats, you may specify the option when you call
pg_restore.
Specify the superuser user name to use when deactivating triggers. This is relevant only if --disable-triggers
is used. It is better to leave this out, and instead start the resulting script as a superuser.
NoteGreenplum Database does not support user-defined triggers.
[pg\_dump](pg_dump.html)
to output detailed object comments and start/stop times to the dump file, and progress messages to standard error.
pg_dumpall
version and exit.
GRANT/REVOKE
commands).
INSERT
commands with explicit column names (
INSERT INTO <table> (<column>, ...) VALUES ...
). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL-based databases. Also, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents.
This option is relevant only when creating a data-only dump. It instructs pg_dumpall
to include commands to temporarily deactivate triggers on the target tables while the data is reloaded. Use this if you have triggers on the tables that you do not want to invoke during data reload. The commands emitted for --disable-triggers
must be done as superuser. So, you should also specify a superuser name with -S
, or preferably be careful to start the resulting script as a superuser.
NoteGreenplum Database does not support user-defined triggers.
extra_float_digits
when dumping floating-point data, instead of the maximum available precision. Routine dumps made for backup purposes should not use this option.
--exclude-database
switches. The pattern parameter is interpreted as a pattern according to the same rules used by
psql
's
\d
commands, so multiple databases can also be excluded by writing wildcard characters in the pattern. When using wildcards, be careful to quote the pattern if needed to prevent shell wildcard expansion.
IF EXISTS
clause) to drop databases and other objects. This option is not valid unless
--clean
is also specified.
INSERT
commands (rather than
COPY
). This will make restoration very slow; it is mainly useful for making dumps that can be loaded into non-PostgreSQL-based databases. Also, since this option generates a separate command for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore may fail altogether if you have rearranged column order. The
--column-inserts
option is safe against column order changes, though even slower.
COPY
or
INSERT
statements target the root of the partitioning hierarchy that contains it, rather than the partition itself. This causes the appropriate partition to be re-determined for each row when the data is loaded. This may be useful when restoring data on a server where rows do not always fall into the same partitions as they did on the original server. That could happen, for example, if the partitioning column is of type
text
and the two systems have different definitions of the collation used to sort the partitioning column.
SET statement_timeout
. Allowed values vary depending on the server version you are dumping from, but an integer number of milliseconds is accepted by all Greenplum Database versions.
pg_roles
instead of
pg_authid
. Consequently, this option also helps if access to
pg_authid
is restricted by some security policy.
pg_dumpall
waits for all files to be written safely to disk. Specifying this option causes
pg_dumpall
to return without waiting, which is faster, but implies that a subsequent operating system crash can leave the dump corrupt. This option is generally useful for testing, and should not be used when dumping data from a production installation.
ON CONFLICT DO NOTHING
to
INSERT
commands. This option is not valid unless
--inserts
or
--column-inserts
is also specified.
pg_dumpall
's, or when the output is intended to be loaded into a server of a different major version. By default,
pg_dumpall
quotes only identifiers that are reserved words in its own major version. This sometimes results in compatibility issues when dealing with servers of other versions that may have slightly different sets of reserved words. Using
--quote-all-identifiers
prevents such issues, at the price of a harder-to-read dump script.
INSERT
commands (rather than
COPY)
. Controls the maximum number of rows per
INSERT
command. The value specified must be a number greater than zero. Any error during restoring will cause only rows that are part of the problematic
INSERT
to be lost, rather than the entire table contents.
SET SESSION AUTHORIZATION
commands instead of
ALTER OWNER
commands to determine object ownership. This makes the dump more standards compatible, but depending on the history of the objects in the dump, may not restore properly. A dump using
SET SESSION AUTHORIZATION
requires superuser privileges to restore correctly, whereas
ALTER OWNER
requires lesser privileges.
CREATE TABLE
statements. This allows the distribution policy (
DISTRIBUTED BY
or
DISTRIBUTED RANDOMLY
clauses) of a Greenplum Database table to be dumped, which is useful for restoring into other Greenplum Database systems.
CREATE TABLE
statements.
pg_dumpall
command line arguments, and exit.
Connection Options
Specifies parameters used to connect to the server, as a connection string. See Connection Strings in the PostgreSQL documentation for more information.
--dbname
for consistency with other client applications, but because
pg_dumpall
needs to connect to many databases, the database name in the connection string will be ignored. Use the
-l
option to specify the name of the database used to dump global objects and to discover what other databases should be dumped.
PGHOST
or defaults to
localhost
.
postgres
database is used. If the
postgres
database does not exist, the
template1
database is used.
PGPORT
or defaults to
5432
.
PGUSER
or defaults to the current system role name.
.pgpass
file the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.
pg_dumpall
will automatically prompt for a password if the server demands password authentication. However,
pg_dumpall
will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing
-W
to avoid the extra connection attempt.
~/.pgpass
file rather than to rely on manual password entry.
pg_dumpall
to issue a
SET ROLE <rolename>
command after connecting to the database. It is useful when the authenticated user (specified by
-U
) lacks privileges needed by
pg_dumpall
, but can switch to a role with the required rights. Some installations have a policy against logging in directly as a superuser, and use of this option allows dumps to be made without violating the policy.
always
,
auto
, and
never
.
This utility also uses the environment variables supported by libpq.
Since pg_dumpall
calls pg_dump internally, some diagnostic messages will refer to pg_dump
.
The --clean
option can be useful even when your intention is to restore the dump script into a fresh cluster. Use of --clean
authorizes the script to drop and re-create the built-in postgres
and template1
databases, ensuring that those databases will retain the same properties (for instance, locale and encoding) that they had in the source cluster. Without the option, those databases will retain their existing database-level properties, as well as any pre-existing contents.
Once restored, it is wise to run ANALYZE
on each database so the query planner has useful statistics. You can also run vacuumdb -a -z
to vacuum and analyze all databases.
Do no expect the dump script to run completely without errors. In particular, because the script issues CREATE ROLE
for every role existing in the source cluster, it is certain to generate a "role already exists" error for the bootstrap superuser, unless the destination cluster was initialized with a different bootstrap superuser name. This error is harmless and can be ignored. Use of the --clean
option is likely to produce additional harmless error messages about non-existent objects; minimize those by adding --if-exists
.
pg_dumpall
requires all needed tablespace directories to exist before the restore; otherwise, database creation will fail for databases in non-default locations.
To dump all databases:
pg_dumpall > db.out
To reload database(s) from this file, you can use:
psql template1 -f db.out
To dump only global objects (including resource queues):
pg_dumpall -g --resource-queues
Check pg_dump for details on possible error conditions.