Note: The DD Boost storage plugin is available only in the commercial release of VMware Greenplum Backup and Restore.
The DD Boost Storage Plugin can be used with the gpbackup and gprestore utilities to perform faster backups to the Dell EMC Data Domain storage appliance, which uses Dell EMC Data Domain Boost (DD Boost) software. The DD Boost Storage Plugin supports filtered restore, which increases performance by selectively reading and restoring only the subset of backup data that you specify via gprestore
filter options.
You can also create disaster recovery scenarios using gpbackup or gpbackup_manager by replicating a backup on a separate, remote Data Domain appliance. See Replicating Backups for more information.
The DD Boost storage plugin is installed in the $GPHOME/bin
directory of your Greenplum coordinator host when you add the gpbackup
package to your installation.
To use the DD Boost storage plugin application, you first create a configuration file to specify the location of the plugin, the DD Boost login, and the backup location. For information about the configuration file, see DD Boost Storage Plugin Configuration File Format.
To run gpbackup
or gprestore
with the plugin, specify the configuration file with the option --plugin-config
.
If you perform a backup operation with the gpbackup
option --plugin-config
, you must also specify the --plugin-config
option when you restore the backup with gprestore
.
The configuration file specifies the absolute path to the Greenplum Database DD Boost storage plugin executable, DD Boost connection credentials, and Data Domain location. The configuration file is required only on the coordinator host. The DD Boost storage plugin application must be in the same location on every Greenplum Database host.
The DD Boost storage plugin configuration file uses the YAML 1.1 document format and implements its own schema for specifying the DD Boost information.
The configuration file must be a valid YAML document. The gpbackup
and gprestore
utilities process the configuration file document in order and use indentation (spaces) to determine the document hierarchy and the relationships of the sections to one another. The use of white space is significant. White space should not be used simply for formatting purposes, and tabs should not be used at all.
This is the structure of a DD Boost storage plugin configuration file.
executablepath: <absolute-path-to-gpbackup_ddboost_plugin>
options:
hostname: "<data-domain-host>"
username: "<ddboost-ID>"
password_encryption: "on" | "off"
password: "<ddboost-pwd>"
storage_unit: "<data-domain-id>"
directory: "<data-domain-dir>"
replication: "on" | "off"
replication_streams: <integer>
remote_hostname: "<remote-dd-host>"
remote_username: "<remote-ddboost-ID>"
remote_password_encryption "on" | "off"
remote_password: "<remote-dd-pwd>"
remote_storage_unit: "<remote-dd-ID>"
remote_directory: "<remote-dd-dir>"
restore_subset: "on" | "off"
$GPHOME/bin/gpbackup_ddboost_plugin
. The plugin must be in the same location on every Greenplum Database host.
Required. Begins the DD Boost storage plugin options section.
password
option value is encrypted. Default value is
off
. Use the
gpbackup_manager
encrypt-password
command to encrypt the plain-text password for the DD Boost user. If the
replication
option is
on
,
gpbackup_manager
also encrypts the remote Data Domain user's password. Copy the encrypted password(s) from the
gpbackup_manager
output to the
password
options in the configuration file.
password_encryption
option is
on
, this is an encrypted password.
/<
data-domain-dir> in the storage unit of the system.
: During a backup operation, the plugin creates the directory location if it does not exist in the storage unit and stores the backup in this directory /<data-domain-dir>/YYYYMMDD/YYYYMMDDHHMMSS/
.
gpbackup
performs a backup operation. Value is either
on
or
off
. Default value is
off
, backup replication is deactivated. When the value is
on
, the DD Boost plugin replicates the backup on the Data Domain system that you specify with the
remote_*
options.
: The replication
option and remote_*
options are ignored when performing a restore operation with gprestore
. The remote_*
options are ignored if replication
is off
.
: This option is ignored when you perform replication with the gpbackup_manager replicate-backup
command. For information about replication,see Replicating Backups.
gpbackup_manager replicate-backup
command, ignored otherwise. Specifies the maximum number of Data Domain I/O streams that can be used when replicating a backup set on a remote Data Domain server from the Data Domain server that contains the backup. Default value is 1.
: This option is ignored when you perform replication with gpbackup
. The default value is used.
remote_password
option value is encrypted. The default value is
off
. To set up password encryption use the
gpbackup_manager
encrypt-password
command to encrypt the plain-text passwords for the DD Boost user. If the
replication
parameter is
on
,
gpbackup_manager
also encrypts the remote Data Domain user's password. Copy the encrypted passwords from the
gpbackup_manager
output to the password options in the configuration file.
remote_password_encryption
option is
on
, this is an encrypted password.
/<
remote-dd-dir> in the storage unit of the remote system.
: During a backup operation, the plugin creates the directory location if it does not exist in the storage unit of the remote Data Domain system and stores the replicated backup in this directory /<remote-dd-dir>/YYYYMMDD/YYYYMMDDHHMMSS/
.
gpbackup
and
gprestore
commands specify certain backup and filter conditions (see
Filtered Restore with the DD Boost Storage Plugin), specifies whether
gprestore
should perform a filtered restore operation. The default value is
on
, perform a filtered restore. Set
restore_subset
to
off
to deactivate this optimization.
This is an example DD Boost storage plugin configuration file that is used in the next gpbackup
example command. The name of the file is ddboost-test-config.yaml
.
executablepath: $GPHOME/bin/gpbackup_ddboost_plugin
options:
hostname: "192.0.2.230"
username: "test-ddb-user"
password: "asdf1234asdf"
storage_unit: "gpdb-backup"
directory: "test/backup"
This gpbackup
example backs up the database demo using the DD Boost storage plugin. The absolute path to the DD Boost storage plugin configuration file is /home/gpadmin/ddboost-test-config.yml
.
gpbackup --dbname demo --single-data-file --no-compression --plugin-config /home/gpadmin/ddboost-test-config.yaml
The DD Boost storage plugin writes the backup files to this directory of the Data Domain storage unit gpdb-backup
.
<directory>/backups/<datestamp>/<timestamp>
Where:
For example:
/test/backup/<YYYYMMDD>/<YYYYMMDDHHMMSS>/
This is an example DD Boost storage plugin configuration file that enables replication.
executablepath: $GPHOME/bin/gpbackup_ddboost_plugin
options:
hostname: "192.0.2.230"
username: "test-ddb-user"
password: "asdf1234asdf"
storage_unit: "gpdb-backup"
directory: "test/backup"
replication: "on"
remote_hostname: "192.0.3.20"
remote_username: "test-dd-remote"
remote_password: "qwer2345erty"
remote_storage_unit: "gpdb-remote"
remote_directory: "test/replication"
To restore from the replicated backup in the previous example, you can run gprestore
with the DD Boost storage plugin and specify a configuration file with this information.
executablepath: $GPHOME/bin/gpbackup_ddboost_plugin
options:
hostname: "192.0.3.20"
remote_username: "test-dd-remote"
remote_password: "qwer2345erty"
storage_unit: "gpdb-remote"
directory: "test/replication"
Include these recommended flags when using the DD Boost storage plugin:
--no-compression
, because compressed data does not allow DD Boost to do any deduplication.--single-data-file
, because multiple data files may cause additional overhead on the Data Domain file system, resulting in slower than optimal backup speed.Filtered restore increases performance by reading and restoring only a subset of the backup data stored on the DD Boost storage system.
gprestore
performs a filtered restore operation with the DD Boost Storage plugin when all of the following conditions hold:
--plugin-config ddboost-config.yml
option when you invoke both the gpbackup
and gprestore
commands.gpbackup
command with the --no-compression
and --single-data-file
flags).--include-table
, --exclude-table
, --include-table-file
, or ‑‑exclude-table-file
) on the gprestore
command line.The DD Boost Storage Plugin reads only the relations that you specify from the backup file on the DD Boost storage system, and restores them in Greenplum Database.
Dell EMC DD Boost is integrated with VMware Greenplum and requires a DD Boost license. Open source Greenplum Database cannot use the DD Boost software, but can back up to a Dell EMC Data Domain system mounted as an NFS share on the Greenplum coordinator and segment hosts.
Parent topic:Using gpbackup Storage Plugins