Note: The DD Boost storage plugin is available only in the commercial release of VMware Greenplum Backup and Restore.

The DD Boost Storage Plugin can be used with the gpbackup and gprestore utilities to perform faster backups to the Dell EMC Data Domain storage appliance, which uses Dell EMC Data Domain Boost (DD Boost) software. The DD Boost Storage Plugin supports filtered restore, which increases performance by selectively reading and restoring only the subset of backup data that you specify via gprestore filter options.

You can also create disaster recovery scenarios using gpbackup or gpbackup_manager by replicating a backup on a separate, remote Data Domain appliance. See Replicating Backups for more information.

The DD Boost storage plugin is installed in the $GPHOME/bin directory of your Greenplum coordinator host when you add the gpbackup package to your installation.

To use the DD Boost storage plugin application, you first create a configuration file to specify the location of the plugin, the DD Boost login, and the backup location. For information about the configuration file, see DD Boost Storage Plugin Configuration File Format.

To run gpbackup or gprestore with the plugin, specify the configuration file with the option --plugin-config.

If you perform a backup operation with the gpbackup option --plugin-config, you must also specify the --plugin-config option when you restore the backup with gprestore.

DD Boost Storage Plugin Configuration File Format

The configuration file specifies the absolute path to the Greenplum Database DD Boost storage plugin executable, DD Boost connection credentials, and Data Domain location. The configuration file is required only on the coordinator host. The DD Boost storage plugin application must be in the same location on every Greenplum Database host.

The DD Boost storage plugin configuration file uses the YAML 1.1 document format and implements its own schema for specifying the DD Boost information.

The configuration file must be a valid YAML document. The gpbackup and gprestore utilities process the configuration file document in order and use indentation (spaces) to determine the document hierarchy and the relationships of the sections to one another. The use of white space is significant. White space should not be used simply for formatting purposes, and tabs should not be used at all.

This is the structure of a DD Boost storage plugin configuration file.

executablepath: <absolute-path-to-gpbackup_ddboost_plugin>
options:
  hostname: "<data-domain-host>"
  username: "<ddboost-ID>"
  password_encryption: "on" | "off"
  password: "<ddboost-pwd>"
  storage_unit: "<data-domain-id>"
  directory: "<data-domain-dir>"
  replication: "on" | "off"
  replication_streams: <integer>
  remote_hostname: "<remote-dd-host>"
  remote_username: "<remote-ddboost-ID>"
  remote_password_encryption "on" | "off"
  remote_password: "<remote-dd-pwd>"
  remote_storage_unit: "<remote-dd-ID>"
  remote_directory: "<remote-dd-dir>"
  restore_subset: "on" | "off"
executablepath
Required. Absolute path to the plugin executable. For example, the VMware Greenplum installation location is $GPHOME/bin/gpbackup_ddboost_plugin. The plugin must be in the same location on every Greenplum Database host.
options

Required. Begins the DD Boost storage plugin options section.

hostname
Required. The IP address or hostname of the host. There is a 30-character limit.
username
Required. The Data Domain Boost user name. There is a 30-character limit.
password_encryption
Optional. Specifies whether the password option value is encrypted. Default value is off. Use the gpbackup_manager encrypt-password command to encrypt the plain-text password for the DD Boost user. If the replication option is on, gpbackup_manager also encrypts the remote Data Domain user's password. Copy the encrypted password(s) from the gpbackup_manager output to the password options in the configuration file.
password
Required. The passcode for the DD Boost user to access the Data Domain storage unit. If the password_encryption option is on, this is an encrypted password.
storage-unit
Required. A valid storage unit name for the Data Domain system that is used for backup and restore operations.
directory
Required. The location for the backup files, configuration files, and global objects on the Data Domain system. The location on the system is /<data-domain-dir> in the storage unit of the system.

: During a backup operation, the plugin creates the directory location if it does not exist in the storage unit and stores the backup in this directory /<data-domain-dir>/YYYYMMDD/YYYYMMDDHHMMSS/.

replication
Optional. Activates or deactivates backup replication with DD Boost managed file replication when gpbackup performs a backup operation. Value is either on or off. Default value is off, backup replication is deactivated. When the value is on, the DD Boost plugin replicates the backup on the Data Domain system that you specify with the remote_* options.

: The replication option and remote_* options are ignored when performing a restore operation with gprestore. The remote_* options are ignored if replication is off.

: This option is ignored when you perform replication with the gpbackup_manager replicate-backup command. For information about replication,see Replicating Backups.

replication_streams
Optional. Used with the gpbackup_manager replicate-backup command, ignored otherwise. Specifies the maximum number of Data Domain I/O streams that can be used when replicating a backup set on a remote Data Domain server from the Data Domain server that contains the backup. Default value is 1.

: This option is ignored when you perform replication with gpbackup. The default value is used.

remote_hostname
Required when performing replication. The IP address or hostname of the Data Domain system that is used for remote backup storage. There is a 30-character limit.
remote_username
Required when performing replication. The Data Domain Boost user name that accesses the remote Data Domain system. There is a 30-character limit.
remote_password_encryption
Optional when performing replication. Specifies whether the remote_password option value is encrypted. The default value is off. To set up password encryption use the gpbackup_manager encrypt-password command to encrypt the plain-text passwords for the DD Boost user. If the replication parameter is on, gpbackup_manager also encrypts the remote Data Domain user's password. Copy the encrypted passwords from the gpbackup_manager output to the password options in the configuration file.
remote_password
Required when performing replication. The passcode for the DD Boost user to access the Data Domain storage unit on the remote system. If the remote_password_encryption option is on, this is an encrypted password.
remote_storage_unit
Required when performing replication. A valid storage unit name for the remote Data Domain system that is used for backup replication.
remote_directory
Required when performing replication. The location for the replicated backup files, configuration files, and global objects on the remote Data Domain system. The location on the system is /<remote-dd-dir> in the storage unit of the remote system.

: During a backup operation, the plugin creates the directory location if it does not exist in the storage unit of the remote Data Domain system and stores the replicated backup in this directory /<remote-dd-dir>/YYYYMMDD/YYYYMMDDHHMMSS/.

restore_subset
Optional. When gpbackup and gprestore commands specify certain backup and filter conditions (see Filtered Restore with the DD Boost Storage Plugin), specifies whether gprestore should perform a filtered restore operation. The default value is on, perform a filtered restore. Set restore_subset to off to deactivate this optimization.

Examples

This is an example DD Boost storage plugin configuration file that is used in the next gpbackup example command. The name of the file is ddboost-test-config.yaml.

executablepath: $GPHOME/bin/gpbackup_ddboost_plugin
options:
  hostname: "192.0.2.230"
  username: "test-ddb-user"
  password: "asdf1234asdf"
  storage_unit: "gpdb-backup"
  directory: "test/backup"

This gpbackup example backs up the database demo using the DD Boost storage plugin. The absolute path to the DD Boost storage plugin configuration file is /home/gpadmin/ddboost-test-config.yml.

gpbackup --dbname demo --single-data-file --no-compression --plugin-config /home/gpadmin/ddboost-test-config.yaml

The DD Boost storage plugin writes the backup files to this directory of the Data Domain storage unit gpdb-backup.

<directory>/backups/<datestamp>/<timestamp>

Where:

  • is the location you specified in the DD Boost configuration file.
  • is the backup date stamp.
  • is the backup time stamp.

For example:

/test/backup/<YYYYMMDD>/<YYYYMMDDHHMMSS>/

This is an example DD Boost storage plugin configuration file that enables replication.

executablepath: $GPHOME/bin/gpbackup_ddboost_plugin
options:
  hostname: "192.0.2.230"
  username: "test-ddb-user"
  password: "asdf1234asdf"
  storage_unit: "gpdb-backup"
  directory: "test/backup"
  replication: "on"
  remote_hostname: "192.0.3.20"
  remote_username: "test-dd-remote"
  remote_password: "qwer2345erty"
  remote_storage_unit: "gpdb-remote"
  remote_directory: "test/replication"

To restore from the replicated backup in the previous example, you can run gprestore with the DD Boost storage plugin and specify a configuration file with this information.

executablepath: $GPHOME/bin/gpbackup_ddboost_plugin
options:
  hostname: "192.0.3.20"
  remote_username: "test-dd-remote"
  remote_password: "qwer2345erty"
  storage_unit: "gpdb-remote"
  directory: "test/replication"

Best Practices

Include these recommended flags when using the DD Boost storage plugin:

  • --no-compression, because compressed data does not allow DD Boost to do any deduplication.
  • --single-data-file, because multiple data files may cause additional overhead on the Data Domain file system, resulting in slower than optimal backup speed.

Filtered Restore with the DD Boost Storage Plugin

Filtered restore increases performance by reading and restoring only a subset of the backup data stored on the DD Boost storage system.

gprestore performs a filtered restore operation with the DD Boost Storage plugin when all of the following conditions hold:

  • You specify the --plugin-config ddboost-config.yml option when you invoke both the gpbackup and gprestore commands.
  • The backup is an uncompressed, single-data-file backup (you invoked the gpbackup command with the --no-compression and --single-data-file flags).
  • You specify filtering options (--include-table, --exclude-table, --include-table-file, or ‑‑exclude-table-file) on the gprestore command line.

The DD Boost Storage Plugin reads only the relations that you specify from the backup file on the DD Boost storage system, and restores them in Greenplum Database.

Notes

Dell EMC DD Boost is integrated with VMware Greenplum and requires a DD Boost license. Open source Greenplum Database cannot use the DD Boost software, but can back up to a Dell EMC Data Domain system mounted as an NFS share on the Greenplum coordinator and segment hosts.

Parent topic:Using gpbackup Storage Plugins

check-circle-line exclamation-circle-line close-line
Scroll to top icon