Supported Platforms

The VMware Greenplum Data Copy Utility is compatible with these VMware Greenplum versions:

  • VMware Greenplum 5.9 and later
  • VMware Greenplum 6.x
  • VMware Greenplum 7 Beta 3+

Version 2.7.0

Release Date: July 12, 2024

VMware Greenplum Data Copy Utility version 2.7.0 is a minor release that includes new and changed features and bug fixes.

Features

gpcopy 2.7.0 includes the following new and changed features:

  • VMware Greenplum Data Copy supports yaml format to avoid command lines that are long or complicated.
  • VMware Greenplum Data Copy lets you use the subcommand of gpcopy job to observe and control the number of parallel tasks.
  • Updated dependency versions:
    • Golang 1.2.1
    • gingko 2.19.0
    • golang-hint 1.57.2

Resolved Issues

  • [N/A] Resolves an issue where redistrubution is not triggered automatically when the distribution keys for source and destination are different.
  • [N/A] Resolves the issue of wrong type of IncludeTableFile and IncludeTableJson.
  • [N/A] Resolves an issue where the partitioned table was not truncated when the flag --truncate-source-after was set to on.
  • [N/A] Resolves an issue where the destination had wrong owner of schema.
  • [N/A] Resolves an issue where gpcopy_helper dialing timed out by local IPV6.

Known Issues and Limitations

  • GPDB 6x and below support AO table with COMPRESSTYPE=quicklz. However, GPDB 7x does not support this format for AO table. If you run gpcopy to copy an AO table with COMPRESSTYPE=quicklz from GPDB 6x or below to GPDB 7x, gpcopy will set gp_quicklz_fallback to true, and the COMPRESSTYPE will become zstd in the destination.

  • Running gpcopy to copy data from a higher version of Greenplum Data Copy to a lower version of Greenplum Data Copy is not supported, and a warning message will be shown.

  • If gpcopy is not running the coordinator node of the source GPDB cluster, a warning message It is recommended to run gpcopy on the coordinator node of the source GPDB cluster will be displayed.

Version 2.6.0

Release Date: August 18, 2023

VMware Greenplum Data Copy Utility version 2.6.0 is a minor release that includes new and changed features and bug fixes.

Features

gpcopy 2.6.0 includes the following new and changed features:

  • In previous versions, gpcopy consulted the Greenplum version number of the source and destination clusters, the number of segments in the source and destination clusters, and the hash key data type to determine whether or not to redistribute the target table data. gpcopy now additionally consults the gp_use_legacy_hashops server configuration parameter in this check.
  • The default value of --on-segment-threshold is changed from 10000 to -1, which instructs gpcopy to copy the table data using the source and destination Greenplum Database segment instances.
  • The --on-segment-threshold option now accepts the value -2, which instructs gpcopy to copy the table data using the source and destination Greenplum Database coordinators.
  • gpcopy now gives higher priority to the exclude list, and when a partitioned table is excluded, gpcopy excludes its child partitions as well.
  • gpcopy introduces a new --snapshot <snapshot_id> option. You can use this option to specify the snapshot identifier of the transaction in which you want gpcopy to run the copy operation. Refer to About Specifying a Transaction Snapshot for more information.

Resolved Issues

  • [32955] Resolves an issue where gpcopy did not exclude child partitions when the parent partitioned table was excluded. gpcopy now gives higher priority to the exclude list, and when a partitioned table is excluded, gpcopy excludes its child partitions as well.

  • [N/A] Resolves an issue where gpcopy returned the error distribution key doesn't belong to segment with ID <num>, it belongs to segment with ID <other_num> when the source and destination Greenplum Database clusters were configured to use different hashing algorithms for the table distribution key(s). gpcopy now consults the gp_use_legacy_hashops server configuration parameter when it creates the target table.

Version 2.5.0

Release Date: May 15, 2023

VMware Greenplum Data Copy Utility version 2.5.0 is a minor release that includes new and changed features.

Note

This version of the VMware Greenplum Data Copy Utility documentation replaces the term master with the term coordinator.

Features

gpcopy 2.5.0 includes the following new and changed features:

  • Adds supports for VMware Greenplum version 7 Beta 3+.

    gpcopy now supports copying data from VMware Greenplum versions 5 and 6 to VMware Greenplum 7 Beta 3+.

  • Updates the go library dependency to version 1.19.

  • Updates supporting library dependencies.

  • Because the default SSL mode of an updated library dependency does not support prefer, you must set PGSSLMODE to disable when one of the Greenplum clusters is configured for encryption and the other is not.

Version 2.4.1

Release Date: December 19, 2022

gpcopy version 2.4.1 is a maintenance release that includes a single bug fix.

Resolved Issues

  • [N/A] Resolves an issue where the gpcopy version 2.4.0 download packages available on Broadcom Support Portal were corrupt.

Version 2.4.0

Release Date: December 9, 2022

VMware Greenplum Data Copy Utility version 2.4.0 is a minor release that includes new features and a bug fix.

Features

gpcopy 2.4.0 includes the following new features:

  • gpcopy now supports SSL/TLS encryption on the data channel between the source and destination Greenplum Database clusters. This feature relies on an update to the gpcopy_helper utility. About SSL/TLS Encryption on the Data Channel describes how to direct gpcopy to use this encryption method.
  • When the pg_hba.conf file specifies password authentication, gpcopy can now obtain the connection password for the source and/or destination Greenplum Database user from the value of the PGPASSWORD or the PGPASSFILE environment variable; refer to About Connecting Using Password Authentication for more information.
  • When the pg_hba.conf file specifies the SSL/TLS connection type, gpcopy can now initiate an SSL-encrypted connection to the source and/or destination destination Greenplum Database cluster; refer to About Connecting Using SSL/TLS for more information and configuration information.

Resolved Issues

  • [32528] Resolves an issue where gpcopy returned the error relation <name> already exists when it failed to copy a partitioned table that was created with an explicit sequence column, and the owner of the sequence was since altered.

Version 2.3.2

Release Date: April 13, 2022

gpcopy version 2.3.2 is a maintenance release that includes changes and resolves several issues.

Changed Features

  • gpcopy now respects the case sensitivity of database, schema, and table names that you specify with --include-table[-xxx], --exclude-table[-xxx], and --dest-table options when you enclose the individual name in double quotes. For example:

    --include-table '"testdb"."Schema"."T1"'
    

    gpcopy always transforms unquoted names to lower case.

  • gpcopy relaxes the gpcopy_helper version check.

Resolved Issues

  • gpcopy updates the version of go that it uses to build it's CLI tool to version 1.17.6 to mitigate CVE-2021-44716.
  • [31760] Resolves an issue where gpcopy did not respect the case-sensitivity of database, schema, and table names specified via the --include-table[-xxx], --exclude-table[-xxx], and --dest-table options when the name was enclosed in double quotes.
  • [31680] Resolves an issue where gpcopy did not terminate queries and the helper daemon when it failed to create clean up files in the current working directory.
  • [30925] Resolves an issue where gpcopy did not print a summary report when a copy operation was cancelled with a Ctrl-C.

Version 2.3.1

gpcopy version 2.3.1 is a maintenance release that resolves several issues.

Resolved Issues

  • [n/a] To help with debugging, gpcopy now prints additional logging information:

    • The source and destination cluster versions are now displayed when the utility initializes at startup. For example:

      Initializing gpcopy
      Source cluster version: 6.9.0+dev.30.ge53fbea1b0 build dev
      Destination cluster version: 6.9.0+dev.30.ge53fbea1b0 build dev
      
    • The results of IP resolution for each destination segment are displayed the message: Resolving destination segments hostname IP address results.

  • [31467] Fixed an issue where the count validation could fail if the source and destination tables did not distribute the data in the same way (for example, for randomly-distributed tables).

  • [31467] Fixed a an issue that could cause a md5xor validation failure if a row had 64KB or more of CSV data.

    Note: You must update the gpcopy_helper utility to version 2.3.1 on every segment in order to apply this fix.

  • [31309] Fixed a crash that could occur if gpcopy did not have permission to write copy results to a file. The resulting crash could prevent the gpcopy_helper utility from terminating correctly on every segment.

  • [178122513] Fixed an issue where gpcopy did not handle DISTRIBUTED REPLICATED tables correctly, allowing duplicated data to be copied.

Version 2.3.0

gpcopy version 2.3.0 is a minor release that adds features and resolves several issues.

Features

  • The --timeout option specifies the maximum time in seconds to wait until both source and destination systems are ready for data transfer. The default is 30 seconds. A value of 0 deactivates the timeout.
  • For a copy operation, gpcopy includes a list of tables and views that were successfully copied to the destination system in the text file gpcopy_date_success.list in ~/gpAdminLogs directory on the coordinator host.
  • If gpcopy fails to copy tables or views, the utility creates a text file gpcopy_date_failure.list that lists the failed tables or views in the ~/gpAdminLogs directory on the coordinator host. After resolving issues that caused the failures, you can run gpcopy with the --include-table-file option to copy the tables or views that were not copied.

Resolved Issues

  • [30675] In some cases, copying an append-optimized partitioned table in parallel with the gpcopy options --truncate and --parallelize-leaf-partitions=true failed with the error message deadlock detected.
  • [30720, 30703] If the gpcopy destination cluster was busy processing requests, the default gpcopy network timeout of 5 seconds could result in panics caused by the utility using closed network connections. This issue is resolved by changing the default timeout to 30 seconds and adding the --timeout option that allows changing the connection timeout. See Features.
  • [30746] Resolved a typographical error in the gpcopy log files where "transaction" was misspelled as "trasaction" in several messages.
  • [30772] Resolved a problem where gpcopy could create numerous, large log files in the /tmp directory and cause the copy operation to fail with a no space left on device error.
  • [173678813] When copying a sequence, gpcopy did not correctly copy the sequence owner and privileges.

Known Issues and Limitations

  • The sql: JSON key, used with the --include-table-json option, is compatible only with Greenplum Database version 5.20 and later.
check-circle-line exclamation-circle-line close-line
Scroll to top icon