The VMware Greenplum Data Copy Utility is compatible with these VMware Greenplum versions:
Release Date: July 12, 2024
VMware Greenplum Data Copy Utility version 2.7.0 is a minor release that includes new and changed features and bug fixes.
gpcopy
2.7.0 includes the following new and changed features:
gpcopy job
to observe and control the number of parallel tasks.IncludeTableFile
and IncludeTableJson
.--truncate-source-after
was set to on.GPDB 6x and below support AO table with COMPRESSTYPE=quicklz
. However, GPDB 7x does not support this format for AO table. If you run gpcopy
to copy an AO table with COMPRESSTYPE=quicklz
from GPDB 6x or below to GPDB 7x, gpcopy
will set gp_quicklz_fallback
to true, and the COMPRESSTYPE
will become zstd
in the destination.
Running gpcopy
to copy data from a higher version of Greenplum Data Copy to a lower version of Greenplum Data Copy is not supported, and a warning message will be shown.
If gpcopy
is not running the coordinator node of the source GPDB cluster, a warning message It is recommended to run gpcopy on the coordinator node of the source GPDB cluster
will be displayed.
Release Date: August 18, 2023
VMware Greenplum Data Copy Utility version 2.6.0 is a minor release that includes new and changed features and bug fixes.
gpcopy
2.6.0 includes the following new and changed features:
gpcopy
consulted the Greenplum version number of the source and destination clusters, the number of segments in the source and destination clusters, and the hash key data type to determine whether or not to redistribute the target table data. gpcopy
now additionally consults the gp_use_legacy_hashops
server configuration parameter in this check.--on-segment-threshold
is changed from 10000
to -1
, which instructs gpcopy
to copy the table data using the source and destination Greenplum Database segment instances.--on-segment-threshold
option now accepts the value -2
, which instructs gpcopy
to copy the table data using the source and destination Greenplum Database coordinators.gpcopy
now gives higher priority to the exclude list, and when a partitioned table is excluded, gpcopy
excludes its child partitions as well.gpcopy
introduces a new --snapshot <snapshot_id>
option. You can use this option to specify the snapshot identifier of the transaction in which you want gpcopy
to run the copy operation. Refer to About Specifying a Transaction Snapshot for more information.[32955] Resolves an issue where gpcopy
did not exclude child partitions when the parent partitioned table was excluded. gpcopy
now gives higher priority to the exclude list, and when a partitioned table is excluded, gpcopy
excludes its child partitions as well.
[N/A] Resolves an issue where gpcopy
returned the error distribution key doesn't belong to segment with ID <num>, it belongs to segment with ID <other_num>
when the source and destination Greenplum Database clusters were configured to use different hashing algorithms for the table distribution key(s). gpcopy
now consults the gp_use_legacy_hashops
server configuration parameter when it creates the target table.
Release Date: May 15, 2023
VMware Greenplum Data Copy Utility version 2.5.0 is a minor release that includes new and changed features.
NoteThis version of the VMware Greenplum Data Copy Utility documentation replaces the term master with the term coordinator.
gpcopy
2.5.0 includes the following new and changed features:
Adds supports for VMware Greenplum version 7 Beta 3+.
gpcopy
now supports copying data from VMware Greenplum versions 5 and 6 to VMware Greenplum 7 Beta 3+.
Updates the go
library dependency to version 1.19.
Updates supporting library dependencies.
Because the default SSL mode of an updated library dependency does not support prefer
, you must set PGSSLMODE
to disable
when one of the Greenplum clusters is configured for encryption and the other is not.
Release Date: December 19, 2022
gpcopy
version 2.4.1 is a maintenance release that includes a single bug fix.
gpcopy
version 2.4.0 download packages available on Broadcom Support Portal were corrupt.Release Date: December 9, 2022
VMware Greenplum Data Copy Utility version 2.4.0 is a minor release that includes new features and a bug fix.
gpcopy
2.4.0 includes the following new features:
gpcopy
now supports SSL/TLS encryption on the data channel between the source and destination Greenplum Database clusters. This feature relies on an update to the gpcopy_helper
utility. About SSL/TLS Encryption on the Data Channel describes how to direct gpcopy
to use this encryption method.pg_hba.conf
file specifies password authentication, gpcopy
can now obtain the connection password for the source and/or destination Greenplum Database user from the value of the PGPASSWORD
or the PGPASSFILE
environment variable; refer to About Connecting Using Password Authentication for more information.pg_hba.conf
file specifies the SSL/TLS connection type, gpcopy
can now initiate an SSL-encrypted connection to the source and/or destination destination Greenplum Database cluster; refer to About Connecting Using SSL/TLS for more information and configuration information.gpcopy
returned the error relation <name> already exists
when it failed to copy a partitioned table that was created with an explicit sequence column, and the owner of the sequence was since altered.Release Date: April 13, 2022
gpcopy
version 2.3.2 is a maintenance release that includes changes and resolves several issues.
gpcopy
now respects the case sensitivity of database, schema, and table names that you specify with --include-table[-xxx]
, --exclude-table[-xxx]
, and --dest-table
options when you enclose the individual name in double quotes. For example:
--include-table '"testdb"."Schema"."T1"'
gpcopy
always transforms unquoted names to lower case.
gpcopy
relaxes the gpcopy_helper
version check.
gpcopy
updates the version of go
that it uses to build it's CLI tool to version 1.17.6 to mitigate CVE-2021-44716.gpcopy
did not respect the case-sensitivity of database, schema, and table names specified via the --include-table[-xxx]
, --exclude-table[-xxx]
, and --dest-table
options when the name was enclosed in double quotes.gpcopy
did not terminate queries and the helper daemon when it failed to create clean up files in the current working directory.gpcopy
did not print a summary report when a copy operation was cancelled with a Ctrl-C.gpcopy
version 2.3.1 is a maintenance release that resolves several issues.
[n/a] To help with debugging, gpcopy
now prints additional logging information:
The source and destination cluster versions are now displayed when the utility initializes at startup. For example:
Initializing gpcopy
Source cluster version: 6.9.0+dev.30.ge53fbea1b0 build dev
Destination cluster version: 6.9.0+dev.30.ge53fbea1b0 build dev
The results of IP resolution for each destination segment are displayed the message: Resolving destination segments hostname IP address results
.
[31467] Fixed an issue where the count validation could fail if the source and destination tables did not distribute the data in the same way (for example, for randomly-distributed tables).
[31467] Fixed a an issue that could cause a md5xor
validation failure if a row had 64KB or more of CSV data.
Note: You must update the gpcopy_helper
utility to version 2.3.1 on every segment in order to apply this fix.
[31309] Fixed a crash that could occur if gpcopy
did not have permission to write copy results to a file. The resulting crash could prevent the gpcopy_helper
utility from terminating correctly on every segment.
[178122513] Fixed an issue where gpcopy
did not handle DISTRIBUTED REPLICATED
tables correctly, allowing duplicated data to be copied.
gpcopy
version 2.3.0 is a minor release that adds features and resolves several issues.
--timeout
option specifies the maximum time in seconds to wait until both source and destination systems are ready for data transfer. The default is 30 seconds. A value of 0 deactivates the timeout.gpcopy
includes a list of tables and views that were successfully copied to the destination system in the text file gpcopy_date_success.list
in ~/gpAdminLogs
directory on the coordinator host.gpcopy
fails to copy tables or views, the utility creates a text file gpcopy_date_failure.list
that lists the failed tables or views in the ~/gpAdminLogs
directory on the coordinator host. After resolving issues that caused the failures, you can run gpcopy
with the --include-table-file
option to copy the tables or views that were not copied.gpcopy
options --truncate
and --parallelize-leaf-partitions=true
failed with the error message deadlock detected
.gpcopy
destination cluster was busy processing requests, the default gpcopy
network timeout of 5 seconds could result in panics caused by the utility using closed network connections. This issue is resolved by changing the default timeout to 30 seconds and adding the --timeout
option that allows changing the connection timeout. See Features.gpcopy
log files where "transaction" was misspelled as "trasaction" in several messages.gpcopy
could create numerous, large log files in the /tmp
directory and cause the copy operation to fail with a no space left on device
error.gpcopy
did not correctly copy the sequence owner and privileges.sql:
JSON key, used with the --include-table-json
option, is compatible only with Greenplum Database version 5.20 and later.