Using the optional SQL migration scripts speeds up the upgrade process before you run the gpupgrade initialize
command.
The migration bash scripts, gpupgrade-migration-sql-generator.bash
and gpupgrade-migration-sql-executor.bash
, identify catalog inconsistencies between source and target cluster, and generate the SQL scripts that help resolve them. The gpupgrade-migration-sql-generator.bash
can be run outside the upgrade downtime window, and it’s run during the pre-upgrade phase.
IMPORTANT: Running the data migration scripts generator takes a snapshot of the database. If any new data or objects that cannot be upgraded are created after the generator is run, will be missed. In such scenario, re-run the generator in order to detect the new data and objects.
The migration SQL scripts are run by gpupgrade-migration-sql-executor.bash
during the upgrade downtime window, and resolve a subset of the issues that are found when the gpupgrade initialize
command runs the pg_upgrade check
command.
The gpupgrade-migration-sql-executor.bash
output directory has three subdirectories:
pre-initialize
- Run the SQL scripts in the pre-initialize
subdirectory just before you run the gpupgrade initialize
command. Execute these scripts as the first step after you start the downtime scheduled to perform the upgrade.post-finalize
- The SQL scripts in the post-finalize
subdirectory are run after gpupgrade finalize
has finished. These scripts perform tasks such as creating indexes that were dropped by the pre-initialize
scripts.post-revert
- The SQL scripts in the post-revert
subdirectory reverse the changes made by the pre-initialize
scripts and are to be executed only if you revert the upgrade. Run them after the gpupgrade revert
command has finished.The gpupgrade-migration-sql-executor.bash
script runs all of the SQL scripts in the directory you specify: pre-initialize
, post-finalize
, or post-revert
. See Executing the SQL Migration Scripts.
If you have used yum install
to install gpupgrade
, the bash scripts are located in /usr/local/bin
and the SQL scripts in /usr/local/bin/greenplum/gpupgrade/data-migration-scripts
. If you used rpm -prefix=<USER-DIRECTORY>
during gpupgrade
installation, the bash scripts are in /<USER-DIRECTORY>
and the SQL scripts in /<USER-DIRECTORY>/greenplum/gpupgrade/data-migration_scripts
.
The gpupgrade-migration-sql-executor.bash
script must be run during the upgrade downtime window.
Migration issues that are not handled with the generated SQL scripts will be caught by the pg_ugprade check
command when you run gpupgrade initialize
, and you may have to manually resolve them. See gpupgrade Initialize pg_upgrade Checks for information about the issues that pg_upgrade check
detects and potential workarounds.
Run the gpupgrade-migration-sql-generator.bash
script, to generate the SQL scripts:
gpupgrade-migration-sql-generator.bash <GPHOME> <PGPORT> <OUTPUT_DIR> [--help | -h]
Where:
<GPHOME>
is the path to the source Greenplum Database installation directory.<PGPORT>
is the source Greenplum system port number.<OUTPUT_DIR>
is the user-defined directory where the SQL scripts are created. If the directory does not exist, it is created. If the directory already exists, all existing .sql
and .sh
files are removed from the pre-initialize
, post-finalize
, and post-revert
subdirectories.-h
or --help
display help information, for example:
gpupgrade-migration-sql-generator.bash -h
Identifies catalog inconsistencies between the source and target Greenplum versions
and generates SQL scripts to resolve them. This command should be run prior to "gpupgrade".
Usage: gpupgrade-migration-sql-generator.bash <GPHOME> <PGPORT> <OUTPUT_DIR>
<GPHOME> : the path to the source Greenplum installation directory
<PGPORT> : the source Greenplum system port number
<OUTPUT_DIR> : the user-defined directory where the SQL scripts are created
The output directory structure is:
<output directory>
+ pre-initialize drop and alter objects prior to "gpupgrade initialize"
+ post-finalize restore and recreate objects following "gpupgrade finalize"
+ post-revert restore objects following "gpupgrade revert"
After running gpupgrade-migration-sql-generator.bash, run gpupgrade-migration-sql-executor.bash.
Run gpupgrade-migration-sql-executor.bash -h for more information.
Run the gpupgrade-migration-sql-generator.bash
script.
gpupgrade-migration-sql-generator.bash $GPHOME 5432 $HOME/gpupgrade
Output files are located in: $HOME/gpupgrade/pre-initialize
Output files are located in: $HOME/gpupgrade/post-finalize
Output files are located in: $HOME/gpupgrade/post-revert
You can review the SQL scripts in the output directory to see what changes will be applied when you execute the scripts.
Run gpupgrade-migration-sql-executor.bash
to execute the generated pre-initialize
, post-finalize
, or post-revert
SQL scripts. The shell script runs all SQL scripts in the specified directory and its subdirectories, so it is important to specify the pre-initialize
, post-finalize
, or post-revert
subdirectory.
$ gpupgrade-migration-sql-executor.bash <GPHOME> <PGPORT> <INPUT_DIR>
Where:
<GPHOME>
is the path to the source Greenplum Database installation directory.<PGPORT>
is the source Greenplum System port number.<INPUT_DIR>
is the directory containing the scripts you want to execute. This is the path to the pre-initialize
, post-finalize
, or post-revert
subdirectory in <OUTPUT_DIR>
. Do not specify the <OUTPUT_DIR>
without a subdirectory.After you execute the generated SQL scripts using gpupgrade-migration-sql-generator.bash
you can regenerate SQL scripts into the same <OUTPUT_DIR>. Regenerating the scripts in the same output directory will archive the previous run scripts in <OUTPUT_DIR>/archive/. The scripts will have
Examples:
Run the pre-initialize
migration scripts. Caution: Only run the pre-initialize
scripts after you have started the gpupgrade
downtime window.
$ gpupgrade-migration-sql-executor.bash $GPHOME 5432 $HOME/gpupgrade/pre-initialize
If you choose to revert the upgrade: After the gpupgrade revert
command has completed, run the generated post-revert
SQL scripts.
$ gpupgrade-migration-sql-executor.bash $GPHOME 5432 $HOME/gpupgrade/post-revert
If you choose to finalize the upgrade: After the gpupgrade finalize
command has completed, run the post-finalize
migration scripts.
$ gpupgrade-migration-sql-executor.bash $GPHOME 5432 $HOME/gpupgrade/post-finalize
Currently, the data migration scripts detect these issues:
external tables used in partitioning
indexes on partitioned tables
heterogeneous partitioned tables, with some exceptions; see this section for details
tsquery data type. Greenplum Database 5.x contains the tsquery
data type. The internal representation of this data type changed between Greenplum Database 5.x and 6.x. For more details, see tsquery user data type check.
gphdfs external tables. The gphdfs
external table protocol is removed in Greenplum Database 6. These tables must be converted to use PXF. The gpupgrade-migration-sql-generator.bash
script generates pre-initialize
and post-revert
SQL scripts. The pre-initialize
scripts drop external gphdfs tables from the databases. The post-revert
script recreates them.
gphdfs role. The gpupgrade-migration-sql-generator.bash
script generates a pre-initialize
script that drops the gphdfs
role and a post-revert
script to restore the role.
invalid name
user columns. The gpupgrade-migration-sql-generator.bash
script generates pre-initialize
and post-revert
scripts. For more details, see Invalid name user columns check.
abstime
, reltime
, tinterval
data types. Columns of these data types may not be used in a table’s distribution key. The gpupgrade-migration-sql-generator.bash
script generates pre-initialize
and post-revert
scripts to alter the column type.
unique or primary key contraints The source installation contains unique or primary key constraints on tables. For more details, see Unique or primary key constraints check. The gpupgrade-migration-sql-generator.bash
script generates pre-initialize
, post-finalize
, and post-revert
SQL scripts. The pre-initialize
scripts drop the constraints. The post-finalize
scripts recreate them on the source cluster after finalization. The post-revert
scripts recreate constraints on the source cluster. The scripts cannot handle the case where a constraint is found on a child of a heap table. In this case you must back up the heap table and recreate it after the upgrade is completed. To restore constraints on child tables as they were in the source cluster, you must create the constraints manually.