About the Migration Scripts

Using the optional SQL migration scripts speeds up the upgrade process before you run the gpupgrade initialize command.

The migration bash scripts, gpupgrade-migration-sql-generator.bash and gpupgrade-migration-sql-executor.bash, identify catalog inconsistencies between source and target cluster, and generate the SQL scripts that help resolve them. The gpupgrade-migration-sql-generator.bash can be run outside the upgrade downtime window, and it’s run during the pre-upgrade phase.

IMPORTANT: Running the data migration scripts generator takes a snapshot of the database. If any new data or objects that cannot be upgraded are created after the generator is run, will be missed. In such scenario, re-run the generator in order to detect the new data and objects.

The migration SQL scripts are run by gpupgrade-migration-sql-executor.bash during the upgrade downtime window, and resolve a subset of the issues that are found when the gpupgrade initialize command runs the pg_upgrade check command.

The gpupgrade-migration-sql-executor.bash output directory has three subdirectories:

  • pre-initialize - Run the SQL scripts in the pre-initialize subdirectory just before you run the gpupgrade initialize command. Execute these scripts as the first step after you start the downtime scheduled to perform the upgrade.
  • post-finalize - The SQL scripts in the post-finalize subdirectory are run after gpupgrade finalize has finished. These scripts perform tasks such as creating indexes that were dropped by the pre-initialize scripts.
  • post-revert - The SQL scripts in the post-revert subdirectory reverse the changes made by the pre-initialize scripts and are to be executed only if you revert the upgrade. Run them after the gpupgrade revert command has finished.

The gpupgrade-migration-sql-executor.bash script runs all of the SQL scripts in the directory you specify: pre-initialize, post-finalize, or post-revert. See Executing the SQL Migration Scripts.

If you have used yum install to install gpupgrade, the bash scripts are located in /usr/local/bin and the SQL scripts in /usr/local/bin/greenplum/gpupgrade/data-migration-scripts. If you used rpm -prefix=<USER-DIRECTORY> during gpupgrade installation, the bash scripts are in /<USER-DIRECTORY> and the SQL scripts in /<USER-DIRECTORY>/greenplum/gpupgrade/data-migration_scripts.

The gpupgrade-migration-sql-executor.bash script must be run during the upgrade downtime window.

Migration issues that are not handled with the generated SQL scripts will be caught by the pg_ugprade check command when you run gpupgrade initialize, and you may have to manually resolve them. See gpupgrade Initialize pg_upgrade Checks for information about the issues that pg_upgrade check detects and potential workarounds.

Generating the SQL Migration Scripts

Run the gpupgrade-migration-sql-generator.bash script, to generate the SQL scripts:

gpupgrade-migration-sql-generator.bash <GPHOME> <PGPORT> <OUTPUT_DIR> [--help | -h]

Where:

  • <GPHOME> is the path to the source Greenplum Database installation directory.
  • <PGPORT> is the source Greenplum system port number.
  • <OUTPUT_DIR> is the user-defined directory where the SQL scripts are created. If the directory does not exist, it is created. If the directory already exists, all existing .sql and .sh files are removed from the pre-initialize, post-finalize, and post-revert subdirectories.
  • -h or --help display help information, for example:

    gpupgrade-migration-sql-generator.bash -h
    
    Identifies catalog inconsistencies between the source and target Greenplum versions
    and generates SQL scripts to resolve them. This command should be run prior to "gpupgrade".
    
    Usage: gpupgrade-migration-sql-generator.bash <GPHOME> <PGPORT> <OUTPUT_DIR>
         <GPHOME>     : the path to the source Greenplum installation directory
         <PGPORT>     : the source Greenplum system port number
         <OUTPUT_DIR> : the user-defined directory where the SQL scripts are created
    
    The output directory structure is:
         <output directory>
         + pre-initialize  drop and alter objects prior to "gpupgrade initialize"
         + post-finalize   restore and recreate objects following "gpupgrade finalize"
         + post-revert     restore objects following "gpupgrade revert"
    
    After running gpupgrade-migration-sql-generator.bash, run gpupgrade-migration-sql-executor.bash.
    Run gpupgrade-migration-sql-executor.bash -h for more information.
    

Example

Run the gpupgrade-migration-sql-generator.bash script.

gpupgrade-migration-sql-generator.bash $GPHOME 5432 $HOME/gpupgrade
Output files are located in: $HOME/gpupgrade/pre-initialize
Output files are located in: $HOME/gpupgrade/post-finalize
Output files are located in: $HOME/gpupgrade/post-revert

You can review the SQL scripts in the output directory to see what changes will be applied when you execute the scripts.

Executing the SQL Migration Scripts

Run gpupgrade-migration-sql-executor.bash to execute the generated pre-initialize, post-finalize, or post-revert SQL scripts. The shell script runs all SQL scripts in the specified directory and its subdirectories, so it is important to specify the pre-initialize, post-finalize, or post-revert subdirectory.

$ gpupgrade-migration-sql-executor.bash <GPHOME> <PGPORT> <INPUT_DIR>

Where:

  • <GPHOME> is the path to the source Greenplum Database installation directory.
  • <PGPORT> is the source Greenplum System port number.
  • <INPUT_DIR> is the directory containing the scripts you want to execute. This is the path to the pre-initialize, post-finalize, or post-revert subdirectory in <OUTPUT_DIR>. Do not specify the <OUTPUT_DIR> without a subdirectory.

After you execute the generated SQL scripts using gpupgrade-migration-sql-generator.bash you can regenerate SQL scripts into the same <OUTPUT_DIR>. Regenerating the scripts in the same output directory will archive the previous run scripts in <OUTPUT_DIR>/archive/. The scripts will have to differentiate each run.

Examples:

  • Run the pre-initialize migration scripts. Caution: Only run the pre-initialize scripts after you have started the gpupgrade downtime window.

    $ gpupgrade-migration-sql-executor.bash $GPHOME 5432 $HOME/gpupgrade/pre-initialize
    
  • If you choose to revert the upgrade: After the gpupgrade revert command has completed, run the generated post-revert SQL scripts.

    $ gpupgrade-migration-sql-executor.bash $GPHOME 5432 $HOME/gpupgrade/post-revert
    
  • If you choose to finalize the upgrade: After the gpupgrade finalize command has completed, run the post-finalize migration scripts.

    $ gpupgrade-migration-sql-executor.bash $GPHOME 5432 $HOME/gpupgrade/post-finalize
    

Issues resolved by the SQL Migration Scripts

Currently, the data migration scripts detect these issues:

  • external tables used in partitioning

  • indexes on partitioned tables

  • heterogeneous partitioned tables, with some exceptions; see this section for details

  • tsquery data type. Greenplum Database 5.x contains the tsquery data type. The internal representation of this data type changed between Greenplum Database 5.x and 6.x. For more details, see tsquery user data type check.

  • gphdfs external tables. The gphdfs external table protocol is removed in Greenplum Database 6. These tables must be converted to use PXF. The gpupgrade-migration-sql-generator.bash script generates pre-initialize and post-revert SQL scripts. The pre-initialize scripts drop external gphdfs tables from the databases. The post-revert script recreates them.

  • gphdfs role. The gpupgrade-migration-sql-generator.bash script generates a pre-initialize script that drops the gphdfs role and a post-revert script to restore the role.

  • invalid name user columns. The gpupgrade-migration-sql-generator.bash script generates pre-initialize and post-revert scripts. For more details, see Invalid name user columns check.

  • abstime, reltime, tinterval data types. Columns of these data types may not be used in a table’s distribution key. The gpupgrade-migration-sql-generator.bash script generates pre-initialize and post-revert scripts to alter the column type.

  • unique or primary key contraints The source installation contains unique or primary key constraints on tables. For more details, see Unique or primary key constraints check. The gpupgrade-migration-sql-generator.bash script generates pre-initialize, post-finalize, and post-revert SQL scripts. The pre-initialize scripts drop the constraints. The post-finalize scripts recreate them on the source cluster after finalization. The post-revert scripts recreate constraints on the source cluster. The scripts cannot handle the case where a constraint is found on a child of a heap table. In this case you must back up the heap table and recreate it after the upgrade is completed. To restore constraints on child tables as they were in the source cluster, you must create the constraints manually.

check-circle-line exclamation-circle-line close-line
Scroll to top icon