When you Initialize the Upgrade (gpupgrade initialize), the utility begins by generating a series of scripts needed for the entire upgrade process. The scripts are based on your specific cluster characteristics. They are located under
$HOME/gpAdminLogs/gpupgrade/data-migration-scripts/current and include:
When generating the data migration scripts, the utility takes a snapshot of the database. If you add new data or objects that cannot be upgraded after generating the scripts, the scripts will miss them. In such scenario, re-run
gpupgrade initializeand select Archive and re-generate scripts when prompted, in order to detect the new data and objects.
Other migration issues that are not handled with the generated SQL scripts will be caught by the
pg_ugprade check command when you run
gpupgrade initialize, and you may have to manually resolve them. See pg_upgrade Checks for information about the issues that it detects and potential workarounds.
gpupgrade utility automatically runs the data migration SQL scripts during different stages of the upgrade process. If you need to manually run the scripts you may use the
gpupgrade command with the following options. Be sure that you run the command using the correct variables as it varies for each script.
Always plan a maintenance window when running the data migration SQL scripts.
Stats: the utility runs the stats data migration scripts on the source cluster
gpupgrade apply --gphome <source_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade/data-migration-scripts --phase stats
Initialize: the utility runs the initialize data migration scripts on the source cluster
gpupgrade apply --gphome <source_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade/data-migration-scripts --phase initialize
Finalize: the utility runs the finalize data migration scripts on the target cluster
gpupgrade apply --gphome <target_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade-<upgradeID>-<timestamp>/data-migration-scripts --phase finalize
Revert: the utility runs the revert data migration scripts on the source cluster
gpupgrade apply --gphome <source_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade-<upgradeID>-<timestamp>/data-migration-scripts --phase revert
<source_gphome>is the location of the Greenplum binaries for the source cluster.
<source_master_port>is the Greenplum master port number for the source cluster.
<target_gphome>is the location of the Greenplum binaries for the target cluster.
<upgradeID>-<timestamp>is the archived directory for the
gpupgradelogs for a specific upgrade.
Currently, the data migration scripts detect the issues listed below. The initialize scripts drop or alter the necessary objects on the source cluster before the upgrade begins, while the finalize and revert scripts undo the changes on the source cluster, if required.
Partitioned tables indexes drops indexes on root partitions, as well as child partitions.
Heterogeneous partitioned tables ensures child partitions have the same on-disk layout as their root.
Tables using tsquery type changes the internal representation of this data type changed as it is different between Greenplum Database 5.x and 6.x.
gphdfs external tables drops
gphdfs external tables from the databases as the protocol is removed in Greenplum Database 6.
gphdfs user roles alters the
gphdfs user role so it cannot create external tables.
Unique or primary foreign key constraint drops unique or primary key constraints. Note the script cannot handle the case where a constraint is found on a child of a heap table. In this case you must back up the heap table and recreate it after the upgrade is completed. To restore constraints on child tables as they were in the source cluster, you must create the constraints manually.
Parent partitions with segment entries makes sure that append-only (AO) and column-oriented (AO/CO) parent partitions do not contain any
pg_aocsseg entries, since it may cause unexpected failures during