When you Initialize the Upgrade (gpupgrade initialize), the utility begins by generating a series of scripts needed for the entire upgrade process. The scripts are based on your specific cluster characteristics. They are located under $HOME/gpAdminLogs/gpupgrade/data-migration-scripts/current and include:

  • Stats Data Migration SQL Scripts: They collect cluster and database specific characteristics, such as segment configuration and the number and type of tables in each database, that your Technical Support representative can use to help them estimate the duration of an upgrade or further improve the product. The generated statistics are logged in $HOME/gpAdminLogs/gpupgrade/apply_stats.log.
  • Initialize Data Migration SQL Scripts: They identify catalog inconsistencies between source and target cluster, and generate the SQL scripts that help resolve them. See Issues resolved by the SQL Migration Scripts for a list of checks performed by the initialize scripts.
  • Finalize Data Migration SQL Scripts: Some of the initialize scripts drop or alter database objects. When the upgrade finishes, the finalize scripts restore any dropped or modified objects back to their original state.
  • Revert Data Migration SQL Scripts: They get the database back to its original state if you choose to revert the upgrade, recreating or modifying any objects that the initialize scripts altered or dropped.
Important

When generating the data migration scripts, the utility takes a snapshot of the database. If you add new data or objects that cannot be upgraded after generating the scripts, the scripts will miss them. In such scenario, re-run gpupgrade initialize and select Archive and re-generate scripts when prompted, in order to detect the new data and objects.

Other migration issues that are not handled with the generated SQL scripts will be caught by the pg_ugprade check command when you run gpupgrade initialize, and you may have to manually resolve them. See pg_upgrade Checks for information about the issues that it detects and potential workarounds.

Manually Executing the SQL Migration Scripts

The gpupgrade utility automatically runs the data migration SQL scripts during different stages of the upgrade process. If you need to manually run the scripts you may use the gpupgrade command with the following options. Be sure that you run the command using the correct variables as it varies for each script.

Note

Always plan a maintenance window when running the data migration SQL scripts.

  • Stats: the utility runs the stats data migration scripts on the source cluster

    gpupgrade apply --gphome <source_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade/data-migration-scripts --phase stats
    
  • Initialize: the utility runs the initialize data migration scripts on the source cluster

    gpupgrade apply --gphome <source_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade/data-migration-scripts --phase initialize
    
  • Finalize: the utility runs the finalize data migration scripts on the target cluster

    gpupgrade apply --gphome <target_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade-<upgradeID>-<timestamp>/data-migration-scripts --phase finalize
    
  • Revert: the utility runs the revert data migration scripts on the source cluster

    gpupgrade apply --gphome <source_gphome> --port <source_master_port> --input-dir $HOME/gpAdminLogs/gpupgrade-<upgradeID>-<timestamp>/data-migration-scripts --phase revert
    

Where:

  • <source_gphome> is the location of the Greenplum binaries for the source cluster.
  • <source_master_port> is the Greenplum master port number for the source cluster.
  • <target_gphome> is the location of the Greenplum binaries for the target cluster.
  • <upgradeID>-<timestamp> is the archived directory for the gpupgrade logs for a specific upgrade.

Check your gpupgrade Configuration File or run the command gpupgrade config show to determine the values of these parameters.

Issues resolved by the SQL Migration Scripts

Currently, the data migration scripts detect the issues listed below. The initialize scripts drop or alter the necessary objects on the source cluster before the upgrade begins, while the finalize and revert scripts undo the changes on the source cluster, if required.

  • Partitioned tables indexes drops indexes on root partitions, as well as child partitions.

  • Heterogeneous partitioned tables ensures child partitions have the same on-disk layout as their root.

  • Tables using tsquery type changes the internal representation of this data type changed as it is different between Greenplum Database 5.x and 6.x.

  • gphdfs external tables drops gphdfs external tables from the databases as the protocol is removed in Greenplum Database 6.

  • gphdfs user roles alters the gphdfs user role so it cannot create external tables.

  • Unique or primary foreign key constraint drops unique or primary key constraints. Note the script cannot handle the case where a constraint is found on a child of a heap table. In this case you must back up the heap table and recreate it after the upgrade is completed. To restore constraints on child tables as they were in the source cluster, you must create the constraints manually.

  • Parent partitions with segment entries makes sure that append-only (AO) and column-oriented (AO/CO) parent partitions do not contain any pg_aoseg or pg_aocsseg entries, since it may cause unexpected failures during pg_upgrade.

See Also

pg_upgrade Checks.

check-circle-line exclamation-circle-line close-line
Scroll to top icon