During this phase you run the gpupgrade initialize command. This phase prepares the source cluster for the upgrade and initializes the target cluster. Before proceeding, ensure you have reviewed and completed the pre-upgrade phase tasks.

IMPORTANT: Start the initialize phase during a scheduled downtime. Plan and notify all appropriate groups and users that the Greenplum Database cluster will be offline for an extended period.

The following table summarizes the cluster state before and after gpupgrade initialize:

Before Initialize After Initialize
Source Target Source Target
Master UP Non Existent UP Initialized but DOWN
Standby UP Non Existent UP Non Existent
Primaries UP Non Existent UP Initialized but DOWN
Mirrors UP Non Existent UP Non Existent

Initialize Workflow Summary

The gpupgrade initialize command performs the following steps:

  1. Starts the gpupgrade hub process on the master host.
  2. Saves the source cluster configuration.
  3. Starts the gpupgrade agents on the master and segment hosts, one agent process on each host.
  4. Checks the environment.
  5. Checks the disk space availability.
  6. Generates the target cluster configuration.
  7. Initializes the target cluster.
  8. Sets the dynamic library path on the target cluster.
  9. Shuts down the target cluster.
  10. Runs pg_upgrade --check to check for known migration issues between the source and target Greenplum Database versions.

Preparing for Initialization

Upgrade supported extensions

Supported extensions can be upgraded by Greenplum Upgrade. Upgrade the supported extensions to their latest version on the source cluster. The supporred extensions include:

  • GPText (3.9.0+)
  • MADlib (1.19.0+)
  • PostGIS (2.1.5+pivotal.3)
  • PXF (5.16.4+ and 6.3.0+)
  • Postgres native extensions (such as amcheck, dblink, hstore, and pgcrypto)

PostGIS and PXF require these additional pre-upgrade tasks:

Extension Additional Pre-Upgrade Task(s)
PostGIS On the source cluster, drop the following views from your database(s): geography_columns, raster_columns, raster_overviews
PXF Perform the PXF Pre-gpupgrade Actions procedure.

Uninstall unsupported extensions

Unsupported extensions – which are extensions that gpupgrade cannot automatically upgrade – must be uninstalled on the source cluster before running gpupgrade initialize. Those extensions will be reinstalled on the target cluster after upgrading. For more information, as well as a list of unsupported extensions, see Handling Unsupported Extensions.

WARNING: Dropping any extension that defines user defined types, aggregates, functions, operators, or views, will drop the data associated with those objects.

Greenplum Command Center requires more complex steps than simply uninstalling and reinstalling. If you are upgrading Command Center, see Upgrading Greenplum Command Center.

Edit the gpupgrade configuration file

The gpupgrade initialize command requires a configuration file as an input. Review an example gpupgrade_config file in the directory where you extracted the downloaded gpupgrade utility.

Copy the example file to the $HOME/gpupgrade/ location and make edits according to your environment:

cp /usr/local/bin/greenplum/gpupgrade/gpupgrade_config  $HOME/gpupgrade/
  • The source_master_port, source_gphome, and target_gphome parameters are blank and must be set to your environment’s values.

  • During gpupgrade initialize, the utility starts the hub and agent processes on the hosts. The master host hub port defaults to 7527 and the segment hosts agent port defaults to 6416. If the reserved hub and agent ports are already used by any of your applications, assign a different port in the gpupgrade configuration file.

    Ensure that there is no firewall blocking these ports so the hub and agents can communicate with each other.

  • If you are upgrading with extensions whose install location is outside of $target_gphome, you must set the dynamic_library_path parameter. For more details refer to gpupgrade Configuration File.

The remaining parameters are commented out and have default values. Change these values as necessary for your upgrade scenario. See the gpupgrade_config file reference page for further details.

Run the pre-initialize migration script

The gpupgrade utility package includes bash and SQL migration scripts to help resolve potential migration issues from Greenplum 5 to 6. Review About the Migration Scripts.

Running gpupgrade Initialize

To run initialize use:

gpupgrade initialize --file | -f PATH/TO/gpupgade_config [--verbose | -v] [--automatic | -a] [--pg-upgrade-verbose --verbose]

Where:

  • --file | -f specifies the configuration file location
  • --verbose | -v is the flag for verbose output
  • --automatic | -a suppress summary and confirmation dialog
  • --pg-upgrade-verbose is an optional flag that provides more detailed logging for debugging; requires the --verbose option

For example:

gpupgrade initialize --file $HOME/gpupgrade/gpupgrade_config --verbose

The utility displays a summary message and waits for user confirmation before proceeding. Then it proceeds through various background steps, and displays its progress on the screen similar to:

Initialize in progress.

Starting gpupgrade hub process...                                  [IN PROGRESS]
Saving source cluster configuration...                             [COMPLETE]   
Starting gpupgrade agent processes...                              [COMPLETE]   

.....

The status of each step can be COMPLETE, FAILED, SKIPPED, or IN PROGRESS. SKIPPED indicates that the command has been run before and the step has already been executed.

These steps are further described below:

  • Starting gpupgrade hub process: Starts up the gpupgrade hub process on the master node.
  • Saving source cluster configuration: Collects the source cluster configuration details and generates gpupgrade state files to hold the source configuration.
  • Starting gpupgrade agent processes: Starts up agents on the standby master and segment hosts.
  • Checking Environment: Checks the environment paths for source and target to avoid mixing the two.
  • Checking disk space: Checks for available disk space.
    The default requirement is 60% free disk space. If ‑‑link is specified, the requirement is 20%.
    Can be altered by providing a different ratio with ‑‑disk_free_ratio.
    To skip this check entirely, specify ‑‑disk_free_ratio: 0.0 in the configuration file.
  • Generating target cluster configuration: Populates the gpupgrade state files with the target cluster details.
  • Creating target cluster: Initializes the target master and segment hosts, in order to run pg_upgrade on the postgres instances. See Creating Target Cluster Directories for a description of the target cluster data directories.
  • Stopping target cluster: Shuts down the target cluster.
  • Backing up target master: Creates a backup copy of target master, to be used during execute if any issues occur.
  • Running pg_upgrade checks: Runs a thorough list of Greenplum Database checks, see Initialize Phase pg_upgrade Checks.

To resolve any [FAILED] steps review the screen error comments and recommendations, review the server log files in the $HOME/gpAdminLogs directory, and review To address any issues on the target cluster.

For customers with extensions:

For source clusters with preinstalled extensions, the step Running pg_upgrade checks will generate an error the first time you run initialize, similar to:

Running pg_upgrade checks...                                       [FAILED]     

Error: initialize create cluster: InitializeCreateCluster: rpc error: code = Unknown desc = substep "CHECK_UPGRADE": 4 errors occurred:
    * check master: Checking for presence of required libraries                 fatal

Your installation references loadable libraries that are missing from the
new installation.  You can add these libraries to the new installation,
or remove the functions using them from the old installation.  A list of
problem libraries is in the file:
    /home/gpadmin/.gpupgrade/pg_upgrade/seg-1/loadable_libraries.txt

When initialize starts the target cluster, as the extensions are not yet installed, it cannot locate the same libraries as in the source cluster so it generates an error. To resolve the error you need to install all missing extensions in the target cluster before re-running initialize. Follow the steps below to install the extensions in the target cluster:

  1. Follow the steps in To address any issues on the target cluster.

  2. Start the target cluster which was initialized and stopped when initialize ran:

    gpstart -a
    
  3. Install on the source cluster the same version of the extension that is on the target cluster. See each extension’s documentation for installation specifics. For GPText and PostGIS, perform these additional tasks:

    • For GPText customers, copy the following GPText files from the source cluster $MASTER_DATA_DIRECTORY to the target cluster $MASTER_DATA_DIRECTORY:

      cp $MASTER_DATA_DIRECTORY/{gptext.conf,gptxtenvs.conf,zoo_cluster.conf} /home/gpadmin/.gpupgrade/master.bak/
      

      Note: Do NOT alter any of the files in the .gpupgrade directory.

    • For PostGIS customers, drop the following views containing deprecated name datatypes.

      DROP VIEW geography_columns;
      DROP VIEW raster_columns;
      DROP VIEW raster_overviews;
      

      Make a note after the upgrade is complete to re-create these views following the PostGIS post-upgrade steps.

  4. Stop the target cluster by issuing this command: gpstop -a

  5. Re-run the initialize command.

gpupgrade Log Files

The gpupgrade log files are saved in $HOME/gpAdminLogs/gpupgrade, for example $HOME/gpAdminLogs/gpupgrade/cli.log.

About Target Cluster Directories

When the gpupgrade initialize command creates the target Greenplum cluster, it creates data directories for the target master segment instance and primary segment instances on the master and segment hosts, alongside the source cluster data directories. This applies both to copy or link mode.

The target cluster data directory names have this format:

<segment-prefix>.<hash-code>.<content-id>

Where:

  • <segment-prefix> is the segment prefix string specified when the source Greenplum Database system was initialized. This is typically gpseg.
  • <hash-code> is a 10-character string generated by gpupgrade. The hash code is the same for all segment data directories belonging to the new target Greenplum cluster. In addition to distinguishing target directories from the source data directories, the unique hash code tags all data directories belonging to the current gpupgrade instance.
  • <content-id> is the database content id for the segment. The master segment instance content id is always −1. The primary segment content ids are numbered consecutively from 0 to the number of primary segments.

For example, if the $MASTER_DATA_DIRECTORY environment variable value is /data/master/gpseg-1/, the data directory for the target master is /data/master/gpseg.AAAAAAAAAA.-1, where AAAAAAAAAA is the hash code gpupgrade generated for this target cluster. Primary segment data directories for the target cluster are located on the same host and at the same path as their source cluster counterparts. If the first primary segment for the source cluster is on host sdw1 in the directory /data/primary/gpseg0, the target cluster segment directory is on the same host at /data/primary/gpseg.AAAAAAAAAA.0.

When the gpugprade finalize command has completed, source cluster data directory names are renamed as:

<segment-prefix>.<hash-code>.<content-id>.old

and the target cluster data directory names are renamed to the original source directory names:

<segment-prefix><content-id>

Troubleshooting the Initialize Phase

To address any issues on the target cluster

The upgrade process is in-place, with two Greenplum Database versions installed on the same hosts. When you need to work on the target cluster, follow these steps to avoid mixing environment variables between source and target systems:

Open a new terminal:

source /usr/local/greenplum-db-<target>/greenplum_path.sh
export MASTER_DATA_DIRECTORY=$(gpupgrade config show --target-datadir)
export PGPORT=$(gpupgrade config show --target-port

where MASTER_DATA_DIRECTORY and PGPORT are the target cluster variables.

Understanding the Format of gpupgrade Errors

This section explains the format of gpupgrade error messages.

Consider the following example:

Error: rpc error: code = Unknown desc = substep "SAVING_SOURCE_CLUSTER_CONFIG": retrieve source configuration: querying gp_segment_configuration: ERROR: Unsupported startup parameter: search_path (SQLSTATE 08P01)

The following table summarizes the meaning of each element of this sample error message:

Error Message Element Meaning
Error: rpc error: code = Unknown This element is inherent to gpupgrade’s underlying protocol and is an implementation detail.
desc = substep "SAVING_SOURCE_CLUSTER_CONFIG" Indicates which substep gpupgrade failed on, in this case the “SAVING_SOURCE_CLUSTER_CONFIG” substep.
retrieve source configuration: querying gp_segment_configuration A series of prefixes providing additional context, from less specific to more specific.
ERROR: Unsupported startup parameter: search_path (SQLSTATE 08P01) The actual error; in this example, there was an unsupported parameter “search_path” when querying the database.

gpstart Failures

During initialize gpinitsystem can fail when calling gpstart with the following errors:

[CRITICAL]:-gpstart failed. (Reason='') exiting...
stderr='Error: unable to import module: /usr/local/greenplum-db-6.20.3/lib/libpq.so.5: symbol gss_acquire_cred_from, version gssapi_krb5_2_MIT not defined in file libgssapi_krb5.so.2 with link time reference

Other similar errors may include:

/usr/local/greenplum-db-6.19.3/bin/postgres: /usr/local/greenplum-db-5.29.1/lib/libxml2.so.2: no version information available (required by /usr/local/greenplum-db-6.19.3/bin/postgres)

This occurs when the source and target Greenplum environments are mixed, causing utilities to fail. To resolve this, perform the following steps:

  1. On all segments, remove from .bashrc or .bash_profile files any lines that source greenplum_path.sh or set Greenplum variables.

  2. Start a new shell and ensure that PATH, LD_LIBRARY_PATH, PYTHONHOME, and PYTHONPATH are clear of any Greenplum values.

  3. ssh to a segment host and also ensure the above values are clear of any Greenplum values.

Missing Extensions

If your Greenplum 5.x cluster has installed extensions, such as Greenplum Streaming Server, PL/Container or PL/Java, the gpupgrade initialize checks will fail until you reinstall the missing extensions on the target Greenplum Database. The error message will look like this:

Running pg_upgrade checks...                                       [FAILED]     

Error: initialize create cluster: InitializeCreateCluster: rpc error: code = Unknown desc = substep "CHECK_UPGRADE": 4 errors occurred:
   * check master: Checking for presence of required libraries
fatal

Your installation references loadable libraries that are missing from the
new installation.  You can add these libraries to the new installation,
or remove the functions using them from the old installation.  A list of
problem libraries is in the file:
   /home/gpadmin/.gpupgrade/pg_upgrade/seg-1/loadable_libraries.txt

Host Resolution Problems

When running on a single node system, particularly in a cloud environment, if you encounter a grpcDialer failed: error it is possible that your local hostname is not resolvable. Verify that each host is resolvable by issuing the following command:

$ ping -q -c 1 -t 1 `hostname`

RPC Connection Errors

There may be a connection issue between gpupgrade’s various processes if you receive “transport is closing” or "context deadline “exceeded” errors such as the following:

  • Error: rpc error: code = Unavailable desc = transport is closing

  • Error: connecting to hub on port 7527: context deadline exceeded

gpupgrade runs CLI, hub, and agent processes. For a variety of reasons, the underlying connections between them can break, resulting in the above errors. Try stopping these processes with gpupgrade kill-services, and restarting with gpupgrade restart-services.

Next Steps

Continue with the gpupgrade Execute Phase or gpupgrade revert.

check-circle-line exclamation-circle-line close-line
Scroll to top icon