If you are using the Greenplum Streaming Server (GPSS) in your current Greenplum Database installation, you must perform the GPSS upgrade procedure when:
The GPSS upgrade procedures describe how to upgrade GPSS in your Greenplum Database installation or on your ETL host. This procedure uses GPSS.from to refer to your currently-installed GPSS and GPSS.new to refer to the GPSS installed when you upgrade to the new version of Greenplum Database or install a new GPSS package.
The GPSS upgrade procedure has two parts. You perform one procedure before, and one procedure after, you upgrade to a new version of Greenplum Database or GPSS:
Perform this procedure in your GPSS.from installation before you upgrade to a new version of Greenplum Database or GPSS:
Log in to the Greenplum Database coordinator host or the ETL host and set up your environment. For example:
$ ssh gpadmin@<gpcoord>
gpadmin@gpcoord$ . /usr/local/greenplum-db/greenplum_path.sh
Or:
$ ssh etluser@<etlhost>
etluser@etlhost$ . /usr/local/gpss/gpss_path.sh
Identify and note the current version (GPSS.from) of GPSS. For example:
$ gpss --version
Stop all gpss
jobs that are in the Running state.
Stop all running gpss
instances.
Upgrade to the new version of Greenplum Database or install a new version of GPSS, and then continue your GPSS upgrade with Step2: Upgrading GPSS.
After you upgrade to the new version of Greenplum Database or install the new version of GPSS in your Greenplum installation, perform the following procedure to upgrade the GPSS.new software:
Log in to the Greenplum Database coordinator host or the ETL host and set up your environment. For example, on the coordinator:
$ ssh gpadmin@<gpcoord>
gpadmin@gpcoord$ . /usr/local/greenplum-db/greenplum_path.sh
Identify and note the new version (GPSS.new) of GPSS. For example:
gpadmin@gpcoord$ gpss --version
If you are upgrading from GPSS version 1.3.0 or older:
GPSS 1.3.0 introduced a regression that caused it to no longer recognize history tables (internal tables that GPSS creates for each job) that were created with GPSS 1.2.6. This regression could cause GPSS to load duplicate Kafka messages into Greenplum. This issue is resolved in GPSS 1.3.1.
You are not required to perform any upgrade steps related to this issue; GPSS will automatically perform the required actions when you resubmit and restart a load job that you initiated with GPSS 1.3.0. GPSS's upgrade actions are dependent upon the GPSS version(s) from which you are upgrading, and are described below:
deprecated_
.If you are upgrading from GPSS version 1.3.1 or older:
gpss.json
configuration file:
Certificate
s for GPSS and gpfdist
. If you are using SSL to encrypt communication between GPSS and Kafka, Greenplum, or the GPSS client, you must update the gpss.json
server configuration file to configure the correct Certificate
block.ListenAddress:SSL
property is removed. Ensure that you remove this property from all GPSS server configuration files.gpkafka check
to gpkafka history
. If you have any scripts or programs that reference gpkafka check
, you must replace these references with gpkafka history
.ENCRYPTION
property from the gpkafka.yaml
job configuration file. Ensure that you remove this property from all job configuration files, and that you provide Kafka SSL configuration properties via the PROPERTY
block in the file.LOCAL_HOSTNAME
and LOCAL_PORT
properties from the gpkafka.yaml
job configuration file. You must remove these properties from all job configurations, and specify the gpfdist
configuration for each job in one of the following ways:
gpkafka load
, provide the --config gpfdistconfig.json
or --gpfdist-host hostaddr
and --gpfdist-port portnum
options when you run the command.gpsscli
job management commands, ensure that the gpss.json
configuration file for the gpss
server instance servicing the request specifies the desired Gpfdist:Host
and Gpfdist:Port
settings.--no-reuse
flag from the gpsscli load
and gpsscli start
commands. If you have any scripts or programs that reference this flag, you must remove the references.If you developed a client application with GPSS 1.3.5 or earlier and you want to use the new MaxErrorRows
or Abort
session capabilities added to the Close
service that were introduced in GPSS 1.3.6, you must:
Edit the gpss.proto
service definition and add the new CloseRequest
field(s):
message CloseRequest {
Session session = 1;
int32 MaxErrorRows = 2;
bool Abort = 3;
}
Re-generate the GPSS client classes.
Add code to utilize the new fields.
Re-compile and re-distribute your GPSS client application. Refer to Developing a Batch Data Client for supporting information.
If you are upgrading from GPSS version 1.4.x or older:
gpsscli history
and gpkafka history
commands. If you have any scripts or programs that reference these commands, you must remove the references.If you are upgrading from GPSS version 1.6.x or older and you have registered the dataflow
extension in any database, you must drop and re-create the extension:
DROP EXTENSION dataflow;
CREATE EXTENSION dataflow;
If you are upgrading from GPSS version 1.7.x or older:
window
property to task
. If you have any Kafka load configuration files that specify window:
, you must change the references to task:
.If you are upgrading from GPSS version 1.9.x or older:
job_id
field to the content of the server log file. You must update any scripts that you have written that rely on the log file naming format or the log file content of previous releases.If you developed a client application with GPSS 1.9.x or earlier and you want to use the new session timeout capability added to the Connect
service that was introduced in GPSS 1.10.0, you must:
Edit the gpss.proto
service definition and add the new SessionTimeout
field to the ConnectRequest
message:
message ConnectRequest {
string Host = 1;
...
bool UseSSL = 6;
int32 SessionTimeout = 7;
}
Re-generate the GPSS client classes.
Add code to utilize the new field.
Re-compile and re-distribute your GPSS client application. Refer to Developing a Batch Data Client for supporting information.
If you are upgrading from GPSS version 1.10.0:
If you installed a new version of Greenplum Database, or you installed the GPSS gppkg
or .tar.gz
packages in your Greenplum installation, you must drop and re-create the GPSS extension in any Greenplum database in which you are using GPSS to load data. A database superuser or the database owner must run these SQL commands:
DROP EXTENSION gpss;
CREATE EXTENSION gpss;
(If the extension does not already exist, GPSS automatically creates it in a database the first time a Greenplum superuser or the database owner submits a load job to any table that resides in that database.)
Restart your gpss
instances.
Resubmit and restart your GPSS jobs.
For any Kafka job that you resubmit and restart, GPSS will consume Kafka messages from the offset associated with the latest timestamp recorded in the history table for the job.