You can configure a Greenplum system to use proxies for interconnect communication to reduce the use of connections and ports during query processing.
The Greenplum interconnect (the networking layer) refers to the inter-process communication between segments and the network infrastructure on which this communication relies. For information about the Greenplum architecture and interconnect, see About the Greenplum Architecture.
In general, when running a query, a QD (query dispatcher) on the Greenplum master creates connections to one or more QE (query executor) processes on segments, and a QE can create connections to other QEs. For a description of Greenplum query processing and parallel query processing, see About Greenplum Query Processing.
By default, connections between the QD on the master and QEs on segment instances and between QEs on different segment instances require a separate network port. You can configure a Greenplum system to use proxies when Greenplum communicates between the QD and QEs and between QEs on different segment instances. The interconnect proxies require only one network connection for Greenplum internal communication between two segment instances, so it consumes fewer connections and ports than TCP
mode, and has better performance than UDPIFC
mode in a high-latency network.
To enable interconnect proxies for the Greenplum system, set these system configuration parameters.
proxy
.NoteWhen expanding a Greenplum Database system, you must deactivate interconnect proxies before adding new hosts and segment instances to the system, and you must update the
gp_interconnect_proxy_addresses
parameter with the newly-added segment instances before you re-enable interconnect proxies.
Parent topic: Managing a Greenplum System
This example sets up a Greenplum system to use proxies for the Greenplum interconnect when running queries. The example sets the gp_interconnect_proxy_addresses parameter and tests the proxies before setting the gp_interconnect_type parameter for the Greenplum system.
Set the gp_interconnect_proxy_addresses
parameter to specify the proxy ports for the master and segment instances. The syntax for the value has the following format and you must specify the parameter value as a single-quoted string.
<db_id>:<cont_id>:<seg_address>:<port>[, ... ]
For the master, standby master, and segment instance, the first three fields, db_id, cont_id, and seg_address can be found in the gp_segment_configuration catalog table. The fourth field, port, is the proxy port for the Greenplum master or a segment instance.
dbid
column in the catalog table.content
column in the catalog table.address
column in the catalog table.ImportantIf a segment instance hostname is bound to a different IP address at runtime, you must run
gpstop -u
to re-load thegp_interconnect_proxy_addresses
value.
This is an example PL/Python function that displays or sets the segment instance proxy port values for the gp_interconnect_proxy_addresses
parameter. To create and run the function, you must enable PL/Python in the database with the CREATE EXTENSION plpythonu
command.
--
-- A PL/Python function to setup the interconnect proxy addresses.
-- Requires the Python modules os and socket.
--
-- Usage:
-- select my_setup_ic_proxy(-1000, ''); -- display IC proxy values for segments
-- select my_setup_ic_proxy(-1000, 'update proxy'); -- update the gp_interconnect_proxy_addresses parameter
--
-- The first argument, "delta", is used to calculate the proxy port with this formula:
--
-- proxy_port = postmaster_port + delta
--
-- The second argument, "action", is used to update the gp_interconnect_proxy_addresses parameter.
-- The parameter is not updated unless "action" is 'update proxy'.
-- Note that running "gpstop -u" is required for the update to take effect.
-- A Greenplum system restart will also work.
--
create or replace function my_setup_ic_proxy(delta int, action text)
returns table(dbid smallint, content smallint, address text, port int) as $$
import os
import socket
results = []
value = ''
segs = plpy.execute('''SELECT dbid, content, port, address
FROM gp_segment_configuration
ORDER BY 1''')
for seg in segs:
dbid = seg['dbid']
content = seg['content']
port = seg['port']
address = seg['address']
# decide the proxy port
port = port + delta
# append to the result list
results.append((dbid, content, address, port))
# build the value for the GUC
if value:
value += ','
value += '{}:{}:{}:{}'.format(dbid, content, address, port)
if action.lower() == 'update proxy':
os.system('''gpconfig --skipvalidation -c gp_interconnect_proxy_addresses -v "'{}'"'''.format(value))
plpy.notice('''the settings are applied, please reload with 'gpstop -u' to take effect.''')
else:
plpy.notice('''if the settings are correct, re-run with 'update proxy' to apply.''')
return results
$$ language plpythonu execute on master;
NoteWhen you run the function, you should connect to the database using the Greenplum interconnect type
UDPIFC
orTCP
. This example usespsql
to connect to the databasemytest
with the interconnect typeUDPIFC
.
PGOPTIONS="-c gp_interconnect_type=udpifc" psql -d mytest
Running this command lists the segment instance values for the gp_interconnect_proxy_addresses
parameter.
select my_setup_ic_proxy(-1000, '');
This command runs the function to set the parameter.
select my_setup_ic_proxy(-1000, 'update proxy');
As an alternative, you can run the sgpconfig utility to set the gp_interconnect_proxy_addresses
parameter. To set the value as a string, the value is a single-quoted string that is enclosed in double quotes. The example Greenplum system consists of a master and a single segment instance.
gpconfig --skipvalidation -c gp_interconnect_proxy_addresses -v "'1:-1:192.168.180.50:35432,2:0:192.168.180.54:35000'"
After setting the gp_interconnect_proxy_addresses
parameter, reload the postgresql.conf
file with the gpstop -u
command. This command does not stop and restart the Greenplum system.
To test the proxy ports configured for the system, you can set the PGOPTIONS
environment variable when you start a psql
session in a command shell. This command sets the environment variable to enable interconnect proxies, starts psql
, and logs into the database mytest
.
PGOPTIONS="-c gp_interconnect_type=proxy" psql -d mytest
You can run queries in the shell to test the system. For example, you can run a query that accesses all the primary segment instances. This query displays the segment IDs and number of rows on the segment instance from the table sales
.
# SELECT gp_segment_id, COUNT(*) FROM sales GROUP BY gp_segment_id ;
After you have tested the interconnect proxies for the system, set the server configuration parameter for the system with the gpconfig
utility.
gpconfig -c gp_interconnect_type -v proxy
Reload the postgresql.conf
file with the gpstop -u
command. This command does not stop and restart the Greenplum system.