This topic presents best practices to follow when you use the Greenplum Streaming Server Kafka Integration.
GPSS supports two mechanisms to control how and when it commits Kafka data to Greenplum Database: a time period or a number of rows. You specify one or both of MINIMAL_INTERVAL
or MAX_ROW
in the Kafka load configuration file.
For best results, try various settings of MINIMAL_INTERVAL
to determine what value works best in your environment.
When message flow is heavy, GPSS may receive and buffer many messages during the MINIMAL_INTERVAL
time period. In this situation, also providing a MAX_ROW
setting may mitigate any high memory usage scenarios.