This topic presents best practices to follow when you use the Greenplum Streaming Server Kafka Integration.

Choosing a Commit Threshold

GPSS supports two mechanisms to control how and when it commits Kafka data to Greenplum Database: a time period or a number of rows. You specify one or both of MINIMAL_INTERVAL or MAX_ROW in the Kafka load configuration file.

For best results, try various settings of MINIMAL_INTERVAL to determine what value works best in your environment.

When message flow is heavy, GPSS may receive and buffer many messages during the MINIMAL_INTERVAL time period. In this situation, also providing a MAX_ROW setting may mitigate any high memory usage scenarios.

