A transformer plugin is a set of go
functions that perform specific formatting or processing on data after is read from a Kafka or RabbitMQ data source.
This topic describes transformer plugins and how to use them with Greenplum Streaming Server:
A transformer plugin is a set of go
functions. You compile the go
functions that you develop into a shared library. Users of the plugin specify the file system path to this library in the GPSS load configuration file.
The GPSS transformer plugin framework exposes two entry points:
The framework supports specifying properties that direct the processing performed by the transformer. GPSS passes any transformer properties specified in the load configuration file to the go
functions. The transformer-related load configuration properties are described further in Using a Transformer Plugin in GPSS.
Refer to the gp-stream-server-plugin github repository for an example Greenplum Streaming Server transformer plugin implementation.
To use a transformer plugin in a Kafka or RabbitMQ data job, you must specify an INPUT:TRANSFORMER
(version 2) or sources:<source>:transformer
(version 3 (Beta)) block in the load configuration file. The properties in this block identify the file system path to the transformer plugin library, the initialization and transform function names, and transform-specific properties that GPSS passes to the functions.
Version 2 format syntax:
[TRANSFORMER:
PATH: <path_to_plugin_transform_library>
ON_INIT: <plugin_transform_init_name>
TRANSFORM: <plugin_transform_name>
PROPERTIES:
<plugin_transform_property_name>: <property_value>
[ ... ] ]
Version 3 (Beta) format syntax:
transformer:
path: <path_to_plugin_transform_library>
on_init: <plugin_transform_init_name>
transform: <plugin_transform_name>
properties:
<plugin_transform_property_name>: <property_value>
...
When you specify a transformer plugin in your GPSS load configuration file, GPSS invokes the transform function to process the data after it applies an input filter (if specified).