How the types correspond between a VMware Tanzu GemFire region and a VMware Tanzu Greenplum table are described in a cache.xml file.

GemFire XML Description of a Greenplum Table

In order to use the connector, a mapping between a GemFire region and a Greenplum Database table has to be described. This mapping may go in the cache.xml file, which has an extended syntax for the mapping information.

The connector makes assumptions about the intended mapping if some parts of the mapping are incomplete. We advise you to explicitly define all keys and fields, such that the connector has no assumptions to make.

This cache.xml file example provides the XML description of the mappings for the region. The XML with the gpdb tag is the mapping between GemFire region fields and the Greenplum Database table columns. It describes how to connect to Greenplum Database, as well as gpfdist protocol to be used to transfer the data between the two systems.

<?xml version="1.0" encoding="UTF-8"?>
<cache xmlns="http://geode.apache.org/schema/cache"
  xmlns:gpdb="http://schema.pivotal.io/gemfire/gpdb"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://geode.apache.org/schema/cache
  http://geode.apache.org/schema/cache/cache-1.0.xsd
  http://schema.pivotal.io/gemfire/gpdb
  http://schema.pivotal.io/gemfire/gpdb/gpdb-3.5.xsd"
  version="1.0">

  <pdx read-serialized="true" persistent="false">
      <pdx-serializer>
          <class-name>org.apache.geode.pdx.ReflectionBasedAutoSerializer</class-name>
          <parameter name="classes">
              <string>io.pivotal.gemfire.demo.entity.*</string>
          </parameter>
      </pdx-serializer>
  </pdx>
  <jndi-bindings>
      <jndi-binding jndi-name="DemoDatasource" type="SimpleDataSource"
          jdbc-driver-class="org.postgresql.Driver" user-name="gpadmin"
          password="changeme" connection-url="jdbc:postgresql://localhost:5432/gemfire_db">
      </jndi-binding>
  </jndi-bindings>
  <region name="Parent">
      <region-attributes refid="PARTITION">
          <partition-attributes redundant-copies="1" />
      </region-attributes>
      <gpdb:store datasource="DemoDatasource">
          <gpdb:types>
              <gpdb:pdx name="io.pivotal.gemfire.demo.entity.Parent"
                  schema="public"
                  table="parent">
                  <gpdb:id field="id" />
                  <gpdb:fields>
                      <gpdb:field name="id" column="id" />
                      <gpdb:field name="name" />
                      <gpdb:field name="income" class="java.math.BigDecimal" />
                  </gpdb:fields>
              </gpdb:pdx>
          </gpdb:types>
      </gpdb:store>
  </region>
  <gpdb:gpfdist port="8000" />
</cache>

jndi-binding

In the example, there is a single JNDI binding to specify the details of the connection to the example’s Greenplum database. This element defines a datasource that is referenced later in gpdb:store element using datasource attribute.

The connection to the Greenplum instance will be to localhost:5432, with a default database of gemfire_db. The Greenplum Database user name is gpadmin, and its password is changeme.

For more details on JNDI bindings, see JNDI.

gpdb:store

The gpdb:store element specifies the mapping between Greenplum Database table contents and the GemFire region.

There should be one store element for each Greenplum data source.

In this example, the gpdb:store element identifies the JNDI binding to specify the Greenplum Database instance at the other end of the connector. It is the data source.

gpdb:types

The gpdb:types element contains a list of gpdb:pdx instances. There is only one gpdb:pdx instance in this example.

gpdb:pdx

Each gpdb:pdx instance identifies the mapping of a GemFire region’s entries to its corresponding Greenplum Database table. This is needed for the data serialization and transfer implemented by the connector. The optional schema attribute specifies the Greenplum Database schema for the table. See Creating and Managing Schemas within the Greenplum Database manual for a description of Greenplum schemas. The optional schema attribute is set to the default value of public when not specified. Make the schema attribute value the same across all GemFire servers that host the region.

gpdb:id

The gpdb:id element specifies the fields to be used as the key in the identification of a GemFire object.

The example uses a singleton key named id, also mapped to table column id:

  <gpdb:id field="id" />

Here is an example of a composite key with the two fields parentId and id mapped to table columns parentId and id.

  <gpdb:id>
    <gpdb:field ref="parentId" />
    <gpdb:field ref="id" />
  </gpdb:id>

gpdb:fields

The gpdb:fields element lists the fields mapped to values of the GemFire PdxInstance. A subset of the available GemFire fields or Greenplum Database table columns may be mapped. Omission of this element causes the connector to automap the GemFire region fields to Greenplum table columns, using the assumption that the exactly matching field and column names are to be mapped.

gpdb:field

Each gpdb:field instance specifies the mapping from a GemFire PdxIntance field to a Greenplum Database table column. If no Greenplum column attribute is specified, the Greenplum column is assumed to have exactly the same name (case sensitive) as the GemFire field.

Here is an example in which a GemFire parentId field is mapped to a Greenplum Database parent_id column, and the GemFire age field is of the Java class (data type) java.lang.Integer.

  <gpdb:field name="parentId" column="parent_id" />
  <gpdb:field name="age" class="java.lang.Integer" />

Special cases exist when the gpdb:field attributes are missing from the cache.xml file. With no gpdb:field attributes defined, the connector assumes that the GemFire region fields have the same names as the Greenplum Database table columns. This list describes the current behavior for this type of special case:

  • If there exists a Greenplum table column for which there is no GemFire region field, then an export operation does not update that Greenplum Database table column. and an insert of a new row places either null or the defined default into the columns with no associated GemFire region field. An import operates on only those GemFire region fields for which the column with the matching name exists.

  • If there is a GemFire region field for which there is no matching Greenplum Database table column, then an export operation only exports the GemFire region fields for which there is a Greenplum table column of the same name. An import has side effects on the GemFire region. The import correctly acquires those GemFire region fields for which the column with the matching name exists. But, the import sets to null those GemFire region fields for which there is no Greenplum Database table column of the same name; this occurs for both an update and an insert of region entries.

If one or more gpdb:field attributes are specified, there can exist a mismatch of fields to columns. This list describes the current behavior for this type of special case:

  • If there is exactly one mapping of a field to a column, and the naming of that GemFire region field or a Greenplum Database table column within a gpdb:field attribute is incorrect such that it does not match a configured and defined name, an exception is thrown upon export, as this case appears as if there are no mappings.

  • With more than one gpdb:field attributes specified, if the naming of one GemFire region field or a Greenplum Database table column is incorrect such that it does not match a configured and defined name, upon export an update to a Greenplum table row will not change the value of the incorrectly named column. An insert of a new Greenplum table row under the same assumptions acquires the correctly mapped GemFire region fields. The incorrectly mapped Greenplum column will be set to null or its defined default value. Upon import, either an update of an existing entry or an insert of a new entry acquires the Greenplum Database table column values for any correctly mapped fields, and uses null for any incorrectly named GemFire field.

gpdb:gpfdist

The gpdb:gpfdist element is not part of the region specification, but is part of the cache specification. It provides details of the gpfdist protocol used in the connector. More details may be found within the section on Using the Greenplum Parallel File Server (gpfdist) in the Greenplum Database manual.

To increase the security of the data during transfer between GemFire and Greenplum Database, TLS/SSL transfers can be enabled, as in this example:

  <gpdb:gpfdist hostname="gpfdist.gemfire.test">
    <gpdb:ssl
        key-store="target/test-classes/io/pivotal/gemfire/gpdb/gpfdist/keystore.jks"
        key-store-password="123456"
        trust-store="target/test-classes/io/pivotal/gemfire/gpdb/gpfdist/truststore.jks"
        trust-store-password="123456" />
  </gpdb:gpfdist>

All four gpdb:ssl attributes must be present to configure the transfers to use SSL. The two password fields remain in plain text within the cache.xml file, so file system protections should be considered.

The specification of optional attributes overrides default values. These are the optional attributes:

  • A port specification overrides the listening port of gpfdist to be other than its default of the first available port within the inclusive range of 8080-8280. The value can be a single port number or a range. If specified with a single port number and that port number is already in use, an exception will be thrown. If specified with a range, the syntax uses a colon to separate the two endpoints of the range, as in <lowport>:<highport>. A value of 0 causes automatic allocation of a port number from an ephemeral port range. An example with a specific port number:

      <gpdb:gpfdist port="8000" />
    

An example that specifies a port range:

``` pre
  <gpdb:gpfdist port="8000:8100" />
```
  • A hostname specifies the host name used in URI generation. If SSL is enabled, this name must match the common name (CN) of installed key store certificate. The default value is an IP address of the local host.

      <gpdb:gpfdist hostname="host12.example.com" />
    
  • An address specification identifies the IP address for listening to be other than the default value of 0.0.0.0. For example:

      <gpdb:gpfdist address="192.0.2.0" />
    
  • idle-timeout specifies a timeout in milliseconds. If no data is transmitted within this timeout period, the connection may be closed. The default value is 60000 milliseconds. There will be no timeout if set to 0. For example

      <gpdb:gpfdist idle-timeout="120000" />
    

Requirements and Caveats in the Mapping

  • There must be only one Greenplum Database table specified per GemFire region. If more than one Greenplum table is specified for a single region by defining distinct Greenplum tables across multiple cache.xml files, then an exception is thrown. If more than one Greenplum Database table is specified within a single cache.xml file, no error will be thrown or issued, and the import or export operation will proceed, resulting in incorrect results.

Supported Datatypes

The following datatypes are supported. The left side shows the Greenplum/PostgreSQL datatype, and the right side shows the associated GemFire/Java datatype which it is mapped to.

Greenplum Database GemFire
Bigint Long (Object wrapper)
Bigint not null long (primitive)
Bigserial long (primitive)
Boolean not null boolean (primitive)
Boolean Boolean (wrapper)
Bit BitSet
Varbit BitSet
Char String
Varchar String
Decimal BigDecimal
Float Float (Object wrapper)
Float not null float (primitive)
Int Integer (Object wrapper)
Int not null int (primitive)
Real Float (Object wrapper)
Real not null float (primitive)
Serial int (primitive)
Smallint Short (Object wrapper)
Smallint not null short (primitive)
Text String
Date java.sql.Date
Time Time
Timetz Time
Timestamp Timestamp
Timestamptz Timestamp

Unsupported Greenplum Datatypes

  • inet
  • cidr
  • bytea
  • box
  • circle
  • interval
  • seg
  • macaddr
  • path
  • point
  • polygon
  • xml
  • money
  • array

Unsupported GemFire Datatypes

  • Classes with data members that are objects. Not supported, as this would require a representation of nested objects in Greenplum Database. Here is an example of an unsupported type:

    class HasAnObjectMember {
        private UserObject theObject;
    }
    
  • Any data type other than BitSet, Boolean, Date, Number, and String. Array types are an example of an unsupported data type.

check-circle-line exclamation-circle-line close-line
Scroll to top icon