How the types correspond between a VMware Tanzu GemFire region and a VMware Tanzu Greenplum table are described in a cache.xml
file.
In order to use the connector, a mapping between a GemFire region and a Greenplum Database table has to be described. This mapping may go in the cache.xml
file, which has an extended syntax for the mapping information.
The connector makes assumptions about the intended mapping if some parts of the mapping are incomplete. We advise you to explicitly define all keys and fields, such that the connector has no assumptions to make.
This cache.xml
file example provides the XML description of the mappings for the region. The XML with the gpdb
tag is the mapping between GemFire region fields and the Greenplum Database table columns. It describes how to connect to Greenplum Database, as well as gpfdist protocol to be used to transfer the data between the two systems.
<?xml version="1.0" encoding="UTF-8"?>
<cache xmlns="http://geode.apache.org/schema/cache"
xmlns:gpdb="http://schema.pivotal.io/gemfire/gpdb"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://geode.apache.org/schema/cache
http://geode.apache.org/schema/cache/cache-1.0.xsd
http://schema.pivotal.io/gemfire/gpdb
http://schema.pivotal.io/gemfire/gpdb/gpdb-3.5.xsd"
version="1.0">
<pdx read-serialized="true" persistent="false">
<pdx-serializer>
<class-name>org.apache.geode.pdx.ReflectionBasedAutoSerializer</class-name>
<parameter name="classes">
<string>io.pivotal.gemfire.demo.entity.*</string>
</parameter>
</pdx-serializer>
</pdx>
<jndi-bindings>
<jndi-binding jndi-name="DemoDatasource" type="SimpleDataSource"
jdbc-driver-class="org.postgresql.Driver" user-name="gpadmin"
password="changeme" connection-url="jdbc:postgresql://localhost:5432/gemfire_db">
</jndi-binding>
</jndi-bindings>
<region name="Parent">
<region-attributes refid="PARTITION">
<partition-attributes redundant-copies="1" />
</region-attributes>
<gpdb:store datasource="DemoDatasource">
<gpdb:types>
<gpdb:pdx name="io.pivotal.gemfire.demo.entity.Parent"
schema="public"
table="parent">
<gpdb:id field="id" />
<gpdb:fields>
<gpdb:field name="id" column="id" />
<gpdb:field name="name" />
<gpdb:field name="income" class="java.math.BigDecimal" />
</gpdb:fields>
</gpdb:pdx>
</gpdb:types>
</gpdb:store>
</region>
<gpdb:gpfdist port="8000" />
</cache>
In the example, there is a single JNDI binding to specify the details of the connection to the example’s Greenplum database. This element defines a datasource
that is referenced later in gpdb:store
element using datasource
attribute.
The connection to the Greenplum instance will be to localhost:5432
, with a default database of gemfire_db
. The Greenplum Database user name is gpadmin
, and its password is changeme
.
For more details on JNDI bindings, see JNDI.
The gpdb:store
element specifies the mapping between Greenplum Database table contents and the GemFire region.
There should be one store element for each Greenplum data source.
In this example, the gpdb:store
element identifies the JNDI binding to specify the Greenplum Database instance at the other end of the connector. It is the data source.
The gpdb:types
element contains a list of gpdb:pdx
instances. There is only one gpdb:pdx
instance in this example.
Each gpdb:pdx
instance identifies the mapping of a GemFire region’s entries to its corresponding Greenplum Database table. This is needed for the data serialization and transfer implemented by the connector. The optional schema
attribute specifies the Greenplum Database schema for the table. See Creating and Managing Schemas within the Greenplum Database manual for a description of Greenplum schemas. The optional schema
attribute is set to the default value of public
when not specified. Make the schema
attribute value the same across all GemFire servers that host the region.
The gpdb:id
element specifies the fields to be used as the key in the identification of a GemFire object.
The example uses a singleton key named id
, also mapped to table column id
:
<gpdb:id field="id" />
Here is an example of a composite key with the two fields parentId
and id
mapped to table columns parentId
and id
.
<gpdb:id>
<gpdb:field ref="parentId" />
<gpdb:field ref="id" />
</gpdb:id>
The gpdb:fields
element lists the fields mapped to values of the GemFire PdxInstance
. A subset of the available GemFire fields or Greenplum Database table columns may be mapped. Omission of this element causes the connector to automap the GemFire region fields to Greenplum table columns, using the assumption that the exactly matching field and column names are to be mapped.
Each gpdb:field
instance specifies the mapping from a GemFire PdxIntance
field to a Greenplum Database table column. If no Greenplum column
attribute is specified, the Greenplum column is assumed to have exactly the same name (case sensitive) as the GemFire field.
Here is an example in which a GemFire parentId
field is mapped to a Greenplum Database parent_id
column, and the GemFire age
field is of the Java class (data type) java.lang.Integer
.
<gpdb:field name="parentId" column="parent_id" />
<gpdb:field name="age" class="java.lang.Integer" />
Special cases exist when the gpdb:field
attributes are missing from the cache.xml
file. With no gpdb:field
attributes defined, the connector assumes that the GemFire region fields have the same names as the Greenplum Database table columns. This list describes the current behavior for this type of special case:
If there exists a Greenplum table column for which there is no GemFire region field, then an export operation does not update that Greenplum Database table column. and an insert of a new row places either null or the defined default into the columns with no associated GemFire region field. An import operates on only those GemFire region fields for which the column with the matching name exists.
If there is a GemFire region field for which there is no matching Greenplum Database table column, then an export operation only exports the GemFire region fields for which there is a Greenplum table column of the same name. An import has side effects on the GemFire region. The import correctly acquires those GemFire region fields for which the column with the matching name exists. But, the import sets to null those GemFire region fields for which there is no Greenplum Database table column of the same name; this occurs for both an update and an insert of region entries.
If one or more gpdb:field
attributes are specified, there can exist a mismatch of fields to columns. This list describes the current behavior for this type of special case:
If there is exactly one mapping of a field to a column, and the naming of that GemFire region field or a Greenplum Database table column within a gpdb:field
attribute is incorrect such that it does not match a configured and defined name, an exception is thrown upon export, as this case appears as if there are no mappings.
With more than one gpdb:field
attributes specified, if the naming of one GemFire region field or a Greenplum Database table column is incorrect such that it does not match a configured and defined name, upon export an update to a Greenplum table row will not change the value of the incorrectly named column. An insert of a new Greenplum table row under the same assumptions acquires the correctly mapped GemFire region fields. The incorrectly mapped Greenplum column will be set to null or its defined default value. Upon import, either an update of an existing entry or an insert of a new entry acquires the Greenplum Database table column values for any correctly mapped fields, and uses null for any incorrectly named GemFire field.
The gpdb:gpfdist
element is not part of the region specification, but is part of the cache specification. It provides details of the gpfdist
protocol used in the connector. More details may be found within the section on Using the Greenplum Parallel File Server (gpfdist) in the Greenplum Database manual.
To increase the security of the data during transfer between GemFire and Greenplum Database, TLS/SSL transfers can be enabled, as in this example:
<gpdb:gpfdist hostname="gpfdist.gemfire.test">
<gpdb:ssl
key-store="target/test-classes/io/pivotal/gemfire/gpdb/gpfdist/keystore.jks"
key-store-password="123456"
trust-store="target/test-classes/io/pivotal/gemfire/gpdb/gpfdist/truststore.jks"
trust-store-password="123456" />
</gpdb:gpfdist>
All four gpdb:ssl
attributes must be present to configure the transfers to use SSL. The two password fields remain in plain text within the cache.xml
file, so file system protections should be considered.
The specification of optional attributes overrides default values. These are the optional attributes:
A port
specification overrides the listening port of gpfdist
to be other than its default of the first available port within the inclusive range of 8080-8280. The value can be a single port number or a range. If specified with a single port number and that port number is already in use, an exception will be thrown. If specified with a range, the syntax uses a colon to separate the two endpoints of the range, as in <lowport>:<highport>
. A value of 0 causes automatic allocation of a port number from an ephemeral port range. An example with a specific port number:
<gpdb:gpfdist port="8000" />
An example that specifies a port range:
``` pre
<gpdb:gpfdist port="8000:8100" />
```
A hostname
specifies the host name used in URI generation. If SSL is enabled, this name must match the common name (CN) of installed key store certificate. The default value is an IP address of the local host.
<gpdb:gpfdist hostname="host12.example.com" />
An address
specification identifies the IP address for listening to be other than the default value of 0.0.0.0. For example:
<gpdb:gpfdist address="192.0.2.0" />
idle-timeout
specifies a timeout in milliseconds. If no data is transmitted within this timeout period, the connection may be closed. The default value is 60000 milliseconds. There will be no timeout if set to 0. For example
<gpdb:gpfdist idle-timeout="120000" />
cache.xml
files, then an exception is thrown. If more than one Greenplum Database table is specified within a single cache.xml
file, no error will be thrown or issued, and the import or export operation will proceed, resulting in incorrect results.The following datatypes are supported. The left side shows the Greenplum/PostgreSQL datatype, and the right side shows the associated GemFire/Java datatype which it is mapped to.
Greenplum Database | GemFire |
---|---|
Bigint | Long (Object wrapper) |
Bigint not null | long (primitive) |
Bigserial | long (primitive) |
Boolean not null | boolean (primitive) |
Boolean | Boolean (wrapper) |
Bit | BitSet |
Varbit | BitSet |
Char | String |
Varchar | String |
Decimal | BigDecimal |
Float | Float (Object wrapper) |
Float not null | float (primitive) |
Int | Integer (Object wrapper) |
Int not null | int (primitive) |
Real | Float (Object wrapper) |
Real not null | float (primitive) |
Serial | int (primitive) |
Smallint | Short (Object wrapper) |
Smallint not null | short (primitive) |
Text | String |
Date | java.sql.Date |
Time | Time |
Timetz | Time |
Timestamp | Timestamp |
Timestamptz | Timestamp |
Classes with data members that are objects. Not supported, as this would require a representation of nested objects in Greenplum Database. Here is an example of an unsupported type:
class HasAnObjectMember {
private UserObject theObject;
}
Any data type other than BitSet
, Boolean
, Date
, Number
, and String
. Array types are an example of an unsupported data type.