This topic explains how to customize the partitioning of your region data in VMware Tanzu GemFire.
By default, VMware Tanzu GemFire partitions each data entry into a bucket using a hashing policy on the key. Additionally, the physical location of the key-value pair is abstracted away from the application. You can change these policies for a partitioned region by providing your own partition resolver. The partitioning can go further with a fixed-partition resolver that specifies which members host which data buckets.
Note: If you are both colocating region data and custom partitioning, all colocated regions must use the same custom partitioning mechanism. See Colocate Data from Different Partitioned Regions.
To custom-partition your region data, follow two steps:
These steps differ based on which partition resolver is used.
Implementing Standard Partitioning
Implement the org.apache.geode.cache.PartitionResolver
interface within one of the following locations, listed here in the search order used by Tanzu GemFire:
Region
operations must be those that specify the callback as a parameter.Implement the resolver’s getName
, init
, and close
methods.
A simple implementation of getName
is
return getClass().getName();
The init
method does any initialization steps upon cache start that relate to the partition resolver’s task.
The close
method accomplishes any clean up that must be accomplished before a cache close completes. For example, close
might close files or connections that the partition resolver opened.
Implement the resolver’s getRoutingObject
method to return the routing object for each entry. A hash of that returned routing object determines the bucket. Therefore, getRoutingObject
should return an object that, when run through its hashCode
, directs grouped objects to the desired bucket.
Note: Only fields on the key should be used when creating the routing object. Do not use the value or additional metadata for this purpose.
For example, here is an implementation on a region key object that groups using the sum of a month and year:
Public class TradeKey implements PartitionResolver
{
private String tradeID;
private Month month;
private Year year;
public TradingKey(){ }
public TradingKey(Month month, Year year)
{
this.month = month;
this.year = year;
}
public Serializable getRoutingObject(EntryOperation opDetails)
{
return this.month + this.year;
}
}
Implementing the String-Based Partition Resolver
The implementation of a string-based partition resolver is in org.apache.geode.cache.util.StringPrefixPartitionResolver
. It does not require any further implementation.
Implementing Fixed Partitioning
Implement the org.apache.geode.cache.FixedPartitionResolver
interface within one of the following locations, listed here in the search order used by Tanzu GemFire:
Region
operations must be those that specify the callback as a parameter.Implement the resolver’s getName
, init
, and close
methods.
A simple implementation of getName
is
return getClass().getName();
The init
method does any initialization steps upon cache start that relate to the partition resolver’s task.
The close
method accomplishes any clean up that must be accomplished before a cache close completes. For example, close
might close files or connections that the partition resolver opened.
Implement the resolver’s getRoutingObject
method to return the routing object for each entry. A hash of that returned routing object determines the bucket within a partition.
This method can be empty for fixed partitioning where there is only one bucket per partition. That implementation assigns partitions to servers such that the application has full control of grouping entries on servers.
Note: Only fields on the key should be used when creating the routing object. Do not use the value or additional metadata for this purpose.
Implement the getPartitionName
method to return the name of the partition for each entry, based on where you want the entries to reside. All entries within a partition will be on a single server.
This example places the data based on date, with a different partition name for each quarter-year and a different routing object for each month.
/**
* Returns one of four different partition names
* (Q1, Q2, Q3, Q4) depending on the entry's date
*/
class QuarterFixedPartitionResolver implements
FixedPartitionResolver<String, String> {
@Override
public String getPartitionName(EntryOperation<String, String> opDetails,
Set<String> targetPartitions) {
Date date = (Date)opDetails.getKey();
Calendar cal = Calendar.getInstance();
cal.setTime(date);
int month = cal.get(Calendar.MONTH);
if (month >= 0 && month < 3) {
if (targetPartitions.contains("Q1")) return "Q1";
}
else if (month >= 3 && month < 6) {
if (targetPartitions.contains("Q2")) return "Q2";
}
else if (month >= 6 && month < 9) {
if (targetPartitions.contains("Q3")) return "Q3";
}
else if (month >= 9 && month < 12) {
if (targetPartitions.contains("Q4")) return "Q4";
}
return "Invalid Quarter";
}
@Override
public String getName() {
return "QuarterFixedPartitionResolver";
}
@Override
public Serializable getRoutingObject(EntryOperation<String, String> opDetails) {
Date date = (Date)opDetails.getKey();
Calendar cal = Calendar.getInstance();
cal.setTime(date);
int month = cal.get(Calendar.MONTH);
return month;
}
@Override
public void close() {
}
}
Configuring Standard Partitioning
Configure the region so Tanzu GemFire finds your resolver for all region operations. How you do this depends on where you chose to implement your custom partitioning.
Custom class. Define the class for the region at creation. Use one of these methods:
XML:
<region name="trades">
<region-attributes>
<partition-attributes>
<partition-resolver>
<class-name>myPackage.TradesPartitionResolver</class-name>
</partition-resolver>
</partition-attributes>
</region-attributes>
</region>
Java API:
PartitionResolver resolver = new TradesPartitionResolver();
PartitionAttributes attrs =
new PartitionAttributesFactory()
.setPartitionResolver(resolver).create();
Cache c = new CacheFactory().create();
Region r = c.createRegionFactory()
.setPartitionAttributes(attrs)
.create("trades");
gfsh:
Add the option --partition-resolver
to the gfsh create region
command, specifying the package and class name of the custom partition resolver.
If your colocated data is in a server system, add the PartitionResolver
implementation class to the CLASSPATH
of your Java clients. For Java single-hop access to work, the resolver class needs to have a zero-argument constructor, and the resolver class must not have any state; the init
method is included in this restriction.
Configuring the String-Based Partition Resolver
Define the class for the region at creation. The resolver will be used for every entry operation. Use one of these methods:
XML:
<region name="customers">
<region-attributes>
<partition-attributes>
<partition-resolver>
<class-name>org.apache.geode.cache.util.StringPrefixPartitionResolver
</class-name>
</partition-resolver>
</partition-attributes>
</region-attributes>
</region>
Java API:
PartitionAttributes attrs =
new PartitionAttributesFactory()
.setPartitionResolver("StringPrefixPartitionResolver").create();
Cache c = new CacheFactory().create();
Region r = c.createRegionFactory()
.setPartitionAttributes(attrs)
.create("customers");
gfsh:
Add the option --partition-resolver=org.apache.geode.cache.util.StringPrefixPartitionResolver
to the gfsh create region
command.
For colocated data, add the StringPrefixPartitionResolver
implementation class to the CLASSPATH
of your Java clients. The resolver will work with Java single-hop clients.
Configuring Fixed Partitioning
Set the fixed-partition attributes for each member.
These attributes define the data stored for the region by the member and must be different for different members. See org.apache.geode.cache.FixedPartitionAttributes
for definitions of the attributes. Define each partition-name
in your data-host members for the region. For each partition name, in the member you want to host the primary copy, define it with is-primary
set to true
. In every member you want to host the secondary copy, define it with is-primary
set to false
(the default). The number of secondaries must match the number of redundant copies you have defined for the region. See Configure High Availability for a Partitioned Region.
Note: Buckets for a partition are hosted only by the members that have defined the partition name in their FixedPartitionAttributes
.
These examples set the partition attributes for a member to be the primary host for the “Q1” partition data and a secondary host for “Q3” partition data.
XML:
<cache>
<region name="Trades">
<region-attributes>
<partition-attributes redundant-copies="1">
<partition-resolver>
<class-name>myPackage.QuarterFixedPartitionResolver</class-name>
</partition-resolver>
<fixed-partition-attributes partition-name="Q1" is-primary="true"/>
<fixed-partition-attributes partition-name="Q3" is-primary="false"
num-buckets="6"/>
</partition-attributes>
</region-attributes>
</region>
</cache>
Java:
FixedPartitionAttribute fpa1 = FixedPartitionAttributes
.createFixedPartition("Q1", true);
FixedPartitionAttribute fpa3 = FixedPartitionAttributes
.createFixedPartition("Q3", false, 6);
PartitionAttributesFactory paf = new PartitionAttributesFactory()
.setPartitionResolver(new QuarterFixedPartitionResolver())
.setTotalNumBuckets(12)
.setRedundantCopies(2)
.addFixedPartitionAttribute(fpa1)
.addFixedPartitionAttribute(fpa3);
Cache c = new CacheFactory().create();
Region r = c.createRegionFactory()
.setPartitionAttributes(paf.create())
.create("Trades");
gfsh:
You cannot specify a fixed partition resolver using gfsh.
If your colocated data is in a server system, add the class that implements the FixedPartitionResolver
interface to the CLASSPATH
of your Java clients. For Java single-hop access to work, the resolver class needs to have a zero-argument constructor, and the resolver class must not have any state; the init
method is included in this restriction.