Custom-Partition Your Region Data

This topic explains how to customize the partitioning of your region data in VMware Tanzu GemFire.

By default, VMware Tanzu GemFire partitions each data entry into a bucket using a hashing policy on the key. Additionally, the physical location of the key-value pair is abstracted away from the application. You can change these policies for a partitioned region by providing your own partition resolver. The partitioning can go further with a fixed-partition resolver that specifies which members host which data buckets.

Note: If you are both colocating region data and custom partitioning, all colocated regions must use the same custom partitioning mechanism. See Colocate Data from Different Partitioned Regions.

To custom-partition your region data, follow two steps:

Implement the interface.
Configure the regions.

These steps differ based on which partition resolver is used.

Implementing Standard Partitioning

Implement the org.apache.geode.cache.PartitionResolver interface within one of the following locations, listed here in the search order used by Tanzu GemFire:
- Within a custom class. Specify this class as the partition resolver during region creation.
- Within the key’s class. For keys implemented as objects, define the interface within the key’s class.
- Within the cache callback class. Implement the interface within a cache callback’s class. When using this implementation, any and all Region operations must be those that specify the callback as a parameter.
Implement the resolver’s getName, init, and close methods.

A simple implementation of getName is
```
return getClass().getName();
```
The init method does any initialization steps upon cache start that relate to the partition resolver’s task.

The close method accomplishes any clean up that must be accomplished before a cache close completes. For example, close might close files or connections that the partition resolver opened.
Implement the resolver’s getRoutingObject method to return the routing object for each entry. A hash of that returned routing object determines the bucket. Therefore, getRoutingObject should return an object that, when run through its hashCode, directs grouped objects to the desired bucket.

Note: Only fields on the key should be used when creating the routing object. Do not use the value or additional metadata for this purpose.

For example, here is an implementation on a region key object that groups using the sum of a month and year:
```
Public class TradeKey implements PartitionResolver
{
  private String tradeID;
  private Month month;
  private Year year;
  public TradingKey(){ }
  public TradingKey(Month month, Year year)
  {
    this.month = month;
    this.year = year;
  }
  public Serializable getRoutingObject(EntryOperation opDetails)
  {
    return this.month + this.year;
  }
}
```

Implementing the String-Based Partition Resolver

The implementation of a string-based partition resolver is in org.apache.geode.cache.util.StringPrefixPartitionResolver. It does not require any further implementation.

Implementing Fixed Partitioning

Implement the org.apache.geode.cache.FixedPartitionResolver interface within one of the following locations, listed here in the search order used by Tanzu GemFire:
- Custom class. Specify this class as the partition resolver during region creation.
- Entry key. For keys implemented as objects, define the interface for the key’s class.
- Within the cache callback class. Implement the interface within a cache callback’s class. When using this implementation, any and all Region operations must be those that specify the callback as a parameter.
Implement the resolver’s getName, init, and close methods.

A simple implementation of getName is
```
return getClass().getName();
```
The init method does any initialization steps upon cache start that relate to the partition resolver’s task.

The close method accomplishes any clean up that must be accomplished before a cache close completes. For example, close might close files or connections that the partition resolver opened.
Implement the resolver’s getRoutingObject method to return the routing object for each entry. A hash of that returned routing object determines the bucket within a partition.

This method can be empty for fixed partitioning where there is only one bucket per partition. That implementation assigns partitions to servers such that the application has full control of grouping entries on servers.

Note: Only fields on the key should be used when creating the routing object. Do not use the value or additional metadata for this purpose.

Implement the getPartitionName method to return the name of the partition for each entry, based on where you want the entries to reside. All entries within a partition will be on a single server.

This example places the data based on date, with a different partition name for each quarter-year and a different routing object for each month.

/**
 * Returns one of four different partition names
 * (Q1, Q2, Q3, Q4) depending on the entry's date
 */
class QuarterFixedPartitionResolver implements
    FixedPartitionResolver<String, String> {

  @Override
  public String getPartitionName(EntryOperation<String, String> opDetails,
    Set<String> targetPartitions) {

    Date date = (Date)opDetails.getKey();
    Calendar cal = Calendar.getInstance();
    cal.setTime(date);
    int month = cal.get(Calendar.MONTH);
    if (month >= 0 && month < 3) {
      if (targetPartitions.contains("Q1")) return "Q1";
    }
    else if (month >= 3 && month < 6) {
      if (targetPartitions.contains("Q2")) return "Q2";
    }
    else if (month >= 6 && month < 9) {
      if (targetPartitions.contains("Q3")) return "Q3";
    }
    else if (month >= 9 && month < 12) {
      if (targetPartitions.contains("Q4")) return "Q4";
    }
    return "Invalid Quarter";
  }

  @Override
  public String getName() {
    return "QuarterFixedPartitionResolver";
  }

  @Override
  public Serializable getRoutingObject(EntryOperation<String, String> opDetails) {
    Date date = (Date)opDetails.getKey();
    Calendar cal = Calendar.getInstance();
    cal.setTime(date);
    int month = cal.get(Calendar.MONTH);
    return month;
  }

  @Override
  public void close() {
  }
}

Configuring Standard Partitioning

Configure the region so Tanzu GemFire finds your resolver for all region operations. How you do this depends on where you chose to implement your custom partitioning.
- Custom class. Define the class for the region at creation. Use one of these methods:
  
  XML:
```
<region name="trades">
  <region-attributes>
    <partition-attributes>
      <partition-resolver> 
        <class-name>myPackage.TradesPartitionResolver</class-name>
      </partition-resolver>
    </partition-attributes>
  </region-attributes>
</region>
```
  Java API:
```
PartitionResolver resolver = new TradesPartitionResolver();
PartitionAttributes attrs =
  new PartitionAttributesFactory()
    .setPartitionResolver(resolver).create();

Cache c = new CacheFactory().create();

Region r = c.createRegionFactory()
  .setPartitionAttributes(attrs)
  .create("trades");
```
  gfsh:
  
  Add the option --partition-resolver to the gfsh create region command, specifying the package and class name of the custom partition resolver.
- Entry key. Use the key object with the resolver implementation for every entry operation.
- Cache callback argument. Provide the argument to every call that accesses an entry. This restricts you to calls that take a callback argument.
If your colocated data is in a server system, add the PartitionResolver implementation class to the CLASSPATH of your Java clients. For Java single-hop access to work, the resolver class needs to have a zero-argument constructor, and the resolver class must not have any state; the init method is included in this restriction.

Configuring the String-Based Partition Resolver

Define the class for the region at creation. The resolver will be used for every entry operation. Use one of these methods:

XML:

<region name="customers">
  <region-attributes>
    <partition-attributes>
      <partition-resolver>
        <class-name>org.apache.geode.cache.util.StringPrefixPartitionResolver
        </class-name>
      </partition-resolver>
    </partition-attributes>
  </region-attributes>
</region>

Java API:

PartitionAttributes attrs =
  new PartitionAttributesFactory()
  .setPartitionResolver("StringPrefixPartitionResolver").create();

Cache c = new CacheFactory().create();

Region r = c.createRegionFactory()
  .setPartitionAttributes(attrs)
  .create("customers");

gfsh:

Add the option --partition-resolver=org.apache.geode.cache.util.StringPrefixPartitionResolver to the gfsh create region command.

For colocated data, add the StringPrefixPartitionResolver implementation class to the CLASSPATH of your Java clients. The resolver will work with Java single-hop clients.

Configuring Fixed Partitioning

Set the fixed-partition attributes for each member.

These attributes define the data stored for the region by the member and must be different for different members. See org.apache.geode.cache.FixedPartitionAttributes for definitions of the attributes. Define each partition-name in your data-host members for the region. For each partition name, in the member you want to host the primary copy, define it with is-primary set to true. In every member you want to host the secondary copy, define it with is-primary set to false (the default). The number of secondaries must match the number of redundant copies you have defined for the region. See Configure High Availability for a Partitioned Region.

Note: Buckets for a partition are hosted only by the members that have defined the partition name in their FixedPartitionAttributes.

These examples set the partition attributes for a member to be the primary host for the “Q1” partition data and a secondary host for “Q3” partition data.

XML:

<cache>
  <region name="Trades">
    <region-attributes>
      <partition-attributes redundant-copies="1">
        <partition-resolver>
          <class-name>myPackage.QuarterFixedPartitionResolver</class-name>
        </partition-resolver>
        <fixed-partition-attributes partition-name="Q1" is-primary="true"/>
        <fixed-partition-attributes partition-name="Q3" is-primary="false"
          num-buckets="6"/>
      </partition-attributes> 
    </region-attributes>
  </region>
</cache>

Java:

FixedPartitionAttribute fpa1 = FixedPartitionAttributes
  .createFixedPartition("Q1", true);
FixedPartitionAttribute fpa3 = FixedPartitionAttributes
  .createFixedPartition("Q3", false, 6);

PartitionAttributesFactory paf = new PartitionAttributesFactory()
  .setPartitionResolver(new QuarterFixedPartitionResolver())
  .setTotalNumBuckets(12)
  .setRedundantCopies(2)
  .addFixedPartitionAttribute(fpa1)
  .addFixedPartitionAttribute(fpa3);

Cache c = new CacheFactory().create();

Region r = c.createRegionFactory()
  .setPartitionAttributes(paf.create())
  .create("Trades");

gfsh:

You cannot specify a fixed partition resolver using gfsh.

If your colocated data is in a server system, add the class that implements the FixedPartitionResolver interface to the CLASSPATH of your Java clients. For Java single-hop access to work, the resolver class needs to have a zero-argument constructor, and the resolver class must not have any state; the init method is included in this restriction.