This topic explains custom partitioning and data colocation can be used separately or in conjunction with each other in VMware Tanzu GemFire.
Use custom partitioning to group like entries into region buckets within a region. By default, Tanzu GemFire assigns new entries to buckets based on the entry key’s hash code. With custom partitioning, you can assign your entries to buckets in whatever way you want.
You can generally get better performance if you use custom partitioning to group similar data within a region. For example, a query run on all accounts created in January runs faster if all January account data is hosted by a single member. Grouping all data for a single customer can improve performance of data operations that work on customer data. Data aware function execution also takes advantage of custom partitioning.
With custom partitioning, you have two choices:
Fixed custom partitioning. With fixed custom partitioning, you specify the exact member where each region entry resides. You assign an entry to a partition and then to a bucket within that partition. You name specific members as primary and secondary hosts of each partition.
This gives you complete control over the locations of your primary and any secondary buckets for the region. This can be useful when you want to store specific data on specific physical machines or when you need to keep data close to certain hardware elements.
Fixed partitioning has these requirements and caveats:
See Fixed Custom Partitioning for implementation and configuration details.
With data colocation, Tanzu GemFire stores entries that are related across multiple data regions in a single member. Tanzu GemFire does this by storing all of the regions’ buckets with the same ID together in the same member. During rebalancing operations, Tanzu GemFire moves these bucket groups together or not at all.
So, for example, if you have one region with customer contact information and another region with customer orders, you can use colocation to keep all contact information and all orders for a single customer in a single member. This way, any operation done for a single customer uses the cache of only a single member.
This figure shows two regions with data colocation where the data is partitioned by customer type.
Data colocation requires the same data partitioning mechanism for all of the colocated regions. You can use the default partitioning provided by Tanzu GemFire or any of the custom partitioning strategies.
You must use the same high availability settings across your colocated regions.
See Colocate Data from Different Partitioned Regions for implementation and configuration details.