This topic explains high availability for partitioned regions in VMware Tanzu GemFire.
With high availability, each member that hosts data for the partitioned region receives some primary copies and some redundant (secondary) copies.
With redundancy, if one member fails, operations continue on the partitioned region with no interruption of service:
Note: You can still lose cached data when you are using redundancy if enough members go down in a short enough time span.
You can configure how the system works to recover redundancy when it is not satisfied. You can configure recovery to take place immediately or, if you want to give replacement members a chance to start up, you can configure a wait period. Redundancy recovery is also automatically attempted during any partitioned data rebalancing operation. Use the gemfire.MAX_PARALLEL_BUCKET_RECOVERIES
system property to configure the maximum number of buckets that are recovered in parallel. By default, up to 8 buckets are recovered in parallel any time the system attempts to recover redundancy.
Without redundancy, the loss of any of the region’s data stores causes the loss of some of the region’s cached data. Generally, you should not use redundancy when your applications can directly read from another data source, or when write performance is more important than read performance.
By default, Tanzu GemFire places your primary and secondary data copies for you, avoiding placement of two copies on the same physical machine. If there are not enough machines to keep different copies separate, Tanzu GemFire places copies on the same physical machine. You can change this behavior, so Tanzu GemFire only places copies on separate machines.
You can also control which members store your primary and secondary data copies. Tanzu GemFire provides two options:
By default, Tanzu GemFire stores redundant copies on different machines. When you run your processes in virtual machines, the normal view of the machine becomes the VM and not the physical machine. If you run multiple VMs on the same physical machine, you could end up storing partitioned region primary buckets in separate VMs, but on the same physical machine as your secondaries. If the physical machine fails, you can lose data. When you run in VMs, you can configure Tanzu GemFire to identify the physical machine and store redundant copies on different physical machines.
Tanzu GemFire treats reads and writes differently in highly-available partitioned regions than in other regions because the data is available in multiple members:
put
and create
) go to the primary for the data keys and then are distributed synchronously to the redundant copies. Events are sent to the members configured with subscription-attributes
interest-policy
set to all
.In this figure, M1 is reading W, Y, and Z. It gets W directly from its local copy. Since it does not have a local copy of Y or Z, it goes to a cache that does, picking the source cache at random.