This topic explains how to define and configure disk stores in VMware Tanzu GemFire.
You define disk stores in your cache, then you assign them to your regions and queues by setting the disk-store-name
attribute in your region and queue configurations.
NoteBesides the disk stores that you specify, Tanzu GemFire has a default disk store that it uses when disk use is configured with no disk store name specified. By default, this disk store is saved to the application’s working directory. For information about changing the behavior, see Create and Configure Your Disk Stores and Modifying the Default Disk Store.
Before you begin, review GemFire Basic Configuration and Programming.
Work with your system designers and developers to plan for anticipated disk storage requirements in your testing and production caching systems. Take into account space and functional requirements.
When calculating your disk requirements, figure in your data modification patterns and your compaction strategy. Tanzu GemFire creates each oplog file at the max-oplog-size
, which defaults to 100 MiB. Obsolete operations are removed from the oplogs only during compaction, so you need enough space to store all operations that are done between compactions. For regions where you are doing a mix of updates and deletes, if you use automatic compaction, a good upper bound for the required disk space is
((1 / (compaction_threshold / 100) ) * data_size) + (max_oplog_size * segments)
where data_size
is the total size of all the data you store in the disk store. So, for the default compaction-threshold
of 50, the disk space is roughly twice your data size. Note that the compaction thread could lag behind other operations, causing disk use to rise temporarily above the upper bound. If you deactivate automatic compaction, the amount of disk required depends on how many obsolete operations accumulate between manual compactions.
Work with your host system administrators to determine where to place your disk store directories, based on your anticipated disk storage requirements and the available disks on your host systems.
gfsh
prompt and connect to the cluster.At the gfsh
prompt, create and configure a disk store:
Specify the name (--name
) of the disk-store.
Choose disk store names that reflect how the stores should be used and that work for your operating systems. Disk store names are used in the disk file names:
Use disk store names that satisfy the file naming requirements for your operating system. For example, if you store your data to disk in a Windows system, your disk store names could not contain any of these reserved characters, < > : " / \ | ? *.
Do not use very long disk store names. The full file names must fit within your operating system limits. On Linux, for example, the standard limitation is 255 characters.
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data
Configure the directory locations (--dir
) and the maximum space to use for the store (specified after the disk directory name by # and the maximum number in megabytes).
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data
Optionally, you can configure the store’s segments. Example:
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data --segments=10
Optionally, you can configure the store’s file compaction behavior. In conjunction with this, plan and program for any manual compaction. Example:
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data --segments=10 \
--compaction-threshold=40 --auto-compact=false --allow-force-compaction=true
If needed, configure the maximum size (in MB) of a single oplog. When the current files reach this size, the system rolls forward to a new file. You get better performance with relatively small maximum file sizes. Example:
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data --segments=10 \
--compaction-threshold=40 --auto-compact=false --allow-force-compaction=true \
--max-oplog-size=512
If needed, modify queue management parameters for asynchronous queueing to the disk store. You can configure any region for synchronous or asynchronous queueing (region attribute disk-synchronous
). Server queues and gateway sender queues always operate synchronously. When either the queue-size
(number of operations) or time-interval
(milliseconds) is reached, enqueued data is flushed to disk. You can also synchronously flush unwritten data to disk through the DiskStore
flushToDisk
method. Example:
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data --segments=10 \
--compaction-threshold=40 --auto-compact=false --allow-force-compaction=true \
--max-oplog-size=512 --queue-size=10000 --time-interval=15
If needed, modify the size (specified in bytes) of the buffer used for writing to disk. Example:
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data --segments=10 \
--compaction-threshold=40 --auto-compact=false --allow-force-compaction=true \
--max-oplog-size=512 --queue-size=10000 --time-interval=15 --write-buffer-size=65536
If needed, modify the disk-usage-warning-percentage
and disk-usage-critical-percentage
thresholds that determine the percentage (default: 90%) of disk usage that will trigger a warning and the percentage (default: 99%) of disk usage that will generate an error and shut down the member cache. Example:
gfsh>create disk-store --name=serverOverflow --dir=c:\overflow_data --segments=10 \
--compaction-threshold=40 --auto-compact=false --allow-force-compaction=true \
--max-oplog-size=512 --queue-size=10000 --time-interval=15 --write-buffer-size=65536 \
--disk-usage-warning-percentage=80 --disk-usage-critical-percentage=98
The following is the complete disk store cache.xml configuration example:
<disk-store name="serverOverflow" segments="10" compaction-threshold="40"
auto-compact="false" allow-force-compaction="true"
max-oplog-size="512" queue-size="10000"
time-interval="15" write-buffer-size="65536"
disk-usage-warning-percentage="80"
disk-usage-critical-percentage="98">
<disk-dirs>
<disk-dir>c:\overflow_data</disk-dir>
<disk-dir dir-size="20480">d:\overflow_data</disk-dir>
</disk-dirs>
</disk-store>
NoteAs an alternative to defining cache.xml on every server in the cluster: If the cluster configuration service is enabled, when you create a disk store in
gfsh
, you can share the disk store’s configuration with the rest of cluster. See Overview of the Cluster Configuration Service.
You can modify an offline disk store by using the alter disk-store command. If you are modifying the default disk store configuration, use “DEFAULT” as the disk-store name.
The following are examples of using already created and named disk stores for Regions, Queues, and PDX Serialization.
Example of using a disk store for region persistence and overflow:
gfsh:
gfsh>create region --name=regionName --type=PARTITION_PERSISTENT_OVERFLOW \
--disk-store=serverPersistOverflow
cache.xml
<region refid="PARTITION_PERSISTENT_OVERFLOW" disk-store-name="persistOverflow1"/>
Example of using a named disk store for server subscription queue overflow (cache.xml):
<cache-server port="40404">
<client-subscription
eviction-policy="entry"
capacity="10000"
disk-store-name="queueOverflow2"/>
</cache-server>
Example of using a named disk store for PDX serialization metadata (cache.xml):
<pdx read-serialized="true"
persistent="true"
disk-store-name="SerializationDiskStore">
</pdx>
Gateway sender queues are always overflowed and may be persisted. Assign them to overflow disk stores if you do not persist, and to persistence disk stores if you do.
Example of using a named disk store for a serial gateway sender queue persistence:
gfsh:
gfsh>create gateway-sender --id=persistedSender1 --remote-distributed-system-id=1 \
--enable-persistence=true --disk-store-name=diskStoreA --maximum-queue-memory=100
cache.xml:
<cache>
<gateway-sender id="persistedsender1" parallel="true"
remote-distributed-system-id="1"
enable-persistence="true"
disk-store-name="diskStoreA"
maximum-queue-memory="100"/>
...
</cache>
Examples of using the default disk store for a serial gateway sender queue persistence and overflow:
gfsh:
gfsh>create gateway-sender --id=persistedSender1 --remote-distributed-system-id=1 \
--enable-persistence=true --maximum-queue-memory=100
cache.xml:
<cache>
<gateway-sender id="persistedsender1" parallel="true"
remote-distributed-system-id="1"
enable-persistence="true"
maximum-queue-memory="100"/>
...
</cache>
Introduced in GemFire version 10.1, the segmented disk stores feature optimizes read, write and recovery performance by spreading regions and partitioned region buckets across segments. You can think of segments as subsidiary disk stores. Spreading the regions across segments can reduce the contention for a single shared disk store. The number of segments may be configured when a disk store is created or upgraded. If unspecified, the number of segments is determined by the number of CPUs available. Benchmarking has shown that this number provides the best performance with the least number of open files in most cases. In some cases it may be necessary to use a different number of segments. If disk store performance is not optimal with the default value, profile your application with different segment counts.
The additional segments change the disk allocation pattern from what was expected from the non-segmented disk stores. Since each segment has its own set of oplogs there is the potential for more disk usage compared to that of an unsegmented disk store. The compaction-threshold
is considered on each set of segment oplogs independently.
VMware recommends that you do not disable segmented disk stores unless absolutely required.
To disable segmented disk stores, and fall back to the orignal non-segmented disk store, set the Java system property gemfire.disk.disableSegmentedDiskStore=true
on any server using disk stores.
gfsh> start server ... --J=-Dgemfire.disk.disableSegmentedDiskStore=true
Alternatively, you may set the environment variable GEMFIRE_DISABLE_SEGMENTED_DISK_STORE=true
before starting any processes. Each subprocess should inherit this environment variable.
GEMFIRE_DISABLE_SEGMENTED_DISK_STORE=true gfsh
gfsh> start server ...