This topic explains backing up and restoring service instances for use with VMware Tanzu GemFire for Tanzu Application Service.

Both disaster recovery and system validation depend upon backups. A GemFire for Tanzu Application Service service instance backup consists of the configuration for a GemFire for Tanzu Application Service service instance together with all region data that has been persisted to disk. A GemFire for Tanzu Application Service service instance backup may be used to restore the service on a new, but unconfigured GemFire for Tanzu Application Service service instance.

Before You Begin

Warning: Consider each of these important items when preparing to back up and restore a GemFire for Tanzu Application Service service instance. Since recovery depends on a correct implementation, address each item during planning.

  • The backup consists of data from regions that have been persisted to disk. All non-persistent region data will be lost. A persistent region is one that writes its region entries to disk.
  • The backup and restoration process does not restore WAN replication configuration. A GemFire for Tanzu Application Service service instance that replicates data via WAN may be backed up. A restore of that GemFire for Tanzu Application Service service instance will contain all the persistent data. That GemFire for Tanzu Application Service service instance will be able to communicate with other WAN-connected service instances after further configuration.
  • The disk size on the jumpbox VM must be large enough to hold the backup artifacts. The quantity of disk space needs to exceed the sum total of all disk space used within the GemFire for Tanzu Application Service service instance to be backed up. An approximation of that quantity of disk space may be calculated by looking at the fully populated GemFire for Tanzu Application Service service instance. Sum the disk space (in /var/vcap/store) used by each locator and server.
  • The GemFire for Tanzu Application Service service instance needs to be quiescent when the backup is taken. An active cluster could have disk writes in progress, leading to a backup for which region data may or may not have been written to disk.
  • The fresh GemFire for Tanzu Application Service service instance that is to be used for a restore must have same quantity of locators and servers as the GemFire for Tanzu Application Service service instance had when its backup was taken. The fresh GemFire for Tanzu Application Service service instance must have sufficient resources to hold the data that is in the backup.
  • Do not configure the fresh GemFire for Tanzu Application Service service instance that is to be used for a restore after creation. It must be empty. Do not create any regions or start any gateway senders. The restore will create all regions, both persistent and not persistent. The non-persistent regions will be empty.
  • Service keys are not part of the backup. New service keys must be created on a restored service instance.

Backing Up a GemFire for Tanzu Application Service Service Instance

  1. Optional: Back up the VMware Tanzu Application Service for VMs environment. For instructions, see Backing Up and Restoring your Tanzu Operations Manager Deployment in the VMware Tanzu Ops Manager documentation.

  2. Optional: Compact the disk stores of the cluster to be backed up. Use the gfsh compact disk-store command after connecting to the cluster with a role that is authorized with the CLUSTER:MANAGE operation permission. See the gfsh compact disk-store command in the VMware GemFire documentaion.

  3. Optional: Acquire a region statistic that may be used for a minimal validation upon using this backup for subsequent restoration. Use the gfsh describe region command on each persistent region, after connecting to the cluster with a role that is authorized with the CLUSTER:READ operation permission.

    describe region --name=REGION-NAME

    For each persistent region (the data-policy will contain the string PERSISTENT), record the size of region. This size is the quantity of entries in the region. Assuming that activity on the region is quiescent, when this backup is used in a restoration, the quantity of region entries in the restored region forms a minimal validation of the region.

  4. SSH to the jumpbox. Assuming that the Ops Manager VM is being used as the jumpbox, follow directions in Log in to the Tanzu Operations Manager VM with SSH in Advanced Tanzu Operations Manager Troubleshooting with BOSH CLI in the VMware Tanzu Operations Manager documentation.

  5. Determine all needed credentials and parameters for the command that makes the backup:

    • BOSH-DIRECTOR-IP: Obtain the BOSH Director IP address by following the instructions in Retrieve BOSH Director Address in Backing Up You Tanzu Operations Manager Deployments with BBR in the VMware Tanzu Operations Manager documentation.
    • SERVICE-INSTANCE-DEPLOYMENT-NAME: Obtain the deployment name by following the instructions at Acquire the Deployment Name.
    • BOSH-CLIENT, BOSH-CLIENT-PASSWORD, and PATH-TO-BOSH-SERVER-CERTIFICATE: To learn all three of these values, open the Credentials tab of the BOSH Director tile in Ops Manager. Find and open Bosh Commandline Credentials. The path is the path to the root Certificate Authority (CA) certificate, and its value will be /var/tempest/workspaces/default/root_ca_certificate if Ops Manager and the jumpbox are on the same VM.
  6. Make the backup. From the jumpbox, issue this command, substituting values acquired in the previous step:

    bbr deployment \
    --target BOSH-DIRECTOR-IP \
    --username BOSH-CLIENT \

    BOSH_CLIENT_SECRET is an environment variable that is set only for the duration of the command.

    The backup will be within a directory on the jumpbox named by the deployment name and a timestamp.

  7. Copy the backup to a permanent home for archiving. VMware recommends compressing and encrypting the files. Disaster recovery plans often recommend archiving multiple copies of each backup.

Restoring a GemFire for Tanzu Application Service Service Instance

  1. Use SSH to connect to the jumpbox. Assuming that the Ops Manager VM is being used as the jumpbox, follow directions in Log in to the Tanzu Operations Manager VM with SSH in Advanced Tanzu Operations Manager Troubleshooting with the BOSH CLI in the VMware Tanzu Operations Manager documentation.

  2. Transfer the archived backup to the jumpbox. Expand if compressed, and decrypt if encrypted.

  3. Make a new, but empty service instance. See directions in Creating a GemFire for Tanzu Application Service Service Instance. If the service instance to be restored was connected to a WAN, be sure to create the new instance with the same distributed_system_id as the old one.

  4. Determine all needed credentials and parameters for the command that does the restore by following step 5 instructions from making the backup.

  5. Do the restore. From the jumpbox, issue this command, substituting values acquired in the previous step:

    bbr deployment \
    --target BOSH-DIRECTOR-IP \
    --username BOSH-CLIENT \
    restore --artifact-path PATH-TO-SERVICE-INSTANCE-ARTIFACT

    PATH-TO-SERVICE-INSTANCE-ARTIFACT is the path to the artifact for the instance that you are currently restoring.

  6. Create a new service key, as the service key was not an artifact that the backup created. For a stand-alone service instance (one that is not part of a WAN), follow the directions at Create a Service Key. For a WAN-connected service instance, see Restoring a WAN Connection.

  7. If the region size was captured prior to making the backup, in order to do a minimal validation of the restored cluster, use the gfsh describe region command on each persistent region, after connecting to the cluster with a role that is authorized with the CLUSTER:READ operation permission.

    describe region --name=REGION-NAME

    For each persistent region, the size of the restored region should be the same as the value captured when the backup was made.

  8. Rebind or restage all apps.

