Create or Delete a VMware Tanzu GemFire Cluster

This topic describes how to create and deletes a VMware Tanzu GemFire cluster in VMware Tanzu GemFire for Kubernetes.

Create a Tanzu GemFire Cluster

Once the VMware Tanzu GemFire Operator is installed (see Install the VMware Tanzu GemFire Operator), create a Tanzu GemFire cluster by using kubectl to apply the Custom Resource Definition (CRD) that describes the cluster. The Custom Resource Definition details the mappings and fields of a Tanzu GemFire cluster.

Create a namespace to be used for the Tanzu GemFire cluster:
```
kubectl create namespace NAMESPACE-NAME
```
Where NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.

This namespace is distinct and separate from the namespace created for the Tanzu GemFire operator.
Create an image pull secret for your Tanzu GemFire cluster’s namespace. Set the user name (USERNAME) and password (PASSWD) credentials to values that have permission to access VMware Tanzu Network, as they will be used to acquire the locator and server images from the registry. Create the image pull secret:
```
kubectl create secret docker-registry image-pull-secret --namespace=NAMESPACE-NAME --docker-server=registry.tanzu.vmware.com --docker-username='USERNAME' --docker-password='PASSWD'
```
Where NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.

Surround both the USERNAME and the PASSWD by single quote marks to ensure that special characters within those values are handled correctly.
If you are using Red Hat OpenShift as your Kubernetes platform, add the required service account within the NAMESPACE-NAME namespace to the anyuid Security Context Constraints, such that Tanzu GemFire’s containers can be executed as root:
```
oc adm policy add-scc-to-user anyuid -z SERVICE-ACCOUNT-NAME -n NAMESPACE-NAME
```
Where:
- NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
- SERVICE-ACCOUNT-NAME is the spec: serviceAccountName field from the deployment YAML. Use default if the spec: serviceAccountName field is empty.
Place the YAML that represents the Tanzu GemFire cluster’s CRD into a file. For example, a definition that uses all possible defaults and names the cluster gemfire1:
```
apiVersion: gemfire.vmware.com/v1
kind: GemFireCluster
metadata:
  name: gemfire1
spec:
  image: registry.tanzu.vmware.com/pivotal-gemfire/vmware-gemfire:10.1.0
```
To create the Tanzu GemFire cluster, apply the CRD specified in the file with a command of the form:
```
kubectl -n NAMESPACE-NAME apply -f CLUSTER-CRD-YAML
```
Where:
- NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
- CLUSTER-CRD-YAML is the file name of the file containing the YAML that represents the Tanzu GemFire cluster.
Check the creation status of the Tanzu GemFire cluster:
```
kubectl -n NAMESPACE-NAME get GemFireClusters
```
Where NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.

For example, if the NAMESPACE-NAME is gemfire-cluster:
```
$ kubectl -n gemfire-cluster get GemFireClusters
NAME       LOCATORS   SERVERS
gemfire1   1/1        1/2
```
The first number shows the number of running replicas. The second number shows how many replicas were specified. When the quantity running reaches the number of replicas specified for both locators and servers, the Tanzu GemFire cluster creation is complete.

If the status ErrImagePull or ImagePullBackOff is encountered, verify the following:
- The registry server and credentials used when creating the image pull secret are correct.
- The image pull secret was created in the correct namespace.
- The VMware Software EULA has been accepted for Tanzu GemFire.

Delete a Tanzu GemFire Cluster

To delete a Tanzu GemFire cluster, remove the cluster:

kubectl -n NAMESPACE-NAME delete GemFireCluster NAME

Where:

NAMESPACE-NAME is your chosen name for the cluster namespace.
NAME is the metadata: name field from the deployment YAML.

Deleting a Tanzu GemFire cluster will not remove persistent data associated with the cluster. To remove the persistent data, see Delete Persistent Data.

Restart a Deleted Tanzu GemFire Cluster

You can restart a Tanzu GemFire cluster after deletion, as long as the persistent data from the original cluster remains.

Note: When restarting a Tanzu GemFire cluster, it is important to keep the number of locator and server replicas the same as the number of replicas that were running when the original cluster was deleted. It is safe to scale the number of server replicas only after the cluster is fully running. Scaling of locator replicas is not supported by Tanzu GemFire.

To restart a deleted cluster, simply create a Tanzu GemFire cluster with the same name and namespace as the original cluster:

kubectl -n NAMESPACE-NAME apply -f CLUSTER-CRD-YAML

Where:

NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
CLUSTER-CRD-YAML is the file name of the file containing the YAML that represents the Tanzu GemFire cluster.

Recover from Restart Issues

If a Tanzu GemFire cluster is restarted with a different number of locator or server replicas, it may not start up successfully:

If restarted with fewer replicas, the servers may never become online. In this case the server logs will contain the message, “It is waiting for another member to recover the latest data.”
If restarted with more replicas, the servers may enter a CrashLoopBackOff state during startup. In this case the server logs will contain ConflictingPersistentDataException.

In either case, follow the steps below to recover:

Delete the cluster.
Wait until the pods associated with the cluster have been terminated.

Wait until the following command returns no pods:
```
kubectl -n NAMESPACE-NAME get pods -l app.kubernetes.io/name=NAME
```
Where:
- NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
- NAME is the metadata: name field from the deployment YAML.
If any of the pods have a status of “Terminating” for over a minute, force-delete the pods using a command like the following:
```
kubectl -n NAMESPACE-NAME delete pod POD-NAME --force --grace-period=0
```
Where:
- NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
- POD-NAME is the name of the pod to force-delete.
Restart the cluster with the correct number of locator and server replicas.
At this point, if any servers fail to become “Ready”:
- Connect to a locator using gfsh (see Work with a Tanzu GemFire Cluster).
- Execute show missing-disk-stores.
- If any missing disk stores are present, revoke the disk stores. For example, revoke missing-disk-store --id=c00727d1-d909-4cf8-a0c9-507ee3a9440a.
- Repeat the previous two steps until there are no remaining missing disk stores.
Verify that the cluster is running with the correct number of locator and server replicas:
```
kubectl -n NAMESPACE-NAME get GemFireClusters
```
Where NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.

For example, if the NAMESPACE-NAME is gemfire-cluster:
```
$ kubectl -n gemfire-cluster get GemFireClusters
NAME       LOCATORS   SERVERS
gemfire1   1/1        2/2
```
The first number shows the number of running replicas. The second number shows how many replicas were specified. Wait until the quantity running reaches the number of replicas specified for both locators and servers.

If, after completing all of the above steps, the number of server replicas is increased, the new server pods may enter a CrashLoopBackOff state during startup due to either a ConflictingPersistentDataException or a RevokedPersistentDataException in the logs. This happens because the persistent data from the previous failed cluster start-up still exists. If this happens, you must delete the persistent data for the new server pods.

For example, suppose there is a Tanzu GemFire cluster named gemfire1 in namespace gemfire-cluster with one locator and three server replicas. After the cluster is running, the number of server replicas is scaled to four. The new pod may enter a CrashLoopBackOff state during startup:

$ kubectl -n gemfire-cluster get pods
NAME                 READY   STATUS             RESTARTS   AGE
gemfire1-locator-0   1/1     Running            0          4d17h
gemfire1-server-0    1/1     Running            0          4d17h
gemfire1-server-1    1/1     Running            0          4d17h
gemfire1-server-2    1/1     Running            0          4d17h
gemfire1-server-3    0/1     CrashLoopBackOff   9          35m

In this case,

Scale the server replicas down to the original number. (In the example above, scale down to three.)
Delete the persistent volume claim for the server which was in a CrashLoopBackOff state only. Its name will be in the form data-POD-NAME, where POD-NAME is the name of the pod using the persistent volume claim. In the example above, the command to delete the persistent volume claim would look like the following:
```
$ kubectl -n gemfire-cluster delete persistentvolumeclaim data-gemfire1-server-3
```
After the persistent volume claim is deleted, scale the server replicas back up. (In the example aboce, scale up to four.)

Delete Persistent Data

To remove a Tanzu GemFire cluster’s persistent data, delete the persistent volume claims associated with the Tanzu GemFire cluster. Persistent volume claims are disk claims that Kubernetes makes on the underlying system.

Obtain a list of the persistent volume claims for the locators and servers:

kubectl -n NAMESPACE-NAME get persistentvolumeclaims -l gemfire.vmware.com/app=NAME-locator
kubectl -n NAMESPACE-NAME get persistentvolumeclaims -l gemfire.vmware.com/app=NAME-server

Where:

NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
NAME is the metadata: name field from the deployment YAML.

For each persistent volume claim listed, delete it:

kubectl -n NAMESPACE-NAME delete persistentvolumeclaim PVC_NAME

Where NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.

Create a LoadBalancer Service

The optional creation of a Kubernetes LoadBalancer service permits Tanzu GemFire cluster access. Each of these two types of cluster access would have its own LoadBalancer service.

Allow cluster access to execute gfsh commands through the use of the Tanzu GemFire Management API. See Executing gfsh Commands through the Management API in the Tanzu GemFire documentation. Permits gfsh access to the Tanzu GemFire cluster from outside the Kubernetes cluster.
The Tanzu GemFire Developer REST API permits Tanzu GemFire cluster data access. For more information about the Tanzu GemFire Developer REST API, see Tanzu GemFire REST API Overview in the Tanzu GemFire documentation.

Some cloud providers assign a public IP address to a LoadBalancer service, which will expose the Tanzu GemFire cluster to the internet. Understand the security risks before creating a LoadBalancer service.

Create a LoadBalancer Service for the Management API

After the Tanzu GemFire cluster has been created, a LoadBalancer service for the Tanzu GemFire Management API may be created.

Place the following YAML configuration in a file:

apiVersion: v1
kind: Service
metadata:
  name: lb-svc-mgmt-api
spec:
  selector:
    gemfire.vmware.com/app: NAME-locator
  ports:
    - name: management
      port: 7070
      targetPort: 7070
  sessionAffinity: ClientIP
  sessionAffinityConfig:
      clientIP:
        timeoutSeconds: 10800
  type: LoadBalancer

You can replace lb-svc-mgmt-api with a name that you create for the LoadBalancer service. The name that you create must follow the syntax rules for a DNS name. NAME is the metadata: name field from the Tanzu GemFire cluster’s deployment YAML.

Create the LoadBalancer service by applying this configuration to the Kubernetes cluster with a command of the form:

kubectl -n NAMESPACE-NAME apply -f YAML-FILE-NAME

Where:

NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
YAML-FILE-NAME is your YAML configuration’s file name.

Create a LoadBalancer Service for the Developer REST API

Create a Tanzu GemFire cluster that has the Developer REST API enabled. The Developer REST API is enabled by including a Tanzu GemFire property for servers in the CRD YAML:

overrides:
  gemFireProperties:
    - name : "start-dev-rest-api"
      value : "true"

After the Tanzu GemFire cluster has been created, a LoadBalancer service for the Tanzu GemFire Developer API may be created.

Place the following YAML configuration in a file:

apiVersion: v1
kind: Service
metadata:
  name: LB-SVC-DEV-API
spec:
  selector:
    gemfire.vmware.com/app: NAME-server
  ports:
    - name: rest-api
      port: 7070
      targetPort: 7070
  type: LoadBalancer

Where LB-SVC-DEV-API is your chosen name for the LoadBalancer service. The chosen name must follow the syntax rules for a DNS name. NAME is the metadata: name field from the Tanzu GemFire cluster’s deployment YAML.

Create the LoadBalancer service by applying this configuration to the Kubernetes cluster with a command of the form:

kubectl -n NAMESPACE-NAME apply -f YAML-FILE-NAME

Where:

NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
YAML-FILE-NAME is your YAML configuration’s file name.

Delete a LoadBalancer Service

To delete LoadBalancer services:

Identify the services with a command of the form:
```
kubectl -n NAMESPACE-NAME get services
```
Where NAMESPACE-NAME is your chosen name for the Tanzu GemFire cluster namespace.
For each listed LoadBalancer service, delete it with a command of the form:
```
kubectl -n NAMESPACE-NAME delete service LB-SVC-NAME
```
Where LB-SVC-NAME is the name field from the service’s configuration YAML.