Back up the Workload Cluster

You can use Velero to back up and restore a workload cluster’s current workloads and persistent volumes state and store the backup file on the object storage. It is recommended for dedicating a unique storage bucket on the object storage server to each cluster.

After you install the Velero add-on on a workload cluster, you can run the Velero commands on the web terminal connected with the cluster using the Embedded SSH Client.

Alternatively, you can run the Velero commands on the standalone Velero client. See Install Standalone Velero Client.

Prerequisites

Install and Configure Velero Add-On for the Workload Clusters

Procedure

Log in to the VMware Telco Cloud Automation web interface.
Navigate to Infrastructure > Virtual Infrastructure.
Open the web terminal by clicking Options (three dots) corresponding to the workload cluster you want to backup and then selecting Open Terminal.
On the Web terminal, check the service health of Velero by running the following command:
```
# kubectl get pod -n velero  // check pod status
# kubectl get bsl -n velero  // check velero BackupStorageLocation CR
```
Alternatively, you can check the service health of Velero by performing the following:
1. Go to Infrastructure > Caas Infrastructure > Cluster Instances.
2. Select the required workload cluster name.
3. Click on the Add-Ons tab.
4. Select the Velero add-on deployed.

Set an environmental variable to exclude the cluster resources from backing up.

# export TCA_VELERO_EXCLUDE_RESOURCES="issuers.cert-manager.io,certificates.cert-manager.io,certificaterequests.cert-manager.io,gateways.networking.x-k8s.io,gatewayclasses.networking.x-k8s.io"
# export TCA_VELERO_EXCLUDE_NAMESPACES="velero,tkg-system,tca-system,tanzu-system,kube-system,tanzu-system-monitoring,tanzu-system-logging,cert-manager,avi-system,tanzu-system-ingress,vmware-system-csi,kube-node-lease,kube-public,tanzu-package-repo-global,tkg-system-public"

Back up the workload cluster.

# velero backup create <example-backup> --exclude-namespaces=$TCA_VELERO_EXCLUDE_NAMESPACES --exclude-resources=$TCA_VELERO_EXCLUDE_RESOURCES

The above backup command uses velero-plugin-for-vsphere as default to back up the Persistent Volumes created with vSphere CSI storage class. If the cluster exists in Persistent Volumes created with nfs-client storage class to back up, you have two options:

Option 1: Anotate the pod which mounts volumes to Persistent Volumes created with nfs-client storage class to backup using Restic.

# kubectl -n <pod_namespace> annotate pod/<pod-name> backup.velero.io/backup-volumes=<volume-name1>,<volume-name2>,…
# velero backup create <example-backup> --exclude-namespaces=$TCA_VELERO_EXCLUDE_NAMESPACES --exclude-resources=$TCA_VELERO_EXCLUDE_RESOURCES

You can choose to add the above annotation to the template metadata in the deployment controller to avoid re-annotating in case the annotated pods restart.

# kubectl -n <deploy_namespace> patch deployment <deployment-name> -p '{"spec": {"template":{"metadata":{"annotations":{"backup.velero.io/backup-volumes":"<volume-name1>,<volume-name2>,…"}}}}}'

Option 2: Change the default PV backup plugin to Restic. This will allow Restic to back up all the types of Persistent Volumes, including the ones created with vSphere CSI plugin.

# velero backup create <example-backup> --default-volumes-to-fs-backup --exclude-namespaces=$TCA_VELERO_EXCLUDE_NAMESPACES --exclude-resources=$TCA_VELERO_EXCLUDE_RESOURCES

Check the backup status and related CR and wait until the processes are "Completed".

# velero backup get // check the backup status

Check the status of uploads CR if using velero-plugin-for-vsphere to backup PV data.

# kubectl get uploads -n velero // get the upload-name 
# kubectl get uploads <upload-name> -o yaml // check the uploads status in yaml output

If you annotate pods and use Restic to back up PV data, check the status of podvolumebackups.

# kubectl get podvolumebackups -n velero // get the podvolumebackup-name
# kubectl get podvolumebackups <podvolumebackup-name> -o yaml // check the podvolumebackups status in yaml output

What to do next

Restore the Workload Cluster and Remediate the Network Functions.