Back Up and Restore Cluster Workloads

This topic explains how to back up and restore the workloads and dynamic storage volumes hosted on Tanzu Kubernetes Grid (TKG) workload clusters for TKG with a standalone management cluster.

To back up and restore cluster infrastructure, the standalone management cluster and workload cluster objects themselves, see Back Up and Restore Management and Workload Cluster Infrastructure.

To back up and restore vSphere with Tanzu clusters, including Supervisor Clusters and the workload clusters that they create, see Backing Up and Restoring vSphere with Tanzu in the VMware vSphere 8.0 Documentation.

Set Up Velero

You can use Velero, an open source community standard tool, to back up and restore TKG standalone management cluster infrastructure and workloads.

Velero supports a variety of storage providers to store its backups. Velero also supports:

  • Pre- and post-hooks for backup and restore to run custom processes before or after backup and restore events.
  • Excluding aspects of workload or cluster state that are not well-suited to backup/restore.

A Tanzu Kubernetes Grid subscription includes support for VMware’s tested, compatible distribution of Velero available from the Tanzu Kubernetes Grid downloads page.

To back up and restore TKG clusters, you need:

After you complete the prerequisites above, you can also use Velero to migrate workloads between clusters. For instructions, see Cluster Migration and Resource Filtering in the Velero documentation.

Install the Velero CLI

Caution

If you have already installed Velero CLI v1.9.x or earlier, as distributed with prior versions of TKG, you need to upgrade to v1.10.3. Older Velero versions do not work with the CRDs used in v1.10 and later. For information, see Upgrade Velero below.

To install the Velero CLI v1.10.3, do the following:

  1. Go to the Broadcom Support Portal and log in with your VMware customer credentials.
  2. Go to the Tanzu Kubernetes Grid downloads page.
  3. Scroll to the Velero entries and download the Velero CLI .gz file for your workstation OS. Its filename starts with velero-linux-, velero-mac-, or velero-windows64-.
  4. Use the gunzip command or the extraction tool of your choice to unpack the binary:

    gzip -d <RELEASE-TARBALL-NAME>.gz
    
  5. Rename the CLI binary for your platform to velero, make sure that it is executable, and add it to your PATH.

    macOS and Linux
    1. Move the binary into the /usr/local/bin folder and rename it to velero.
    2. Make the file executable:
    chmod +x /usr/local/bin/velero
    
    Windows
    1. Create a new Program Files\velero folder and copy the binary into it.
    2. Rename the binary to velero.exe.
    3. Right-click the velero folder, select Properties > Security, and make sure that your user account has the Full Control permission.
    4. Use Windows Search to search for env.
    5. Select Edit the system environment variables and click the Environment Variables button.
    6. Select the Path row under System variables, and click Edit.
    7. Click New to add a new row and enter the path to the velero binary.

Upgrade Velero

Velero v1.10.3 uses different CRDs to v1.9.x. In addition, Velero v1.10 adopted Kopia with Restic as the uploader, which had led to several changes in the naming of components and commands, and in how Velero functions. For more information about breaking changes between v1.9.x and v1.10, see Breaking Changes in the Velero v1.10 Changelog. If you installed Velero v1.9.x with a previous version of TKG, you must upgrade Velero.

  1. Follow the procedure in Install the Velero CLI to install Velero v1.10.3.
  2. Update the CRD definitions with the Velero v1.10 binary.

    velero install --crds-only --dry-run -o yaml | kubectl apply -f -
    
  3. Update the Velero deployment and daemon set configuration to match the component renaming that happened in Velero v1.10.

    In the command below, uploader_type can be either restic or kopia.

    kubectl get deploy -n velero -ojson \
    | sed "s#\"image\"\: \"velero\/velero\:v[0-9]*.[0-9]*.[0-9]\"#\"image\"\: \"velero\/velero\:v1.10.0\"#g" \
    | sed "s#\"server\",#\"server\",\"--uploader-type=$uploader_type\",#g" \
    | sed "s#default-volumes-to-restic#default-volumes-to-fs-backup#g" \
    | sed "s#default-restic-prune-frequency#default-repo-maintain-frequency#g" \
    | sed "s#restic-timeout#fs-backup-timeout#g" \
    | kubectl apply -f -
    
  4. (Optional) If you are using the restic daemon set, rename the corresponding components.

    echo $(kubectl get ds -n velero restic -ojson) \
    | sed "s#\"image\"\: \"velero\/velero\:v[0-9]*.[0-9]*.[0-9]\"#\"image\"\: \"velero\/velero\:v1.10.0\"#g" \
    | sed "s#\"name\"\: \"restic\"#\"name\"\: \"node-agent\"#g" \
    | sed "s#\[ \"restic\",#\[ \"node-agent\",#g" \
    | kubectl apply -f -
    kubectl delete ds -n velero restic --force --grace-period 0 
    

For more information, see Upgrading to Velero 1.10 in the Velero documentation.

Set Up a Storage Provider

To back up Tanzu Kubernetes Grid workload cluster contents, you need storage locations for:

  • Cluster object storage backups for Kubernetes metadata in clusters
  • Volume snapshots for data used by clusters

See Backup Storage Locations and Volume Snapshot Locations in the Velero documentation. Velero supports a variety of storage providers, which can be either:

  • An online cloud storage provider.
  • An on-premises object storage service such as MinIO, for proxied or air-gapped environments.

VMware recommends dedicating a unique storage bucket to each cluster.

To set up MinIO:

  1. Run the minio container image with MinIO credentials and a storage location, for example:

    $ docker run -d --name minio --rm -p 9000:9000 -e "MINIO_ACCESS_KEY=minio" -e "MINIO_SECRET_KEY=minio123" -e "MINIO_DEFAULT_BUCKETS=mgmt" gcr.io/velero-gcp/bitnami/minio:2021.6.17-debian-10-r7
    
  2. Save the credentials to a local file to pass to the --secret-file option of velero install, for example:

    [default]
    aws_access_key_id=minio
    aws_secret_access_key=minio123
    

Storage for vSphere

On vSphere, cluster object storage backups and volume snapshots save to the same storage location. This location must be S3-compatible external storage on Amazon Web Services (AWS), or an S3 provider such as MinIO.

To set up storage for Velero on vSphere, see Velero Plugin for vSphere in Vanilla Kubernetes Cluster for the v1.5.1 plugin.

Storage for and on AWS

To set up storage for Velero on AWS, follow the procedures in the Velero Plugins for AWS repository:

  1. Create an S3 bucket.

  2. Set permissions for Velero.

Set up S3 storage as needed for each plugin. The object store plugin stores and retrieves cluster object backups, and the volume snapshotter stores and retrieves data volumes.

Storage for and on Azure

To set up storage for Velero on Azure, follow the procedures in the Velero Plugins for Azure repository:

  1. Create an Azure storage account and blob container

  2. Get the resource group containing your VMs and disks

  3. Set permissions for Velero

Set up S3 storage as needed for each plugin. The object store plugin stores and retrieves cluster object backups, and the volume snapshotter stores and retrieves data volumes.

Deploy Velero Server to Workload Clusters

To deploy the Velero Server to a workload cluster, you run the velero install command. This command creates a namespace called velero on the cluster, and places a deployment named velero in it.

Note

If the cluster already has Velero installed, follow the steps in Upgrade Velero.

Velero Install Options

To install Velero, run velero install with the following options:

  • --provider $PROVIDER: For example, aws
  • --plugins projects.registry.vmware.com/tkg/velero/velero-plugin-for-aws:v1.6.2_vmware.1
  • --bucket $BUCKET: The name of your S3 bucket
  • --backup-location-config region=$REGION: The AWS region the bucket is in
  • --snapshot-location-config region=$REGION: The AWS region the bucket is in
  • (Optional) --kubeconfig to install the Velero server to a cluster other than the current default.
  • (Optional) --secret-file ./VELERO-CREDS one way to give Velero access to an S3 bucket is to pass in to this option a local VELERO-CREDS file that looks like:

    [default]
    aws_access_key_id=<AWS_ACCESS_KEY_ID>
    aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
    
  • For additional options, see Install and start Velero.

Running the velero install command creates a namespace called velero on the cluster, and places a deployment named velero in it.

How you run the velero install command and otherwise set up Velero on a cluster depends on your infrastructure and storage provider, as described in the following sections.

Install Velero on Clusters on vSphere

This procedure installs Velero on workload clusters managed by a standalone management cluster on vSphere.

To deploy Velero to a vSphere with Tanzu Supervisor Cluster that serves as your Tanzu Kubernetes Grid management cluster, see Backing Up and Restoring vSphere with Tanzu.

  1. Install the Velero server to the current default cluster in your kubeconfig by running velero install with the options listed in Velero Install Options above.

    • For example, to use MinIO as the object storage, following the MinIO server setup instructions in the Velero documentation:

      velero install --provider aws --plugins "projects.registry.vmware.com/tkg/velero/velero-plugin-for-aws:v1.6.2_vmware.1" --bucket velero --secret-file ./credentials-velero --backup-location-config "region=minio,s3ForcePathStyle=true,s3Url=minio_server_url" --snapshot-location-config region="default"
      
    • For more information, see the Install section for Vanilla Kubernetes clusters in the Velero Plugin for vSphere v1.5.1 repository.

    • Installing the Velero server to the cluster creates a namespace in the cluster called velero, and places a deployment named velero in it.

  2. If you did not pass a credentials file to --secret-file with velero install above, configure grant access to Velero from your backup S3 bucket. On AWS, for example, attach a policy to the IAM role nodes.tkg.cloud.vmware.com, which governs hosted applications like Velero, to allow access to the S3 bucket.

  3. Add the Velero Plugin for vSphere, which lets Velero use your S3 bucket to store CSI volume snapshots for workload data, in addition to storing cluster objects:

    1. Download the Velero Plugin for vSphere v1.5.1 image.
    2. Retrieve and decode the vSphere credentials used by your CSI driver into a secret configuration file csi-vsphere.conf:

      kubectl -n vmware-system-csi get secret vsphere-config-secret -o jsonpath='{.data.csi-vsphere\.conf}'| base64 -d > csi-vsphere.conf
      
    3. (Optional) Check and confirm the vCenter IP address, username, and password values in the secret configuration file csi-vsphere.conf, which looks like this:

      cluster-id = "CLUSTER-ID"
      
      [VirtualCenter "VCENTER-IP"]
      user = "USERNAME"
      password = "PASSWORD"
      port = "443"
      
    4. Use the configuration file to create the velero-vsphere-config-secret secret in the namespace velero:

      kubectl -n velero create secret generic velero-vsphere-config-secret --from-file=csi-vsphere.conf
      
    5. Create a ConfigMap file velero-plugin.conf for the Velero plugin that references the secret:

      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: velero-vsphere-plugin-config
      data:
        cluster_flavor: VANILLA
        vsphere_secret_name: velero-vsphere-config-secret
        vsphere_secret_namespace: velero
      
    6. Apply the ConfigMap:

      kubectl -n velero apply -f velero-plugin.conf
      
    7. Add the plugin:

      velero plugin add PLUGIN-IMAGE
      

      Where PLUGIN-IMAGE is the registry path to the container image listed in the Velero Plugin for vSphere repo v1.5.1, for example, http://projects.registry.vmware.com/tkg/velero/velero-plugin-for-vsphere:v1.5.1_vmware.1.

    8. Enable the plugin by adding the following VirtualMachine permissions to the role you created for the Tanzu Kubernetes Grid account, if you did not already include them when you created the account:

      • Configuration > Toggle disk change tracking
      • Provisioning > Allow read-only disk access
      • Provisioning > Allow virtual machine download
      • Snapshot management > Create snapshot
      • Snapshot management > Remove snapshot

Install Velero on Clusters on AWS

  1. To install Velero on workload clusters on AWS, follow the Install and start Velero procedure in the Velero Plugins for AWS repository.
  2. Run velero install with the options listed in Velero Install Options above.

Install Velero on Clusters on Azure

  1. To install Velero on workload clusters on Azure, follow the Install and start Velero procedure in the Velero Plugins for Azure repository.
  2. Run velero install with the options listed in Velero Install Options above.

Back Up and Restore Workloads

Use Velero to back up and restore a workload cluster’s current workloads and persistent volumes state, for entire clusters or specific namespaces.

Back Up Workloads

To back up the contents of a workload cluster:

  1. Follow the Deploy Velero Server to Clusters instructions for your infrastructure, above, to deploy a Velero server onto the workload cluster, along with the Velero Plugin for vSphere if needed.

  2. Back up the cluster contents:

    velero backup create your_backup_name
    
  3. If velero backup returns a transport is closing error, try again after increasing the memory limit, as described in Update resource requests and limits after install in the Velero documentation.

Note

Backup and restore of Windows and multi-OS workload clusters is not supported.

Restore Workloads

To restore a workload cluster’s contents from backup:

  1. Create a new cluster. You cannot restore a cluster backup to an existing cluster.

  2. Follow the Deploy Velero Server to Clusters instructions for your infrastructure, above, to deploy a Velero server onto the new cluster, along with the Velero Plugin for vSphere if needed.

  3. Restore the cluster contents:

    velero backup get
    velero restore create your_restore_name --from-backup your_backup_name
    
check-circle-line exclamation-circle-line close-line
Scroll to top icon