Using VMware Tanzu Mission Control, you can protect the valuable data resources in your Kubernetes clusters using the backup and restore functionality provided by Velero, an open source community standard.

The data protection features of Tanzu Mission Control allow you to create the following types of backups for managed clusters (both attached and provisioned):
  • all resources in a cluster
  • selected or excluded namespaces in a cluster
  • specific or excluded resources in a cluster identified by a given label
You can selectively restore the backups you have created, by specifying the following:
  • the entire backup
  • selected or excluded namespaces from the backup
  • specific or excluded resources from the backup identified by a given label

Additionally, you can schedule regular backups and manage the storage of backups and volume snapshots you create by specifying a retention period for each backup and deleting backups that are no longer needed.

When you perform a backup for a cluster, Tanzu Mission Control uses Velero to create a backup of the specified Kubernetes resources with snapshots of persistent volume data, and then stores the backup in the location that you specify.
Note: The namespaces kube-system, velero, tkg-system, and vmware-system-tmc are not included in backups.

For more information about Velero, visit https://velero.io/docs.

For information on how to use the data protection features in Tanzu Mission Control, see Protecting Data in Using VMware Tanzu Mission Control.

Note: The data protection features of Tanzu Mission Control are not available in Tanzu Mission Control Essentials.

About Backup Storage

For the storage of your backups, you can specify a target location that allows Tanzu Mission Control to manage the storage of backups, provisioning resources as necessary according to your specifications. However, if you prefer to manage your own storage for backups, you can also specify a target location that points to a storage location that you create and maintain in your cloud provider account, such as an AWS S3 or S3-compatible storage location or an Azure Blob storage location. With self-provisioned storage, you can leverage existing storage investments for backups, reducing network and cloud storage costs, and apply existing storage policies, quotas, and encryption. For a list of supported S3-compatible providers, see S3-Compatible object store providers in the Velero documentation.

Before you define a backup for a cluster, you must create a target location and credential that you will use to perform the backup.
  • The data protection credential specifies the access credentials for the account where your backup is stored. This account can be either your AWS account where Tanzu Mission Control manages backup storage, or an account where you manage backups (the account that contains your AWS S3 or S3-compatible storage or the subscription that contains your Azure Blob storage).
  • The data protection target location identifies the place where you want the backup stored, and references the associated data protection credential. You can share the target location across multiple cluster groups and clusters.

About restic and Volume Snapshots

When you enable data protection for a cluster, Tanzu Mission Control installs Velero with restic (an open-source backup tool), configured to use the opt-out approach. With this approach, Velero backs up all pod volumes using restic.

For volumes that you do not want included in backups, you can annotate the pod that contains the volume to opt out of the backup with the backup.velero.io/backup-volumes-excludes annotation. If a pod is annotated to exclude volumes from being backed up using restic, then restic does not include the persistent volume in the backup. In this scenario, Velero does attempt to create persistent volume snapshots if the specified target location for the backup is in the same cloud provider account as the cluster.

Due to default node taints for DaemonSet on Kubernetes control plane nodes, restic DaemonSet pods are not scheduled on the control plane nodes of the cluster. If your cluster contains pods on control plane nodes with emptyDir volumes, backup operations on the cluster will fail unless you exclude those volumes from backups. To exclude emptyDir volumes, either annotate the pod containing the volume to opt out or exclude the namespace of the pod from the backup in Tanzu Mission Control.

For more information about Velero with restic, see the section on Restic Integration in the Velero documentation.

About Backup Restoration Between Different Clusters

When you create a backup using Tanzu Mission Control, that backup can be available for restoration to other clusters in your organization. This feature allows you to create a backup in one cluster and restore it to a different cluster, even clusters running on different platforms.

When migrating workloads between clusters running different versions of Kubernetes, consider the availability of resources in each version and the compatibility of API groups for each custom resource. If the source and target clusters are running different versions of Kubernetes, keep the following in mind:
  • A Kubernetes version downgrade (restoring to a cluster running a lower version of Kubernetes) can cause incompatibility of core API groups and other issues associated with feature availability. Use this approach judiciously.
  • If a Kubernetes version upgrade (restoring to a cluster running a higher version of Kubernetes) causes incompatibility of core API groups, you must update the impacted custom resources in the source cluster prior to creating the backup.

    For example, IngressClass in networking.k8s.io/v1beta1 API is no longer supported as of Kubernetes version 1.22.

For more information, see https://velero.io/docs/main/migration-case/ in the Velero documentation.

You cannot restore a backup that contains restic volumes on cluster without restic. Additionally, you can restore a backup that contains volume snapshots to another cluster only if both clusters share the same cloud provider account.

When migrating workloads between clusters running on different cloud providers, consider the following items:
  • By default, persistent volume claims (PVCs) might fail to bind to volumes because the appropriate storage class from the source cluster doesn't exist in the target cluster. To make sure your volumes bind, use a storage class map as described in https://velero.io/docs/v1.8/restore-reference/#changing-pvpvc-storage-classes in the Velero documentation.
    For example, the following configmap maps the default and managed-premium storage classes from an AKS cluster to the gp2 storage class in an EKS cluster.
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: change-storage-class-config
      namespace: velero
      labels:
        velero.io/plugin-config: ""
        velero.io/change-storage-class: RestoreItemAction
    data:
      # Map the "default" and "managed-premium" storage classes backed by AzureDisk on 
      # the source cluster to "gp2", a storage class backed by AWS EBS on the current 
      # (destination) cluster.
      default: gp2
      managed-premium: gp2
  • Custom resources from the source cluster might not exist in the target cluster.

    For example, tiers.crd.antrea.io and tiers.security.antrea.tanzu.vmware.com from a Tanzu Kubernetes cluster are not found in an AKS cluster.

    You can exclude resources during restore to help avoid this issue.

  • Resource differences between the source and target cluster might impact functionality.

    Some packages install webhooks that can cause issues when the source and target are not the same cluster. For example, mutatingwebhookconfiguration.admissionregistration.k8s.io and validatingwebhookconfiguration.admissionregistration.k8s.io from an AKS will impact the functionality of an EKS cluster.

    You can exclude resources during restore to help avoid this issue.