To back up and restore Tanzu Kubernetes Grid management clusters, you can use Velero, an open source community standard tool for backing up and restoring Kubernetes cluster resources and persistent volumes. Velero supports a variety of storage providers to store its backups.
If a Tanzu Kubernetes Grid management cluster crashes and fails to recover, the infrastructure administrator can use a Velero backup to restore its contents to a new management cluster, including management cluster extensions and internal API objects for the workload clusters.
The following sections explain how to back up and restore a Tanzu Kubernetes Grid management cluster using the Velero CLI.
To back up and restore Tanzu Kubernetes Grid workload clusters, you can use similar velero backup
and velero restore
commands to the ones described below.
These sections describe what you need to back up and restore Tanzu Kubernetes Grid management clusters.
.gz
file for your workstation OS. Its filename starts with velero-linux-
, velero-mac-
, or velero-windows64-
.gunzip
command or the extraction tool of your choice to unpack the binary: gzip -d <RELEASE-TARBALL-NAME>.gz
Rename the CLI binary for your platform to velero
, make sure that it is executable, and add it to your PATH
.
Mac OS and Linux platforms:
/usr/local/bin
folder and rename it to velero
.chmod +x /usr/local/bin/velero
Windows platforms:
Program Files\velero
folder and copy the binary into it.velero.exe
.velero
folder, select Properties > Security, and make sure that your user account has the Full Control permission.env
.Path
row under System variables, and click Edit.velero
binary.Velero supports a variety of storage providers. You need a storage provider to store cluster object backups and snapshots of any CSI persistent volumes that the cluster objects use.
VMware recommends dedicating a unique storage bucket to each cluster, so the instructions below include setting up storage for each management cluster backup or restore operation.
How you set up a storage provider depends on your infrastructure:
vSphere (On Premises)
AWS and Azure
velero install
command, with parameters including:
--plugins
: the object store and volume snapshotter plugin versions--backup-location-config
: object backup location--snapshot-location-config
: volume snapshot locationExample: The following Velero CLI commands use a public Velero image repository. Please replace the repository with the official Tanzu Kubernetes Grid image repository by following the Customize Velero Install instructions in the Velero documentation.
velero install \
--image=$TKG_REG/velero:$VELERO_VERSION \
--plugin=$TKG_REG/velero-plugin-for-aws:$PLUGIN_VERSION \
<....>
These sections describe how to back up and restore Tanzu Kubernetes Grid management clusters on vSphere.
Install Velero server on the management cluster. This example uses MinIO as the object storage:
velero install --provider aws --plugins "velero/velero-plugin-for-aws:v1.1.0" --bucket velero --secret-file ./credentials-velero --backup-location-config "region=minio,s3ForcePathStyle=true,s3Url=minio_server_url" --snapshot-location-config region="default"
See the MinIO server setup instructions in the Velero documentation.
If there are CSI-based volumes to back up, follow the Velero Plugin for vSphere setup instructions to install the Velero plugin for vSphere.
Set Cluster.Spec.Paused
to true
for all workload clusters:
kubectl patch cluster workload_cluster_name --type='merge' -p '{"spec":{"paused": true}}'
Back up the management cluster:
velero backup create your_backup_name --exclude-namespaces=tkg-system
Excluding tkg-system
objects avoids creating duplicate management cluster API objects when restoring to a new management cluster.
Set Cluster.Spec.Paused
back to false
for the workload clusters.
Install Velero server on the new management cluster. This example uses MinIO as the object storage:
velero install --provider aws --plugins "velero/velero-plugin-for-aws:v1.1.0" --bucket velero --secret-file ./credentials-velero --backup-location-config "region=minio,s3ForcePathStyle=true,s3Url=minio_server_url"
See the MinIO server setup instructions in the Velero documentation.
If there are CSI-based volumes to restore, follow the Velero Plugin for vSphere setup instructions to install the Velero plugin for vSphere.
Restore the management cluster:
velero restore create your_restore_name --from-backup your_backup_name
Set Cluster.Spec.Paused
field to false
for all workload clusters:
kubectl patch cluster cluster_name --type='merge' -p '{"spec":{"paused": false}}'
These sections describe how to back up and restore Tanzu Kubernetes Grid management clusters on AWS.
Follow the Velero Plugin for AWS setup instructions to install Velero server on the management cluster.
Set Cluster.Spec.Paused
to true
for all workload clusters:
kubectl patch cluster workload_cluster_name --type='merge' -p '{"spec":{"paused": true}}'
Back up the management cluster:
velero backup create your_backup_name --exclude-namespaces=tkg-system
Excluding tkg-system
objects avoids creating duplicate management cluster API objects when restoring to a new management cluster.
Set Cluster.Spec.Paused
back to false
for the workload clusters.
Follow the Velero Plugin for AWS setup instructions to install Velero server on the new management cluster.
Restore the management cluster:
velero backup get
velero restore create your_restore_name --from-backup your_backup_name
Set Cluster.Spec.Paused
to false
for all workload clusters:
kubectl patch cluster cluster_name --type='merge' -p '{"spec":{"paused": false}}'
These sections describe how to back up and restore Tanzu Kubernetes Grid management clusters on Azure.
Follow the Velero Plugin for Azure setup instructions to install Velero server on the management cluster.
Set Cluster.Spec.Paused
to true
for all workload clusters:
kubectl patch cluster workload_cluster_name --type='merge' -p '{"spec":{"paused": true}}'
Back up the management cluster:
velero backup create your_backup_name --exclude-namespaces=tkg-system
Excluding tkg-system
objects avoids creating duplicate management cluster API objects when restoring to a new management cluster.
If velero backup
returns a transport is closing
error, try again after increasing the memory limit, as described in Update resource requests and limits after install in the Velero documentation.
Set Cluster.Spec.Paused
back to false
for the workload clusters.
Follow the Velero Plugin for Azure setup instructions to install Velero server on the new management cluster.
Restore the management cluster:
velero backup get
velero restore create your_restore_name --from-backup your_backup_name
Set Cluster.Spec.Paused
to false
for all workload clusters:
kubectl patch cluster cluster_name --type='merge' -p '{"spec":{"paused": false}}'