To restore the workload cluster, copy the existing cluster specifications and deploy a new cluster. Then, restore the data to the new cluster.

Prerequisites

  • Source and target clusters must be associated with the same Management Cluster and must be under the same vCenter server.

  • Source and target clusters must be associated with the same Kubernetes version.

Procedure

  1. Copy Specification and Deploy new Cluster.
    Note:
    • Add nodepools manually as those will not be copied from the source cluster spec.

    • Manually enter passwords when configuring Systemsettings and Harbor add-ons.

    • If Prometheus add-on is stuck without complete, delete and re-add it after vsphere-csi or nfs-client add-on is in ready status.

    • If TCA cert-manager add-on is enabled in the source cluster and CNF is configured to use this add-on, cert-manager service won't renew certificates requested by this CNF after restoration. Remedy and reconfigure the CNF from TCA to generate missing resources after the restoration process.

    • If load-balancer-and-ingress-service add-on is enabled in the source cluster, use a new service engine group setting in the target cluster. Refer to the add-on configuration load-balancer-and-ingress-service

    • If TCA add-on load-balancer-and-ingress-service is enabled in the source cluster and a CNF is defined to create Kubernetes resources gatewayclasses.networking.x-k8s.io or gateways.networking.x-k8s.io in the Helm Chart, CNF resources in the restore namespaces will be in Pending state after restoration is complete. Recreate the resources in the restored cluster with new service engine group setting. It is recommended to define these resources in the TCA add-on instead.

  2. Restore the workload to the new cluster .
    1. Log in to the VMware Telco Cloud Automation web interface.
    2. Navigate to Infrastructure > Virtual Infrastructure.
    3. Open the web terminal by clicking on the Options (three dots) corresponding to the workload cluster you want to restore and then selecting Open Terminal.
    4. On the Web terminal, check the service health of Velero by running the following command:
      # kubectl get pod -n velero // check pod status
      # kubectl get bsl -n velero // check velero BackupStorageLocation CR

      Alternatively, you can check the velero addon health status from TCA UI

    5. Retrieve the backup information by running the following command:
      # velero backup get
    6. Restore to the cluster by running the following command:
      # velero restore create --from-backup <example-backup> 
    7. Check the restoration status and download CR and wait until those are "Completed".
      # velero restore get // check the restore status
      # kubectl get downloads -n velero // get the download-name 
      # kubectl get downloads <download-name> -o yaml // check the downloads status in yaml output
      Note:

      If the Network Function pod requires late binding for nodepool VMs, the restored pods might be in Pending status. Follow Remediate Network Functions to heal.

What to do next

Remediate Network Functions.