Restoring a backup restores the state of the components at the time of the backup.
- During restore, if any data ingestion occurs and the index is getting used, then the restore can fail.
- After upgrading or migrating VMware Telco Cloud Service Assurance from older version to VMware Telco Cloud Service Assurance 2.4, the custom catalog metrics that you have created in the older version will not be available in VMware Telco Cloud Service Assurance 2.4 due to the
vsa_catalog
schema changes. You can create the customized catalog metrics again in the VMware Telco Cloud Service Assurance 2.4 Catalog UI.
Procedure
- Connect to the deployer VM and execute the below command:
export KUBECONFIG=/root/.kube/<KubeConfig File>
- In a text editor, create the restoration configuration file in YAML format.
The example file is located in the tcx-deployer/examples/backup-and-restore/restore.yaml.example.
Use the following template for the component backup:apiVersion: tcx.vmware.com/v1 kind: Restore metadata: name: group-restore-tps namespace: tps-system spec: backupName: group-backup restore: postgres: timeout: 10m config: endpoint: host: postgres-cluster.tps-system.svc.cluster.local port: 5432 adminSecret: name: postgres-db-secret namespace: tps-system dbs: - "analyticsservice" - "alarmservice" - "collector" - "grafana" - "keycloak" #- "airflow" #- "remediation" #- "dm_upgrade" --- apiVersion: tcx.vmware.com/v1 kind: Restore metadata: name: group-restore-tcsa namespace: tcsa-system spec: backupName: group-backup postAction: name: postaction serviceAccount: cluster-admin-sa timeout: 30m resource: memory: 250Mi cpu: 100m bash: command: - /bin/bash - -c - | set ex;kubectl delete pods -n tcsa-system --selector run=apiservice; sleep 200; set ex;kubectl delete pod -n tcsa-system --selector=app.kubernetes.io/name=grafana; sleep 10; set ex;kubectl exec -it deploy/br-operator -n tcsa-system -- curl -k -s --show-error --stderr - -H 'Content-Type: application/json' -X POST --data '{ "isCleanUpgrade": true }' http://apiservice:8080/smartsrestcontroller/vsa/smarts/domain/migrate; restore: collectors: config: authenticationSecret: name: collectors-secrets namespace: tcsa-system passwordKey: key: COLLECTORS_PASSWORD usernameKey: key: COLLECTORS_USERNAME endpoint: basePath: /dcc/v1/ host: collector-manager.tcsa-system.svc.cluster.local port: 12375 scheme: http timeout: 10m elastic: authentication: name: elasticsearch-secret-credentials namespace: tcsa-system passwordKey: key: ES_PASSWORD usernameKey: key: ES_USER_NAME cleanUpIndices: true config: endpoint: host: elasticsearch.tcsa-system.svc.cluster.local port: 9200 scheme: https region: ap-south-1 indexList: - vsa_chaining_history-* - vsa_events_history-* - vsa_audit-* - vsarole,policy,userpreference,mapping-metadata,mnr-metadata - gateway-mappings # Uncomment vsametrics for metrics restore and set cleanUpIndices as true # - vsametrics* # Uncomment vsa_catalog to restore TCSA 2.4 backup # - vsa_catalog # 'removeAndAddRepository: true' and trigger Backup/Restore, to cleanup the respository. removeAndAddRepository: true timeout: 30m tls: caCrt: key: ca.crt insecureSkipVerify: true namespace: tcsa-system secretName: elasticsearch-cert tlsCrt: key: tls.crt # Uncomment KubernetesResources to restore configmaps/secrets. # kubernetesResources: # timeout: 10m # resources: # - groupVersionResource: # group: "" # version: "v1" # resource: "secrets" # nameList: # - name: "spe-pguser" # namespace: "tcsa-system" # - groupVersionResource: # group: "" # version: "v1" # resource: "configmaps" # nameList: # - name: "product-info" # namespace: "tcsa-system" zookeeper: endpoint: host: zookeeper.tcsa-system.svc.cluster.local port: 2181 paths: - path: /vmware/vsa/gateway - path: /vmware/vsa/smarts # Uncomment the zookeeper path for NCM backup # - path: /vmware/vsa/ncm timeout: 10m
Add
"/vmware/vsa/ncm"
to restore backup of NCM reports.Note: If you want to take a VMware Telco Cloud Service Assurance 2.3.0 backup and restore the data in VMware Telco Cloud Service Assurance 2.4.0, perform the following actions:- Use
restore-22-or-23-version.yaml.example
.example file to restore the backup from VMware Telco Cloud Service Assurance 2.3.0 to 2.4.0The example file is located in the tcx-deployer/examples/backup-and-restore/restore-22-or-23-version.yaml.example.
- When restoring an older version of a backup, please note that the configmap and secrets are not backward compatible. If there is a need to apply an older version of configmap and secrets, this must be done manually using the
kubectl
command. - In the restore-22-or-23-version.yaml.example file, set
cleanUpIndices
to true if you want to delete the index data before restore. - To view the NCM reports, enter the NCM database IP address and password in the Grafana NCM-Postgres datasource.
- Use the Grafana export and import options to export any customized Grafana dashboards from VMware Telco Cloud Service Assurance 2.3.0 and import them into VMware Telco Cloud Service Assurance 2.4.0.
- Use
- To restore the backup, run the following command:
kubectl apply -f <configuration YAML file>
The following command creates the restore CR, and the status of the restore are updated in the status field as follows:~> kubectl get restore.tcx.vmware.com -A --watch
NAME STATUS CURRENT STATE READY AGE restore-name SUCCESSFUL restore true 4h16m
Note: Once restore is triggered it cannot be undone, a failure to restore might result in a partial restore of the system. If the restore fails, a failure message displays in the Message field.Option Description removeAndAddRepository Remove and add repository will delete the existing repository. When you are performing backup and restore across clusters, you must set removeAndAddRepository: true.` cleanUpIndices when you perform backup and restore across clusters, you must update the indices. For example, cleanUPIndices: true
Note:Comment Out Unbacked Components
-
Before starting the restore process, it is important to remove/comment any datastores that are not required. To do this, update the Restore CR. Also, if you have not taken a backup for a datastore, you must remove it from Restore CR.
For example: If you have not taken the zookeeper path /vmware/vsa/gateway in the backup then remove that path in the restore CR. Otherwise the following error message appears/backups/clusters/default/zookeeper/group-backup/_vmware_vsa_gateway does not exist
If you have not taken the backup of kubernetesResources, remove/comment kubernetesResources in the restore CR. -
If you have not backed up some components in a datastore, you must comment out the corresponding components in the Restore CRs.
-
For NCM Reports, if the subnets are added in the NCM server-side configuration, then no changes are required. Otherwise, add the modified Grafana and ncm-proxy Node IPs in the NCM server-side configuration. For more information, see the NCM Server-Side Configuration topic.
-