VMware Telco Cloud Service Assurance allows you to take a backup and restore your data from the older version of VMware Telco Cloud Service Assurance to the newer version.
You must perform the following procedure to migrate the data from older version of VMware Telco Cloud Service Assurance to the newer version.
Procedure
- Take a backup on an older setup (VMware Telco Cloud Service Assurance 2.3.0 / 2.3.1) using an S3 bucket or an NFS server.
- Use the same S3 bucket or NFS server details on the target VMware Telco Cloud Service Assurance 2.4.0 cluster by providing them in
values-user-overrides.yaml
file in thetcx-deployer/product-helm-charts/tcsa
bundle of VMware Telco Cloud Service Assurance 2.4.0 and then re-deploy it. - In the
syncbackup.yaml
file of VMware Telco Cloud Service Assurance 2.4.0, there aretcsa-system namespace
components andtps-system namespace
components. In thesyncbackup.yaml
file, provide the bucket and backup name (same as the ones given in both the namespace system.).Note: An example for this could be found intcx-deployer/examples/backup-and-restore/synbackup.yaml.example
. - In the
syncbackup.yaml
file, underspec
uncomment the below two lines, only if the backup was taken on either VMware Telco Cloud Service Assurance 2.3.1 / 2.4.0. Setname
totcsa2.3.1
if the backup was taken on 2.3.1. Setname
totcsa2.4.0
if the backup was taken on VMware Telco Cloud Service Assurance 2.4.0.apiVersion: tcx.vmware.com/v1 kind: SyncBackup metadata: name: sync-backup-tps namespace: tps-system spec: overrideExisting: false # set this if the backup with same name already exists in the cluster # setting this to true will not delete the data filter: componentList: - postgres backupList: - group-backup pauseIntegrityCheck: true overrideNamespace: targetNamespace: tps-system #Uncomment the below two lines ONLY if the backup was taken on either 2.3.1 / 2.4.0. Set "name" to tcsa2.3.1 if the backup was taken on 2.3.1, or tcsa2.4.0 if taken on 2.4.0 cluster: name: tcsa2.4.0 storage: minio: bucket: vmware-tcsa-backup endpoint: minio.tcsa-system.svc.cluster.local:9000 secretRef: name: minio-secrets namespace: tcsa-system accessKey: key: root-user secretKey: key: root-password --- apiVersion: tcx.vmware.com/v1 kind: SyncBackup metadata: name: sync-backup-tcsa namespace: tcsa-system spec: overrideExisting: false # set this if the backup with same name already exists in the cluster # setting this to true will not delete the data filter: componentList: - elasticsearch - collectors - zookeeper - kubernetesResources backupList: - group-backup pauseIntegrityCheck: true overrideNamespace: targetNamespace: tcsa-system cluster: name: tcsa2.4.0 storage: minio: bucket: vmware-tcsa-backup endpoint: minio.tcsa-system.svc.cluster.local:9000 secretRef: name: minio-secrets namespace: tcsa-system accessKey: key: root-user secretKey: key: root-password
- In the example yaml file, default bucket
vmware-tcsa-backup
is used. To override the bucket name, you must update the NFS File Server bucket or an S3 bucket, whichever was used on the older setup, in the yaml file. - Run the following command to sync the backup in the target cluster.
kubectl apply -f synbackup.yaml.example
After the sync operation is complete, you can see the status as SUCCESSFUL.
The backups stored in NFS File Server are accessible in the new cluster.Note: Please note that performing the sync backup is essential before initiating the restore operation.[root@wdc-10-214-150-193 backup-and-restore]# kubectl get syncbackups -A NAMESPACE NAME STATUS CURRENT STATE READY AGE MESSAGE default sync-backup-tcsa SUCCESSFUL syncBackup True 4h51m synced: 1, skipped: 0, failed: 0 default sync-backup-tps SUCCESSFUL syncBackup True 4h51m synced: 1, skipped: 0, failed: 0
In case of any failure, the MESSAGE field is populated with the error message.
- After syncbackup is successful, you must check the status of your backup that you have taken in the previous or older
tcsa
setup.kubectl get backups -A
- If you want to restore VMware Telco Cloud Service Assurance 2.3.0 or 2.3.1, use the following example.
Provide the VMware Telco Cloud Service Assurance 2.3.1 or 2.3.0 backup name in the following restoration file for both the name spaces,
tcsa-system
andtps-system
.- The following example file is for restoring 2.3.0 backup.
tcx-deployer/examples/backup-and-restore/restore-230-version.yaml.example
- The following example file is for restoring 2.3.1 backup.
tcx-deployer/examples/backup-and-restore/restore-231-version.yaml.example
The following is the example content of the
restore-231-version.yaml.example
file.apiVersion: tcx.vmware.com/v1 kind: Restore metadata: name: group-restore-tps namespace: tps-system spec: backupName: <backup name of tcsa2.3.1> restore: postgres: timeout: 10m config: adminSecret: name: postgres-db-secret namespace: tps-system endpoint: host: postgres-cluster.tps-system.svc.cluster.local port: 5432 dbs: - analyticsservice - alarmservice - collector - grafana - keycloak #- "remediation" #- "airflow" postAction: name: pgpostaction serviceAccount: cluster-admin-sa timeout: 30m resource: cpu: 200m memory: 256Mi bash: command: - /bin/bash - -c - | set -ex; psql -a -U pgadmin -d grafana -c "ALTER TABLE alert_configuration_history DROP COLUMN IF EXISTS last_applied;"; env: - name: PGPORT value: "5432" - name: PGHOST value: postgres-cluster.tps-system.svc.cluster.local - name: PGUSER valueFrom: secretKeyRef: key: username name: postgres-db-secret - name: PGPASSWORD valueFrom: secretKeyRef: key: password name: postgres-db-secret --- apiVersion: tcx.vmware.com/v1 kind: Restore metadata: name: group-restore-tcsa namespace: tcsa-system spec: backupName: <backup name of tcsa2.3.1> postAction: name: postaction serviceAccount: cluster-admin-sa timeout: 30m resource: memory: 250Mi cpu: 100m bash: command: - /bin/bash - -c - | set ex;kubectl delete pods -n tcsa-system --selector run=apiservice; sleep 200; set ex;kubectl delete pod -n tcsa-system --selector=app.kubernetes.io/name=grafana; sleep 10; set ex;kubectl exec -it deploy/br-operator -n tcsa-system -- curl -k -s --show-error --stderr - -H 'Content-Type: application/json' -X POST --data '{ "isCleanUpgrade": true }' http://apiservice:8080/smartsrestcontroller/vsa/smarts/domain/migrate; restore: collectors: config: authenticationSecret: name: collectors-secrets namespace: tcsa-system usernameKey: key: COLLECTORS_USERNAME passwordKey: key: COLLECTORS_PASSWORD endpoint: basePath: /dcc/v1/ host: collector-manager.tcsa-system.svc.cluster.local port: 12375 scheme: http timeout: 10m elastic: authentication: name: elasticsearch-secret-credentials namespace: tcsa-system passwordKey: key: ES_PASSWORD usernameKey: key: ES_USER_NAME cleanUpIndices: false config: endpoint: host: elasticsearch.tcsa-system.svc.cluster.local port: 9200 scheme: https region: ap-south-1 indexList: - vsa_chaining_history-* - vsa_events_history-* - vsa_audit-* - gateway-mappings - vsarole,policy,userpreference,mapping-metadata,mnr-metadata # Uncomment vsametrics to restore metrics and set cleanUpIndices as true #- vsametrics* # Uncomment vsa_catalog to restore TCSA 2.4 backup #- vsa_catalog # 'removeAndAddRepository: true' and trigger Backup/Restore, to cleanup the respository. removeAndAddRepository: true timeout: 30m tls: caCrt: key: ca.crt insecureSkipVerify: true namespace: tcsa-system secretName: elasticsearch-cert tlsCrt: key: tls.crt # Uncomment KubernetesResources to restore configmaps/secrets. # kubernetesResources: # timeout: 10m # resources: # - groupVersionResource: # group: "" # version: "v1" # resource: "secrets" # nameList: # - name: "spe-pguser" # namespace: "tcsa-system" # - groupVersionResource: # group: "" # version: "v1" # resource: "configmaps" # nameList: # - name: "product-info" # namespace: "tcsa-system" zookeeper: endpoint: host: zookeeper.tcsa-system.svc.cluster.local port: 2181 paths: - path: /vmware/vsa/gateway - path: /vmware/vsa/smarts # Uncomment the zookeeper path for NCM backup #- path: /vmware/vsa/ncm timeout: 10m
- The following example file is for restoring 2.3.0 backup.
- If you want to restore VMware Telco Cloud Service Assurance 2.4.0, use the following example.
This example file can also be found in
tcx-deployer/examples/backup-and-restore/restore.yaml.example
.apiVersion: tcx.vmware.com/v1 kind: Restore metadata: name: group-restore-tps namespace: tps-system spec: backupName: <backup name of tcsa2.4.0> restore: postgres: timeout: 10m config: endpoint: host: postgres-cluster.tps-system.svc.cluster.local port: 5432 adminSecret: name: postgres-db-secret namespace: tps-system dbs: - "analyticsservice" - "alarmservice" - "collector" - "grafana" - "keycloak" #- "airflow" #- "remediation" #- "dm_upgrade" --- apiVersion: tcx.vmware.com/v1 kind: Restore metadata: name: group-restore-tcsa namespace: tcsa-system spec: backupName: <backup name of tcsa2.4.0> postAction: name: postaction serviceAccount: cluster-admin-sa timeout: 30m resource: memory: 250Mi cpu: 100m bash: command: - /bin/bash - -c - | set ex;kubectl delete pods -n tcsa-system --selector run=apiservice; sleep 200; set ex;kubectl delete pod -n tcsa-system --selector=app.kubernetes.io/name=grafana; sleep 10; set ex;kubectl exec -it deploy/br-operator -n tcsa-system -- curl -k -s --show-error --stderr - -H 'Content-Type: application/json' -X POST --data '{ "isCleanUpgrade": true }' http://apiservice:8080/smartsrestcontroller/vsa/smarts/domain/migrate; restore: collectors: config: authenticationSecret: name: collectors-secrets namespace: tcsa-system passwordKey: key: COLLECTORS_PASSWORD usernameKey: key: COLLECTORS_USERNAME endpoint: basePath: /dcc/v1/ host: collector-manager.tcsa-system.svc.cluster.local port: 12375 scheme: http timeout: 10m elastic: authentication: name: elasticsearch-secret-credentials namespace: tcsa-system passwordKey: key: ES_PASSWORD usernameKey: key: ES_USER_NAME cleanUpIndices: true config: endpoint: host: elasticsearch.tcsa-system.svc.cluster.local port: 9200 scheme: https region: ap-south-1 indexList: - vsa_chaining_history-* - vsa_events_history-* - vsa_audit-* - vsarole,policy,userpreference,mapping-metadata,mnr-metadata - gateway-mappings # Uncomment vsametrics for metrics restore and set cleanUpIndices as true # - vsametrics* # Uncomment vsa_catalog to restore TCSA 2.4 backup # - vsa_catalog # 'removeAndAddRepository: true' and trigger Backup/Restore, to cleanup the respository. removeAndAddRepository: true timeout: 30m tls: caCrt: key: ca.crt insecureSkipVerify: true namespace: tcsa-system secretName: elasticsearch-cert tlsCrt: key: tls.crt # Uncomment KubernetesResources to restore configmaps/secrets. # kubernetesResources: # timeout: 10m # resources: # - groupVersionResource: # group: "" # version: "v1" # resource: "secrets" # nameList: # - name: "spe-pguser" # namespace: "tcsa-system" # - groupVersionResource: # group: "" # version: "v1" # resource: "configmaps" # nameList: # - name: "product-info" # namespace: "tcsa-system" zookeeper: endpoint: host: zookeeper.tcsa-system.svc.cluster.local port: 2181 paths: - path: /vmware/vsa/gateway - path: /vmware/vsa/smarts # Uncomment the zookeeper path for NCM backup # - path: /vmware/vsa/ncm timeout: 10m
Note:- Provide the same backup name in the
restore.yaml
file. This would also have both thetcsa-system
andtps-system
namespace components to restore. - Add
"/vmware/vsa/ncm"
to restore backup of NCM reports.
- Provide the same backup name in the
- Before starting the restore process, it is important to remove/comment any datastores that are not required. If you have not backed up some components in a datastore, you must comment out the corresponding components in the Restore CRs.
- To restore the backup, run the following command:
kubectl apply -f <restoration YAML file>
You can check the restoration status by executing the following command. After restoration is successful, you can also launch the VMware Telco Cloud Service Assurance UI and check the backup data.[root]# kubectl get restore -A NAMESPACE NAME STATUS CURRENT STATE READY AGE MESSAGE tcsa-system scheduled-group-restore-tcsa231-tcsa SUCCESSFUL restore True 2d16h tcsa-system scheduled-group-restore-tcsa231-tps SUCCESSFUL restore True 2d16h
Note:- Once restore is triggered it cannot be undone, a failure to restore might result in a partial restore of the system. If the restore fails, a failure message displays in the Message field.
- During restore, if any data ingestion occurs and the index is getting used, then the restore can fail.
- After upgrading or migrating VMware Telco Cloud Service Assurance from older version to VMware Telco Cloud Service Assurance 2.4, the custom catalog metrics that you have created in the older version will not be available in VMware Telco Cloud Service Assurance 2.4 due to the
vsa_catalog
schema changes. You can create the customized catalog metrics again in the VMware Telco Cloud Service Assurance 2.4 Catalog UI. - When restoring an older version of a backup, please note that the configmap and secrets are not backward compatible. If there is a need to apply an older version of configmap and secrets, this must be done manually using the
kubectl
command. - To view the NCM reports, enter the NCM database IP address and password in the Grafana NCM-Postgres datasource.
- Use the Grafana export and import options to export any customized Grafana dashboards from VMware Telco Cloud Service Assurance 2.3.0 and import them into VMware Telco Cloud Service Assurance 2.4.0.