Restore Backup

Restoring a backup restores the state of the components at the time of the backup.

Note:

During restore, if any data ingestion occurs and the index is getting used, then the restore can fail.
After upgrading or migrating VMware Telco Cloud Service Assurance from older version to VMware Telco Cloud Service Assurance 2.4, the custom catalog metrics that you have created in the older version will not be available in VMware Telco Cloud Service Assurance 2.4 due to the vsa_catalog schema changes. You can create the customized catalog metrics again in the VMware Telco Cloud Service Assurance 2.4 Catalog UI.

Procedure

Connect to the deployer VM and execute the below command:
```
export KUBECONFIG=/root/.kube/<KubeConfig File>
```

In a text editor, create the restoration configuration file in YAML format.

The example file is located in the tcx-deployer/examples/backup-and-restore/restore.yaml.example.

Use the following template for the component backup:

apiVersion: tcx.vmware.com/v1
kind: Restore
metadata:
  name: group-restore-tps
  namespace: tps-system
spec:
  backupName: group-backup
  restore:
    postgres:
      timeout: 10m
      config:
        endpoint:
          host: postgres-cluster.tps-system.svc.cluster.local
          port: 5432
        adminSecret:
          name: postgres-db-secret
          namespace: tps-system
      dbs:
        - "analyticsservice"
        - "alarmservice"
        - "collector"
        - "grafana"
        - "keycloak"
       #- "airflow"
       #- "remediation"
       #- "dm_upgrade"
---
apiVersion: tcx.vmware.com/v1
kind: Restore
metadata:
  name: group-restore-tcsa
  namespace: tcsa-system
spec:
  backupName: group-backup
  postAction:
    name: postaction
    serviceAccount: cluster-admin-sa
    timeout: 30m
    resource:
      memory: 250Mi
      cpu: 100m
    bash:
      command:
      - /bin/bash
      - -c
      - |
        set ex;kubectl delete pods -n tcsa-system --selector run=apiservice;
        sleep 200;
        set ex;kubectl delete pod -n tcsa-system  --selector=app.kubernetes.io/name=grafana;
        sleep 10;
        set ex;kubectl exec -it deploy/br-operator -n tcsa-system -- curl -k -s --show-error --stderr - -H 'Content-Type: application/json' -X POST --data '{ "isCleanUpgrade": true }' http://apiservice:8080/smartsrestcontroller/vsa/smarts/domain/migrate;
  restore:
    collectors:
      config:
        authenticationSecret:
          name: collectors-secrets
          namespace: tcsa-system
          passwordKey:
            key: COLLECTORS_PASSWORD
          usernameKey:
            key: COLLECTORS_USERNAME
        endpoint:
          basePath: /dcc/v1/
          host: collector-manager.tcsa-system.svc.cluster.local
          port: 12375
          scheme: http
      timeout: 10m
    elastic:
      authentication:
        name: elasticsearch-secret-credentials
        namespace: tcsa-system
        passwordKey:
          key: ES_PASSWORD
        usernameKey:
          key: ES_USER_NAME
      cleanUpIndices: true
      config:
        endpoint:
          host: elasticsearch.tcsa-system.svc.cluster.local
          port: 9200
          scheme: https
        region: ap-south-1
      indexList:
      - vsa_chaining_history-*
      - vsa_events_history-*
      - vsa_audit-*
      - vsarole,policy,userpreference,mapping-metadata,mnr-metadata
      - gateway-mappings
#     Uncomment vsametrics for metrics restore and set cleanUpIndices as true
#      - vsametrics*
#      Uncomment vsa_catalog to restore TCSA 2.4 backup
#      - vsa_catalog
#    'removeAndAddRepository: true' and trigger Backup/Restore, to cleanup the respository.
      removeAndAddRepository: true
      timeout: 30m
      tls:
        caCrt:
          key: ca.crt
        insecureSkipVerify: true
        namespace: tcsa-system
        secretName: elasticsearch-cert
        tlsCrt:
          key: tls.crt
  # Uncomment KubernetesResources to restore configmaps/secrets.
  # kubernetesResources:
  #   timeout: 10m
  #   resources:
  #     - groupVersionResource:
  #         group: ""
  #         version: "v1"
  #         resource: "secrets"
  #       nameList:
  #         - name: "spe-pguser"
  #           namespace: "tcsa-system"
  #     - groupVersionResource:
  #         group: ""
  #         version: "v1"
  #         resource: "configmaps"
  #       nameList:
  #         - name: "product-info"
  #           namespace: "tcsa-system"
    zookeeper:
      endpoint:
        host: zookeeper.tcsa-system.svc.cluster.local
        port: 2181
      paths:
      - path: /vmware/vsa/gateway
      - path: /vmware/vsa/smarts
#   Uncomment the zookeeper path for NCM backup
#      - path: /vmware/vsa/ncm
      timeout: 10m

Add "/vmware/vsa/ncm" to restore backup of NCM reports.

Note: If you want to take a VMware Telco Cloud Service Assurance 2.3.0 backup and restore the data in VMware Telco Cloud Service Assurance 2.4.0, perform the following actions:

Use restore-22-or-23-version.yaml.example.example file to restore the backup from VMware Telco Cloud Service Assurance 2.3.0 to 2.4.0
The example file is located in the tcx-deployer/examples/backup-and-restore/restore-22-or-23-version.yaml.example.
When restoring an older version of a backup, please note that the configmap and secrets are not backward compatible. If there is a need to apply an older version of configmap and secrets, this must be done manually using the kubectl command.
In the restore-22-or-23-version.yaml.example file, set cleanUpIndices to true if you want to delete the index data before restore.
To view the NCM reports, enter the NCM database IP address and password in the Grafana NCM-Postgres datasource.
Use the Grafana export and import options to export any customized Grafana dashboards from VMware Telco Cloud Service Assurance 2.3.0 and import them into VMware Telco Cloud Service Assurance 2.4.0.

To restore the backup, run the following command:

kubectl apply -f <configuration YAML file>

The following command creates the restore CR, and the status of the restore are updated in the status field as follows:

~> kubectl get restore.tcx.vmware.com -A --watch

NAME           STATUS        CURRENT STATE     READY    AGE    
restore-name   SUCCESSFUL    restore           true     4h16m

Note: Once restore is triggered it cannot be undone, a failure to restore might result in a partial restore of the system. If the restore fails, a failure message displays in the Message field.


Option	Description
removeAndAddRepository	Remove and add repository will delete the existing repository. When you are performing backup and restore across clusters, you must set removeAndAddRepository: true.`
cleanUpIndices	when you perform backup and restore across clusters, you must update the indices. For example, `cleanUPIndices: true`

Note:

Comment Out Unbacked Components

Before starting the restore process, it is important to remove/comment any datastores that are not required. To do this, update the Restore CR. Also, if you have not taken a backup for a datastore, you must remove it from Restore CR.
For example: If you have not taken the zookeeper path /vmware/vsa/gateway in the backup then remove that path in the restore CR. Otherwise the following error message appears
```
/backups/clusters/default/zookeeper/group-backup/_vmware_vsa_gateway does not exist
```
If you have not taken the backup of kubernetesResources, remove/comment kubernetesResources in the restore CR.
If you have not backed up some components in a datastore, you must comment out the corresponding components in the Restore CRs.
For NCM Reports, if the subnets are added in the NCM server-side configuration, then no changes are required. Otherwise, add the modified Grafana and ncm-proxy Node IPs in the NCM server-side configuration. For more information, see the NCM Server-Side Configuration topic.