Back-Up Replica Nodes on vSphere

To perform operations such as restoring or cloning, as a best practice, you must back up the data on the Replica nodes in the Replica Network.

Backing up the Replica node data lets you save the data. If an operation fails, you can restore the lost or corrupted data.

Prerequisites

Verify that you have the deployed blockchain ID information.
Familiarize yourself with the backup and restore consideration for your VMware Blockchain nodes. See VMware Blockchain Node Backup and Restore Considerations on vSphere.
Verify that you have access to the latest version of VMware Blockchain.
Identify the following details from the VMware Blockchain Orchestrator output directory, /home/blockchain/output.
Blockchain ID
Current blockchain version
Replica node IP address
Client node IP address
VMware Blockchain vmbc user password for all the Replica and Client node VMs

Procedure

SSH into the VMware Blockchain Orchestrator appliance.
Enter the login credentials for the blockchain user account.
Navigate to the /home/blockchain directory.
Instantiate the Concord operator container.

You can only use one Client node, which is part of the original blockchain, as an Concord operator container.
1. SSH into a Client node designated to host the operator Client node.
  
  Note:
  Reboot the Concord container before using an operator container from a different Client node.
2. Verify that the Client node has a Concord operator container image and identify the operator image ID.
  
  sudo docker images | grep "operator"
3. Verify that the Concord operator container configuration file is available.
```
sudo cat /config/daml-ledger-api/concord-operator/operator.config
```
4. For deployments with unencrypted configuration, copy the private key content into the following location.
```
sudo vi /config/daml-ledger-api/concord-operator/operator_priv.pem
```
5. For deployments with encrypted configuration, deploy the encrypted operator private key content into the following location.
  The private key is encrypted and stored on disk because the deployment secure store configuration requires private keys not to be saved unprotected on disk.
```
# Replace <OP_IMAGE_ID> and paste the operator private key when prompted
sudo docker run -ti --network=blockchain-fabric --name=operator --entrypoint /operator/install_private_key.py --rm -v /config/daml-ledger-api/concord-operator:/operator/config-local -v /config/daml-ledger-api/concord-operator:/concord/config-public -v /config/clientservice/cert:/config/clientservice/cert -v /config/daml-ledger-api/config-public:/operator/config-public <Operator_IMAGE_ID>
```
6. Launch the Concord operator container Client node.
```
curl -X POST 127.0.0.1:8546/api/node/start-operator
```
7. Access the Concord operator container Client node.
```
sudo docker exec -it operator bash
```
Stop all the applications that invoke connection requests to the Daml Ledger.

Stop the Replica nodes.

curl -X POST 127.0.0.1:8546/api/node/management?action=stop

Verify that all the containers except the agent and deployed Concord operator container are running on the selected Replica node.
```
sudo docker ps -a
```
If the sudo docker ps -a command shows that some containers, with the exception of agent and deployed Concord operator container, are still running, rerun the command or use the sudo docker stop <container_name> command to stop the containers.
Check that all the Replica nodes are stopped in the same state.
Verifying whether the LastReacheableBlockID and LastBlockID sequence number of each Replica node stopped helps determine if any nodes are lagging. As best practice, check if the LastReachableBlockID and LastBlockID are same for at least five Replica nodes and can be sources for recovery.

If there is a lag when you power on the Replica Network, some Replica nodes in the state-transfer mode might have to catch up. Otherwise, it can result in a failed consensus and require restoring each Replica node from the latest single copy.
```
image=$(docker images --format "{{.Repository}}:{{.Tag}}" | grep "concord");docker run --rm --entrypoint="" --mount type=bind,source=/mnt/data/rocksdbdata,target=/concord/rocksdbdata $image /concord/kv_blockchain_db_editor /concord/rocksdbdata getLastBlockID
image=$(docker images --format "{{.Repository}}:{{.Tag}}" | grep "concord");docker run --rm --entrypoint="" --mount type=bind,source=/mnt/data/rocksdbdata,target=/concord/rocksdbdata $image /concord/kv_blockchain_db_editor /concord/rocksdbdata getLastReachableBlockID
```
The <image_name> is the Concord-core image name in the blockchain.
vmwaresaas.jfrog.io/vmwblockchain/concord-core:1.7.0.0.55
Back up the data on each of the Replica nodes.
```
sudo tar cvzf <backup_name> /mnt/data/rocksdbdata 
#For data greater than 64GB 
cd /mnt/data/
sudo nohup tar cvzpf db-backup.tar.gz /mnt/data/rocksdbdata & 
sudo tail -f nohup.out 
#Wait for the tar to complete
```
The <backup_name> must end in .tar.gz. For example, db-backup.tar.gz.

The large data backup command might time out due to SSH inactivity. Incrementally rerun the command.
Start all the applications that invoke connection requests to the Daml Ledger.

Start all the Replica nodes in the Replica Network.

curl -X POST 127.0.0.1:8546/api/node/management?action=start

Start all the Client nodes.

curl -X POST 127.0.0.1:8546/api/node/management?action=start

What to do next

If there is a failure or you want to clone a deployment, you can restore it from the backup data. See Restore Replica Nodes from the Backup Data on vSphere.