You can optionally restore a failed Replica or Client node from the backup data.
Prerequisites
Verify that you have captured the IP addresses of all the Replica and Client node VMs and have access to them. You can find the information in the VMware Blockchain Orchestrator descriptor file.
Procedure
- Stop all the applications that invoke connection requests from the DAML Ledger.
- Stop the Client node.
curl -X POST 127.0.0.1:8546/api/node/management?action=stop
root@localhost [ ~ ]# curl -X POST 127.0.0.1:8546/api/node/management?action=stop root@localhost [ ~ ]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 218a1bdaddd6 vmwaresaas.jfrog.io/vmwblockchain/operator:1.2.0.1.91 "/operator/operator_…" 18 hours ago Up 18 hours operator cd476a6b3d6c vmwaresaas.jfrog.io/vmwblockchain/agent:1.2.0.1.91 "java -jar node-agen…" 18 hours ago Up 18 hours 127.0.0.1:8546->8546/tcp agent root@localhost [ ~ ]#
- Repeat the stop operation on each Client node.
- Restore the backup data on each Client node.
rm -rf /mnt/data/db tar xvzf <backup_nam> --directory /
Use the backup that you created recently. The <backup_name> must end in .tar.gz. For example, db-backup.tar.gz.
- Stop the Replica node.
curl -X POST 127.0.0.1:8546/api/node/management?action=stop
root@localhost [ ~ ]# curl -X POST 127.0.0.1:8546/api/node/management?action=stop root@localhost [ ~ ]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3b7135c677cf vmwaresaas.jfrog.io/vmwblockchain/agent:1.2.0.1.91 "java -jar node-agen…" 20 hours ago Up 20 hours 127.0.0.1:8546->8546/tcp agent
- Repeat the stop operation on each Replica node in the Replica Network.
- Restore the backup data on each Replica node.
rm -rf /mnt/data/rocksdbdata/ tar xvzf <backup_name> --directory /
The <backup_name> must end in .tar.gz. For example, db-backup.tar.gz.
- (Optional) If you apply the backup data to restore a different node where the configuration is refreshed, sanitize the Replica node you have identified to restore.
docker run -it --entrypoint="" --mount type=bind,source=/mnt/data/rocksdbdata,target=/concord/rocksdbdata <image_name> /concord/kv_blockchain_db_editor /concord/rocksdbdata removeMetadata
The <image_name> is the Concord-core image name in the blockchain.
vmwaresaas.jfrog.io/vmwblockchain/concord-core:1.3.0.0.49
- Start all the Replica nodes.
curl -X POST 127.0.0.1:8546/api/node/management?action=start
- Verify that all the containers such as daml_execution_engine and concord are running.
docker ps -a
If the containers are not running, use the command to restart the containers.
docker ps -aq | grep -v $(docker ps -aq --filter='name=^/agent') | xargs docker rm -rf docker restart agent
- Start all the Client nodes.
curl -X POST 127.0.0.1:8546/api/node/management?action=start
- Monitor the deployed VMware Blockchain nodes' health and check whether new blocks are added to the DAML Ledger from the logs and metrics for about five minutes.
docker exec -it telegraf curl -s http://concord:9891/metrics | grep -ia last_block | tail -1 docker exec -it concord sh -c './concord-ctl status get state-transfer' | grep Fetching docker exec -it concord sh -c './concord-ctl status get replica' | grep -E 'lastStableSeqNum|curView' docker logs --since 1m -f concord | grep -ia addBlock | cut -d '|' -f 3,10