You can optionally restore a failed Replica node from the backup data.

Prerequisites

  • Verify that the backup was created from a healthy Replica node. See Back-Up Replica Nodes on vSphere.

  • Verify that you have captured the IP addresses of all the Replica and Client node VMs and have access to them. You can find the information in the VMware Blockchain Orchestrator descriptor file.

Procedure

  1. Stop all the applications that invoke connection requests from the Daml Ledger.
  2. Stop the Replica nodes.
    curl -X POST 127.0.0.1:8546/api/node/management?action=stop
  3. Change the owner of the /mnt/data/db to VMware Blockchain user in the new Replica node.
    sudo chown vmbc:users /mnt/data
  4. Remove any contents of the database directory.
    sudo rm -rf /mnt/data/rocksdbdata/*
  5. For small database, copy the backup data from the old Replica node to the new Replica node.

    Name of the backup should end with .tar.gz.

    rsync -avh /mnt/data/<BackupName> <destination>

    Example command.

    rsync -avh /mnt/data/db-backup.tar.gz vmbc@<New Replica IP>:/mnt/data
  6. For large database, create a password.txt file with the VMware Blockchain user password.

    The user password is available in the output files generated by the initial deployment operation.

    1. Run the command on the old Replica node to copy the backup tar to the new Replica node.
      sudo nohup sshpass -f ./password.txt scp -v -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -r /mnt/data/db-backup.tar.gz [email protected]:/mnt/data &
    2. Remove the password file after the copying operation is done.
    3. Monitor the nohup.out file and wait for the process to complete.
      sudo tail -f nohup.out
  7. Untar the backup data in the new Replica node.
    #For small data
    sudo tar xvzpf <BackupName> --directory /
    #For large data
    cd /mnt/data
    sudo nohup tar xvzpf db-backup.tar.gz --directory . &

    The large data untar process command might take some time.

  8. (Optional) If you apply the backup data to restore a different node where the configuration is refreshed, sanitize the Replica node you have identified to restore.
    1. Remove the metadata.
      image=$(docker images --format "{{.Repository}}:{{.Tag}}" | grep "concord");docker run -it --rm --entrypoint="" --mount type=bind,source=/mnt/data/rocksdbdata,target=/concord/rocksdbdata $image /concord/kv_blockchain_db_editor /concord/rocksdbdata removeMetadata

      The <image_name> is the Concord-core image name in the blockchain.

      vmwaresaas.jfrog.io/vmwblockchain/concord-core:1.8.0.0.53

    2. Delete data in the /config/concord/config-generated/gen* directory.
      sudo rm -f /config/concord/config-generated/gen*

    3. sudo rm /mnt/data/rocksdbdata/LOCK
  9. Start all the Replica nodes.
    curl -X POST 127.0.0.1:8546/api/node/management?action=start
  10. Verify that all the containers, such as daml_execution_engine and concord, are running.

    sudo docker ps -a

    If the containers are not running, use the command to restart the containers.

    sudo docker ps -aq | grep -v $(sudo docker ps -aq --filter='name=^/agent') | xargs 
    sudo docker rm -rf
    sudo docker restart agent