Here are the steps for a sample scenario for vRealize Network Insight Disaster Recovery (DR):
Procedure
- Ensure that SRM is configured and up in both the protected and the recovery sites.
- Configure replication for each of the vRealize Network Insight nodes that are to be protected. While configuring the replication, provide adequate Recovery Point Objectives (RPO) time for the vRealize Network Insight instance. For example, if it is a vRealize Network Insight deployment with a single platform and collector nodes (medium size), then RPO of 45 minutes is good. But if it is a cluster with nodes having bricks of large size, then the adequate RPO should be provided. The snapshot interval configuration is specific to the user environment and requirement.
- Create protection group. Include the VMs that you want to protect under a specific protection group.
- Create the recovery plan where you include the respective protection groups.
- Perform test recovery. This is to ensure that your recovery plan works as expected.
- SRM recommends that users perform planned migration at regular intervals to validate the integrity of the existingDR plan.
- Suppose the recovery site has a network configuration that forces the vRealize Network Insight VMs to come up with the new IPs. Recover the vRealize Network Insight VMs with a recovery plan that assumes no network change for the recovered VMs. Once the recovery of the VMs is reported as a success in vRealize Network Insight, assign new IP addresses manually to the vRealize Network Insight nodes, apply new certificates, and re-initialize the cluster.
- As IPv4 customization with SRM is not supported currently, as a work around you can perform DR with vRealize Network Insight assuming as if there is no network change.
To manually assign the network settings:
- Run the
change-network-settings
command simultaneously on all the platform nodes. - Run the
update-IP-change
command on the nodes on Platform1, Platform2 and Platform3 consecutively. - Run
vrni-proxy set-platform --ip-or-fqdn <with-updated-ip-of-Platform1>
on the collector node. - Check the service status. If some of the services on the platform nodes are not running, reboot the nodes in the recommended order.
- Run the