If Region B becomes unavailable in the event of a disaster, you perform a failback as disaster recovery to Region A by preparing the network and NSX components in Region A, updating the vSAN Default Storage policy, and performing failback of the Operations Management Applications and the Cloud Management Platform.
Prerequisites
You perform failback as disaster recovery if the following conditions are met:
Procedure
Reconfigure the NSX Instance for the Management Cluster in Region A In the event of a site failure, when Region B becomes unavailable, prepare the network layer in Region A for failback of management applications. Change the role of the NSX Manager in Region A to primary, redeploy the universal controller cluster, and synchronize the universal controller cluster configuration.
Recover the Control VM of the Universal Distributed Logical Router in Region A In the case of failback, because of the failure in Region B, dynamic routing in Region A is not available. Deploy a Control VM for the universal dynamic logical router sfo01m01udlr01 in Region A to recover dynamic routing in the environment. You then reconfigure the recovered Control VM to provide dynamic routing for the SDDC management applications that are failed back.
Reconfigure the Universal Distributed Logical Router and NSX Edges for Dynamic Routing in Region A To support dynamic routing in Region A before you start disaster recovery from Region B, you configure the universal distributed logical router sfo01m01udlr01 , and NSX Edges sfo01m01esg01 and sfo01m01esg02 . This configuration ensures that the management components of the SDDC continue to communicate using optimal routes in a fault-tolerant network.
Verify the Establishment of BGP for the Universal Distributed Logical Router in Region A Verify that the UDLR for the management applications is successfully peering, and that BGP routing has been established in Region A. After you perform failback of disaster recovery, they can continue communicating to keep SDDC operational.
Enable Network Connectivity for the NSX Load Balancer in Region A Enable the network connectivity on sfo01m01lb01 load balancer to support high availability and distribute the network traffic load for vRealize Operations Manager and the Cloud Management Platform after disaster recovery to Region А.
Update the vSAN Default Storage Policy of the Management Cluster in Region A In the event of a site failure in Region B, the witness VM part of Region B is not available. This results in one fault domain being unavailable for the vSAN stretched cluster. To satisfy the provisioning of a VM with Site Recovery Manager, update the vSAN Default Storage Policy for the management cluster.
Initiate Disaster Recovery of the Operations Management Applications in Region A In the event of a site failure in Region B, initiate disaster recovery of vRealize Suite Lifecycle Manager and of vRealize Operations Manager to keep the monitoring functionality of the SDDC running.
Initiate Disaster Recovery of the Cloud Management Platform in Region A In the event of a site failure in Region B, initiate disaster recovery of vRealize Automation and vRealize Business in Region A to fail the Cloud Management Platform back to Region A.
Post-Failback Configuration of the SDDC Management Applications After failback of the Operations Management applications and the Cloud Management Platform, you must perform certain tasks to ensure that applications perform as expected.