If Region B becomes unavailable in the event of a disaster, you perform a failback as disaster recovery to Region A by preparing the network and NSX components in Region A, updating the vSAN Default Storage policy, and performing failback of the operations management applications and the Cloud Management Platform.
Prerequisites
You perform failback as disaster recovery if the following conditions are met:
Procedure
Assign the Primary Role to the NSX Manager Instance for the Management Cluster for VMware Cloud Foundation in Region A If a site failure occurs after you fail over the SDDC management applications to Region B, you must prepare the network layer in Region A for a failback of the management applications. First change the role of the NSX Manager instance in Region A to primary so that you can recreate the virtual network infrastructure in Region A by using NSX Manager.
Redeploy the Control VM of the Universal Distributed Logical Router for VMware Cloud Foundation in Region A During failover of the SDDC management applications to Region B, the universal NSX components for dynamic routing are deployed in Region B. If a site failure occurs in Region B, the management applications might lose connectivity if you fail them back to Region A right away. Deploy and configure a control VM for the universal dynamic logical router sfo01m01udlr01 in Region A to restore dynamic routing for the management applications.
Reconfigure the Universal Distributed Logical Router and NSX Edge Nodes for Dynamic Routing for VMware Cloud Foundation in Region A To support dynamic routing in Region A before you initiate disaster recovery from Region B, you configure the universal distributed logical router sfo01m01udlr01 , and NSX Edge nodes sfo01m01esg01 and sfo01m01esg02 . This configuration ensures that the management components of the SDDC continue to communicate using optimal routes in a fault-tolerant network.
Verify the Establishment of BGP for the Universal Distributed Logical Router for VMware Cloud Foundation in Region A Verify that the UDLR for the management applications is successfully peering, and that BGP routing has been established in Region A. After you perform failback for disaster recovery, they can continue communicating to keep SDDC operational.
Deploy the NSX Controllers for the NSX Instance for VMware Cloud Foundation in Region A Deploy the three-node universal NSX Controller cluster in Region A for logical switching and routing in and across the clusters and regions in the SDDC after a failback.
Connect the Application NSX Load Balancer for VMware Cloud Foundation in Region A to the SDDC Network Enable the network connectivity on sfo01m01lb01 load balancer to support high availability and distribute the network traffic load for the Operations Management applications, and the Cloud Management Platform after failback for disaster recovery to Region А.
Update the vSAN Default Storage Policy of the Management Cluster for VMware Cloud Foundation in Region A In an environment with multiple availability zones, if a site failure in Region B occurs, the witness appliance in Region B becomes inaccessible. As a result, one fault domain becomes unavailable for the vSAN stretched cluster. To continue provisioning virtual machines in Region A, configure vSAN by using the vSAN default storage policy to force-provision these virtual machines although they will be non-compliant until the witness appliance rejoins Region A.You perform this operation only when multiple availability zones are configured in your environment.
Initiate Disaster Recovery of the Operations Management Applications for VMware Cloud Foundation in Region A If a site failure in Region B occurs after you failed over the SDDC management applications, initiate disaster recovery of vRealize Suite Lifecycle Manager and of vRealize Operations Manager to keep the monitoring functionality of the SDDC running.
Initiate Disaster Recovery of the Cloud Management Platform for VMware Cloud Foundation in Region A If a site failure in Region B occurs, initiate a disaster recovery of the vRealize Automation components to keep the workload provisioning functionality of the SDDC available.
Post-Failback Configuration of the SDDC Management Applications for VMware Cloud Foundation After failback of the Operations Management applications and the Cloud Management Platform, you must perform certain tasks to ensure that applications perform as expected.
Additional Post-Failback Configuration After Region B Is Available Again for VMware Cloud Foundation When the protected region, Region B, is back online, you can fully transfer all features of the original SDDC configuration to Region A. Integrate the running management nodes in Region B in the main SDDC configuration that is failed back to Region A.