Use the following checklist to verify that you have fulfilled all the requirements to initiate disaster recovery or planned migration of the SDDC management applications and complete the configuration of these applications.

Table 1. Checklist for Failover and Failback in a Validated SDDC

Checklist

Tasks

Activation and Assessment

  • Verify that the disaster failover or failback is required:

    • For example, an application failure might not be a cause to perform a failover or failback, while an extended region outage is a valid cause.

  • Plan for business continuity events such as scheduled building maintenance or the probability of a natural disaster.

Approval

  • Submit the required documentation for approval to the following roles:

    • IT management staff

    • CTO

    • Business users

    • Other stakeholders

Activation Logistics

  • Verify that all the required facilities and personnel are available for the complete duration of the disaster recovery process.

  • Verify that Site Recovery Manager is available in the recovery region.

  • Verify the replication status of the applications.

  • Verify the state of the NSX Edge in the recovery region:

    • Verify that the NSX Edges are available.

    • Verify that the IP addresses for VXLAN backed networks are correct.

    • Verify that the load balancer on the NSX Edge is correctly configured according to the design.

    • Verify that the firewall on the NSX Edge is correctly configured according to the design.

Communication, Initiation, and Failover or Failback Validation

  • In case of a planned migration:

    • Notify all stakeholders for the planned outage and the expected duration of the maintenance window.

    • At the scheduled time, initiate the failover or failback process.

  • In case of a disaster recovery failover or failback:

    • Before initiating a failover or a failback, notify all stakeholders for the event.

  • After completing a failover or a failback:

    • Test applications availability.

    • Notify all stakeholders for the completed event.

Multiple Availability Zones

In case your environment consists of multiple Availability Zones, perform the following additional configurations for disaster recovery failback.

  • In case of disaster recovery failback in which Region B remains unavailable, the witness vSAN appliance is no longer available, which might impact storage policies. Update the force provisioning setting of the storage policy to Yes (sets FTT=0 for all newly provisioned VMs) to allow for recovery of the vRealize components. Revert the storage policy once the recovery is complete.

  • In case of a planned migration in which Region A and Region B are still operational, you do not need to update the storage policy as the witness vSAN appliance remains available.

Configuration After Failover or Failback

In case of disaster recovery failover or failback, perform the following additional configuration:

  • Update the backup jobs to include the applications that are now running in Region B. For information about the configured backup jobs, see the VMware Validated Design Backup and Restore documentation.

  • Configure the NSX Controllers and the UDLR Control VM to forward events to vRealize Log Insight in the recovery region.

  • Redirect the log data from the failed over or failed back applications to vRealize Log Insight in the recovery region.

  • Complete a post-recovery assessment:

    • Note which items worked and which did not work, and identify improvements that you can include in the recovery plan.