When you run a recovery plan for failover, a running instance of the plan recovery steps launches and the plan continues until a pause for user input or upon encountering an error (if configured), or if you cancel, roll back, or end the plan.

As a general rule, once you start a failover plan, do not make any inventory changes on the protected site until the plan completes. You cannot edit a recovery plan once the plan has been started.

If you need to make any protected site vCenter inventory changes, such as renaming a datacenter, or moving VMs to different folders or resource pools, do so before you initiate a recovery plan for failback.
Note: For successful failover operations, ensure that any vSphere tags and tag categories associated with your protected VMs also exist on the target recovery SDDC, or the failback will not succeed. A recovery plan compliance report will check that any vSphere tags and tag categories associated with your protected VMs also exist on the destination recovery SDDC.
Note: Failing over VMs with Trusted Platform Module (TPM) is not supported.

Also, make sure that the plan’s compliance check succeeds before you run the plan.

Procedure

  1. From the left navigation, select Recovery plans.
  2. In the list of plans, click the plan that you want to run. The plan you select must be in the Ready state.
  3. Next, click the DR Failover button.
  4. In the Compliance check page of the Failover wizard, you can view the Compliance check for the plan. Click Next.
    Warning: Running a failover when the plan compliance check is not all green will likely result in a failover failure.
  5. In the Snapshots page, you can verify that the failover plan is using the snapshot you want when it fails over. By default, the most recent snapshot is selected, but if you want to select a different snapshot, click Use different snapshot.
  6. In the Select protection group snapshot dialog box, select a snapshot you want to use for the failover operation, and then click OK.
  7. Click Next. In the Runtime settings page, under Error Handling select one of the following two options:
    • Ignore all errors. Select 'Ignore all errors' to run the failover in unattended mode and to allow the failover operation to continue running, even when it encounters errors. The system automatically ignores all errors by default. You can still fix those errors at the end of the failover operation if the failover completes with partial success, by clicking Retry all errors.
    • Stop on every error. Select this option to run the failover in an attended running mode. This mode instructs the plan to stop on every error and waits for you to click Retry or Ignore and continue. This option is useful if you are running this plan as a test failover.
  8. Click Next, and on the VM Storage page, select how you want the VMs stored once they are failed over:
    • Run VMs live on the cloud file system. After failover, VMs run live directly on the cloud file system, which offers a faster failover time for better RTO. Another benefit of running VMs on the cloud file system is that subsequent failback operations are also faster, resulting in less downtime. Some VMs recovered on the cloud file system might require performance that is better suited to vSAN. After a recovery plan operation completes, you can selectively Storage vMotion workloads to the vSAN datastore to improve performance. If you Storage vMotion the VMs manually, it can cause a longer failback process for those VMs.

      Another benefit of using the cloud file system for disaster recovery operations is that it you will likely require fewer, and potentially less expensive, host types to operating during disaster recovery. You only have to size and scale your SDDC for CPU and memory to avoid adding hosts to meet requirements for vSAN capacity, which is often the constraint for sizing of an SDDC.

    • Full storage migration to recovery SDDC. Performs a full Storage vMotion migration from the staging datastore to the SDDC vSAN datastore as the final step of running a plan.

      Using this option increases RTO, as the plan cannot be committed or finished until all Storage vMotion operations are complete. At scale, this can take hours or days. Without committing a successful failover plan, even with all VMs up and running, you cannot then run a failback operation until the initial Storage vMotion is complete. Also during failback operations, there will be a longer failback outage to recover workloads that have been migrated to vSAN. Fully migrated VMs provide higher IOPS performance, which is suitable for VMs that require higher performance, such as database VMs. This option might require more hosts on the cluster, depending on the size of the VMs.

  9. Click Next, and in the Preview page, you can view the steps that are taken when you finally run the plan. You defined these steps in the Recovery Plan Recovery steps page. To achieve low RTO, VMs are first recovered on the staging datastore. This recovery involves no data copy. VMs are powered on using the stored backups directly.
    If you have selected 'Full storage migration to recovery SDDC', VMware Live Cyber Recovery adds automatic migration tasks to the plan, which will run and must complete prior to plan completion and commit.
  10. Click Next, and in the Confirmation page, to run this failover plan, enter FAILOVER in all upper case letters in the confirmation text box.
  11. Click Finish to run the failover operation.

Results

You can monitor the failover process in the VMware Live Cyber Recovery UI by clicking the plan to view its details. (You can also monitor the process in the recovery SDDC). After failover, once the VMs have been powered on, they are either migrated by Storage vMotion to the recovery SDDC vSAN datastore or migrated to the cloud file system.

After a failover operation finishes, you must commit the failover to make the effects permanent. When you commit a completed failover plan, the plan transitions to the committed state. You cannot start a failback to the source site until the plan is committed.

Until the completed failover operation is explicitly committed by an administrator, it can be rolled back (even following a successful completion). But after you Commit a plan, there is no rollback.

For more information, see #GUID-B3301FF8-60B6-42D1-BA3F-44228142D505.