From cloud sites backed by VMware Cloud Director, orchestrate complex failover or migration to and from paired cloud and on-premises sites by using recovery plans. These plans attach existing replications to ordered steps with optional delay or prompt attributes. Prioritize which workloads failover or migrate first and power on, followed by workloads pending specific conditions before recovering or migrating and powering on.

VMware Cloud Director Availability 4.3 introduces recovery plans. These plans orchestrate step-by-step the failover or migrations of already created vApp and virtual machine incoming replications to the disaster recovery site. Sequence and organize disaster recovery or migration of each workload by priority, with available delays and prompts.
Recovery Plans
Each recovery plan consists of sequential actions, called steps. The plans can contain an unlimited number of ordered steps.
  • Recovery plans contain steps that perform only test failover or failover of the protected workloads.
  • Migration plans allow scheduling of the synchronization and contain steps that perform only migrations.
Steps
Each step in the recovery plan can perform multiple existing replication tasks such as a test failover, a failover, or a migration of the workload with optional attributes after the step completes, like a delay or a prompt.
Delay
This step attribute allows configuring a waiting time before executing the next step. The delay applies after completing all replication tasks in the current step.
Prompt
This step attribute allows configuring a user prompt message, suspending the current step execution before the next step occurs, until approval of the prompt in the current step.

Scheduling Migrations Synchronization

Scheduling the initial synchronization of a migration
You can schedule the initial synchronization time when creating any migration.
Then the initial synchronization of the migration waits for the scheduled time, while the replication remains paused.
Scheduling the migrations auto synchronization of a plan
You can schedule the migrations auto synchronization when creating or editing a migration plan.
Then all the plan migrations automatically synchronize at this scheduled time, regardless of their previous synchronizations.
Delayed synchronization
At the migration plan scheduled time, if a migration is started paused, meaning the virtual machine is not running or the initial synchronization time of the migration is scheduled in the future compared with the plan scheduled time, the migration performs its initial synchronization.
Synchronize before migrate
At the migration plan scheduled time, if a migration is already synchronized, meaning the virtual machine is running and no initial synchronization time of the migration is scheduled at all or it has been scheduled but the synchronization already passed, the migration performs a subsequent synchronization, for reducing the Recovery Time Objective (RTO) near the actual migration.
Note: Scheduling the auto synchronization in a migration plan overwrites the initial synchronization schedule of all its migrations.

Step and Recovery Plan Execution

In VMware Cloud Director Availability 4.4 and later, selecting a step shows its detailed execution sequence, highlighting the currently performed activity when the step executes. Also, while executing a recovery plan, the active Follow plan toggle selects the currently executed step showing its detailed execution sequence with the currently performed activity. Then, after the currently executed step completes automatically follows the next executed step and keeps showing the currently active step details as the plan proceeds. Selecting another step deactivates the Follow plan toggle and keeps showing the selected step details without advancing while the recovery plan completes its steps. For example, a recovery plan that consists of the following steps, with their detailed execution sequence:
  • Not Started > Delay (wait X seconds) > Completed
  • Not Started > Failover & Delay (Failover X vApps and Y VMs, then wait Z seconds) > Completed
  • Not Started > Delay (Wait X seconds) > Prompt (Message) > Completed
  • Not Started > Failover (Failover X vApps and Y VMs) > Completed

The execution of a recovery plan repeats for each step the following fixed execution sequence, according to the configured attributes.

  1. Execute and complete the step of the plan. In parallel, for each vApp or virtual machine in the step:
    1. First, perform the replication task like test failover or failover by using the latest available instance for the replication. Migrate tasks perform at least one synchronization before falling over.
    2. After the replication task completes, power on the workload.
  2. Skip, unless a delay is configured.
    1. Else, the step waits for the configured seconds or minutes.
    2. After the delay, the plan resumes executing #3.
  3. Skip, unless a prompt is configured.
    1. Else, suspend the plan after completing the current step, until approving the prompt.
    2. Prompt the user. Approving the prompt resumes executing #4.
  4. Repeat this sequence with the next step in the plan, if any more, executing from #1.
  • After the last step, the recovery plan completes with a Completed Failover or a Completed Migrate state, regardless of whether certain replication operations completed with a warning.
  • Alternatively, the recovery plan suspends with a Suspended... state on a prompt, or when clicking Suspend, or at any step where the replication operation fails with an error message.

    For example, any recovery plan suspends at a migration step that requires authentication with the remote site.

Recovery Plan States

The allowed operations on a recovery plan depend on its current state and on the last operation in the plan.

Not started recovery plans
Not started state persists before executing any recovery plan operation, or after executing test cleanup operation. The recovery plans allow all operations, like test failover, failover or migrate, editing and modifying the steps, and attaching and detaching replications.
Running recovery plans
While running, recovery plans only allow clicking Suspend, suspending the plan after the current step executes. Running recovery plans do not allow any other replication operation, nor modifying the steps, nor their order, nor attaching and detaching replications.
Suspended recovery plans
  • Suspended on prompt recovery plans resume after clicking Approve Prompt. Alternatively, they resume by using failover or migrate.
  • Suspended recovery plans after test or cleanup step allow resuming by using test failover or test cleanup, failover or migrate.
  • Suspended recovery plans after a failover or a migrate step, allow resuming by using failover or migrate.
  • All suspended recovery plans allow editing and modifying the steps and attaching and detaching replications. For example, detaching replications that suspend the recovery plan, allows resuming the plan execution.
  • Modifying the steps order then resuming uses the previous step order before the modification. New steps execute according to their order, for example, adding a step and moving it before the currently suspended step resumes execution with the new step first.
Completed recovery plans
  • Completed failover and completed migrate plans only allow deleting or cloning in a new plan. Such plans do not allow editing nor modifying their steps, nor attaching and detaching replications.
  • Completed test failover plans, allow test cleanup, failover, migrate, and editing but do not allow attaching and detaching replications.
  • Migration plans migrate their workloads and complete. Similarly, failover plans perform failover and complete.
  • Empty steps execute and complete, performing no operations and continue with the next step.
  • Empty recovery plans without steps or with empty steps execute without performing any tasks and have a Completed state.

Replications Implications

  • Steps can only use existing replications and do not create new replications.
  • The recovery plan steps treat the replicated workload similarly, regardless of whether it is a vApp replication or a virtual machine replication.
  • One replication task can be part of multiple recovery plans but not in multiple steps in the same plan.

    When using the same replication task in more than one recovery plan and several plans using this task start simultaneously, the plan that first starts the replication task completes its steps. The remaining recovery plans steps also complete while skipping this reused replication task as already performed when the step completed. If the step is in-progress, remaining recovery plans can fail.

    For example, running two recovery plans that contain steps with replication tasks for the same workload. The first plan executes a step performing a failover task then the second plan executes a step performing a test failover task. As a result, the recovery plan executing the test failover task fails, at the step containing the already failed over replication.

  • Deleting a replication while used in a recovery plan, detaches the replication from the step where attached, without causing the plan failing.
  • To change advanced replication settings, like network settings or disk settings, directly modify the replication settings. After the modifications, all plans using the modified replication execute by using the updated replication settings.

Recovery Plans Operations

After logging in the cloud site, in the left pane, under the Replications section click Recovery Plans.

Note: The recovery plans are only available only from the cloud site and are not available from on-premises sites. On-premises workloads can be part of the plans and are managed from cloud site.
New recovery plan
Allows entering a name and optional description then creates a blank recovery plan for adding steps that perform protections.
New migration plan
Allows entering a name, optional description, optional synchronization schedule of the migrations then creates a blank migration plan for adding steps that perform migrations. Scheduling the migration in the plan overrides the usual scheduled migration.
Selecting an existing plan that is in a Not started or in a Suspended... state allows the following actions.
New step
Adds a step in the selected plan. For information about the actions of the steps, see the next section.
Edit
Editing allows modifying the selected plan name and description and for migration plans modifying the automatic synchronization schedule. Editing is available for plans in a Not started, or Completed Test, or Suspended... state.
Delete
Prompts a confirmation for removing the selected plan. Deleting is available for suspended, completed, and not started plans. Deleting is not available only for plans in a Running state.

Deleting a recovery plan also removes all of its reports.

Suspend
Requests pause of the execution of the selected plan after completing the currently running replication task in the current step. Suspending is available for any plan in a Running state. While suspended, the plan allows attaching and detaching replications, re-ordering the steps, and adding or removing steps. Modifying the steps or their order causes resuming the plan execution at the first step and skipping completed steps, where an already approved prompt means a completed step. When a prompt suspends the step, after reordering the steps and then approving the prompt resumes with the original next step as before reordering and the plan completes.
Test
Performs a test failover task for all workloads in the selected plan. Testing is inactive after a test or after a failover or a migrate task completes.
Test Cleanup
Performs a cleanup of the test failover tasks for all workloads in the selected plan. Cleanup is inactive, until completing a test.
Failover
Performs a failover task for all workloads in the selected plan. Failover is inactive after a failover or a migrate task completes. Failover is available for plans in a Not started, or Completed Test, or Suspended state.
Migrate
Performs a migration task for all workloads in the selected plan. Migrate suspends unless authenticated with the remote site. Migrate is inactive after a failover or a migrate task completes. Similar to failover, migrate is available for plans in a Not started, or in a Completed Test, or in a Suspended... state.
Monitor tasks
Opens Replication Tasks, filtered to only display the tasks of the selected plan.
Other actions
  • Change owner - allows selecting a new owner organization for the selected plan. The ownership and the visibility of a plan belong to the user who initially created it. For example, plans created by the service provider are not visible to a tenant user, until the changing the owner. Change owner is inactive after failover or migrate complete.
  • Clone - prompts for a name of the duplicate plan and copies the steps of the selected plan in the duplicate plan. Optionally, cloning allows detaching all replications from the steps of the duplicate plan, while preserving the steps. Cloning a recovery plan creates a recovery plan duplicate, similarly cloning a migration plan, creates a migration plan duplicate. Both completed and suspended plans allow cloning. The cloned plan is in a Not started state with Not started steps, regardless of whether any steps completed in the source plan.
  • Reports - new for VMware Cloud Director Availability 4.4, shows the Recovery Plan Reports window for the selected recovery plan. This page contains entries for the performed operation of each completed plan execution, the start and end timestamps and the result of each execution. For example, the following recovery plan executed four times, with the latest performed operation on top:
    Table 1. Recovery Plan Reports
    Operation Start Date End Date Result
    Failover d/m/yyyy, h:mm:ss d/m/yyyy, h:mm:ss Success
    Test d/m/yyyy, h:mm:ss d/m/yyyy, h:mm:ss Error
    Cleanup d/m/yyyy, h:mm:ss d/m/yyyy, h:mm:ss Success
    Test d/m/yyyy, h:mm:ss d/m/yyyy, h:mm:ss Error
    Selecting any of these performed and completed operations activates View Report File for the selected operation. Clicking View Report File opens its Recovery Plan Execution Report.

Recovery Plan Execution Report

VMware Cloud Director Availability 4.4 and later show reports for each execution of a recovery plan by selecting the plan and clicking Other Actions > Reports.

In the Recovery Plan Reports window, selecting an operation and clicking View Report File opens a new Recovery Plan Execution Report page that contains the following information:
  • Plan: name.
  • Type: RECOVERY or MIGRATION.
  • Site: name.
  • Owner: org@site.
  • Steps: X executed of Y total.
  • Duration: start date - end date.
  • Operation: Test, or Cleanup, or Failover, or Migrate, with operation state Completed or Failed.

    Operation suspended by user. operations show as Failed.

  • Step information: Step name, Delay if exists, Duration, Outcome, Prompt if exists.
    • Recovery information: VM/vApp Name, Source site, State, Duration if executed, Outcome.
    The Outcome can be Completed, Failed, or when the recovery plan failed or suspended before that step - Not Started.
Recovery executions are nested into steps, similar to how replications associate to the steps in the plan. The report can contain multiple steps and multiple recovery executions within each of those steps.

Steps Operations

Selecting an existing recovery plan that is in a Not started or in a Suspended... state allows adding steps in the plan.
New step
  • For recovery plans, completing the New Recovery Step wizard allows attaching multiple vApp or virtual machine protections for recovery in the step and creates a recovery step.
  • For migration plans, completing the New Migration Step wizard allows attaching multiple vApp or virtual machine migrations for recovery in the step and creates a migration step.
Completing each of the New Step wizards allows selecting an optional delay and an optional prompt that suspends that step unless approved.
Selecting an existing and not executed step from an existing recovery plan that is in a Not started or in a Suspended state allows the following actions for the step.
Edit
Allows modifying the name, the optional delay, and the optional prompt of the selected step.
Delete
Prompts a confirmation for removing the selected step from the current plan. Deleting is available for completed steps but not while a step is running.
Attach
The Attach replications window allows selecting vApp or virtual machine replications for attaching in the selected step.
  • When attaching a vApp replication, changing the number of replicated virtual machines in that vApp replication, affects the recovery plan. Adding virtual machine replications for that vApp attaches the new virtual machine replications to the step with the attached the vApp. Similarly, removing virtual machine replications from the vApp detaches them from the step.
  • Alternatively, attaching all the virtual machine replications of a vApp replication in the step permanently fixes those virtual machine replications as part of the step. Adding or removing virtual machine replications to the same vApp replication does not affect the step or the recovery plan.
Selecting an already attached replication in a step allows the following actions for the replication.
Detach
Prompts a confirmation for removing the selected replication from the current step.
Move to step
Prompts for selecting a destination step for the selected replication. Moving is inactive when the plan contains only one step.
Dragging and dropping each step in a recovery plan re-arranges the plan steps order.