Continuous Virtual Machine Clone

Clone is one of the long-running virtual machine operations. It can interfere with or be interrupted by server reboot. As of vSphere 8.0 U3, continuous VM clone promises to streamline maintenance mode operations.

Continuous virtual machine clone ensures that:

VM operation automatically restarts without customer intervention after vCenter restarts.
From a user perspective, restarted operation is the same as initially requested. Customers can use existing scripts and interfaces to manage the operation.
Incoming operations are not affected while the system is recovering interrupted operation.

Going forward, continuous virtual machine clone is the default, but can be deactivated with a vpxd configuration option. If instructed by technical support personnel for example, administrators can set config.provisioning.continuousop.enabled to false in the vpxd.cfg configuration file. This can be done with the vSphere Client; see "Configure Advanced Settings" in VMware docs.

Here are the steps for continuous operation. Pre-workflow (1-3) operation was interrupted before the workflow was executed. Mid-workflow (4-5): operation was interrupted in the middle of workflow execution. Post-workflow (6): operation was interrupted when workflow finished but before task was updated.

Session preservator saves user session. Task management associates the task with returned UUID corresponding to the preserved session.
Session preservator acquires a persistent token (or other equivalents) to preserve the identity.
Workflow is journalled in VPX_JOURNAL and associated tables. The first action in workflow is RedoableAction, which is responsible for restarting the operation during recovery. Transaction manager loads workflow journals from database. For each pending continuous operation, it flags corresponding managed entities in recovery so that the entities do not participate in conflicting operations while recovery is in progress.
Restart begins. Recovery long-running operation revives persisted session. The session becomes the current user session for later privilege checks. Recovery cleans up previous operation that was interrupted, publishing any cleanup issues encountered in events or tasks.
RedoableAction checks if the operation should be restarted. If ye,s it will continue to the rest of recovery, otherwise it sets the task error status. Conditions when an operation should be restarted are an implementation detail, for example, it might not restart if the maximum number of retries was exceeded.
If the interruption occurs right before task status is updated, the situation is handled by performing the workflow commit and task update in a single transaction.

As implemented, managed object RedoableAction extends UndoableAction. In the vSphere Web Services API Reference, see data object TaskInfo for details about continuous operations and VM_CLONE_CONTINUOUS.