You can use vMotion to perform a live migration of NVIDIA vGPU-powered virtual machines without causing downtime or data loss.

In vSphere 6.7 Update 1 and later, vGPU vMotion is supported for vGPU profiles of up to 12 GB of frame buffer. The 12GB frame buffer limit represents a single vGPU device attached to the VM, regardless of the GPU model or vGPU profile. Attempts to migrate VMs with vGPU frame buffers exceeding this limit might exceed the 100 second timeout for vSphere vMotion stun time, resulting in the migration process failing due to timeout.

While the migration is in progress, you will be unable to access the VM, desktop, or application. Once the migration is completed, access to the VM will resume and all applications will continue from their previous state. If the migration fails, the VM remains on the source host. To preserve the application (and GPU) state during the cold-migration of VMs with a vGPU frame buffer over 12 GB, the VM should be suspended, cold migrated, and resumed on a compatible destination host. For information on frame buffer size in vGPU profiles, refer to the NVIDIA Virtual GPU documentation.

The expected VM stun times (the time when the VM is inaccessible to users during vMotion) are listed in the following table. These stun times were tested over a 10Gb network with NVIDIA Tesla P40 GPUs :

Table 1. Expected Stun Times for vMotion of vGPU VMs
Used vGPU Frame Buffer (GB) VM Stun Time (sec)
1 9.0
2 16.5
4 31.4
8 61.3
10 76.3
12 91.2
16 100+ (vMotion timeout)
24 100+ (vMotion timeout)
Note: The configured vGPU profile represents an upper bound to the used vGPU frame buffer. In many VDI/Graphics use cases, the amount of vGPU frame buffer memory used by the VM at any given time is below the assigned vGPU memory in the profile. Treat these times as worst case stun times for cases when the entire assigned vGPU memory is being used at the time of the migration. For example, an M60-8A vGPU profile will allocate 8 GB of vGPU frame buffer to the VM, but the VM can use any amount between 0-8GB of frame buffer during the migration. This means that the stun time can end up being between less than 1 sec to 61.3 seconds.

VMware vSphere vMotion is supported only with and between compatible NVIDIA GPU device models and NVIDIA GRID host driver versions as defined and supported by NVIDIA. For compatibility information, refer to the NVIDIA Virtual GPU User Guide.

To check compatibility between NVIDIA vGPU host drivers, vSphere, and Horizon, refer to the VMware Compatibility Matrix.