GPU Device in PCI Passthrough

PCI passthrough for GPU cards in VMware ESXi allows a virtual machine to access a physical GPU card directly, bypassing the hypervisor layer. This enables a virtual machine to fully utilize the capabilities of a GPU, making it suitable for compute-intensive tasks like machine learning, high-performance computing, and computer vision. This section will walk users through setting up PCI passthrough for GPU cards in VMware ESXi.

VMDirectPath I/O - also called PCI Passthrough, allows the GPU device to be accessed directly by the guest operating system on a virtual machine. This method is the simplest way to consume a GPU and only allows the GPU card to be dedicated to one virtual machine, although a single virtual machine can make use of multiple GPUs in passthrough mode.

The deployment details of a virtual machine leveraging GPU in PCI Passthrough are available in this blog post. They can be summarized into the following steps:

Prerequisites

vCenter access with read and write privileges for PCI Devices.

Procedure

Configure the vSphere Host BIOS setting to work with a GPU that requires 16GB or above of memory-mapped I/O (MMIO).
Enable PCI Passthrough for the GPU device. Go to “Configure” -> “Hardware” -> “PCI Devices” and click on “Toggle Passthrough” if the GPU has not been previously enabled for DirectPath I/O.

Figure 2. Enable PCI Passthrough
Reboot the host.
Create the virtual machine that will be using the GPU and configure EFI Boot Mode under Boot Options. The default boot option is BIOS.

Figure 3. Configure EFI boot mode
For GPU models requiring MMIO space greater than 16GB in size, add and configure the following two parameters in the virtual machine’s Advanced configuration parameters. The value for “64bitMMIOSizeGB” will depend on the type of high-end GPU card deployed and the MMIO space required is shown in the table from this release notes. For example, the NVIDIA A30 GPU requires 64GB of MMIO Space.
- pciPassthru.use64bitMMIO="TRUE"
- pciPassthru.64bitMMIOSizeGB = "64"
Figure 4. Configure MMIO memory parameters
Install a UEFI-capable operating system on the virtual machine or deploy a virtual machine template that can boot in UEFI mode.
Assign GPU to the virtual machine by adding a new PCI Device with the “Dynamic DirectPath I/O” option for HA and DRS support.

Figure 5. Assign GPU to the virtual machine
Configure memory reservation by checking the “Reserve all guest memory” option under virtual machine settings.

Figure 6. Virtual machine memory reservation
Power on the virtual machine and deploy applications using the GPU in Passthrough mode.

Results

You will now be able to assign the GPU card in PCI Passthrough to a specific virtual machine, which gains exclusive access to the card.

Note:

To deploy a GPU-Enabled workload cluster in TKG 1.6, the GPU must be configured in PCI Passthrough. When using PCI passthrough mode, each GPU device is dedicated to a single worker node virtual machine in the workload cluster. GPU in PCI Passthrough is the only mode supported in TKG version 1.6 and you can follow this documentation for step-by-step deployment.

What to do next

This setup allows the virtual machine to install its own GPU driver and access the full performance capabilities of the GPU.