In these VM classes, as a cloud administrator, you set the compute requirements and a vGPU profile for an NVIDIA GRID vGPU device according to the vGPU devices configured on the ESXi hosts in the Supervisor cluster.

Note: This documentation is based on VMware Cloud Foundation 5.2.1. For information on the VMware Private AI Foundation with NVIDIA functionality in VMware Cloud Foundation 5.2, see VMware Private AI Foundation with NVIDIA Guide for VMware Cloud Foundation 5.2.

Prerequisites

Procedure

  1. For a VMware Cloud Foundation 5.2.1 instance, log in to the vCenter Server instance for the management domain at https://<vcenter_server_fqdn>/ui.
  2. In the vSphere Client side panel, click Private AI Foundation.
  3. In the Private AI Foundation workflow, click the Set Up a Workload Domain section.
  4. Create the VM classes with NVIDIA vGPUs.

    The wizard in the guided deployment workflow has the same options as the analogous wizard in the Workload Management area of the vSphere Client.

    Set the following additional settings in the VM class dialog box according to the contents of the deep learning VM.

    Use Case VM Class Additional Settings

    Deep learning VMs with NVIDIA RAG workloads

    • Select the full-sized vGPU profile for time-slicing mode or a MIG profile. For example, for NVIDIA A100 40GB card in vGPU time-slicing mode, select nvidia_a100-40c.
    • On the Virtual Hardware tab, allocate more than 16 virtual CPU cores and 64 GB of virtual memory.
    • On the Advanced Parameters tab, set the pciPassthru<vgpu-id>.cfg.enable_uvm parameter to 1.

      where <vgpu-id> identifies the vGPU assigned to the virtual machine. For example, if two vGPUs are assigned to the virtual machine, you set pciPassthru0.cfg.parameter=1 and pciPassthru1.cfg.parameter = 1.

      Important: This configuration turns off vSphere vMotion migration for the deep learning VM.
    Deep learning VMs using Triton Inference Server with the TensorRT backend On the Advanced Parameters tab, set the pciPassthru<vgpu-id>.cfg.enable_uvm parameter to 1.

    where <vgpu-id> identifies the vGPU assigned to the virtual machine. For example, if two vGPUs are assigned to the virtual machine, you set pciPassthru0.cfg.parameter=1 and pciPassthru1.cfg.parameter = 1.

    Important: This configuration turns off vSphere vMotion migration for the deep learning VM.