You can deploy artificial intelligence (AI) and machine learning (ML) workloads on clusters provisioned by the Tanzu Kubernetes Grid. The deployment of artificial intelligence and machine learning workloads requires some initial setup by service providers, and some configuration by organization administrators and tenant users in the cluster creation workflow.
To prepare VMware Cloud Director environment to provision clusters that can handle artificial intelligence and machine learning workloads, service providers must create a vGPU policy and add a vGPU policy to an organization VDC. For instructions on how to perform these tasks, refer to Creating and Managing vGPU Policies. Once service providers perform these steps, tenant users can deploy artificial intelligence and machine learning workloads to their Tanzu Kubernetes Grid clusters.
To create Tanzu Kubernetes Grid clusters with vGPU functionality, see Create a Tanzu Kubernetes Grid Cluster. If you are using Tanzu Kubernetes Grid 2.1 and above that are interoperable with VMware Cloud Director Container Service Extension, the following sections are not applicable and you can proceed to the cluster creation workflow.
BIOS Firmware Limitations
VMware Cloud Director Container Service Extension Tanzu Kubernetes Grid templates are built with BIOS firmware, and it is not possible to change this firmware configuration. The BAR1 memory on this firmware cannot exceed 256 MB. NVIDIA Grid cards with more than 256MB of BAR1 memory require EFI firmware. For more information on firmware limitations, refer to VMware vSphere: NVIDIA Virtual GPU Software Documentation.
Create a Custom Image with EFI Firmware
To overcome the BIOS firmware limitations that exist on Tanzu Kubernetes Grid templates, you can create a custom image with EFI firmware in vSphere. For instructions, refer to Linux Custom Machine Images sections in the archived Tanzu Kubernetes Grid 1.6 documentation. To access the archived documentation, see VMware Tanzu Kubernetes Grid Documentation > Unsupported Releases.
Inputs | Description | ||||||
---|---|---|---|---|---|---|---|
customizations.json | To build an image for a vGPU-enabled cluster for vSphere, create a file named customizations.json , and add the following:{ "vmx_version": "17" } |
||||||
metadata.json | VERSION must identically match an established version of a Tanzu Kubernetes Grid template, as the Kubernetes Container Clusters UI plug-in does not recognize the OVA file if the version number differs to that of the template.
The following example outlines the recommended file naming convention:
|
||||||
build-node-ova-vsphere-ubuntu-2004-efi |
Use this command to run the image builder for vGPU-enabled clusters. This command specifies to build the custom image with EFI firmware. |