This page explains how to create TKGI clusters on vSphere that run NVIDIA GPU worker nodes. Applications hosted on the GPU clusters access GPU functionality via Compute Unified Device Architecture (CUDA).
VMware ESXi hosts let VMs directly access plugged-in GPU devices via PCI passthrough as described in GPU Device in PCI Passthrough in the VMware Edge documentation.
To create a CUDA-enabled GPU cluster with TKGI on vSphere, you:
vendor_id
and device_id
that identify them.pci_passthroughs
.To prepare GPU hardware for supporting TKGI clusters with CUDA:
Plug the GPU cards into your ESXi hosts.
Enable PCI passthrough and record the GPU IDs:
GPU
cluster.For each target GPU:
You configure a Kubernetes cluster to have GPU-based workers by defining an instance group with VM extensions vm_extensions.pci_passthroughs.vendor_id
and .device_id
set to your GPU’s vendor and device ID values. See Using BOSH VM Extensions for how to create the VM extension.
The instance group’s name
value must start with worker-
, to specify that it applies to worker nodes.
You can define the instance groups using either YAML or JSON format. The formats differ in how you set the ID values:
0x10de
; prepend 0x
to the vSphere client listing4318
; convert from the vSphere client listingFor example:
YAML:
---
instance_groups:
- name: master
vm_extension:
vmx_options:
disk.enableUUID: '1'
- name: worker-gpu-pool
vm_extension:
cpu: 8
ram: 16384
pci_passthroughs:
- vendor_id: 0x10de
device_id: 0x1db6
vmx_options:
disk.enableUUID: '1'
pciPassthru.use64bitMMIO: 'TRUE'
pciPassthru.64bitMMIOSizeGB: 128
JSON:
{
"instance_groups": [
{
"name": "master",
"vm_extension": {
"vmx_options": {
"disk.enableUUID": "1"
}
}
},
{
"name": "worker-gpu-pool",
"vm_extension": {
"cpu": 8,
"ram": 16384,
"pci_passthroughs": [
{
"vendor_id": 4318,
"device_id": 7606
}
],
"vmx_options": {
"disk.enableUUID": "1",
"pciPassthru.use64bitMMIO": "TRUE",
"pciPassthru.64bitMMIOSizeGB": 128
}
}
}
]
}
Configure the pci_passthroughs
and vmx_options
sections as described below.
pci_passthroughs
To support the GPU worker nodes, you need a sufficient number of GPUs:
Total GPUs needed = Number of GPUs in the vm_extension * Number of workers in the GPU node pool
For example, if you have two GPUs on every ESXi host that is hosting GPU workers, you can set pci_passthroughs
to specify both of them, using the vendor and device ID for each:
pci_passthroughs:
- vendor_id: 0x10de
device_id: 0x1db6
- vendor_id: 0x10de
device_id: 0x1db6
The IDs are the same for identical GPU boards, but you need to list them by the correct count.
vmx_options
The vmx_options
sets extra properties for the GPU worker, for example:
pciPassthru.use64bitMMIO: ‘TRUE’
- set this for GPUs that require 16GB or more of memory mappingpciPassthru.64bitMMIOSizeGB: 128
- set this option to the total amount of memory mapped I/O (MMIO) needed by your GPU cards, which is at minimum their combined framebuffer memory.
64bitMMIOSizeGB
to 128GB
.64bitMMIOSizeGB
setting, see Calculating the value for 64bitMMIOSizeGB.To create a Kubernetes cluster with both GPU and non-GPU worker nodes, configure a compute profile and custom AZs that define separate node pools, one for each worker type, as described in Create a Compute Profile.
Without a compute profile, the cluster you create will only have GPU workers.
For example, to use a node pool gpu-pool
in AZ gpu-az
, create a compute profile spec gpu-compute-profile.json
with:
{
"name": "gpu-compute-profile",
"description": "gpu-compute-profile",
"parameters": {
"azs": [{
"name": "gpu-az",
[...]
}]
}
}
],
"cluster_customization": {
"node_pools": [
{
"name": "normal-pool",
"instances": 3,
"max_worker_instances": 5
},
{
"name": "gpu-pool",
"az_names": ["gpu-az"],
"instances": 3,
"max_worker_instances": 5
}
]
}
}
}
Where the node_pools.name
value is the name of the VM extension without the worker-
prefix.
How you create the cluster depends on whether you defined a compute profile:
With compute profile:
Create the compute profile:
tkgi create-compute-profile ~/work/x/tkgi-gpu/src/tkgi/gpu-compute-profile.json
Create the cluster with the profile:
tkgi create-cluster my-gpu-cluster \
--external-hostname my-gpu-cluster.example.com \
--plan small \
--compute-profile < compute profile defined above > \
--config-file < path to the vm_extension file save above >
No compute profile:
Create the GPU-only cluster:
tkgi create-cluster my-gpu-cluster \
--external-hostname my-gpu-cluster.example.com \
--plan small \
--config-file < path to the vm_extension file save above >
To enable GPU integration with the Kubernetes environment, NVIDIA provides a GPU Operator Helm chart for managing GPUs. This Kubernetes operator handles GPU driver lifecycle management, node labeling, container-toolkit installation, etc.
See Supported NVIDIA Data Center GPUs and Systems in the NVIDIA documentation to determine whether the GPU Operator supports your hardware and environment.
NoteBroadcom does not support NVIDIA software.
To install the GPU Operator in your TKGI GPU cluster, see Installing the NVIDIA GPU Operator in the NVIDIA documentation.
For Helm chart customization options, see Common Chart Customization Options.
In a typical installation for example, you might run the following on the local workstation where you have kubectl
installed:
Install Helm, if not already installed:
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \
&& chmod 700 get_helm.sh \
&& ./get_helm.sh
Add the NVIDIA Helm repository:
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
&& helm repo update
Install the GPU Operator:
helm install --wait --generate-name \
-n gpu-operator --create-namespace \
nvidia/gpu-operator \
--set driver.enabled=true \
--set toolkit.enabled=true \
--set toolkit.env[0].name=CONTAINERD_CONFIG \
--set toolkit.env[0].value=/var/vcap/jobs/containerd/config/config.toml \
--set toolkit.env[1].name=CONTAINERD_SOCKET \
--set toolkit.env[1].value=/var/vcap/sys/run/containerd/containerd.sock \
--set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS \
--set toolkit.env[2].value=nvidia \
--set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \
--set-string toolkit.env[3].value="true"
The values /var/vcap/jobs/containerd/config/config.toml
and /var/vcap/sys/run/containerd/containerd.sock
are specific to TKGI.
If the default GPU driver does not work or suit your needs, you can install custom one as described in Running a Custom Driver Image in the NVIDIA documentation.
If you use a custom driver, you need to add the driver.repository
and driver.version
options you install the gpu-operator.