Refer to the example YAML provided here to provision a TanzuKubernetesCluster cluster that uses the Ubuntu OS for cluster nodes. Such a cluster can be used for vGPU workloads.
v1alpha3 Example: TKC with Ubuntu TKR
By default the PhotonOS edition of the named TKR is used for TKG cluster nodes. If the referenced TKR supports the OSImage format and has an Ubuntu OS edition available, use the run.tanzu.vmware.com/resolve-os-image: os-name=ubuntu
annotation to specify the Ubuntu OS edition of the TKR. For more information on the OSImage format, see TKr Operating System Image Format.
The Ubuntu TKR is required for AI/ML workloads. Each worker node pool has a separate volume for the containerd runtime and kubelet, each with a 70 GiB capacity. Providing a separate volume of this size is recommended for container-based AI/ML workloads.
apiVersion: run.tanzu.vmware.com/v1alpha3 kind: TanzuKubernetesCluster metadata: name: tkc-ubuntu-gpu namespace: tkg-cluster-ns annotations: run.tanzu.vmware.com/resolve-os-image: os-name=ubuntu spec: topology: controlPlane: replicas: 3 storageClass: tkg-storage-policy vmClass: guaranteed-large tkr: reference: name: v1.25.7---vmware.3-fips.1-tkg.1 nodePools: - name: nodepool-a100-primary replicas: 3 storageClass: tkg-storage-policy vmClass: vgpu-a100 tkr: reference: name: v1.25.7---vmware.3-fips.1-tkg.1 volumes: - name: containerd mountPath: /var/lib/containerd capacity: storage: 70Gi - name: kubelet mountPath: /var/lib/kubelet capacity: storage: 70Gi - name: nodepool-a100-secondary replicas: 3 storageClass: tkg-storage-policy vmClass: vgpu-a100 tkr: reference: name: v1.25.7---vmware.3-fips.1-tkg.1 volumes: - name: containerd mountPath: /var/lib/containerd capacity: storage: 70Gi - name: kubelet mountPath: /var/lib/kubelet capacity: storage: 70Gi settings: storage: defaultClass: tkg-storage-policy network: cni: name: antrea services: cidrBlocks: ["198.51.100.0/12"] pods: cidrBlocks: ["192.0.2.0/16"] serviceDomain: cluster.local