This topic describes ways of configuring Tanzu Kubernetes Grid (TKG) workload clusters to use features that are specific to Microsoft Azure, and that are not entirely configurable in the cluster’s flat configuration file or Kubernetes-style object spec.
For information about how configure workload clusters on Azure using configuration files and object specs, see Azure Cluster Configuration Files.
ImportantTanzu Kubernetes Grid v2.4.x is the last version of TKG that supports the creation of TKG workload clusters on Azure. The ability to create TKG workload clusters on Azure will be removed in the Tanzu Kubernetes Grid v2.5 release.
Starting from now, VMware recommends that you use Tanzu Mission Control to create native Azure AKS clusters instead of creating new TKG workload clusters on Azure. For information about how to create native Azure AKS clusters with Tanzu Mission Control, see Managing the Lifecycle of Azure AKS Clusters in the Tanzu Mission Control documentation.
For more information, see Deprecation of TKG Management and Workload Clusters on AWS and Azure in the VMware Tanzu Kubernetes Grid v2.4 Release Notes.
By default, Azure management and workload clusters are public. But you can also configure them to be private, which means their API server uses an Azure internal load balancer (ILB) and is therefore only accessible from within the cluster’s own VNet or peered VNets.
To make an Azure cluster private, include the following in its configuration file:
Set AZURE_ENABLE_PRIVATE_CLUSTER
to true
.
(Optional) Set AZURE_FRONTEND_PRIVATE_IP
to an internal address for the cluster’s load balancer.
10.0.0.100
.Set AZURE_VNET_NAME
, AZURE_VNET_CIDR
, AZURE_CONTROL_PLANE_SUBNET_NAME
, AZURE_CONTROL_PLANE_SUBNET_CIDR
, AZURE_NODE_SUBNET_NAME
, and AZURE_NODE_SUBNET_CIDR
to the VNet and subnets that you use for other Azure private clusters.
(Optional) Set AZURE_ENABLE_CONTROL_PLANE_OUTBOUND_LB
and AZURE_ENABLE_NODE_OUTBOUND_LB
to true
if you require the control plane and worker nodes to be able to access the internet via an Azure internet connection.
By default, Azure Private Clusters create a Public IP address for each Kubernetes service of type Load Balancer. To configure the load balancer service to instead use a private IP address, add the following annotation to your deployment manifest:
---
metadata:
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
For more information, see API Server Endpoint in the Cluster API Provider Azure documentation.
Tanzu Kubernetes Grid can run workload clusters on multiple target platform accounts, for example to split cloud usage among different teams or apply different security profiles to production, staging, and development workloads.
To deploy workload clusters to an alternative Azure Service Principal account, different from the one used to deploy their management cluster, do the following:
Create the alternative Azure account. You use the details of this account to create an AzureClusterIdentity in a later step. For information about creating an Azure Service Principal Account, see How to: Use the portal to create an Azure AD application and service principal that can access resources in the Azure documentation.
Set the context of kubectl
to your management cluster:
kubectl config use-context MY-MGMT-CLUSTER@MY-MGMT-CLUSTER
Where MY-MGMT-CLUSTER
is the name of your management cluster.
Create a secret.yaml
file with the following contents:
apiVersion: v1
kind: Secret
metadata:
name: SECRET-NAME
type: Opaque
data:
clientSecret: CLIENT-SECRET
Where:
SECRET-NAME
is the secret name for the client password.CLIENT-SECRET
is the client secret of your Service Principal Identity. The client secret must be base64-encoded.Use the file to create the Secret
object:
kubectl apply -f secret.yaml
Create an identity.yaml
file with the following contents:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
name: EXAMPLE-IDENTITY
namespace: EXAMPLE-NAMESPACE
spec:
type: ManualServicePrincipal
tenantID: AZURE-TENANT-ID
clientID: CLIENT-ID
clientSecret: {"name":"SECRET-NAME","namespace":"default"}
allowedNamespaces:
list:
- CLUSTER-NAMESPACE-1
- CLUSTER-NAMESPACE-1
Where:
EXAMPLE-IDENTITY
is the name to use for the AzureClusterIdentity.EXAMPLE-NAMESPACE
is the namespace for your AzureClusterIdentity.AZURE-TENANT-ID
is your Azure tenant ID.CLIENT-ID
is the client ID (also known as an AppID) for the Azure AD application.SECRET-NAME
is the secret name for the client password.CLUSTER-NAMESPACE-1
and CLUSTER-NAMESPACE-2
are Kubernetes namespaces that the clusters are allowed to use identities from. These namespaces can be selected using an array of namespaces.Use the file to create the AzureClusterIdentity
object:
kubectl apply -f identity.yaml
The management cluster can now deploy workload clusters to the alternative account by using the new AzureClusterIdentity
object.
To create workload clusters that use the alternative Azure account, include the following variables in the cluster configuration file:
AZURE_IDENTITY_NAME: EXAMPLE-IDENTITY
AZURE_IDENTITY_NAMESPACE: EXAMPLE-NAMESPACE
Where:
EXAMPLE-IDENTITY
is the name to use for the AzureClusterIdentity.EXAMPLE-NAMESPACE
is the namespace for your AzureClusterIdentity.After you create the workload cluster, sign in to the Azure Portal using the alternative account, and you should see the cluster running.
There are two ways of deploying NVIDIA GPU-enabled workload clusters on Azure:
ClusterResourceSet
(CRS) to create one or more GPU-enabled workload clusters automaticallyThe subsections below explain these two approaches, and how to test the GPU-enabled clusters.
To deploy a workload cluster and configure it manually to take advantage of NVIDIA GPU VMs available on Azure:
In the configuration file for the cluster, set AZURE_NODE_MACHINE_TYPE
, for worker nodes, to a GPU-compatible VM type, such as Standard_NC4as_T4_v3
.
Deploy the cluster with the cluster configuration file:
tanzu cluster create MY-GPU-CLUSTER -f MY-GPU-CONFIG
Where MY-GPU-CLUSTER
is a name that you give to the cluster.
Install a GPU cluster policy and GPU operator on the cluster:
Set the kubectl context
to the cluster, if it is not already the current context.
Download the required NVIDIA GPU resources from the Cluster API Provider Azure repository, and save them to your current directory:
Apply the cluster policy:
kubectl apply clusterpolicy-crd.yaml
Apply the GPU operator:
kubectl apply gpu-operator-components.yaml
Run kubectl get pods -A
. You should see listings for gpu-operator-
pods in the default
namespace, and nvidia-
pods in the gpu-operator-resources
namespace.
NoteThis feature is in the unsupported Technical Preview state; see TKG Feature States.
You can configure the management cluster to create GPU-enabled workload clusters automatically whenever you add gpu: nvidia
to the labels in the cluster manifest. To do this, you install a ClusterResourceSet
(CRS) and activate it as follows:
To configure the management cluster to create GPU clusters:
Search the Broadcom Communities for GPU CRS for TKG and download the gpu-crs.yaml
file for Tanzu Kubernetes Grid v1.4.
Set the context of kubectl
to the context of your management cluster:
kubectl config use-context my-management-cluster-admin@my-management-cluster
Apply the CRS file to the management cluster, using the --server-side
option to handle the the large size of ConfigMap
data:
kubectl apply -f gpu-crs.yaml --server-side
To create a GPU workload cluster:
In the configuration file for the cluster, set AZURE_NODE_MACHINE_TYPE
, for worker nodes, to a GPU-compatible VM type, such as Standard_NC4as_T4_v3
.
Use tanzu cluster create
with the --dry-run
option to generate a deployment manifest from the cluster configuration file:
tanzu cluster create MY-GPU-CLUSTER -f MY-GPU-CONFIG --dry-run > MY-GPU-CLUSTER-MANIFEST
Where MY-GPU-CLUSTER
is a name that you give to the cluster.
Create the cluster by passing it to kubectl apply
:
kubectl apply -f MY-GPU-CLUSTER-MANIFEST
Run kubectl get pods -A
. You should see listings for gpu-operator-
pods in the default
namespace, and nvidia-
pods in the gpu-operator-resources
namespace.
To test a GPU-enabled cluster:
Test GPU processing by running the CUDA VectorAdd vector addition test in the NVIDIA documentation.
Test the GPU operator:
Scale up the workload cluster’s worker node count:
tanzu cluster scale MY-GPU-CLUSTER -w 2
Run kubectl get pods -A
again. You should see additional gpu-operator-
and nvidia-
pods listed for the added nodes.