To enable multiple availability zones for Tanzu Kubernetes clusters on vSphere, two new custom resource definitions (CRD) have been introduced in Cluster API Provider vSphere (CAPV).

  • The VSphereFailureDomain CRD captures the region/zone specific tagging information and the topology definition which includes the vSphere datacenter, cluster, host, and datastore information.
  • The VSphereDeploymentZone CRD captures the association of a VSphereFailureDomain with placement constraint information for the Kubernetes node.

IMPORTANT: Multiple availability zones on vSphere is an experimental feature in Tanzu Kubernetes Grid 1.4.0, for testing and proof-of-concept purposes. VMware welcomes your feedback on this experimental feature, but its use is not supported in production environments in this release.

The configurations in this topic spread the Kubernetes control plane and worker nodes across vSphere objects, namely hosts and compute clusters and datacenters.

Spread Nodes Across Multiple Compute Clusters in a Datacenter

The example in this section shows how to achieve multiple availability zones by spreading nodes across multiple compute clusters.

  1. Create the custom resources for defining the region and zones.

    To spread the Kubernetes nodes for a Tanzu Kubernetes cluster across multiple compute clusters within a datacenter you must create custom resources. This example describes 3 deployment zones named us-west-1a, us-west-1b and us-west-1c, with each one being a compute cluster with its network and storage parameters.

    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereFailureDomain
    metadata:
     name: us-west-1a
    spec:
     region:
       name: us-west-1
       type: datacenter
       tagCategory: k8s-region
     zone:
       name: us-west-1a
       type: ComputeCluster
       tagCategory: k8s-zone
     topology:
       datacenter: dc0
       computeCluster: cluster1
       datastore: ds-c1
       networks:
       - net1
       - net2
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereFailureDomain
    metadata:
     name: us-west-1b
    spec:
     region:
       name: us-west-1
       type: datacenter
       tagCategory: k8s-region
     zone:
       name: us-west-1b
       type: ComputeCluster
       tagCategory: k8s-zone
     topology:
       datacenter: dc0
       computeCluster: cluster2
       datastore: ds-c2
       networks:
       - net3
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereFailureDomain
    metadata:
     name: us-west-1c
    spec:
     region:
       name: us-west-1
       type: datacenter
       tagCategory: k8s-region
     zone:
       name: us-west-1c
       type: ComputeCluster
       tagCategory: k8s-zone
     topology:
       datacenter: dc0
       computeCluster: cluster3
       datastore: ds-c3
       networks:
       - net4
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereDeploymentZone
    metadata:
     name: us-west-1a
    spec:
     server: vcenter.sddc-54-70-161-229.com
     failureDomain: us-west-1a
     placementConstraint:
       resourcePool: pool1
       folder: foo
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereDeploymentZone
    metadata:
     name: us-west-1b
    spec:
     server: vcenter.sddc-54-70-161-229.com
     failureDomain: us-west-1b
     placementConstraint:
       resourcePool: pool2
       folder: bar
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereDeploymentZone
    metadata:
     name: us-west-1c
    spec:
     server: vcenter.sddc-54-70-161-229.com
     failureDomain: us-west-1c
     placementConstraint:
       resourcePool: pool3
       folder: baz 
    
  2. Tag the vSphere objects.

    From the first VSphereFailureDomain CR, named us-west-1a, use govc to apply the following tags to the datacenter dc0 and the compute cluster cluster1.

    $ govc tags.attach -c k8s-region us-west-1 /dc0
    
    $ govc tags.attach -c k8s-zone us-west-1a /dc0/host/cluster1
    

    Similarly, perform the following tagging operations for the other compute clusters.

    $ govc tags.attach -c k8s-zone us-west-1b /dc0/host/cluster2
    
    $ govc tags.attach -c k8s-zone us-west-1c /dc0/host/cluster3
    

    You can skip this step if spec.region.autoConfigure and spec.zone.autoConfigure are set to true when creating the VSphereFailureDomain CRs.

For the next steps to deploy the cluster, see Deploy a Workload Cluster with Nodes Spread Across the Availability Zones.

Spread Nodes Across Multiple Hosts in a Single Compute Cluster

The example in this section spreads workload cluster nodes across 3 different host groups in a single cluster.

  1. In vCenter Server, create the HostGroup, for example rack1, and a VMGroup, for example rack1-vm-group, for each failure domain.
  2. Add an affinity rule between the created VMGroup and HostGroup so that the VMs in the VMGroup must run on the hosts in the created HostGroup.
  3. Create the VSphereFailureDomain and VSphereDeploymentZone CRs.

    The following manifests define three zones, with each one being a rack of hosts that are all within the same cluster.

    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereFailureDomain
    metadata:
     name: rack1
    spec:
     region:
       name: room1
       type: ComputeCluster
       tagCategory: k8s-region
     zone:
       name: rack1
       type: HostGroup
       tagCategory: k8s-zone
     topology:
       datacenter: dc0
       computeCluster: cluster1
       hosts:
         vmGroupName: rack1-vm-group
         hostGroupName: rack1
       datastore: ds-r1
       networks:
       - net1
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereFailureDomain
    metadata:
     name: rack2
    spec:
     region:
       name: room1
       type: ComputeCluster
       tagCategory: k8s-region
     zone:
       name: rack2
       type: HostGroup
       tagCategory: k8s-zone
     topology:
       datacenter: dc0
       computeCluster: cluster1
       hosts:
         vmGroupName: rack2-vm-group
         hostGroupName: rack2
       datastore: ds-r2
       networks:
       - net2
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereFailureDomain
    metadata:
     name: rack3
    spec:
     region:
       name: room1
       type: ComputeCluster
       tagCategory: k8s-region
     zone:
       name: rack3
       type: HostGroup
       tagCategory: k8s-zone
     topology:
       datacenter: dc0
       computeCluster: cluster1
       hosts:
         vmGroupName: rack3-vm-group
         hostGroupName: rack3
       datastore: ds-c3
       networks:
       - net3
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereDeploymentZone
    metadata:
     name: rack1
    spec:
     server: vcenter.sddc-54-70-161-229.com
     failureDomain: rack1
     placementConstraint:
       resourcePool: pool1
       folder: foo
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereDeploymentZone
    metadata:
     name: rack2
    spec:
     server: vcenter.sddc-54-70-161-229.com
     failureDomain: rack2
     placementConstraint:
       resourcePool: pool2
       folder: bar
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereDeploymentZone
    metadata:
     name: rack3
    spec:
     server: vcenter.sddc-54-70-161-229.com
     failureDomain: rack3
     placementConstraint:
       resourcePool: pool3
       folder: baz
    
  4. Tag the cluster and host objects in vCenter Server.

    From the first VSphereFailureDomain CR, named us-west-1a, use govc to apply the following tags to the compute cluster cluster1 and to the hosts in the HostGroup rack1.

    $ govc tags.attach -c k8s-region us-west-1 /dc0/host/cluster1
    
    $ govc tags.attach -c k8s-zone us-west-1a <path-to-the-host>
    

    You must repeat the second command for every host in rack1.

  5. Apply tags to the other host groups.

    $ govc tags.attach -c k8s-zone us-west-1b <path-to-the-host-in-rack2>
    
    $ govc tags.attach -c k8s-zone us-west-1c <path-to-the-host-in-rack3>
    

For the next steps to deploy the cluster, see Deploy a Workload Cluster with Nodes Spread Across the Availability Zones.

Deploy a Workload Cluster with Nodes Spread Across the Availability Zones

After you have performed the steps in Spread Nodes Across Multiple Compute Clusters in a Datacenter or Spread Nodes Across Multiple Hosts in a Single Compute Cluster, you can deploy a workload cluster with its nodes spread across multiple availability zones.

  1. To spread the worker nodes across the zones, update the VSphereMachineTemplate object to include the name of the failure domain in the spec definition.

    To do this, edit the ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/vsphere-overlay.yaml file to include the following:

    #! Please add any overlays specific to vSphere provider under this file.
    #@ load("@ytt:overlay", "overlay")
    #@ load("@ytt:data", "data")
    
    #@overlay/match by=overlay.subset({"kind":"MachineDeployment", "metadata":{"name": "{}-md-0".format(data.values.CLUSTER_NAME)}})
    ---
    spec:
     template:
       spec:
         #@overlay/match missing_ok=True
         failureDomain: <name-of-first-failure-domain>
    
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereMachineTemplate
    metadata:
     name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
    spec:
     template:
       spec:
         cloneMode:  #@ data.values.VSPHERE_CLONE_MODE
         datacenter: #@ data.values.VSPHERE_DATACENTER
         datastore: #@ data.values.VSPHERE_DATASTORE
         storagePolicyName: #@ data.values.VSPHERE_STORAGE_POLICY_ID
         diskGiB: #@ data.values.VSPHERE_WORKER_DISK_GIB
         folder: #@ data.values.VSPHERE_FOLDER
         memoryMiB: #@ data.values.VSPHERE_WORKER_MEM_MIB
         network:
           devices:
           #@ if data.values.TKG_IP_FAMILY == "ipv6":
           #@overlay/match by=overlay.index(0)
           #@overlay/replace
           - dhcp6: true
             networkName: #@ data.values.VSPHERE_NETWORK
           #@ else:
           #@overlay/match by=overlay.index(0)
           #@overlay/replace
           - dhcp4: true
             networkName: #@ data.values.VSPHERE_NETWORK
           #@ end
         numCPUs: #@ data.values.VSPHERE_WORKER_NUM_CPUS
         resourcePool: #@ data.values.VSPHERE_RESOURCE_POOL
         server: #@ data.values.VSPHERE_SERVER
         template: #@ data.values.VSPHERE_TEMPLATE
         failureDomain: <name-of-second-failure-domain>
    
    ---
    apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
    kind: KubeadmConfigTemplate
    metadata:
     name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
     namespace: '${ NAMESPACE }'
    spec:
     template:
       spec:
         useExperimentalRetryJoin: true
         joinConfiguration:
           nodeRegistration:
             criSocket: /var/run/containerd/containerd.sock
             kubeletExtraArgs:
               cloud-provider: external
               tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
             name: '{{ ds.meta_data.hostname }}'
         preKubeadmCommands:
         - hostname "{{ ds.meta_data.hostname }}"
         - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
         - echo "127.0.0.1   localhost" >>/etc/hosts
         - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
         - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
         files: []
         users:
         - name: capv
           sshAuthorizedKeys:
           - '${ VSPHERE_SSH_AUTHORIZED_KEY }'
           sudo: ALL=(ALL) NOPASSWD:ALL
    
    ---
    apiVersion: cluster.x-k8s.io/v1alpha3
    kind: MachineDeployment
    metadata:
     labels:
       cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
     name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
    spec:
     clusterName: #@ data.values.CLUSTER_NAME
     replicas: #@ data.values.WORKER_MACHINE_COUNT
     selector:
       matchLabels:
         cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
     template:
       metadata:
         labels:
           cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
           node-pool: #@ "{}-worker-pool".format(data.values.CLUSTER_NAME)
       spec:
         bootstrap:
           configRef:
             apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
             kind: KubeadmConfigTemplate
             name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
         clusterName: #@ data.values.CLUSTER_NAME
         infrastructureRef:
           apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
           kind: VSphereMachineTemplate
           name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
         version: #@ data.values.KUBERNETES_VERSION
    
    ---
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereMachineTemplate
    metadata:
     name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
    spec:
     template:
       spec:
         cloneMode:  #@ data.values.VSPHERE_CLONE_MODE
         datacenter: #@ data.values.VSPHERE_DATACENTER
         datastore: #@ data.values.VSPHERE_DATASTORE
         storagePolicyName: #@ data.values.VSPHERE_STORAGE_POLICY_ID
         diskGiB: #@ data.values.VSPHERE_WORKER_DISK_GIB
         folder: #@ data.values.VSPHERE_FOLDER
         memoryMiB: #@ data.values.VSPHERE_WORKER_MEM_MIB
         network:
           devices:
           #@ if data.values.TKG_IP_FAMILY == "ipv6":
           #@overlay/match by=overlay.index(0)
           #@overlay/replace
           - dhcp6: true
             networkName: #@ data.values.VSPHERE_NETWORK
           #@ else:
           #@overlay/match by=overlay.index(0)
           #@overlay/replace
           - dhcp4: true
             networkName: #@ data.values.VSPHERE_NETWORK
           #@ end
         numCPUs: #@ data.values.VSPHERE_WORKER_NUM_CPUS
         resourcePool: #@ data.values.VSPHERE_RESOURCE_POOL
         server: #@ data.values.VSPHERE_SERVER
         template: #@ data.values.VSPHERE_TEMPLATE
         failureDomain: <name-of-third-failure-domain>
    
    ---
    apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
    kind: KubeadmConfigTemplate
    metadata:
     name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
     namespace: '${ NAMESPACE }'
    spec:
     template:
       spec:
         useExperimentalRetryJoin: true
         joinConfiguration:
           nodeRegistration:
             criSocket: /var/run/containerd/containerd.sock
             kubeletExtraArgs:
               cloud-provider: external
               tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
             name: '{{ ds.meta_data.hostname }}'
         preKubeadmCommands:
         - hostname "{{ ds.meta_data.hostname }}"
         - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
         - echo "127.0.0.1   localhost" >>/etc/hosts
         - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
         - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
         files: []
         users:
         - name: capv
           sshAuthorizedKeys:
           - '${ VSPHERE_SSH_AUTHORIZED_KEY }'
           sudo: ALL=(ALL) NOPASSWD:ALL
    
    ---
    apiVersion: cluster.x-k8s.io/v1alpha3
    kind: MachineDeployment
    metadata:
     labels:
       cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
     name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
    spec:
     clusterName: #@ data.values.CLUSTER_NAME
     replicas: #@ data.values.WORKER_MACHINE_COUNT
     selector:
       matchLabels:
         cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
     template:
       metadata:
         labels:
           cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
           node-pool: #@ "{}-worker-pool".format(data.values.CLUSTER_NAME)
       spec:
         bootstrap:
           configRef:
             apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
             kind: KubeadmConfigTemplate
             name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
         clusterName: #@ data.values.CLUSTER_NAME
         infrastructureRef:
           apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
           kind: VSphereMachineTemplate
           name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
         version: #@ data.values.KUBERNETES_VERSION
    
    #@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}), expects="1+"
    ---
    spec:
     template:
       spec:
         users:
         #@overlay/match by=overlay.index(0)
         #@overlay/replace
         - name: capv
           sshAuthorizedKeys:
           - #@ data.values.VSPHERE_SSH_AUTHORIZED_KEY
           sudo: ALL=(ALL) NOPASSWD:ALL
    

    The above overlay adds the names of the VSphereFailureDomain objects defined in step 1 to the VSphereMachineTemplate objects to create worker nodes. Since there are 3 different failure domains, this example creates 2 extra sets of MachineDeployment + VSphereMachineTemplate + KubeadmConfigTemplate objects. Each VSphereMachineTemplate object has the name of a failure domain that controls the placement of the machines.

  2. For the {CLUSTER_NAME}-vsphere-cpi-addon secret object, update the values of the region and zone keys under vsphereCPI to point to the tag category used in the VSphereFailureDomain object.

    region => VSphereFailureDomain.spec.region.tagCategory
    
    zone => VSphereFailureDomain.spec.zone.tagCategory
    
  3. Apply the YAML to create a workload cluster with its control plane and worker nodes spread across the three failure domains.

check-circle-line exclamation-circle-line close-line
Scroll to top icon