Running Clusters Across Multiple Availability Zones

This topic describes how to deploy new Tanzu Kubernetes Grid (TKG) workload clusters that run in multiple availability zones (AZs), and how to change existing management and workload clusters to run in multiple or different AZs.

To use the installer interface to configure a new standalone management cluster that runs across multiple AZs, see Configure vSphere Resources in Deploy Management Clusters with the Installer Interface.

Note
This topic applies to TKG with a standalone management cluster. For running TKG with a Supervisor across availability zones, see Create vSphere Zones for a Multi-Zone Supervisor Deployment in the vSphere 8.0 Documentation.

Overview

In TKG on vSphere, you can optionally define regions and AZs in Kubernetes for hosting standalone management clusters and their workload clusters, and then tag them in vSphere to associate them with vSphere clusters, host groups, or datacenters.

This setup lets TKG support workload distribution and redundancy on vSphere similar to the way it works with regions and AZs on other infrastructures.

Without these constructs, you can place TKG clusters at the vSphere level by referencing vSphere objects directly, but then TKG and Kubernetes are unable to manage their placement.

Defining AZs

To define AZs in TKG on vSphere you create Kubernetes objects and associate them with vSphere objects depending on how the AZs are scoped:

Kubernetes Objects: To enable multiple availability zones for clusters on vSphere, Cluster API Provider vSphere (CAPV) uses two custom resource definitions (CRD):

The VSphereFailureDomain CRD captures the region/zone specific tagging information and the topology definition which includes the vSphere datacenter, cluster, host, and datastore information.
The VSphereDeploymentZone CRD captures the association of a VSphereFailureDomain with placement constraint information for the Kubernetes node.

To create these objects, you define them in an object spec such as a vsphere-zones.yaml file, which you can then use to create the objects at different times, as follows:

When you first create the management cluster, pass the file to the --az-file option of tanzu mc create.
- To run the management cluster itself within defined AZs, you must create the AZ objects in this way at this time.
- If you expect to need additional AZ objects for your workload clusters, and to keep all of your AZ definitions in one place, you can define additional AZs in this same file. If you do this, set SKIP_MULTI_AZ_VERIFY=true as an env variable to skip vSphere validation as described in Validation Checks under Run the tanzu mc create Command, because those additional AZs might not yet have all vSphere configurations in place.
After the management cluster has been created but before you create workload clusters that require the AZ objects to be in place, pass the file to the -f option of tanzu mc az set or kubectl apply command.
When you create workload clusters, pass the file to the --az-file option of tanzu cluster create.

To list AZ objects that have already been created, run:

kubectl get VSphereFailureDomain,VSphereDeploymentZone -a

Kubernetes to vSphere Association: The VSphereFailureDomain and VSphereDeploymentZone object definitions define regions and AZs with the following settings:

Region: spec.region
Zone/AZ: spec.zone

The vSphere tags k8s-region and k8s-zone associate regions and AZs in Kubernetes with their underlying objects in vSphere.

AZ Scope: You can scope AZs and regions at different levels in vSphere by associating them with vSphere objects as follows:

AZ Scope	Zone / AZ	Region	Multi-AZ Use
Cluster AZs	vSphere Cluster	vSphere Datacenter	Spread nodes across multiple clusters in a datacenter
Host Group AZs	vSphere Host Group	vSphere Cluster	Spread nodes across multiple hosts in a single cluster

The configurations in this topic spread the TKG cluster control plane and worker nodes across vSphere objects, namely vSphere datacenters, clusters, and hosts, based on how the objects are tagged in vSphere and referenced in the VSphereFailureDomain and VSphereDeploymentZone definitions in Kubernetes.

Configuring Clusters to Use AZs

With VSphereFailureDomain and VSphereDeploymentZone objects defined for AZs in Kubernetes, you can configure how clusters use them via either configuration file variables or Cluster object properties.

Configuration Variables: To use cluster configuration variables, set VSPHERE_AZ_0, VSPHERE_AZ_1, VSPHERE_AZ_2, VSPHERE_AZ_CONTROL_PLANE_MATCHING_LABELS, VSPHERE_REGION, and VSPHERE_ZONE as described under vSphere in Configuration File Variable Reference.

Object Properties: For Cluster object properties, the object spec configures AZs for their control plane and worker nodes in different ways, matching different properties of the VSphereDeploymentZone objects that define the AZs:

Node type	Property under `spec.topology`	To match `VSphereDeploymentZone` properties	Example
Control plane nodes	`variables.controlPlaneZoneMatchingLabels`	List of `metadata.labels` pairs	`{"environment": "staging", "region": "room1"}`
Worker nodes	`machineDeployments.MD-INDEX.failureDomain` for each machine deployment	List of `metadata.name` values	`[rack1,rack2,rack3]`

Because control plane nodes are assigned to AZs based on label-matching, you need to create a label distinguishing each combination of AZs that cluster control planes nodes might use.

Prerequisites

Prerequisites for deploying or changing TKG clusters to run in multiple or different AZs include:

A Tanzu Kubernetes Grid management cluster with workload clusters running on vSphere.
The following permission added to the vSphere account set up for TKG as described Required Permissions for the vSphere Account:
- Host > Inventory > Modify cluster

Deploy Workload Clusters Across Multiple AZs

You can deploy a workload cluster to run its control plane or worker nodes in multiple availability zones (AZs) in three basic steps, as described in the following sections:

Prepare Regions and AZs in vSphere
Create FailureDomain and DeploymentZone Objects in Kubernetes
Deploy the Cluster

Prepare Regions and AZs in vSphere

To prepare vSphere to support regions and AZs in TKG:

Identify or create the vSphere objects for the regions and AZs where your TKG cluster nodes will be hosted.
- Host Group AZs: If you are using vSphere host groups as AZs, you need to create one host group and a corresponding VM group for each AZ that you plan to use:
  1. Create host group and VM group objects in one of the following ways:
    - In vCenter, create a host group and a VM group from Configure > VM/Host Groups > Add…
      - To create a host group, you may have to create a dummy VM to add as a group member.
    - With govc CLI, run govc commands similar to the following. For example to create a host group rack1 and a VM group rack1-vm-group:
```
govc cluster.group.create -cluster=RegionA01-MGMT -name=rack1 -host esx-01a.corp.tanzu esx-02a.corp.tanzu
```
```
govc cluster.group.create -cluster=RegionA01-MGMT -name=rack1-vm-group -vm
```
  2. Add affinity rules between the created VM groups and Host groups so that the VMs in the VM group must run on the hosts in the created Host group:
    - Set Type to Virtual Machines to Hosts and include the rule Must run on hosts in group.

Tag the vSphere objects as follows, depending on whether you are configuring vSphere cluster AZs or host group AZs. These examples use the govc CLI but you can also use the Tags & Custom Attributes pane in vCenter:

Cluster AZs:

For each AZ, use govc to create and attach a k8s-region category tag to the datacenter and a k8s-zone category tag to each vSphere cluster. For example, to tag datacenter dc0 as a region us-west-1 and its clusters cluster1 etc. as AZs us-west-1a etc.:

govc tags.category.create -t Datacenter k8s-region

govc tags.category.create -t ClusterComputeResource k8s-zone

govc tags.create -c k8s-region us-west-1

govc tags.create -c k8s-zone us-west-1a
govc tags.create -c k8s-zone us-west-1b
govc tags.create -c k8s-zone us-west-1c

govc tags.attach -c k8s-region us-west-1 /dc0

govc tags.attach -c k8s-zone us-west-1a /dc0/host/cluster1
govc tags.attach -c k8s-zone us-west-1b /dc0/host/cluster2
govc tags.attach -c k8s-zone us-west-1c /dc0/host/cluster3

Host Group AZs:

For each AZ, use govc to create and attach a k8s-region category tag to the vSphere cluster and a k8s-zone category tag to each host. For example, to tag cluster /dc1/host/room1-mgmt as a region room1 and the host in that group /dc1/host/room1-mgmt/esx-01a.corp.tanzu etc. as AZs rack1 etc.:

govc tags.category.create -t ClusterComputeResource k8s-region
govc tags.category.create -t HostSystem k8s-zone

govc tags.create -c k8s-region room1

govc tags.create -c k8s-zone rack1
govc tags.create -c k8s-zone rack2
govc tags.create -c k8s-zone rack3

govc tags.attach -c k8s-region room1 /dc1/host/room1-mgmt

govc tags.attach -c k8s-zone rack1 /dc1/host/room1-mgmt/esx-01a.corp.tanzu
govc tags.attach -c k8s-zone rack1 /dc1/host/room1-mgmt/esx-01b.corp.tanzu
govc tags.attach -c k8s-zone rack1 /dc1/host/room1-mgmt/esx-01c.corp.tanzu

Identify or create the vSphere ResourcePools and Folders to be used to place the VMs for each of the AZs. These examples use the govc CLI but you can also do this in the the Inventory panes in vCenter:

Cluster AZs:

For each AZ, use govc to create a resource-pool objects on each vSphere cluster matchin each one of the 3 AZs, and 3 VM folders

govc pool.create /dc0/host/cluster1/pool1
govc pool.create /dc0/host/cluster2/pool2
govc pool.create /dc0/host/cluster3/pool3
govc folder.create /dc0/vm/folder1
govc folder.create /dc0/vm/folder2
govc folder.create /dc0/vm/folder3

Host Group AZs:

For each AZ, use govc to create 3 resource-pool objects, and 3 VM folders

govc pool.create /dc1/host/cluster1/pool1
govc pool.create /dc1/host/cluster1/pool2
govc pool.create /dc1/host/cluster1/pool3
govc folder.create /dc1/vm/folder1
govc folder.create /dc1/vm/folder2
govc folder.create /dc1/vm/folder3

Create `FailureDomain` and `DeploymentZone` Objects in Kubernetes

Before deploying a cluster to multiple availability zones, you need to define the Kubernetes objects FailureDomain and DeploymentZone for the region and zones as described in Defining AZs above.

Every setting under spec.region, spec.zone, and spec.topology must match the object paths and tags configured in vCenter:

For VSphereDeploymentZone objects, the spec.failuredomain value must match one of the metadata.name values of the VSphereFailureDomain definitions.
The spec.server value in the VSphereDeploymentZone objects must match the vCenter server address (IP or FQDN) entered for VCENTER SERVER in the installer interface IaaS Provider pane or the VSPHERE_SERVER setting in the management cluster configuration file.
metadata.name values must be all lowercase.

Create FailureDomain and DeploymentZone object definitions as follows, depending on whether you are configuring vSphere cluster AZs or host group AZs.

Cluster AZs:

As an example of how to spread workload cluster across multiple vSphere cluster nodes within a datacenter, the following code defines the objects needed for three deployment zones named us-west-1a, us-west-1b and us-west-1c, with each one being a vSphere cluster that has its own network and storage parameters:

  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereFailureDomain
  metadata:
    name: us-west-1a
  spec:
    region:
      name: us-west-1
      type: Datacenter
      tagCategory: k8s-region
    zone:
      name: us-west-1a
      type: ComputeCluster
      tagCategory: k8s-zone
    topology:
      datacenter: dc0
      computeCluster: cluster1
      datastore: ds-c1
      networks:
      - net1
      - net2
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereFailureDomain
  metadata:
    name: us-west-1b
  spec:
    region:
      name: us-west-1
      type: Datacenter
      tagCategory: k8s-region
    zone:
      name: us-west-1b
      type: ComputeCluster
      tagCategory: k8s-zone
    topology:
      datacenter: dc0
      computeCluster: cluster2
      datastore: ds-c2
      networks:
      - net3
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereFailureDomain
  metadata:
    name: us-west-1c
  spec:
    region:
      name: us-west-1
      type: Datacenter
      tagCategory: k8s-region
    zone:
      name: us-west-1c
      type: ComputeCluster
      tagCategory: k8s-zone
    topology:
      datacenter: dc0
      computeCluster: cluster3
      datastore: ds-c3
      networks:
      - net4
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereDeploymentZone
  metadata:
    name: us-west-1a
    labels:
      environment: "staging"
      region: "us-west-1"
  spec:
    server: VSPHERE_SERVER
    failureDomain: us-west-1a
    placementConstraint:
      resourcePool: pool1
      folder: folder1
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereDeploymentZone
  metadata:
    name: us-west-1b
    labels:
      environment: "staging"
      region: "us-west-1"
  spec:
    server: VSPHERE_SERVER
    failureDomain: us-west-1b
    placementConstraint:
      resourcePool: pool2
      folder: folder2
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereDeploymentZone
  metadata:
    name: us-west-1c
    labels:
      environment: "staging"
      region: "us-west-1"
  spec:
    server: VSPHERE_SERVER
    failureDomain: us-west-1c
    placementConstraint:
      resourcePool: pool3
      folder: folder3

Where VSPHERE_SERVER is the IP address or FQDN of your vCenter server.

If different vSphere clusters have identically-named resource pools, set the VSphereDeploymentZone objects’ spec.placementConstraint.resourcePool to a full resource path, not just the name.

Host Group AZs:

As an example of how to spread workload cluster nodes across three host groups in a single vSphere cluster, the following code defines the objects needed for three AZs, rack1, rack2, and rack3, each of which represents a rack of hosts within the same vSphere cluster, defined as region room1:

  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereFailureDomain
  metadata:
    name: rack1
  spec:
    region:
      name: room1
      type: ComputeCluster
      tagCategory: k8s-region
    zone:
      name: rack1
      type: HostGroup
      tagCategory: k8s-zone
    topology:
      datacenter: dc0
      computeCluster: cluster1
      hosts:
        vmGroupName: rack1-vm-group
        hostGroupName: rack1
      datastore: ds-r1
      networks:
      - net1
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereFailureDomain
  metadata:
    name: rack2
  spec:
    region:
      name: room1
      type: ComputeCluster
      tagCategory: k8s-region
    zone:
      name: rack2
      type: HostGroup
      tagCategory: k8s-zone
    topology:
      datacenter: dc0
      computeCluster: cluster1
      hosts:
        vmGroupName: rack2-vm-group
        hostGroupName: rack2
      datastore: ds-r2
      networks:
      - net2
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereFailureDomain
  metadata:
    name: rack3
  spec:
    region:
      name: room1
      type: ComputeCluster
      tagCategory: k8s-region
    zone:
      name: rack3
      type: HostGroup
      tagCategory: k8s-zone
    topology:
      datacenter: dc0
      computeCluster: cluster1
      hosts:
        vmGroupName: rack3-vm-group
        hostGroupName: rack3
      datastore: ds-c3
      networks:
      - net3
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereDeploymentZone
  metadata:
    name: rack1
    labels:
      region: room1
  spec:
    server: VSPHERE_SERVER
    failureDomain: rack1
    placementConstraint:
      resourcePool: pool1
      folder: folder1
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereDeploymentZone
  metadata:
    name: rack2
    labels:
      region: room1
  spec:
    server: VSPHERE_SERVER
    failureDomain: rack2
    placementConstraint:
      resourcePool: pool2
      folder: folder2
  ---
  apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
  kind: VSphereDeploymentZone
  metadata:
    name: rack3
    labels:
      region: room1
  spec:
    server: VSPHERE_SERVER
    failureDomain: rack3
    placementConstraint:
      resourcePool: pool3
      folder: folder3

Where VSPHERE_SERVER is the IP address or FQDN of your vCenter server.

After creating the FailureDomain and DeploymentZone object definitions for your AZs, continue based on the type of cluster that you are deploying:

Standalone management cluster: Continue with Configure Multiple Availability Zones as part of Deploy Management Clusters from a Configuration File.
Workload cluster: Continue with Deploy the Cluster below.

Deploy the Cluster

After you have performed the steps in Prepare Regions and AZs in vSphere and Create FailureDomain and DeploymentZone Objects in Kubernetes, you can deploy a workload cluster with its nodes spread across multiple AZs.

The following steps use vsphere-zones.yaml as the file that contains FailureDomain and DeploymentZone object definitions.

Follow vSphere with Standalone Management Cluster Configuration Files to create the cluster configuration file for the workload cluster you are deploying.
Check or modify the AZ variables in your cluster configuration file to match your AZ object definitions:
- Set VSPHERE_REGION and VSPHERE_ZONE to the region and zone tag categories, k8s-region and k8s-zone.
  - These variables also configure CSI topology; see Deploy a Cluster with Region and Zone Tags for CSI.
- Set VSPHERE_AZ_0, VSPHERE_AZ_1, VSPHERE_AZ_2 with the names of the VsphereDeploymentZone objects where the machines need to be deployed.
  - The VsphereDeploymentZone associated with VSPHERE_AZ_0 is the VSphereFailureDomain in which the machine deployment ending with md-0 gets deployed, similarly VSPHERE_AZ_1 is the VSphereFailureDomain in which the machine deployment ending with md-1 gets deployed, and VSPHERE_AZ_2 is the VSphereFailureDomain in which the machine deployment ending with md-2 gets deployed
  - If any of the AZ configs are not defined, then that machine deployment gets deployed without any VSphereFailureDomain
- WORKER_MACHINE_COUNT sets the total number of workers for the cluster. The total number of workers are distributed in a round-robin fashion across the number of AZs specified
- VSPHERE_AZ_CONTROL_PLANE_MATCHING_LABELS sets key/value selector labels for the AZs that cluster control plane nodes may deploy to.
  - Set this variable if VSPHERE_REGION and VSPHERE_ZONE are set.
  - The labels must exist in the VSphereDeploymentZone resources that you create.
  - These labels let you specify all AZs in a region and an environment without having to list them individually, for example: "region=us-west-1,environment=staging".
Cluster configuration variables for AZs work the same way for standalone management clusters and workload clusters. For the full list of options that you must specify when deploying workload clusters to vSphere, see the Configuration File Variable Reference.
Run tanzu cluster create to create the workload cluster. For more information, see Create Workload Clusters.
- To create the AZ objects separately from creating the cluster, log in to the management cluster with tanzu login and run the following before you run tanzu cluster create:
```
tanzu mc az set -f vsphere-zones.yaml
```
  Or, you can run kubectl apply -f vsphere-zones.yaml
- To use the AZ object definitions with a flat cluster configuration file and create the AZ and cluster objects together, pass the vsphere-zones.yaml file to the --az-file option of tanzu cluster create:
```
tanzu cluster create --file cluster-config-file.yaml --az-file vsphere-zones.yaml
```
- To combine the AZ object definitions into a cluster manifest, create the cluster manifest by following step 1 of the two-step process described in Create a Class-Based Cluster, append the contents of vsphere-zones.yaml to the manifest, and then run tanzu cluster create as described in step 2.
- During the cluster creation process, you can see its VMs and other resources appear in vCenter.
- If you created a dummy VM in vCenter in order to create a VM group, you can delete or remove the VM from the VM groups once the cluster is running.

Update Existing Clusters to Use Multiple or Different Availability Zones

You can update an already-deployed management or workload cluster to run its control plane or worker nodes in multiple availability zones (AZs) or to change the AZs that the nodes run in.

You can assign AZs to a cluster’s control plane or worker nodes as a whole, or else set AZs for underlying machine deployments, to customize vSphere machine settings along with AZs for the machine deployment set.

After you update an existing workload cluster’s AZs, you need to update its container storage interface (CSI) and Cloud Provider Interface (CPI) to reflect the change, as described in Update CPI and CSI for AZ Changes.

The following sections explain how to update existing cluster AZ configurations for different scenarios.

Add AZs for Control Plane Nodes

To expand an existing cluster whose control plane nodes run in a single AZ so that its control plane runs in multiple AZs:

Prepare a configuration file that defines a VSphereFailureDomain and VSphereDeploymentZone object for each new AZ. The example below, vsphere-3-zones.yaml, defines AZs rack1, rack2 and rack3 with region room1:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind:VSphereFailureDomain
metadata:
  name:rack
spec:
 region:
   name: room1
   type: ComputeCluster
   tagCategory: k8s-region
 zone:
   name:rack
   type: HostGroup
   tagCategory: k8s-zone
 topology:
   datacenter: dc0
   computeCluster: cluster0
 hosts:
   vmGroupName:rack-vm-group
   hostGroupName:rack
   datastore: ds1
   networks:
   - "VM Network"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind:VSphereFailureDomain
metadata:
  name:rack
spec:
  region:
    name: room1
    type: ComputeCluster
    tagCategory: k8s-region
  zone:
    name: rack2
    type: HostGroup
    tagCategory: k8s-zone
  topology:
    datacenter: dc0
    computeCluster: cluster0
  hosts:
    vmGroupName: rack2-vm-group
    hostGroupName: rack2
    datastore: ds2
    networks:
    - "VM Network"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
  name: rack3
spec:
  region:
    name: room1
    type: ComputeCluster
    tagCategory: k8s-region
  zone:
    name: rack3:
    type: HostGroup
    tagCategory: k8s-zone
  topology:
    datacenter: dc0
    computeCluster: cluster0
  hosts:
    vmGroupName: rack3-vm-group
    hostGroupName: rack3
    datastore: ds3
    networks:
    - "VM Network"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: rack1
  labels:
    environment: "staging"
    region: "room1"
spec:
  server: VSPHERE_SERVER
  failureDomain: rack1
  placementConstraint:
    resourcePool: rp0
    folder: folder0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: rack2
  labels:
    environment: "staging"
    region: "room1"
spec:
  server: VSPHERE_SERVER
  failureDomain: rack2
  placementConstraint:
    resourcePool: rp0
    folder: folder0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: rack3
  labels:
    environment: "staging"
    region: "room1"
spec:
  server: VSPHERE_SERVER
  failureDomain: rack3
  placementConstraint:
    resourcePool: rp0
    folder: folder0

Where VSPHERE_SERVER is the IP address or FQDN of your vCenter server.

Notes:

Ensure that the tags are created and that resources in the vCenter Server inventory have been properly tagged as described in vSphere Tags in vSphere Product Documentation.
You must set spec.placementConstraint.resourcePool in VSphereDeploymentZone. If there is no user-created resource pool for the cluster, set the value to the default resource pool of the cluster, whose path is /dc0/host/cluster1/Resources.
For VSphereFailureDomain objects, spec.region.autoConfigure and spec.zone.autoConfigure are no longer supported.

Create the vSphereFailureDomain and VSphereDeploymentZone objects, for example:
```
tanzu mc az set -f vsphere-3-zones.yaml
```

Get the KubeAdmControlPlane of the target cluster. In our example, the target is the management cluster tkg-mgmt-vc, but it can also be a workload cluster:

kubectl get kcp --selector cluster.x-k8s.io/cluster-name=tkg-mgmt-vc -n tkg-system -o=name

kubeadmcontrolplane.controlplane.cluster.x-k8s.io/tkg-mgmt-vc-cpkxj

Update the cluster AZ selector, for example controlPlaneZoneMatchingLabels: {"environment": "staging", "region": "room1"}:

kubectl get cluster tkg-mgmt-vc -n tkg-system -o json | jq '.spec.topology.variables |= map(if .name == "controlPlaneZoneMatchingLabels" then .value = {"environment": "staging", "region": "room1"} else . end)'| kubectl apply -f -

cluster.cluster.x-k8s.io/tkg-mgmt-vc replaced

Check that the cluster’s failure domain has been updated as expected:

kubectl get cluster tkg-mgmt-vc -n tkg-system -o json | jq -r '.status.failureDomains | to_entries[].key'

Patch the KubeAdmControlPlane with rolloutAfter to trigger an update of the control plane nodes.

kubectl patch kcp tkg-mgmt-vc-cpkxj -n tkg-system --type merge -p "{\"spec\":{\"rolloutAfter\":\"$(date +'%Y-%m-%dT%TZ')\"}}"

Verify that the control plane nodes have moved to the new AZs, by checking the nodes’ host and datastore in the vCenter or running kubectl get node or govc vm.info commands like the following:
- kubectl get node NODE-NAME -o=jsonpath='{.metadata.labels.node.cluster.x-k8s.io/esxi-host}' --context tkg-mgmt-vc-admin@tkg-mgmt-vc
- govc vm.info -json NODE-NAME | jq -r '.VirtualMachines[].Config.Hardware.Device[] | select(.DeviceInfo.Label == "Hard disk 1") | .Backing.FileName'

Use Selector Labels to Specify New Control Plane AZs

Selecting AZs with selector labels means specifying VSphereDeploymentZone by its metadata.labels rather than its metadata.name. This lets you configure a cluster’s control plane nodes, for example, to run in all of the AZs in a specified region and environment without listing the AZs individually: "region=us-west-1,environment=staging". It also means that you can update a cluster control plane’s AZs without having to change AZ names for the control plane nodes.

To use selector labels to specify new AZs for an existing cluster’s control plane nodes:

Prepare a configuration file that defines a VSphereFailureDomain and VSphereDeploymentZone object for each new AZ. The example below, vsphere-labeled-zones.yaml, defines an AZ rack4 with selector labels metadata:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
  name: rack4
spec:
  region:
    name: room1
    type: ComputeCluster
    tagCategory: k8s-region
  zone:
    name: rack4
    type: HostGroup
    tagCategory: k8s-zone
  topology:
    datacenter: dc0
    computeCluster: cluster0
  hosts:
    vmGroupName: rack4-vm-group
    hostGroupName: rack4
    datastore: vsanDatastore
    networks:
    - "VM Network"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: rack4
  labels:
    environment: staging
    region: room1
spec:
  server: VSPHERE_SERVER
  failureDomain: rack4
  placementConstraint:
    resourcePool: rp0
    folder: folder0

Create the VSphereFailureDomain and VSphereDeploymentZone objects, for example:
```
tanzu mc az set -f vsphere-labeled-zones.yaml
```

Update the cluster with the AZ selector label. The example here uses an AZ selector controlPlaneZoneMatchingLabels: {"environment": "staging", "region": "room1"} for the management cluster tkg-mgmt-vc, but it can also be a workload cluster:

kubectl get cluster tkg-mgmt-vc -n tkg-system -o json | \
jq '.spec.topology.variables |= \
map(if .name == "controlPlaneZoneMatchingLabels" \
then .value = {"environment": "staging", "region": "room1"} \
else . end)'| kubectl apply -f -

cluster.cluster.x-k8s.io/tkg-mgmt-vc replaced

Check the cluster status to ensure that the failure domain has been updated as expected:

kubectl get cluster tkg-mgmt-vc -n tkg-system -o json | jq -r '.status.failureDomains | to_entries[].key'

Patch the KubeAdmControlPlane with rolloutAfter to trigger a update of control plane nodes.

kubectl patch kcp tkg-mgmt-vc-cpkxj -n tkg-system --type merge -p "{\"spec\":{\"rolloutAfter\":\"$(date +'%Y-%m-%dT%TZ')\"}}"

Verify that the control plane nodes are moved to new AZs selected by the selector in controlPlaneZoneMatchingLabels by checking the nodes’ host and datastore in the vCenter or running kubectl get node or govc vm.info commands like the following. In our example, the new AZ is rack4:
- kubectl get node NODE-NAME -o=jsonpath='{.metadata.labels.node.cluster.x-k8s.io/esxi-host}' --context tkg-mgmt-vc-admin@tkg-mgmt-vc
- govc vm.info -json NODE-NAME | jq -r '.VirtualMachines[].Config.Hardware.Device[] | select(.DeviceInfo.Label == "Hard disk 1") | .Backing.FileName'

Change Machine Deployment AZs

To change an AZ configuration in an existing cluster, patch its nodes’ underlying MachineDeployment configurations with the new AZ value.

For example, if the cluster configuration file set VSPHERE_AZ_0 to rack1 and you want to move its worker nodes to rack2:

Query the current AZs used for cluster. This example uses a workload cluster tkg-wc, but it can also be a management cluster:
```
kubectl get cluster tkg-wc -o json \| jq -r '.spec.topology.workers.machineDeployments\[0\].failureDomain'
```

List all available AZs.

kubectl get vspheredeploymentzones -o=jsonpath='{range .items[?(@.status.ready == true)]}{.metadata.name}{"\n"}{end}'

rack1
rack2

Patch the tkg-wc cluster’s spec.toplogy.workers.machineDeployments configuration to set its zone VSphereFailureDomain to rack2. This example assumes that tkg-wc is a single-node, dev plan cluster. For a prod plan cluster, you would need to patch all three MachineDeployment object configurations in the cluster.
```
kubectl patch cluster tkg-wc --type=json -p='[{"op": "replace", "path": "/spec/topology/workers/machineDeployments/0/failureDomain", "value": "rack2"}]'

cluster.cluster.x-k8s.io/tkg-wc patched
```

Verify that the cluster is updated with VSphereFailureDomain rack2.

kubectl get cluster tkg-wc -o=jsonpath='{.spec.topology.workers.machineDeployments[?(@.name=="md-0")].failureDomain}'

rack2

Verify that the worker nodes are now deployed in VSphereFailureDomain rack2.

Add AZs for Machine Deployments

To configure new AZs for use by TKG clusters and then use them in an existing cluster:

Prepare a configuration file that defines a VSphereFailureDomain and VSphereDeploymentZone object for each new AZ. Use the vsphere-3-zones.yaml example in Add AZs for Control Plane Nodes above, which defines AZs rack1, rack2 and rack3 with region room1.
Create the VSphereFailureDomain and VSphereDeploymentZone objects.
```
tanzu mc az set -f vsphere-3-zones.yaml
```
Or, you can run kubectl apply -f vsphere-3-zones.yaml

Patch the cluster tkg-wc with VSphereFailureDomain rack1, rack2 and rack3. In this example, tkg-wc is a prod plan cluster plan with three MachineDeployment configurations. With a dev plan cluster you only need to update one MachineDeployment in the cluster’s spec.toplogy.workers.machineDeployments.

kubectl patch cluster tkg-wc --type=json -p='[  \
{"op": "replace", "path": "/spec/topology/workers/machineDeployments/0/failureDomain", "value": "rack1"}, \
{"op": "replace", "path": "/spec/topology/workers/machineDeployments/1/failureDomain", "value": "rack2"}, \
{"op": "replace", "path": "/spec/topology/workers/machineDeployments/2/failureDomain", "value": "rack3"}]'

Verify that the cluster is updated with the new AZs.

kubectl get cluster tkg-wc -o=jsonpath='{range .spec.topology.workers.machineDeployments[*]}{"Name: "}{.name}{"\tFailure Domain: "}{.failureDomain}{"\n"}{end}'

Verify that its worker nodes are now deployed in VSphereFailureDomain rack1, rack2 and rack3.

Add AZs and New Machine Deployments

To configure both new AZs and new MachineDeployment objects for use by TKG clusters and then use them in an existing cluster:

Prepare a configuration file that defines a VSphereFailureDomain and VSphereDeploymentZone object for each new AZ. The example below, vsphere-1-zone.yaml, defines new AZ rack2 with region room1:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
 name: rack2
spec:
 region:
   name: room1
   type: ComputeCluster
   tagCategory: k8s-region
 zone:
   name: rack2
   type: HostGroup
   tagCategory: k8s-zone
 topology:
   datacenter: dc0
   computeCluster: cluster0
   hosts:
     vmGroupName: rack2-vm-grou
     hostGroupName: rack2
     datastore: ds-r2
   networks:
   - "VM Network"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
  name: rack2
spec:
  server: VSPHERE_SERVER
  failureDomain: rack2
  placementConstraint:
    resourcePool: rp0
    folder: folder0

Create the VSphereFailureDomain and VSphereDeploymentZone objects.
```
tanzu mc az set -f vsphere-zones.yaml
```
Or, you can run kubectl apply -f vsphere-1-zone.yaml
Prepare a configuration file for the new machine deployment. The example below, md-1.yaml, defines a new machine deployment md-1 with its az property set to rack2:
```
name: md-1
replicas: 1
az: rack2
nodeMachineType: t3.large
workerClass: tkg-worker
tkrResolver: os-name=ubuntu,os-arch=amd64
```
Use the Tanzu CLI to create new node pool. This example uses workload cluster tkg-wc, but it can also be a management cluster:
```
tanzu cluster node-pool set wl-antrea -f md-1.yaml

Cluster update for node pool 'md-1' completed successfully
```

Get the machine deployment name in the new created node-pool:

kubectl get machinedeployments -l
topology.cluster.x-k8s.io/deployment-name=md-1
-o=jsonpath='{.items[*].metadata.name}'

wl-antrea-md-1-pd9vj

Verify that the machine deployment is updated with VSphereFailureDomain rack2:

kubectl get machinedeployments wl-antrea-md-1-pd9vj -o json | \
jq -r '.spec.template.spec.failureDomain'

rack2

Verify that the worker node of md-1 is deployed in rack2.

Update CPI and CSI for AZ Changes

After you change a workload cluster’s AZ configuration as described in any of the sections above, you need to update its CPI and CSI add-on configurations and then re-create the add-ons to reflect the changes. The procedures below explain how to do this.

Limitations:

Before you enable multiple AZs for an existing cluster or move its worker or control plane nodes to a different AZ for an existing cluster, you must ensure that any new cluster nodes can access the cluster’s original persistent volume (PV).
All tagCategory settings in different regions and zones in VsphereFailureDomain should match.
Before enabling multiple AZs for vSphere CSI, you must enable multi-az kcp/worker.

Update CPI After AZ Change

To update a cluster’s CPI add-on configuration to reflect an AZ change, and then delete the corresponding package installer to recreate the add-on with the changes:

Retrieve the name of the cluster’s vsphereCPIConfig using the cb reference. For example, with a workload cluster named wl:
```
kubectl -n default get cb wl -o json \| jq -r '.spec.cpi.valuesFrom.providerRef.name'
```

Edit the cluster’s vsphereCPIConfig spec to set its region and zone to the tagCategory fields that you set for the AZ’s region and zone in vSphere and in the vsphereFailuredomain spec. For example:

apiVersion: cpi.tanzu.vmware.com/v1alpha1
kind: VSphereCPIConfig
metadata:
  name: wl
  namespace: default
spec:
  vsphereCPI:
    mode: vsphereCPI
    region: k8s-zone
    tlsCipherSuites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
  vmNetwork:
    excludeExternalSubnetCidr: 10.215.10.79/32
    excludeInternalSubnetCidr: 10.215.10.79/32
  zone: k8s-zone

Apply the changes and wait for Reconcile succeeded.

Confirm the CPI package installer (pkgi) has reinstalled:

kubectl -n tkg-system get wl-vsphere-cpi pkgi --context wl-admin@wl

Update CSI After AZ Change

To update a cluster’s CSI add-on configuration to reflect an AZ change, and then delete the csinodetopology and corresponding package installer to recreate the add-on with the changes:

Retrieve the name of the cluster’s vsphereCPIConfig using the cb reference. For example, with a workload cluster named wl:
```
kubectl -n default get cb wl  -o json  |jq  -r '.spec.csi.valuesFrom.providerRef.name')
```

Edit the cluster’s vsphereCSIConfig spec to set its region and zone to the tagCategory fields that you set for the AZ’s region and zone in vSphere and in the vsphereFailuredomain spec. For example:

apiVersion: csi.tanzu.vmware.com/v1alpha1
kind: VSphereCSIConfig
metadata:
  name: wl
  namespace: default
spec:
  vsphereCSI:
    config:
      datacenter: /dc0
      httpProxy: ""
      httpsProxy: ""
    insecureFlag: true
    noProxy: ""
  region: k8s-region
  tlsThumbprint: ""
  useTopologyCategories: true
  zone: k8s-zone
mode: vsphereCSI

Apply the changes.

Delete the csinode and csiNodeTopology objects so that they re-create. csinodetopology does not update automatically:

kubectl -n delete csinode --all --context wl-admin@wl
kubectl -n delete csinodetopology --all --context wl-admin@wl

Delete the cluster’s vsphere-csi package installer and wait for Reconcile succeeded.
```
kubectl delete pkgi -n tkg-system wl-vsphere-csi --context wl-admin@wl
```

Verify that all csinodes objects include the topologyKeys parameter, for example:

kubectl get csinodes -o jsonpath='{range .items[*]}{.metadata.name} {.spec}{"\n"}{end}'

k8s-control-1 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-control-1","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-control-2 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-control-2","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-control-3 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-control-3","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-node-1 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-node-1","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-node-2 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-node-2","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-node-3 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-node-3","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-node-4 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-node-4","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-node-5 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-node-5","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
k8s-node-6 {"drivers":[{"name":"csi.vsphere.vmware.com","nodeID":"k8s-node-6","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}

Verify that all nodes’ topology labels reflect the correct AZ regions and zones, for example:

kubectl get nodes --show-labels
NAME            STATUS   ROLES                  AGE  VERSION   LABELS
k8s-control-1   Ready    control-plane          1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-A
k8s-control-2   Ready    control-plane          1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-B
k8s-control-3   Ready    control-plane          1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-C
k8s-node-1      Ready    <none>                 1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-A
k8s-node-2      Ready    <none>                 1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-B
k8s-node-3      Ready    <none>                 1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-B
k8s-node-4      Ready    <none>                 1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-1,topology.csi.vmware.com/k8s-zone=zone-C
k8s-node-5      Ready    <none>                 1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-2,topology.csi.vmware.com/k8s-zone=zone-D
k8s-node-6      Ready    <none>                 1d   v1.21.1   topology.csi.vmware.com/k8s-region=region-2,topology.csi.vmware.com/k8s-zone=zone-D

List Availability Zones

Use the tanzu mc az list command to list the AZs that are defined in the standalone management cluster or used by a workload cluster:

To list availability zones that are currently being used by a management cluster and its workload clusters:
```
tanzu management-cluster available-zone list
```
To list all availability zones defined in the management cluster, and therefore available for workload cluster nodes:
```
tanzu management-cluster available-zone list -a
```
To the availability zones currently being used by the workload cluster CLUSTER-NAME:
```
tanzu management-cluster available-zone list -c CLUSTER-NAME:
```

tanzu mc az commands are aliased from tanzu management-cluster available-zone.

Example output:

AZNAME   ZONENAME  ZONETYPE    REGIONNAME REGIONTYPE DATASTORE   NETWORK   OWNERCLUSTER STATUS
us-west-1a us-west-1a ComputeCluster us-west-1  Datacenter sharedVmfs-0 VM Network az-1     ready
us-west-1b us-west-1b ComputeCluster us-west-1  Datacenter sharedVmfs-0 VM Network az-1     ready
us-west-1c us-west-1c ComputeCluster us-west-1  Datacenter sharedVmfs-0 VM Network az-1     ready

The output lists:

AZNAME, ZONENAME: The name of the AZ
ZONETYPE: The vSphere object type scoped to the AZ, ComputeCluster or HostGroup
REGIONNAME: The name of the region that contains the AZ
REGIONTYPE: The vSphere object type scoped to the region, Datacenter or ComputeCluster
DATASTORE: The datastore that hosts VMs in the region
NETWORK: The network serving VMs in the region
OWNERCLUSTER: TKG cluster or clusters that run in the AZ
STATUS: Current AZ status

The tanzu mc az command group is aliased from tanzu management-cluster available-zone.

Delete Availability Zones

Use the tanzu mc az delete command to delete an unused AZ, for example:

tanzu mc az delete AZNAME

Where AZNAME is the name of the AZ as listed by tanzu mc az list.

You can only delete an AZ if it is not currently hosting TKG cluster nodes, as shown by tanzu mc az list listing no OWNERCLUSTER for the AZ.

Running Clusters Across Multiple Availability Zones

Overview

Defining AZs

Configuring Clusters to Use AZs

Prerequisites

Deploy Workload Clusters Across Multiple AZs

Prepare Regions and AZs in vSphere

Create FailureDomain and DeploymentZone Objects in Kubernetes

Deploy the Cluster

Update Existing Clusters to Use Multiple or Different Availability Zones

Add AZs for Control Plane Nodes

Use Selector Labels to Specify New Control Plane AZs

Change Machine Deployment AZs

Add AZs for Machine Deployments

Add AZs and New Machine Deployments

Update CPI and CSI for AZ Changes

Update CPI After AZ Change

Update CSI After AZ Change

List Availability Zones

Delete Availability Zones

Create `FailureDomain` and `DeploymentZone` Objects in Kubernetes