Upgrading Tanzu Kubernetes Grid

To upgrade Tanzu Kubernetes Grid (TKG), you download and install the new version of the Tanzu CLI on the machine that you use as the bootstrap machine. You must also download and install base image templates and VMs, depending on whether you are upgrading clusters that you previously deployed to vSphere, Amazon Web Services (AWS), or Azure.

Note

In the TKG upgrade path, v2.3 immediately follows v2.2.

After you have installed the new versions of the components, you use the tanzu mc upgrade and tanzu cluster upgrade CLI commands to upgrade management clusters and workload clusters.

The next sections are the overall steps required to upgrade Tanzu Kubernetes Grid. This procedure assumes that you are upgrading to Tanzu Kubernetes Grid v2.3.1.

Some steps are only required if you are performing a minor upgrade from Tanzu Kubernetes Grid v2.2.x to v2.3.x and are not required if you are performing a patch upgrade from Tanzu Kubernetes Grid v2.3.x to v2.3.y.

Important

Tanzu Kubernetes Grid v2.4.x is the last version of TKG that supports upgrading existing standalone TKG management clusters and TKG workload clusters on AWS and Azure. The ability to upgrade standalone TKG management clusters and TKG workload clusters on AWS and Azure will be removed in the Tanzu Kubernetes Grid v2.5 release.

Starting from now, VMware recommends that you use Tanzu Mission Control to create native AWS EKS and Azure AKS clusters. However, upgrading existing standalone TKG management clusters and TKG workload clusters on AWS and Azure remains fully supported for all TKG releases up to and including TKG v2.4.x.

For more information, see Deprecation of TKG Management and Workload Clusters on AWS and Azure in the VMware Tanzu Kubernetes Grid v2.4 Release Notes.

Prerequisites

Before you upgrade to TKG v2.3.x, ensure that your current deployment is TKG v2.2.x or an earlier v2.3 version. To upgrade to TKG v2.3.x from versions earlier than v2.2, you must first upgrade to v2.2.x with a v2.2.x of the Tanzu CLI.

Unset environment variables

If you have used environment variables for cluster creation, you must unset some of them before starting the upgrade process. If these variables are set in the environment where you run the upgrade, the upgrade process will treat them as reconfiguration options and the upgrade might fail.

Run the following commands to unset the relevant variables:

unset VSPHERE_DATACENTER
unset VSPHERE_DATASTORE
unset VSPHERE_FOLDER
unset VSPHERE_NETWORK
unset VSPHERE_RESOURCE_POOL

Download and Install the New Version of the Tanzu CLI

This step is required for both major v2.2.x to v2.3.x and patch v2.3.x to v2.3.y upgrades.

To download and install the new version of the Tanzu CLI and plugins, perform the following steps.

  1. Delete the ~/.config/tanzu/tkg/compatibility/tkg-compatibility.yaml file.

    If you do not delete this file, the new version of the Tanzu CLI will continue to use the Bill of Materials (BOM) for the previous release. Deleting this file causes the Tanzu CLI to pull the updated BOM. You must perform this step both when upgrading from 2.2.x to 2.3.x, and when upgrading from 2.3.x to 2.3.y.

  2. Follow the instructions in Install the Tanzu CLI and Kubernetes CLI for Use with Standalone Management Clusters to download and install the Tanzu CLI and plugins and kubectl on the machine where you currently run your tanzu commands. If any of your standalone management clusters are configured to use an LDAP identity provider, perform the steps in (LDAP Only) Update LDAP Settings after installing Tanzu CLI and before updating your CLI plugins to Tanzu Kubernetes Grid v2.3.

  3. Make sure to run tanzu plugin install to install the current --vmware-tkg plugin group.
  4. After you install the new version of the Tanzu CLI and plugins, run tanzu version to check that the correct version of the Tanzu CLI is properly installed. For a list of CLI versions compatible with Tanzu Kubernetes Grid v2.3, see Product Interoperability Matrix.
  5. After you install kubectl, run kubectl version to check that the correct version of kubectl is properly installed.

For information about Tanzu CLI commands and options that are available, see the Tanzu CLI Command Reference.

Prepare to Upgrade Clusters

Before you can upgrade management and workload clusters, you must perform preparatory steps depending whether you deployed clusters on vSphere, AWS, or Azure. This step is required for both major v2.2.x to v2.3.x and patch v2.3.x to v2.3.y upgrades.

vSphere
Before you can upgrade a Tanzu Kubernetes Grid deployment on vSphere, you must import into vSphere new versions of the base image templates that the upgraded management and workload clusters will run. VMware publishes base image templates in OVA format for each supported OS and Kubernetes version. After importing the OVAs, you must convert the resulting VMs into VM templates.

This procedure assumes that you are upgrading to Tanzu Kubernetes Grid v2.3.x.

  1. Go to the Broadcom Support Portal and log in with your VMware customer credentials.
  2. Go to the Tanzu Kubernetes Grid downloads page and log in with your VMware customer credentials.
  3. Select v2.3.1.
  4. Download the latest Tanzu Kubernetes Grid OVAs for the OS and Kubernetes version lines that your management and workload clusters are running.

    For example, for Photon v3 images:

    • Kubernetes v1.26.8: Photon v3 Kubernetes v1.26.8 OVA
    • Kubernetes v1.25.13: Photon v3 Kubernetes v1.25.13 OVA
    • Kubernetes v1.24.17: Photon v3 Kubernetes v1.24.17 OVA

    For Ubuntu 20.04 images:

    • Kubernetes v1.26.8: Ubuntu 2004 Kubernetes v1.26.8 OVA
    • Kubernetes v1.25.13: Ubuntu 2004 Kubernetes v1.25.13 OVA
    • Kubernetes v1.24.17: Ubuntu 2004 Kubernetes v1.24.17 OVA
    Important

    Make sure you download the most recent OVA base image templates in the event of security patch releases. You can find updated base image templates that include security patches on the Tanzu Kubernetes Grid product download page.

  5. In the vSphere Client, right-click an object in the vCenter Server inventory and select Deploy OVF template.
  6. Select Local file, click the button to upload files, and navigate to a downloaded OVA file on your local machine.
  7. Follow the installer prompts to deploy a VM from the OVA.

    • Accept or modify the appliance name.
    • Select the destination datacenter or folder.
    • Select the destination host, cluster, or resource pool.
    • Accept the end user license agreements (EULA).
    • Select the disk format and destination datastore.
    • Select the network for the VM to connect to.
  8. Click Finish to deploy the VM.
  9. When the OVA deployment finishes, right-click the VM and select Template > Convert to Template.
  10. In the VMs and Templates view, right-click the new template, select Add Permission, and assign your Tanzu Kubernetes Grid user, for example, tkg-user, to the template with the Tanzu Kubernetes Grid role, for example, TKG. You created this user and role in Prepare to Deploy Management Clusters to vSphere.

Repeat the procedure for each of the Kubernetes versions for which you have downloaded the OVA file.

VMware Cloud on AWS SDDC Compatibility

If you are upgrading workload clusters that are deployed on VMware Cloud on AWS, verify that the underlying Software-Defined Datacenter (SDDC) version used by your existing deployment is compatible with the version of Tanzu Kubernetes Grid you are upgrading to.

To view the version of an SDDC, select View Details on the SDDC tile in VMware Cloud Console and click on the Support pane.

To validate compatibility with Tanzu Kubernetes Grid, refer to the VMware Product Interoperablity Matrix.

AWS
After you install the new version of the Tanzu CLI and other tools, but before you upgrade a management cluster, you must reset the permissions in your AWS account by running the tanzu mc permissions aws set command.
tanzu mc permissions aws set

This step is required for either major v2.2.x to v2.3.x or patch v2.3.x to v2.3.y upgrades. For more information about the AWS permission that the command sets, see Required AWS Permissions.

Amazon Linux 2 Amazon Machine Images (AMI) that include the supported Kubernetes versions are publicly available to all AWS users, in all supported AWS regions. Tanzu Kubernetes Grid automatically uses the appropriate AMI for the Kubernetes version that you specify during upgrade.

Azure
Before upgrading a Tanzu Kubernetes Grid deployment on Azure, you must accept the terms for the new default VM image and for each non-default VM image that you plan to use for your cluster VMs. You need to accept these terms once per subscription.

To accept the terms:

  1. List all available VM images for Tanzu Kubernetes Grid in the Azure Marketplace:

    az vm image list --publisher vmware-inc --offer tkg-capi --all
    
  2. Accept the terms for the new default VM image:

    az vm image terms accept --urn publisher:offer:sku:version
    

    For example, to accept the terms for the default VM image in Tanzu Kubernetes Grid v2.3.1, k8s-1dot26dot8-ubuntu-2004, run:

    az vm image terms accept --urn vmware-inc:tkg-capi:k8s-1dot26dot8-ubuntu-2004:2021.05.17
    
  3. If you plan to upgrade any of your workload clusters to a non-default Kubernetes version, such as v1.25.13 or v1.24.17, accept the terms for each non-default version that you want to use for your cluster VMs.


Upgrade Standalone Management Clusters

This step is only required for TKG with a standalone management cluster. If you are running TKG with a vSphere with Tanzu Supervisor, you upgrade the Supervisor as part of vSphere and update the Supervisor’s Kubernetes version by upgrading its TKrs.

This step is required for both major v2.2.x to v2.3.x and patch v2.3.x to v2.3.y upgrades.

To upgrade Tanzu Kubernetes Grid, you must upgrade all management clusters in your deployment. You cannot upgrade workload clusters until you have upgraded the management clusters that manage them.

Follow the procedure in Upgrade Standalone Management Clusters to upgrade your management clusters.

Upgrade Workload Clusters

This step is required for both major v2.2.x to v2.3.x and patch v2.3.x to v2.3.y upgrades.

Follow the procedure in Upgrade Workload Clusters to upgrade the workload clusters that are running your workloads.

Post-Upgrade Operations

After you have upgraded your clusters, there are additional steps to perform to complete the upgrade process.

Sync Package Versions Older Than n-2

Some packages that are installed by default in the management cluster, for example, cert-manager can be installed as CLI-managed packages in workload and the shared services clusters. When the management cluster is upgraded to the latest Tanzu Kubernetes Grid release, its default packages are automatically updated.

You can run different versions of the CLI-managed packages in different workload clusters. In a workload cluster, you can run either the latest supported version of a CLI-managed package or the package’s versions in your last two previously-installed versions of Tanzu Kubernetes Grid. For example, if the latest packaged version of cert-manager is v1.11.1 and your previous two Tanzu Kubernetes Grid installations ran cert-manager v1.10.1 and v1.7.2, then you can run cert-manager versions v1.11.1, v1.10.1, and v1.7.2 in workload clusters.

For any workload clusters that are running package versions that are more than n-2 installed Tanzu Kubernetes Grid versions older than the package versions in the management cluster, you must update the package repository (See Update a Package Repository) and then upgrade the package in the workload clusters (See Update a Package). If you do not upgrade the package version, you will not be able to update the package configuration because the package repository might not include over n-2 older version of the package.

Important

If you are have installed Prometheus on a workload cluster and you upgrade the workload cluster to Kubernetes v1.25, you must upgrade Prometheus to at least version 2.37.0+vmware.3-tkg.1. Earlier versions of the Prometheus package, for example version 2.37.0+vmware.1-tkg.1, are not compatible with Kubernetes 1.25.

Provider-Specific Post-Upgrade Operations

Depending on whether your clusters are running on vSphere, AWS, or Azure, there are operations that you must perform after you have upgraded the clusters.

vSphere
Upgrade NSX Advanced Load Balancer Configuration After Tanzu Kubernetes Grid Upgrade

If NSX ALB was not enabled in your TKG v2.2 installation, see Install and Configure NSX Advanced Load Balancer for information on how to install NSX ALB.

If NSX ALB was enabled in your TKG v2.2 installation, see the Tanzu Kubernetes Grid v2.3 Release Notes for which Avi Controller versions are supported in this release, and if needed upgrade the Avi Controller to a compatible version. For how to upgrade the Avi Controller, see Flexible Upgrades for Avi Vantage.

AWS
Install AWS EBS CSI Driver after Tanzu Kubernetes Grid Upgrade

TKG v2.2 and later automatically install the AWS EBS CSI driver on newly-created workload clusters, but to run AWS EBS CSI on clusters upgraded from v2.1, the driver must be installed manually. Follow this procedure to install the AWS EBS CSI driver manually on a cluster that was created in TKG v2.1 or earlier and has never had the AWS EBS CSI Driver installed.

  1. Grant permissions for the AWS EBS CSI driver:

    export AWS_REGION={YOUR_AWS_REGION}
    tanzu mc permissions aws set
    
  2. For each workload cluster that uses CSI storage:

    1. Export the following environment variables and set feature flag:

      export _TKG_CLUSTER_FORCE_ROLE="management"
      export FILTER_BY_ADDON_TYPE="csi/aws-ebs-csi-driver"
      export NAMESPACE="tkg-system"
      export DRY_RUN_MODE="legacy"
      tanzu config set features.cluster.allow-legacy-cluster true
      

      Set NAMESPACE to the cluster’s namespace, tkg-system in the example above.

    2. Generate the CSI driver manifest:

      tanzu cl create ${TARGET_CLUSTER_NAME} --dry-run -f ~/MANAGEMENT_CLUSTER_CREATE_CONFIG.yaml > csi-driver-addon-manifest.yaml
      

      Where TARGET_CLUSTER_NAME is the name of the cluster on which you are installing the CSI driver.

    3. Update the namespace of the secret in the metadata in csi-driver-addon-manifest.yaml with the namespace of the workload cluster. Use the command kubectl get cluster -A to view the namespace of the cluster.

    4. Apply the changes in management cluster’s context:

      kubectl apply -f csi-driver-addon-manifest.yaml
      
    5. Unset the following environment variables and feature flag:

      unset _TKG_CLUSTER_FORCE_ROLE
      unset FILTER_BY_ADDON_TYPE
      unset NAMESPACE
      unset DRY_RUN_MODE
      tanzu config set features.cluster.allow-legacy-cluster false
      
  3. For management cluster that uses CSI storage:

    1. Export the following environment variables:

      export _TKG_CLUSTER_FORCE_ROLE="management"
      export FILTER_BY_ADDON_TYPE="csi/aws-ebs-csi-driver"
      export NAMESPACE="tkg-system"
      export DRY_RUN_MODE="legacy"
      tanzu config set features.cluster.allow-legacy-cluster true
      

      Set NAMESPACE to the cluster’s namespace, tkg-system in the example above.

    2. Generate the CSI driver manifest:

      tanzu mc create ${MANAGEMENT_CLUSTER_NAME} --dry-run -f ~/MANAGEMENT_CLUSTER_CREATE_CONFIG.yaml > csi-driver-addon-manifest.yaml
      

      Where MANAGEMENT_CLUSTER_NAME is the name of the management cluster.

    3. Update the namespace of the secret in the metadata in csi-driver-addon-manifest.yaml with the namespace of the management cluster. Use the command kubectl get cluster -A to view the namespace of the cluster.

    4. Apply the changes in management cluster’s context:

      kubectl apply -f csi-driver-addon-manifest.yaml
      
    5. Unset the following environment variables and feature flag:

      unset _TKG_CLUSTER_FORCE_ROLE
      unset FILTER_BY_ADDON_TYPE
      unset NAMESPACE
      unset DRY_RUN_MODE
      tanzu config set features.cluster.allow-legacy-cluster false
      
Azure
Install Azure Disk CSI Driver after Tanzu Kubernetes Grid Upgrade

TKG v2.1 and later automatically install the Azure Disk CSI driver on newly-created workload clusters, but to run Azure Disk CSI on clusters upgraded from v1.6, the driver must be installed manually. Follow this procedure to install the Azure Disk CSI driver manually on a cluster that was created in TKG v1.6 or earlier and has never had the Azure Disk CSI Driver installed.

  1. Export the following environment variables and set feature flag:

    export _TKG_CLUSTER_FORCE_ROLE="management"
    export FILTER_BY_ADDON_TYPE="csi/azuredisk-csi-driver"
    export NAMESPACE="tkg-system"
    export DRY_RUN_MODE="legacy"
    tanzu config set features.cluster.allow-legacy-cluster true
    

    Set NAMESPACE to the cluster’s namespace, tkg-system in the example above.

  2. For each workload cluster that uses CSI storage:

    1. Generate the CSI driver manifest:

      tanzu cl create ${TARGET_CLUSTER_NAME} --dry-run -f ~/MANAGEMENT_CLUSTER_CREATE_CONFIG.yaml > csi-driver-addon-manifest.yaml
      

      Where TARGET_CLUSTER_NAME is the name of the cluster on which you are installing the CSI driver.

    2. Update the namespace of the secret in the metadata in csi-driver-addon-manifest.yaml with the namespace of the workload cluster. Use the command kubectl get cluster -A to view the namespace of the cluster.

    3. Apply the changes in management cluster’s context:

      kubectl apply -f csi-driver-addon-manifest.yaml
      
    4. Unset the following environment variables and feature flag:

      unset _TKG_CLUSTER_FORCE_ROLE
      unset FILTER_BY_ADDON_TYPE
      unset NAMESPACE
      unset DRY_RUN_MODE
      tanzu config set features.cluster.allow-legacy-cluster false
      
  3. For management cluster that uses CSI storage:

    1. Export the following environment variables:

      export _TKG_CLUSTER_FORCE_ROLE="management"
      export FILTER_BY_ADDON_TYPE="csi/azuredisk-csi-driver"
      export NAMESPACE="tkg-system"
      export DRY_RUN_MODE="legacy"
      tanzu config set features.cluster.allow-legacy-cluster true
      

      Set NAMESPACE to the cluster’s namespace, tkg-system in the example above.

    2. Generate the CSI driver manifest:

      tanzu mc create ${MANAGEMENT_CLUSTER_NAME} --dry-run -f ~/MANAGEMENT_CLUSTER_CREATE_CONFIG.yaml > csi-driver-addon-manifest.yaml
      

      Where MANAGEMENT_CLUSTER_NAME is the name of the management cluster.

    3. Update the namespace of the secret in the metadata in csi-driver-addon-manifest.yaml with the namespace of the management cluster. Use the command kubectl get cluster -A to view the namespace of the cluster.

    4. Apply the changes in management cluster’s context:

      kubectl apply -f csi-driver-addon-manifest.yaml
      
    5. Unset the following environment variables and feature flag:

      unset _TKG_CLUSTER_FORCE_ROLE
      unset FILTER_BY_ADDON_TYPE
      unset NAMESPACE
      unset DRY_RUN_MODE
      tanzu config set features.cluster.allow-legacy-cluster false
      

Install Azure File CSI Driver after Tanzu Kubernetes Grid Upgrade

If the cluster has not installed it before, follow this procedure to install Azure File CSI Driver after upgrading your Tanzu Kubernetes Grid installation to v2.3+.

  1. Export the following environment variables and set feature flag:

    export _TKG_CLUSTER_FORCE_ROLE="management"
    export FILTER_BY_ADDON_TYPE="csi/azurefile-csi-driver"
    export NAMESPACE="tkg-system"
    export DRY_RUN_MODE="legacy"
    tanzu config set features.cluster.allow-legacy-cluster true
    

    Set NAMESPACE to the cluster’s namespace, tkg-system in the example above.

  2. For each workload cluster that uses CSI storage:

    1. Generate the CSI driver manifest:

      tanzu cl create ${TARGET_CLUSTER_NAME} --dry-run -f ~/MANAGEMENT_CLUSTER_CREATE_CONFIG.yaml > csi-driver-addon-manifest.yaml
      

      Where TARGET_CLUSTER_NAME is the name of the cluster on which you are installing the CSI driver.

    2. Update the namespace of the secret in the metadata in csi-driver-addon-manifest.yaml with the namespace of the workload cluster. Use the command kubectl get cluster -A to view the namespace of the cluster.

    3. Apply the changes in management cluster’s context:

      kubectl apply -f csi-driver-addon-manifest.yaml
      
    4. Unset the following environment variables and feature flag:

      unset _TKG_CLUSTER_FORCE_ROLE
      unset FILTER_BY_ADDON_TYPE
      unset NAMESPACE
      unset DRY_RUN_MODE
      tanzu config set features.cluster.allow-legacy-cluster false
      
  3. For management cluster that uses CSI storage:

    1. Export the following environment variables:

      export _TKG_CLUSTER_FORCE_ROLE="management"
      export FILTER_BY_ADDON_TYPE="csi/azurefile-csi-driver"
      export NAMESPACE="tkg-system"
      export DRY_RUN_MODE="legacy"
      tanzu config set features.cluster.allow-legacy-cluster true
      

      Set NAMESPACE to the cluster’s namespace, tkg-system in the example above.

    2. Generate the CSI driver manifest:

      tanzu mc create ${MANAGEMENT_CLUSTER_NAME} --dry-run -f ~/MANAGEMENT_CLUSTER_CREATE_CONFIG.yaml > csi-driver-addon-manifest.yaml
      

      Where MANAGEMENT_CLUSTER_NAME is the name of the management cluster.

    3. Update the namespace of the secret in the metadata in csi-driver-addon-manifest.yaml with the namespace of the management cluster. Use the command kubectl get cluster -A to view the namespace of the cluster.

    4. Apply the changes in management cluster’s context:

      kubectl apply -f csi-driver-addon-manifest.yaml
      
    5. Unset the following environment variables and feature flag:

      unset _TKG_CLUSTER_FORCE_ROLE
      unset FILTER_BY_ADDON_TYPE
      unset NAMESPACE
      unset DRY_RUN_MODE
      tanzu config set features.cluster.allow-legacy-cluster false
      


Upgrade Crash Recovery and Diagnostics

This step is required for both major v2.2.x to v2.3.x and patch v2.3.x to v2.3.y upgrades.

For information about how to upgrade Crash Recovery and Diagnostics, see Install or Upgrade the Crash Recovery and Diagnostics Binary.

What to Do Next

Examine your upgraded management clusters or register them in Tanzu Mission Control. See Examine and Register a Newly-Deployed Standalone Management Cluster.

check-circle-line exclamation-circle-line close-line
Scroll to top icon