VMware Tanzu Kubernetes Grid v2.2 Release Notes

Except where noted, these release notes apply to all v2.2.x patch versions of Tanzu Kubernetes Grid (TKG).

TKG v2.2 is distributed as a downloadable Tanzu CLI package that deploys a versioned TKG standalone management cluster. TKG v2.2 supports creating and managing workload clusters with a standalone management cluster that can run on multiple infrastructures, including vSphere 6.7, 7, and 8, AWS, and Azure.

Tanzu Kubernetes Grid v2.0, v2.2, and vSphere with Tanzu Supervisor in vSphere 8

Important

The vSphere with Tanzu Supervisor in vSphere 8.0.1c and later runs TKG v2.2. Earlier versions of vSphere 8 run TKG v2.0, which was not released independently of Supervisor. Standalone management clusters that run TKG 2.x are available from TKG 2.1 onwards. Later TKG releases will be embedded in Supervisor in future vSphere update releases. Consequently, the version of TKG that is embedded in the latest vSphere with Tanzu version at a given time might not be the same as the standalone version of TKG that you are using. However, the versions of the Tanzu CLI that are compatible with all TKG v2.x releases are fully supported for use with Supervisor in all releases of vSphere 8.

Tanzu Kubernetes Grid v2.2 and vSphere with Tanzu in vSphere 7

Caution

The versions of the Tanzu CLI that are compatible with TKG 2.x and with the vSphere with Tanzu Supervisor in vSphere 8 are not compatible with the Supervisor Cluster in vSphere 7. To use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster on vSphere 7, use the Tanzu CLI version from TKG v1.6. To use the versions of the Tanzu CLI that are compatible with TKG 2.x with Supervisor, upgrade to vSphere 8. You can deploy a standalone TKG 2.x management cluster to vSphere 7 if a vSphere with Tanzu Supervisor Cluster is not present. For information about compatibility between the Tanzu CLI and VMware products, see the Tanzu CLI Documentation.

What’s New

Tanzu Kubernetes Grid v2.2.x includes the following new features:

  • Support for Kubernetes v1.25.7, 1.24.11, and 1.23.17. See Supported Kubernetes Versions in Tanzu Kubernetes Grid v2.2 below.
  • You can install the FIPS version of TKG v2.2.0. For more information, see FIPS-Enabled Versions in VMware Tanzu Compliance.
  • For standalone management clusters and class-based workload clusters, you can configure trust for multiple private image registries using ADDITIONAL_IMAGE_REGISTRY* variables in a cluster configuration file or additionalImageRegistries settings in a Cluster object specification. See Trusted Registries for a Class-Based Cluster.
  • Removes the jobservice.scandataExports persistent volume claim from Harbor v2.7.1. If you previously applied the Harbor Scandata Volume EmptyDir Overlay to the Harbor package, see Update a Running Harbor Deployment before updating the Harbor package to v2.7.1.
  • From TKG 2.2.0, VMWare offers Runtime Level Support for VMware Supported Packages such as Harbor, Contour, and Velero, when they are deployed on Tanzu Kubernetes Grid.
  • vSphere CSI package supports vsphereCSI.netpermissions configuration.
  • (AWS) AWS EBS CSI driver is automatically installed on newly-created workload clusters.

Supported Kubernetes and TKG Versions

With TKG v2.2, VMware’s support policy changes for older patch versions of TKG and TKrs, which package Kubernetes versions for TKG. Support policies for TKG v2.1 and older minor versions of TKG do not change.

The sections below summarize support for all currently-supported versions of TKG and TKrs,under the support policies that apply to each.

Supported Kubernetes Versions

Each version of Tanzu Kubernetes Grid adds support for the Kubernetes version of its management cluster, plus additional Kubernetes versions, distributed as Tanzu Kubernetes releases (TKrs), except where noted as a Known Issue.

Minor versions: VMware supports TKG v2.2 with Kubernetes v1.25, v1.24, and v1.23 at time of release and for as long as TKG v2.1 is also supported. Once TKG v2.1 reaches its End of General Support milestone, VMware will no longer support Kubernetes v1.23 and v1.24 with TKG.

Patch versions: After VMware publishes a new TKr patch version for a minor line, it retains support for older patch versions for two months. This gives customers a 2-month window to upgrade to new TKr patch versions. As of TKG v2.2, VMware does not support all TKr patch versions from previous minor lines of Kubernetes.

Supported TKG and TKr Versions

Currently-supported TKG patch versions support TKr patch versions as listed below.

Tanzu Kubernetes Grid Version Management Cluster Kubernetes Version Provided Kubernetes (TKr) Versions
2.2.0 1.25.7 1.25.7, 1.24.11, 1.23.17
2.1.1 1.24.10 1.24.10, 1.23.16, 1.22.17
2.1.0 1.24.9 1.24.9, 1.23.15, 1.22.17
1.6.1 1.23.10 1.23.10, 1.22.13, 1.21.14
1.6.0 1.23.8 1.23.8, 1.22.11, 1.21.14

TKr Versions Supported with TKG v2.2.0

TKG v2.2.0 supports TKr patch versions as listed in the table below, based on the following release dates:

  • v2.2.0: May 18, 2023
  • v2.1.1: March 21, 2023

Kubernetes minor Kubernetes patch Released with TKG Support end date
(if not latest)
v1.25 v1.25.7 v2.2.0 Latest supported
v1.24 v1.24.11 v2.2.0 Latest supported
v1.24.10 v2.1.1 July 11, 2023
v1.24.9 v2.1.0 May 21, 2023
v1.23 v1.23.17 v2.2.0 Latest supported
v1.23.16 v2.1.1 July 11, 2023
v1.23.15 v2.1.0 May 21, 2023

Supported Tanzu Kubernetes Grid Versions

VMware supports TKG versions as follows:

Minor versions: VMware supports TKG following the N-2 Lifecycle Policy, which applies to the latest and previous two minor versions of TKG. With the release of TKG v2.2.0, TKG v1.5 is no longer supported. See the VMware Product Lifecycle Matrix for more information.

Patch versions: VMware does not support all previous TKG patch versions. After VMware releases a new patch version of TKG, it retains support for the older patch version for two months. This gives customers a 2-month window to upgrade to new TKG patch versions.

  • For example, support for TKG v2.2.0 would end two months after the general availability of TKG v2.2.1.

Tanzu Standard Repository Package Support Clarification

VMware provides the following support for the optional packages that are provided in the VMware Tanzu Standard Repository:

  • VMware provides installation and upgrade validation for the packages that are included in the optional VMware Tanzu Standard Repository when they are deployed on Tanzu Kubernetes Grid. This validation is limited to the installation and upgrade of the package but includes any available updates required to address CVEs. Any bug fixes, feature enhancements, and security fixes are provided in new package versions when they are available in the upstream package project.
  • VMware does not provide Runtime Level Support for the components provided by the Tanzu Standard Repository. The debugging of configuration, performance related issues, or debugging and fixing the package itself is not provided by VMware.

For more information about VMware support for Tanzu Standard packages, see What’s New above and Future Behavior Change Notices below.

Product Snapshot for Tanzu Kubernetes Grid v2.2

Tanzu Kubernetes Grid v2.2 supports the following infrastructure platforms and operating systems (OSs), as well as cluster creation and management, networking, storage, authentication, backup and migration, and observability components. The component versions listed in parentheses are included in Tanzu Kubernetes Grid v2.2.0. For more information, see Component Versions.

vSphere AWS Azure
Infrastructure platform
  • vSphere 6.7U3
  • vSphere 7
  • vSphere 8
  • VMware Cloud on AWS**
  • Azure VMware Solution
  • Oracle Cloud VMware Solution (OCVS)
  • Google Cloud VMware Engine (GCVE)
Native AWS Native Azure
CLI, API, and package infrastructure Tanzu Framework v0.29.0
Cluster creation and management Core Cluster API (v1.2.8), Cluster API Provider vSphere (v1.5.3) Core Cluster API (v1.2.8), Cluster API Provider AWS (v2.0.2) Core Cluster API (v1.2.8), Cluster API Provider Azure (v1.6.3)
Kubernetes node OS distributed with TKG Photon OS 3, Ubuntu 20.04 Amazon Linux 2, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Build your own image Photon OS 3, Red Hat Enterprise Linux 7*** and 8, Ubuntu 18.04, Ubuntu 20.04, Windows 2019 Amazon Linux 2, Ubuntu 18.04, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Container runtime Containerd (v1.6.18)
Container networking Antrea (v1.9.0), Calico (v3.24.1)
Container registry Harbor (v2.7.1)
Ingress NSX Advanced Load Balancer Essentials and Avi Controller **** (v21.1.5-v21.1.6, v22.1.2-v22.1.3), Contour (v1.23.5) Contour (v1.23.5) Contour (v1.23.5)
Storage vSphere Container Storage Interface (v2.7.1*) and vSphere Cloud Native Storage Amazon EBS CSI driver (v1.16.0) and in-tree cloud providers Azure Disk CSI driver (v1.27.0), Azure File CSI driver (v1.26.1), and in-tree cloud providers
Authentication OIDC via Pinniped (v0.12.1), LDAP via Pinniped (v0.12.1) and Dex
Observability Fluent Bit (v1.9.5), Prometheus (v2.37.0)*****, Grafana (v7.5.17)
Backup and migration Velero (v1.9.7)

* Version of vsphere_csi_driver. For a full list of vSphere Container Storage Interface components included in the Tanzu Kubernetes Grid v2.2 release, see Component Versions.

** For a list of VMware Cloud on AWS SDDC versions that are compatible with this release, see the VMware Product Interoperability Matrix.

*** Tanzu Kubernetes Grid v1.6 is the last release that supports building Red Hat Enterprise Linux 7 images.

**** On vSphere 8, to use NSX Advanced Load Balancer with a TKG standalone management cluster and its workload clusters, you need NSX ALB v22.1.2 or later and TKG v2.1.1 or later.

***** If you upgrade a cluster to Kubernetes v1.25, you must upgrade Prometheus to version 2.37.0+vmware.3-tkg.1. Earlier versions of the Prometheus package, for example version 2.37.0+vmware.1-tkg.1, are not compatible with Kubernetes 1.25.

For a full list of Kubernetes versions that ship with Tanzu Kubernetes Grid v2.2, see Supported Kubernetes Versions in Tanzu Kubernetes Grid v2.2 above.

Component Versions

The Tanzu Kubernetes Grid v2.2.x releases include the following software component versions:

Component TKG v2.2
aad-pod-identity v1.8.15+vmware.1*
addons-manager v2.2+vmware.1*
ako-operator v1.8.0_vmware.1*
alertmanager v0.25.0_vmware.1*
antrea v1.9.0_vmware.2-tkg.1-advanced*
aws-ebs-csi-driver v1.16.0_vmware.1-tkg.1*
azuredisk-csi-driver v1.27.0_vmware.2-tkg.1*
azurefile-csi-driver v1.26.1_vmware.2-tkg.1*
calico v3.24.1_vmware.1-tkg.2*
capabilities-package v0.29.0-dev-capabilities*
carvel-secretgen-controller v0.11.2+vmware.1
cloud-provider-azure v1.1.26+vmware.1,
v1.23.23+vmware.1,
v1.24.10+vmware.1
cloud_provider_vsphere v1.25.1+vmware.2*
cluster-api-provider-azure v1.6.3_vmware.2*
cluster_api v1.2.8+vmware.2*
cluster_api_aws v2.0.2+vmware.2*
cluster_api_vsphere v1.5.3+vmware.2*
cni_plugins v1.1.1+vmware.20*
configmap-reload v0.7.1+vmware.2
containerd v1.6.18+vmware.1*
contour v1.23.5+vmware.1-tkg.1*
coredns v1.9.3_vmware.8*
crash-diagnostics v0.3.7+vmware.6
cri_tools v1.24.2+vmware.8*
csi_attacher v3.5.0+vmware.1,
v3.4.0+vmware.1,
v3.3.0+vmware.1
csi_livenessprobe v2.7.0+vmware.1,
v2.6.0+vmware.1,
v2.5.0+vmware.1,
v2.4.0+vmware.1
csi_node_driver_registrar v2.5.1+vmware.1,
v2.5.0+vmware.1,
v2.3.0+vmware.1
csi_provisioner v3.2.1+vmware.1,
v3.1.0+vmware.2,
v3.0.0+vmware.1
dex v2.35.3_vmware.3*
envoy v1.24.5_vmware.1*
external-dns v0.12.2+vmware.5*
external-snapshotter v6.0.1+vmware.1,
v5.0.1+vmware.1
etcd v3.5.6+vmware.9*
fluent-bit v1.9.5+vmware.1
gangway v3.2.0+vmware.2
grafana v7.5.17+vmware.2
guest-cluster-auth-service v1.3.0*
harbor v2.7.1+vmware.1*
image-builder v0.1.13+vmware.3*
image-builder-resource-bundle v1.25.7+vmware.2-tkg.1*
imgpkg v0.31.1+vmware.1
jetstack_cert-manager v1.10.2+vmware.1*
k8s-sidecar v1.15.6+vmware.5*,
v1.12.1+vmware.6*
k14s_kapp v0.53.2+vmware.1
k14s_ytt v0.43.1+vmware.1
kapp-controller v0.41.7_vmware.1-tkg.1*
kbld v0.35.1+vmware.1
kube-state-metrics v2.7.0+vmware.2*
kube-vip v0.5.7+vmware.2*
kube-vip-cloud-provider* v0.0.4+vmware.4*
kube_rbac_proxy v0.11.0+vmware.2
kubernetes v1.25.7+vmware.2*
kubernetes-csi_external-resizer v1.4.0+vmware.1,
v1.3.0+vmware.1
kubernetes-sigs_kind v1.25.7+vmware.2-tkg.2_v0.17.0*
kubernetes_autoscaler v1.25.0+vmware.1*
load-balancer-and-ingress-service (AKO) v1.9.3_vmware.1-tkg.1*
metrics-server v0.6.2+vmware.1
multus-cni v3.8.0+vmware.3*
pinniped v0.12.1_vmware.3-tkg.4*
pinniped-post-deploy v0.12.1+vmware.2-tkg.3
prometheus v2.37.0+vmware.3*
prometheus_node_exporter v1.4.0+vmware.3*
pushgateway v1.4.3+vmware.3*
sonobuoy v0.56.16+vmware.1*
standalone-plugins-package v0.29.0-dev-standalone-plugins*
tanzu-framework v0.29.0*
tanzu-framework-addons v0.29.0*
tanzu-framework-management-packages v0.29.0-tf*
tkg-bom v2.2.0*
tkg-core-packages v1.25.7+vmware.2-tkg.1*
tkg-standard-packages v2.2.0*
tkg-storageclass-package v0.29.0-tkg-storageclass*
tkg_telemetry v2.2.0+vmware.1*
velero v1.9.7+vmware.1*
velero-mgmt-cluster-plugin* v0.1.0+vmware.1
velero-plugin-for-aws v1.5.5+vmware.1*
velero-plugin-for-csi v0.3.5+vmware.1*
velero-plugin-for-microsoft-azure v1.5.5+vmware.1*
velero-plugin-for-vsphere v1.4.3+vmware.1*
vendir v0.30.1+vmware.1
vsphere_csi_driver v2.7.1+vmware.2*
whereabouts v0.5.4+vmware.2*

* Indicates a new component or version bump since the previous release. TKG v2.1.1 is previous to v2.2.0.

For a complete list of software component versions that ship with TKG v2.2, use imgpkg to pull the repository bundle and then list its contents. For TKG v2.2.0, for example:

imgpkg pull -b projects.registry.vmware.com/tkg/packages/standard/repo:v2.2.0 -o standard-2.2.0
cd standard-2.2.0/packages
tree

Local BOM files such as the following also list package versions, but may not be current:

  • ~/.config/tanzu/tkg/bom/tkg-bom-v2.2.0.yaml
  • ~/.config/tanzu/tkg/bom/tkr-bom-v1.25.7+vmware.1-tkg.1.yaml

Supported Upgrade Paths

In the TKG upgrade path, v2.2 immediately follows v2.1.1.

You can only upgrade to Tanzu Kubernetes Grid v2.2.x from v2.1.x. If you want to upgrade to Tanzu Kubernetes Grid v2.2.x from a version earlier than v2.1.x, you must upgrade to v2.1.x first.

When upgrading Kubernetes versions on workload clusters, you cannot skip minor versions. For example, you cannot upgrade a Tanzu Kubernetes cluster directly from v1.23.x to v1.25.x. You must upgrade a v1.23.x cluster to v1.24.x before upgrading the cluster to v1.25.x.

Release Dates

Tanzu Kubernetes Grid v2.2 release dates are:

  • v2.2.0: May 9, 2023

Behavior Changes in Tanzu Kubernetes Grid v2.2

Tanzu Kubernetes Grid v2.2 introduces no new behaviors compared with v2.1.1, which is the latest previous release.

Future Behavior Change Notices

This section provides advance notice of behavior changes that will take effect in future releases, after the TKG v2.2 release.

VMware Tanzu Standard Repository

Important

Tanzu Kubernetes Grid v2.2.x is the final minor release of TKG in which the VMware Tanzu Standard repository is packaged as part of the release. TKG v2.2.x and previous releases include a set of optional packages in the Tanzu Standard Repository, that you can deploy on clusters to add functionality such as log forwarding, ingress control, a container registry, and so on. In a future TKG v2.x minor release, the Tanzu Standard Repository will not be automatically downloaded when you install the Tanzu CLI and deploy a management cluster. To use this optional set of packages, you will use the Tanzu CLI to download and add them manually. Separating the optional packages from TKG releases will allow VMware to provide incremental package updates outside of TKG releases and be more responsive to CVEs.

Deprecation Notices

In a future release of TKG the Dex component will be removed, as it is no longer needed for Pinniped to work with LDAP providers. With this change, the following cluster configuration variables for LDAP authentication will become required: LDAP_BIND_DN and LDAP_BIND_PASSWORD. For more information, see Identity Providers - LDAP in the Configuration File Variable Reference.

User Documentation

Deploying and Managing TKG 2.2 Standalone Management Clusters, includes topics specific to standalone management clusters that are not relevant to using TKG with a vSphere with Tanzu Supervisor.

For more information, see Find the Right TKG Docs for Your Deployment on the VMware Tanzu Kubernetes Grid Documentation page.

Resolved Issues

The following issues that were documented as Known Issues in earlier Tanzu Kubernetes Grid releases are resolved in Tanzu Kubernetes Grid v2.2.0.

  • Creating a ClusterClass config file from a legacy config file and --dry-run includes empty Antrea configuration

    Creating a ClusterClass config file by using tanzu cluster create --dry-run -f with a legacy config file that includes an ANTREA_NODEPORTLOCAL entry results in an autogenerated Antrea configuration that does not include any labels, which causes Antrea not to reconcile successfully.

  • Packages do not comply with default baseline PSA profile

    With PSA controllers on TKG, in unsupported Technical Preview state, some TKG packages do not comply with the default baseline profile.

  • Validation error when running tanzu cluster create

    By default, when you pass a flat key-value configuration file to the --file option of tanzu cluster create, the command converts the configuration file into a Kubernetes-style object spec file and then exits. This behavior is controlled by the auto-apply-generated-clusterclass-based-configuration feature, which is set to false by default. In some cases, when you pass the Kubernetes-style object spec file generated by the --file option to tanzu cluster create, the command fails with an error similar to the following:

    Error: workload cluster configuration validation failed...
    

    This error may also occur when you pass a Kubernetes-style object spec file generated by the --dry-run option to tanzu cluster create.

  • tanzu cluster create does not correctly validate generated cluster specs with non-default Kubernetes versions

    When creating a class-based workload cluster from a configuration file using one of the two-step processes described in Create a Class-Based Cluster, and you specify a --tkr value in the first step to base the cluster on a non-default version of Kubernetes, the second step may fail with validation errors.

Known Issues

The following are known issues in Tanzu Kubernetes Grid v2.2.x. Any known issues that were present in v2.2.0 that have been resolved in a subsequent v2.2.x patch release are listed under the Resolved Issues for the patch release in which they were fixed.

You can find additional solutions to frequently encountered issues in Troubleshooting Management Cluster Issues and Troubleshooting Workload Cluster Issues, or in Broadcom Communities.

Upgrade

  • You cannot upgrade multi-OS clusters

    You cannot use the tanzu cluster upgrade command to upgrade clusters with Windows worker nodes as described in Deploy a Multi-OS Workload Cluster.

    • Before upgrade, you must manually update a changed Avi certificate in the tkg-system package values

    Management cluster upgrade fails if you have rotated an Avi Controller certificate, even if you have updated its value in the management cluster’s secret/avi-controller-ca as described in Modify the Avi Controller Credentials.

    Failure occurs because updating secret/avi-controller-ca does not copy the new value into the management cluster’s tkg-system package values, and TKG uses the certificate value from those package values during upgrade.

    For legacy management clusters created in TKG v1.x, the new value is also not copied into the ako-operator-addon secret.

    Workaround: Before upgrading TKG, check if the Avi certificate in tkg-pkg-tkg-system-values is up-to-date, and patch it if needed:

    1. In the management cluster context, get the certificate from avi-controller-ca:
      kubectl get secret avi-controller-ca -n tkg-system-networking -o jsonpath="{.data.certificateAuthorityData}"
      
    2. In the tkg-pkg-tkg-system-values secret, get and decode the package values string:
      kubectl get secret tkg-pkg-tkg-system-values -n tkg-system -o jsonpath="{.data.tkgpackagevalues\.yaml}" | base64 --decode
      
    3. In the decoded package values, check the value for avi_ca_data_b64 under akoOperatorPackage.akoOperator.config. If it differs from the avi-controller-ca value, update tkg-pkg-tkg-system-values with the new value:

      1. In a copy of the decoded package values string, paste in the new certificate from avi-controller-ca as the avi_ca_data_b64 value under akoOperatorPackage.akoOperator.config.
      2. Run base64 to re-encode the entire package values string.
      3. Patch the tkg-pkg-tkg-system-values secret with the new, encoded string:
        kubectl patch secret/tkg-pkg-tkg-system-values -n tkg-system -p '{"data": {"tkgpackagevalues.yaml": "BASE64-ENCODED STRING"}}'
        
    4. For management clusters created before TKG v2.1, if you updated tkg-pkg-tkg-system-values in the previous step, also update the ako-operator-addon secret:

      1. If needed, run the following to check whether your management cluster was created in TKG v1.x:
        kubectl get secret -n tkg-system ${MANAGEMENT_CLUSTER_NAME}-ako-operator-addon
        

        If the command outputs an ako-operator-addon object, the management cluster was created in v1.x and you need to update its secret as follows.

      2. In the ako-operator-addon secret, get and decode the values string:
        kubectl get secret ${MANAGEMENT-CLUSTER-NAME}-ako-operator-addon -n tkg-system -o jsonpath="{.data.values\.yaml}" | base64 --decode
        
      3. In a copy of the decoded values string, paste in the new certificate from avi-controller-ca as the avi_ca_data_b64 value.
      4. Run base64 to re-encode the entire ako-operator-addon values string.
      5. Patch the ako-operator-addon secret with the new, encoded string:
        kubectl patch secret/${MANAGEMENT-CLUSTER-NAME}-ako-operator-addon -n tkg-system -p '{"data": {"values.yaml": "BASE64-ENCODED STRING"}}'
        
  • Offline Upgrade Cannot Find Kubernetes v1.25.7 Packages, Fails

    During upgrade to TKG v2.2 in an internet-restricted environment using tanzu isolated-cluster commands, procedure fails with error could not resolve TKR/OSImage [...] 'v1.25.7+vmware.2-tkg.1' or similar as described in Knowledge Base article TKG 2.1 upgrade to TKG 2.2 fails when searching for 1.25.7 Tanzu Kubernetes Release on Air Gapped environment.

    Failure occurs because the tanzu isolated-cluster upload-bundle command does not upload two packages needed for TKr v1.25.7.

    Workaround: Manually upload the tkr-vsphere-nonparavirt-v1.25.7_vmware.2-tkg.1 and tkg-vsphere-repository-nonparavir-1.25.7_vmware.2-tkg.1 packages to your Harbor registry as described in the KB article linked above.

Packages

  • Grafana installation fails on vSphere 6.7 with NSX ALB v21.1.4 or prior

    You cannot install the Grafana v7.5.17 package to TKG v2.2 workload clusters on vSphere v6.7U3 that use NSX ALB v21.1.4 or prior as a load balancer service.

    Workaround: If your workload clusters use Grafana and NSX ALB, upgrade vSphere to v7.0+ and NSX ALB to v21.1.5+ before upgrading TKG to v2.2.

  • Harbor CVE export may fail when execution ID exceeds 1000000+

    Harbor v2.7.1, which is the version packaged for TKG v2.2, has a known issue that CVE reports export with error “404 page not found” when the execution primary key auto-increment ID grows to 1000000+.

    This Harbor issue is resolved in later versions of Harbor that are slated for inclusion in later versions of TKG.

  • Golang vulnerability in the Notary component of Harbor

    CVE-2021-33194 is present in Notary. The Notary component is deprecated in Harbor v2.6+ and is scheduled to be removed in a future release as noted in the Harbor v2.6.0 release notes. VMware recommends using Sigstore Cosign instead of Notary for container signing and verification.

  • Adding standard repo fails for single-node clusters

    Running tanzu package repository add to add the tanzu-standard repo to a single-node cluster of the type described in Single-Node Clusters on vSphere (Technical Preview) may fail.

    This happens because single-node clusters boot up with cert-manager as a core add-on, which conflicts with the different cert-manager package in the tanzu-standard repo.

    Workaround: Before adding the tanzu-standard repo, patch the cert-manager package annotations as described in Install cert-manager.

Cluster Operations

  • Cannot create new workload clusters based on non-current TKr versions with Antrea CNI

    You cannot create a new workload cluster that uses Antrea CNI and runs Kubernetes versions shipped with prior versions TKG, such as Kubernetes v1.23.10, which was the default Kubernetes version in TKG v1.6.1 as listed in Supported Kubernetes Versions in Tanzu Kubernetes Grid v2.2.

    Workaround: Create a workload cluster that runs Kubernetes 1.25.7, 1.24.11, or 1.23.17. The Kubernetes project recommends that you run components on the most recent patch version of any current minor version.

  • Autoscaler for class-based clusters requires manual annotations

    Due to a label propagation issue in Cluster API, AUTOSCALER_MIN_SIZE_* and AUTOSCALER_MAX_SIZE_* settings in the cluster configuration file for class-based workload clusters are not set in the cluster’s MachineDeployment objects.

    Workaround: After creating a class-based workload cluster with Cluster Autoscaler enabled, manually add the min- and max- machine count setting for each AZ as described in Manually Add Min and Max Size Annotations.

  • Node pool labels and other configuration properties cannot be changed

    You cannot add to or otherwise change an existing node pool’s labels, az, nodeMachineType or vSphere properties, as listed in Configuration Properties.

    Workaround: Create a new node pool in the cluster with the desired properties, migrate workloads to the new node pool, and delete the original.

  • You cannot scale management cluster control plane nodes to an even number

    If you run tanzu cluster scale on a management cluster and pass an even number to the --controlplane-machine-count option, TKG does not scale the control plane nodes, and the CLI does not output an error. To maintain quorum, control plane node counts should always be odd.

    Workaround: Do not scale control plane node counts to an even number.

  • Class-based cluster names have 25 character limit with NSX ALB as load balancer service or ingress controller

    When NSX Advanced Load Balancer (ALB) is used as a class-based cluster’s load balancer service or ingress controller with a standalone management cluster, its application names include both the cluster name and load-balancer-and-ingress-service, the internal name for the AKO package. When the combined name exceeds the 64-character limit for Avi Controller apps, the tanzu cluster create command may fail with an error that the avi-system namespace was not found.

    Workaround: Limit class-based cluster name length to 25 characters or less when using NSX ALB as a load balancer or ingress controller.

  • Orphan vSphereMachine objects after cluster upgrade or scale

    Due to a known issue in the cluster-api-provider-vsphere (CAPV) project, standalone management clusters on vSphere may leave orphaned VSphereMachine objects behind after cluster upgrade or scale operations.

    This issue is fixed in newer versions of CAPV, which future patch versions of TKG will incorporate.

    Workaround: To find and delete orphaned CAPV VM objects:

    1. List all VSphereMachine objects and identify the orphaned ones, which do not have any PROVIDERID value:
      kubectl get vspheremachines -A
      
    2. For each orphaned VSphereMachine object:

      1. List the object and retrieve its machine ID:
        kubectl get vspheremachines VSPHEREMACHINE -n NAMESPACE -o yaml
        

        Where VSPHEREMACHINE is the machine NAME and NAMESPACE is its namespace.

      2. Check if the VSphereMachine has an associated Machine object:
        kubectl get machines -n NAMESPACE |  grep MACHINE-ID
        

        Run kubectl delete machine to delete any Machine object associated with the VSphereMachine.

      3. Delete the VSphereMachine object:
        kubectl delete vspheremachines VSPHEREMACHINE -n NAMESPACE
        
      4. From vCenter, check if the VSphereMachine VM still appears; it may be present, but powered off. If so, delete it in vCenter.
      5. If the deletion hangs, patch its finalizer:
        kubectl patch vspheremachines VSPHEREMACHINE -n NAMESPACE -p '{"metadata": {"finalizers": null}}' --type=merge
        

Packages

  • Adding standard repo fails for single-node clusters

    Running tanzu package repository add to add the tanzu-standard repo to a single-node cluster of the type described in Single-Node Clusters on vSphere) may fail.

    This happens because single-node clusters boot up with cert-manager as a core add-on, which conflicts with the different cert-manager package in the tanzu-standard repo.

    Workaround: Before adding the tanzu-standard repo, patch the cert-manager package annotations as described in Install cert-manager.

Cluster Operations

  • On AWS and Azure, creating workload cluster with object spec fails with zone/region error

    By default, on AWS or Azure, running tanzu cluster create with a class-based cluster object spec passed to --file causes the Tanzu CLI to perform region and zone verification that are only relevant to vSphere availability zones.

    Workaround When creating a class-based cluster on AWS or Azure, do either of the following, based on whether you use the one-step or a two-step process described in Create a Class-Based Cluster:

    • One-step: Follow the one-step process as described, by setting features.cluster.auto-apply-generated-clusterclass-based-configuration to true and not passing --dry-run to the tanzu cluster create command.

    • Two-step: Before running tanzu cluster create with the object spec as the second step, set SKIP_MULTI_AZ_VERIFY to true in your local environment:

      export SKIP_MULTI_AZ_VERIFY=true
      
  • Components fail to schedule when using clusters with limited capacity

    For management clusters and workload clusters, if you deploy clusters with a single control plane node, a single worker node, or small or medium clusters, you might encounter resource scheduling contention.

    Workaround: Use either single-node clusters or clusters with a total of three or more nodes.

  • Cannot create new workload clusters based on non-current TKr versions with Antrea CNI

    You cannot create a new workload cluster that uses Antrea CNI and runs Kubernetes versions shipped with prior versions TKG, such as Kubernetes v1.23.10, which was the default Kubernetes version in TKG v1.6.1 as listed in Supported Kubernetes Versions in Tanzu Kubernetes Grid v2.2.

    Workaround: Create a workload cluster that runs Kubernetes 1.26.8, 1.25.13, or 1.24.17. The Kubernetes project recommends that you run components on the most recent patch version of any current minor version.

  • You cannot scale management cluster control plane nodes to an even number

    If you run tanzu cluster scale on a management cluster and pass an even number to the --controlplane-machine-count option, TKG does not scale the control plane nodes, and the CLI does not output an error. To maintain quorum, control plane node counts should always be odd.

    Workaround Do not scale control plane node counts to an even number.

Networking

Note

For v4.0+, VMware NSX-T Data Center is renamed to “VMware NSX.”

  • IPv6 networking is not supported on vSphere 8

    TKG v2.2 does not support IPv6 networking on vSphere 8, although it supports single-stack IPv6 networking using Kube-Vip on vSphere 7 as described in IPv6 Networking.

    Workaround: If you need or are currently using TKG in an IPv6 environment on vSphere, do not install or upgrade to vSphere 8.

  • NSX ALB NodePortLocal ingress mode is not supported for management cluster

    In TKG v2.2, you cannot run NSX Advanced Load Balancer (ALB) as a service type with ingress mode NodePortLocal for traffic to the management cluster.

    This issue does not affect support for NodePortLocal ingress to workload clusters, as described in L7 Ingress in NodePortLocal Mode.

    Workaround: Configure management clusters with AVI_INGRESS_SERVICE_TYPE set to either NodePort or ClusterIP. Default is NodePort.

  • Management cluster create fails or performance slow with older NSX-T versions and Photon 3 or Ubuntu with Linux kernel 5.8 VMs

    Deploying a management cluster with the following infrastructure and configuration may fail or result in restricted traffic between pods:

    • vSphere with any of the following versions of NSX-T:
      • NSX-T v3.1.3 with Enhanced Datapath enabled
      • NSX-T v3.1.x lower than v3.1.3
      • NSX-T v3.0.x lower than v3.0.2 hot patch
      • NSX-T v2.x. This includes Azure VMware Solution (AVS) v2.0, which uses NSX-T v2.5
    • Base image: Photon 3 or Ubuntu with Linux kernel 5.8

    This combination exposes a checksum issue between older versions of NSX-T and Antrea CNI.

    TMC: If the management cluster is registered with Tanzu Mission Control (TMC) there is no workaround to this issue. Otherwise, see the workarounds below.

    Workarounds:

    • Deploy workload clusters configured with ANTREA_DISABLE_UDP_TUNNEL_OFFLOAD set to "true". This setting deactivates Antrea’s UDP checksum offloading, which avoids the known issues with some underlay network and physical NIC network drivers.
    • Upgrade to NSX-T v3.0.2 Hot Patch, v3.1.3, or later, without Enhanced Datapath enabled
    • Use an Ubuntu base image with Linux kernel 5.9 or later.

Identity and Access Management

  • Re-creating standalone management cluster does not restore Pinniped authentication

    After you re-create a standalone management cluster as described in Back Up and Restore Management and Workload Cluster Infrastructure (Technical Preview), users cannot log in to workload clusters via Pinniped authentication.

    Workaround: After re-creating the management cluster, re-configure identity management as described in Enable and Configure Identity Management in an Existing Deployment.

  • Golang vulnerability in the Pinniped component

    CVE-2022-41723 is present in Pinniped v0.12.1. The exploitation of this vulnerability could result in a DDoS attack of the Pinniped service. To reduce the exploitation probability to low, you can use ingress filtering to allow traffic only from known IP addresses, such as a corporate VPN CIDR. This vulnerability will be resolved in a future release of Tanzu Kubernetes Grid.

  • Generating and using kubeconfig for Supervisor-deployed workload cluster causes “unknown flag” error

    In Tanzu CLI v0.25.0 and prior, the cluster plugin generates kubeconfig files that may contain the argument --concierge-is-cluster-scoped. The Tanzu CLI v0.29.0 pinniped-auth plugin does not recognize this argument. This temporary regression has been fixed in subsequent versions.

    Because vSphere 8.0 embeds the v0.25.0 cluster plugin, connecting to a Supervisor-deployed cluster from CLI v0.29.0, the version in TKG v2.2, may generate this error:

    Error: unknown flag: --concierge-is-cluster-scoped
    Unable to connect to the server: getting credentials: exec: executable tanzu failed with exit code 1
    

    Workaround: In your kubeconfig file, remove the line under args that sets --concierge-is-cluster-scoped.

Storage

  • Workload cluster cannot distribute storage across multiple datastores

    You cannot enable a workload cluster to distribute storage across multiple datastores as described in Deploy a Cluster that Uses a Datastore Cluster. If you tag multiple datastores in a datastore cluster as the basis for a workload cluster’s storage policy, the workload cluster uses only one of the datastores.

    Workaround: None

Tanzu CLI

  • Non-alphanumeric characters cannot be used in HTTP/HTTPS proxy passwords

    When deploying management clusters with CLI, the non-alphanumeric characters # ` ^ | / ? % ^ { [ ] } \ " < > cannot be used in passwords. Also, any non-alphanumeric character cannot be used in HTTP/HTTPS proxy passwords when deploying management cluster with UI.

    Workaround: You can use non-alphanumeric characters other than # ` ^ | / ? % ^ { [ ] } \ " < > in passwords when deploying management cluster with CLI.

  • Tanzu CLI does not work on macOS machines with ARM processors

    Tanzu CLI v0.11.6 does not work on macOS machines with ARM (Apple M1) chips, as identified under Finder > About This Mac > Overview.

    Workaround: Use a bootstrap machine with a Linux or Windows OS, or a macOS machine with an Intel processor.

  • Tanzu CLI lists tanzu management-cluster osimage

    The management-cluster command group lists tanzu management-cluster osimage. This feature is currently in development and reserved for future use.

    Workaround: Do not use tanzu management-cluster osimage.

  • --default-values-file-output option of tanzu package available get outputs an incomplete configuration template file for the Harbor package

    Running tanzu package available get harbor.tanzu.vmware.com/PACKAGE-VERSION --default-values-file-output FILE-PATH creates an incomplete configuration template file for the Harbor package. To get a complete file, use the imgpkg pull command as described in Install Harbor for Service Registry.

vSphere

  • Node pools created with small nodes may stall at Provisioning

    Node pools created with node SIZE configured as small may become stuck in the Provisioning state and never proceed to Running.

    Workaround: Configure node pool with at least medium size nodes.

AWS

  • CAPA resource tagging issue causes reconciliation failure during AWS management cluster deploy and upgrade.

    Due to a resource tagging issue in upstream Cluster API Provider AWS (CAPA), offline deployments cannot access the ResourceTagging API, causing reconciliation failures during management cluster creation or upgrade.

    Workaround: In an offline AWS environment, set EXP_EXTERNAL_RESOURCE_GC=false in your local environment or in the management cluster configuration file before running tanzu mc create or tanzu mc upgrade.

  • Workload cluster node pools on AWS must be in the same availability zone as the standalone management cluster.

    When creating a node pool configured with an az that is different from where the management cluster is located, the new node pool may remain stuck with status ScalingUp, as listed by tanzu cluster node-pool list, and never reach the Ready state.

    Workaround: Only create node pools in the same AZ as the standalone management cluster.

  • Deleting cluster on AWS fails if cluster uses networking resources not deployed with Tanzu Kubernetes Grid.

    The tanzu cluster delete and tanzu management-cluster delete commands may hang with clusters that use networking resources created by the AWS Cloud Controller Manager independently from the Tanzu Kubernetes Grid deployment process. Such resources may include load balancers and other networking services, as listed in The Service Controller in the Kubernetes AWS Cloud Provider documentation.

    For more information, see the Cluster API issue Drain workload clusters of service Type=Loadbalancer on teardown.

    Workaround: Use kubectl delete to delete services of type LoadBalancer from the cluster. Or if that fails, use the AWS console to manually delete any LoadBalancer and SecurityGroup objects created for this service by the Cloud Controller manager.

    Caution

    Do not to delete load balancers or security groups managed by Tanzu, which have the tags key: sigs.k8s.io/cluster-api-provider-aws/cluster/CLUSTER-NAME, value: owned.

Azure

  • Cluster delete fails when storage volume uses account with private endpoint

    With an Azure workload cluster in an unmanaged resource group, when the Azure CSI driver creates a persistent volume (PV) that uses a storage account with private endpoint, it creates a privateEndpoint and vNet resources that are not deleted when the PV is deleted. As a result, deleting the cluster fails with an error like subnets failed to delete. err: failed to delete resource ... Subnet management-cluster-node-subnet is in use.

    Workaround: Before deleting the Azure cluster, manually delete the network interface for the storage account private endpoint:

    1. From a browser, log in to Azure Resource Explorer.
    2. Click subscriptions at left, and expand your subscription.
    3. Under your subscription, expand resourceGroups at left, and expand your TKG deployment’s resource group.
    4. Under the resource group, expand providers > Microsoft.Network > networkinterfaces.
    5. Under networkinterfaces, select the NIC resource that is failing to delete.
    6. Click the Read/Write button at the top, and then the Actions(POST, DELETE) tab just underneath.
    7. Click Delete.
    8. Once the NIC is deleted, delete the Azure cluster.

Windows

  • Windows workers not supported in internet-restricted environments

    VMware does not support TKG workload clusters with Windows worker nodes in proxied or air-gapped environments.

    Workaround: Please contact your VMware representative. Some TKG users have built Windows custom images and run workload clusters with Windows workers in offline environments, for example as described in this unofficial repo.

Image-Builder

  • Ignorable goss test failures during image-build process

    When you run Kubernetes Image Builder to create a custom Linux custom machine image, the goss tests python-netifaces, python-requests, and ebtables fail. Command output reports the failures. The errors can be ignored; they do not prevent a successful image build.

AVS

  • vSphere CSI volume deletion may fail on AVS

On Azure vSphere Solution (AVS), vSphere CSI Persistent Volumes (PVs) deletion may fail. Deleting a PV requires the cns.searchable permission. The default admin account for AVS, [email protected], is not created with this permission. For more information, see vSphere Roles and Privileges.

Workaround: To delete a vSphere CSI PV on AVS, contact Azure support.

check-circle-line exclamation-circle-line close-line
Scroll to top icon