VMware Tanzu Kubernetes Grid 1.5.4 Release Notes

What’s New

VMware recommends that you install or upgrade to Tanzu Kubernetes Grid v1.5.4, not previous v1.5 patch versions.

Except where noted, these release notes apply to all patch versions of Tanzu Kubernetes Grid, v1.5.0 through v1.5.4.

  • You can register management and workload clusters with Tanzu Mission Control. See Register Your Management Cluster with Tanzu Mission Control.
  • You can install Tanzu Application Platform (TAP) v1.0.1 on workload clusters created by Tanzu Kubernetes Grid, use TAP to deploy apps to those clusters, and manage the TAP-deployed applications with tanzu apps commands.
  • (vSphere) You can create workload clusters with Windows OS worker nodes. See Deploying a Windows Cluster.
  • Different workload clusters can run different versions of the same CLI-managed package, including the latest version and the versions of the package in your two previous installations of Tanzu Kubernetes Grid. See CLI-Managed Packages.
  • New pinniped.supervisor_svc_external_dns configuration setting supports using an FQDN as the callback URL for a Pinniped Supervisor (in v1.5.3+)
  • (vSphere) Clusters that have NSX Advanced Load Balancer (ALB) as their control plane API endpoint server can use an external identity provider for login authentication, via Pinniped.
  • (vSphere) Supports Antrea NodePortLocal mode. See L7 Ingress in NodePortLocal Mode.
  • Crash Diagnostics (Crashd) collects information about clusters and infrastructure on Microsoft Azure as well as on vSphere and Amazon EC2.
  • (Azure) Supports NVIDIA GPU machine types based on Azure NC-, NV-, and NVv3-series VMs for control plane and worker VMs. See GPU-Enabled Clusters in the Cluster API Provider Azure documentation.
  • (Azure) You can now configure Azure management and workload clusters to be private, which means their API server uses an Azure internal load balancer (ILB) and is therefore only accessible from within the cluster’s own VNET or peered VNETs. See Azure Private Clusters.
  • Tanzu CLI:
    • tanzu secret registry commands manage secrets to enable cluster access to a private container registry. See Configure Authentication to a Private Container Registry.
    • tanzu config set and tanzu config unset commands activate and deactivate CLI features and manage persistent environment variables. See Tanzu CLI Configuration.
    • tanzu plugin sync command discovers and downloads new CLI plugins that are associated with either a newer version of Tanzu Kubernetes Grid, or a package installed on your management cluster that your local CLI does not know about, for example if another user installed it. See Sync New Plugins.
    • mc alias for management-cluster. See Tanzu CLI Command Reference.
    • -v and -f flags to tanzu package installed update enable updating package configuration without updating version.
    • -p flag to tanzu cluster scale lets you specify a node pool when scaling node-pool nodes. See Update Node Pools.
    • The --machine-deployment-base option to tanzu cluster node-pool set specifies a base MachineDeployment object from which to create a new node pool.
    • (Amazon EC2) tanzu management-cluster permissions aws generate-cloudformation-template command retrieves the CloudFormation template to create the IAM resources required by Tanzu Kubernetes Grid’s account on AWS. See Permissions Set by Tanzu Kubernetes Grid.
  • Installer interface:
    • Export Configuration button in installer interface lets you save management cluster configuration file out to a location of your choice before you deploy the cluster.
    • Disable Verification checkbox disables certificate thumbprint verification when configuring access to your vCenter server, equivalent to setting VSPHERE_INSECURE: true in cluster configuration file.
    • Browse File button supports specifying SSH public key file, as alternative to pasting key into textbox.
    • (Amazon EC2) Automate creation of AWS CloudFormation Stack option retrieves the CloudFormation template to create the IAM resources required by Tanzu Kubernetes Grid’s account on AWS. See Permissions Set by Tanzu Kubernetes Grid.
  • Configuration variables:
    • AVI_DISABLE_STATIC_ROUTE_SYNC disables the static routing sync for AKO.
    • AVI_MANAGEMENT_CLUSTER_SERVICE_ENGINE_GROUP specifies the group name of the service engine that is to be used by AKO in the management cluster.
    • AVI_INGRESS_NODE_NETWORK_LIST describes the details of the network and the CIDRs that are used in the pool placement network for vCenter Cloud.
    • AVI_CONTROLLER_VERSION sets the NSX Advanced Load Balancer (ALB) version for NSX ALB v21.1.x deployments in Tanzu Kubernetes Grid.
    • CONTROL_PLANE_MACHINE_COUNT and WORKER_MACHINE_COUNT configuration variables customize management clusters, in addition to workload clusters.
    • CLUSTER_API_SERVER_PORT sets the port number of the Kubernetes API server, overriding default 6443, for deployments without NSX Advanced Load Balancer.
    • ANTREA_DISABLE_UDP_TUNNEL_OFFLOAD disables Antrea’s UDP checksum offloading, to avoid known issues with underlay network and physical NIC network drivers. See Antrea CNI Configuration.
    • (Azure) Configuration variable AZURE_ENABLE_ACCELERATED_NETWORKING toggles Azure accelerated networking. Defaults to true and enables setting to false on VMs with more than 4 CPUs.
  • Addresses security vulnerabilities:
    • CVE-2022-0847 (in v1.5.3+)
    • CVE-2022-0492 (in v1.5.3+)
  • New Kubernetes versions, listed in Supported Kubernetes Versions in Tanzu Kubernetes Grid, below.
  • The version numbering scheme for Tanzu Framework changed when the project became open-source. Previously, Tanzu Framework version numbers matched the Tanzu Kubernetes Grid versions that included them. Tanzu Kubernetes Grid v1.5.x uses Tanzu Framework v0.11.x.

Supported Kubernetes Versions in Tanzu Kubernetes Grid

Each version of Tanzu Kubernetes Grid adds support for the Kubernetes version of its management cluster, plus additional Kubernetes versions, distributed as Tanzu Kubernetes releases (TKrs).

Any version of Tanzu Kubernetes Grid supports all TKr versions from the previous two minor lines of Kubernetes. For example, TKG v1.5.4 supports the Kubernetes versions v1.22.x, v1.21.x, and v1.20.x listed below, but not v1.19.x, v1.18.x, or v1.17.x.

Tanzu Kubernetes Grid Version Kubernetes Version of
Management Cluster
Provided Kubernetes (TKr) Versions
1.5.4 1.22.9 1.22.9, 1.20.15
1.5.3 1.22.8 1.22.8, 1.21.11
1.5.2, 1.5.1, 1.5.0 1.22.5 1.22.5, 1.21.8
1.4.2 1.21.8 1.21.8, 1.20.14, 1.19.16
1.4.0, 1.4.1 1.21.2 1.21.2, 1.20.8, 1.19.12
1.3.1 1.20.5 1.20,5, 1.19.9, 1.18.17
1.3.0 1.20.4 1.20.4, 1.19.8, 1.18.16, 1.17.16

Product Snapshot for Tanzu Kubernetes Grid v1.5

Tanzu Kubernetes Grid v1.5 supports the following infrastructure platforms and operating systems (OSs), as well as cluster creation and management, networking, storage, authentication, backup and migration, and observability components. The component versions listed in parentheses are included in Tanzu Kubernetes Grid v1.4. For more information, see Component Versions.

vSphere Amazon EC2 Azure
Infrastructure platform
  • vSphere 6.7U3
  • vSphere 7
  • VMware Cloud on AWS***
  • Azure VMware Solution
Native AWS Native Azure
CLI, API, and package infrastructure Tanzu Framework v0.11.6
Cluster creation and management Core Cluster API (v1.0.1), Cluster API Provider vSphere (v1.0.2) Core Cluster API (v1.0.1), Cluster API Provider AWS (v1.2.0) Core Cluster API (v1.0.1), Cluster API Provider Azure (v1.0.1)
Kubernetes node OS distributed with TKG Photon OS 3, Ubuntu 20.04 Amazon Linux 2, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Build your own image Photon OS 3, Red Hat Enterprise Linux 7, Ubuntu 18.04, Ubuntu 20.04, Windows 2019 Amazon Linux 2, Ubuntu 18.04, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Container runtime Containerd (v1.5.7)
Container networking Antrea (v1.2.3), Calico (v3.19.1)
Container registry Harbor (v2.3.3)
Ingress NSX Advanced Load Balancer Essentials and Avi Controller (v20.1.3, v20.1.6, v20.1.7, v21.1.3, and v21.1.4)*, Contour (v1.18.2, v1.17.2) Contour (v1.18.2, v1.17.2) Contour (v1.18.2, v1.17.2)
Storage vSphere Container Storage Interface (v2.4.1**) and vSphere Cloud Native Storage In-tree cloud providers only In-tree cloud providers only
Authentication OIDC via Pinniped (v0.12.1), LDAP via Pinniped (v0.12.1) and Dex
Observability Fluent Bit (v1.7.5), Prometheus (v2.27.0), Grafana (v7.5.7)
Backup and migration Velero (v1.8.1)

NOTES:

  • * NSX Advanced Load Balancer Essentials is supported on vSphere 6.7U3, vSphere 7, and VMware Cloud on AWS. You can download it from the Download VMware Tanzu Kubernetes Grid page.
  • ** Version of vsphere_csi_driver. For a full list of vSphere Container Storage Interface components included in the Tanzu Kubernetes Grid v1.5 release, see Component Versions.
  • *** For a list of VMware Cloud on AWS SDDC versions that are compatible with this release, see the VMware Product Interoperability Matrix.

For a full list of Kubernetes versions that ship with Tanzu Kubernetes Grid v1.5, see Supported Kubernetes Versions in Tanzu Kubernetes Grid above.

Component Versions

The Tanzu Kubernetes Grid v1.5 patch releases include the following software component versions. A blank space indicates that the component version is the same one that is listed to the left for a later patch release.

Component TKG v1.5.4 TKG v1.5.3 TKG v1.5.2 TKG v1.5.1
aad-pod-identity v1.8.0+vmware.1*
addons-manager v1.5.0_vmware.1-tkg.5 v1.5.0_vmware.1-tkg.4 v1.5.0_vmware.1-tkg.3
ako-operator v1.5.0_vmware.6-tkg.1 v1.5.0_vmware.5 v1.5.0_vmware.4*
alertmanager v0.22.2+vmware.1
antrea v1.2.3+vmware.4*
cadvisor v0.39.1+vmware.1
calico_all v3.19.1+vmware.1*
carvel-secretgen-controller v0.7.1+vmware.1*
cloud-provider-azure v0.7.4+vmware.1
cloud_provider_vsphere v1.22.4+vmware.1*
cluster-api-provider-azure v1.0.2+vmware.1 v1.0.1+vmware.1*
cluster_api v1.0.1+vmware.1*
cluster_api_aws v1.2.0+vmware.1*
cluster_api_vsphere v1.0.3+vmware.1 v1.0.2+vmware.1*
cni_plugins v1.1.1+vmware.2 v0.9.1+vmware.8*
configmap-reload v0.5.0+vmware.2*
containerd v1.5.11+vmware.1 v1.5.7+vmware.1
contour v1.18.2+vmware.1,
v1.17.2+vmware.1*
coredns v1.8.4_vmware.9 v1.8.4_vmware.7*
crash-diagnostics v0.3.7+vmware.5 v0.3.7+vmware.3
cri_tools v1.21.0+vmware.7*
csi_attacher v3.3.0+vmware.1*
csi_livenessprobe v2.4.0+vmware.1*
csi_node_driver_registrar v2.3.0+vmware.1*
csi_provisioner v3.0.0+vmware.1*
dex v2.30.2+vmware.1*
envoy v1.19.1+vmware.1,
v1.18.4+vmware.1
external-dns v0.10.0+vmware.1*
etcd v3.5.4_vmware.2 v3.5.2_vmware.3 v3.5.0+vmware.7*
fluent-bit v1.7.5+vmware.2*
gangway v3.2.0+vmware.2
grafana v7.5.7+vmware.2*
harbor v2.3.3+vmware.1*
image-builder v0.1.11+vmware.3
imgpkg v0.22.0+vmware.1 v0.18.0+vmware.1*
jetstack_cert-manager v1.5.3+vmware.2*
k8s-sidecar v1.12.1+vmware.2*
k14s_kapp v0.42.0+vmware.2 v0.42.0+vmware.1*
k14s_ytt v0.37.0+vmware.1 v0.35.1+vmware.1*
kapp-controller v0.30.1_vmware.1-tkg.2 v0.30.0+vmware.1-tkg.2 v0.30.0+vmware.1-tkg.1*
kbld v0.31.0+vmware.1*
kube-state-metrics v1.9.8+vmware.1
kube-vip v0.3.3+vmware.1
kube_rbac_proxy v0.8.0+vmware.1
kubernetes v1.22.9+vmware.1 v1.22.8+vmware.1 v1.22.5+vmware.1-tkg.4 v1.22.5+vmware.1-tkg.3*
kubernetes-csi_external-resizer v1.3.0+vmware.1*
kubernetes-sigs_kind v1.22.9+vmware.1-tkg.1_v0.11.1 v1.22.8+vmware.1-tkg.1_v0.11.1 v1.22.5+vmware.1_v0.11.1*
kubernetes_autoscaler v1.22.0+vmware.1*
load-balancer-and-ingress-service (AKO) v1.6.1+vmware.4 v1.6.1+vmware.2*
metrics-server v0.5.1+vmware.1*
multus-cni** v3.7.1_vmware.2*
pinniped v0.12.1+vmware.1 v0.12.0+vmware.1*
prometheus v2.27.0+vmware.1
prometheus_node_exporter v1.1.2+vmware.1
pushgateway v1.4.0+vmware.1
standalone-plugins-package v0.11.6-1-standalone-plugins v0.11.4-1-standalone-plugins v0.11.2-standalone-plugins v0.11.1-standalone-plugins*
sonobuoy v0.54.0+vmware.1
tanzu-framework v0.11.6-1 v0.11.4-1 v0.11.2 v0.11.1*
tanzu-framework-addons v0.11.6-1 v0.11.4-1 v0.11.2 v0.11.1*
tanzu-framework-management-packages v0.11.6-1 v0.11.4-1 v0.11.2 v0.11.1*
tkg-bom v1.5.4 v1.5.3 v1.5.2 v1.5.1*
tkg-core-packages v1.22.9+vmware.1-tkg.1 v1.22.8+vmware.1-tkg.1 v1.22.5+vmware.1-tkg.4 v1.22.5+vmware.1-tkg.3*
tkg-standard-packages v1.5.4 v1.5.4 v1.5.2 v1.5.1*
tkg_telemetry v1.5.0+vmware.1*
velero v1.8.1+vmware.1 v1.7.0+vmware.1*
velero-plugin-for-aws v1.4.1+vmware.1 v1.3.0+vmware.1*
velero-plugin-for-microsoft-azure v1.4.1+vmware.1 v1.3.0+vmware.1*
velero-plugin-for-vsphere v1.3.1+vmware.1 v1.3.0+vmware.1*
vendir v0.23.1+vmware.1 v0.23.0+vmware.1*
vsphere_csi_driver v2.4.1+vmware.1*
windows-resource-bundle v1.22.9+vmware.1-tkg.1 v1.22.8+vmware.1-tkg.1 v1.22.5+vmware.1-tkg.1 v1.22.5+vmware.1-tkg.1

* Indicates a version bump or new component since v1.4.2, which is the latest release prior to v1.5.1.

The version numbering scheme for Tanzu Framework changed when the project became open-source. Previously, Tanzu Framework version numbers matched the Tanzu Kubernetes Grid versions that included them.

For a complete list of software component versions that ship with Tanzu Kubernetes Grid v1.5.4, see ~/.config/tanzu/tkg/bom/tkg-bom-v1.5.4.yaml and ~/.config/tanzu/tkg/bom/tkr-bom-v1.22.9+vmware.1-tkg.1.yaml. For component versions in previous releases, see the tkg-bom- and tkr-bom- YAML files that install with those releases.

Supported Upgrade Paths

Caution: VMware recommends not installing or upgrading to Tanzu Kubernetes Grid v1.5.0-v1.5.3, due to a bug in the versions of etcd in the versions of Kubernetes used by Tanzu Kubernetes Grid v1.5.0-v1.5.3. Tanzu Kubernetes Grid v1.5.4 resolves this problem by incorporating a fixed version of etcd. For more information, see Resolved Issues below.

You can only upgrade to Tanzu Kubernetes Grid v1.5.x from v1.4.x. If you want to upgrade to Tanzu Kubernetes Grid v1.5.x from a version earlier than v1.4.x, you must upgrade to v1.4.x first.

When upgrading Kubernetes versions on Tanzu Kubernetes clusters, you cannot skip minor versions. For example, you cannot upgrade a Tanzu Kubernetes cluster directly from v1.20.x to v1.22.x. You must upgrade a v1.20.x cluster to v1.21.x before upgrading the cluster to v1.22.x.

Behavior Changes Between Tanzu Kubernetes Grid v1.4 and v1.5

Tanzu Kubernetes Grid v1.5 introduces the following new behaviors compared with v1.4.

  • After installing the Tanzu CLI, you need to run tanzu init before deploying a management cluster.
  • You cannot use the Tanzu CLI to register management clusters with Tanzu Mission Control.
  • After you have installed the Tanzu CLI but before a management cluster has been deployed or upgraded, tanzu kubernetes-release and tanzu cluster commands, such as tanzu cluster list, are unavailable.
  • After a management cluster has been deployed in Tanzu Kubernetes Grid v1.5.0, v1.5.1, v1.5.2, or v1.5.3, you need to run tanzu plugin sync to install Tanzu CLI plugins for Tanzu Kubernetes Grid.

User Documentation

The Tanzu Kubernetes Grid v1.5 documentation applies to all of the v1.5.x releases. It includes information about the following subjects:

  • This Release Notes topic covers the new features and other information specific to 1.5.x patch versions.
    • In v1.4 and earlier, each patch release had a separate Release Notes.
  • Concepts and References introduces the key components of Tanzu Kubernetes Grid and describes how you use them and what they do.
  • Prepare to Deploy Management Clusters describes how to install the Tanzu CLI as well as the prerequisites for deploying Tanzu Kubernetes Grid on vSphere, AWS, and Microsoft Azure
  • Deploy Management Clusters describes how to deploy Tanzu Kubernetes Grid management clusters to vSphere, AWS, and Microsoft Azure.
  • Deploy Tanzu Kubernetes Clusters describes how to use the Tanzu Kubernetes Grid CLI to deploy Tanzu Kubernetes clusters from your management cluster
  • Manage Clusters describes how to manage the lifecycle of management and workload clusters.
  • Install and Configure Packages describes how to set up local shared services in your Tanzu Kubernetes clusters, such as authentication and authorization, logging, networking, and ingress control.
  • Build Machine Images describes how to build and use your own base OS images for cluster nodes.
  • Upgrade Tanzu Kubernetes Grid describes how to upgrade to this version.
  • Identity and Access Management explains how to integrate an external identity provider (IDP) and configure role-based access control (RBAC).
  • Networking includes how to configure container networking, and on vSphere, NSX Advanced Load Balancer and IPv6.
  • Security and Compliance explains how Tanzu Kubernetes Grid maintains security, and covers NIST controls assessment, audit logging, and a FIPS-capable product version. networking and network architectures how to upgrade to this
  • Logs and Troubleshooting includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying Tanzu Kubernetes clusters.

Resolved Issues

The following issues are resolved in Tanzu Kubernetes Grid v1.5 patch versions as indicated.

Resolved in v1.5.4

The following issues are resolved in Tanzu Kubernetes Grid v1.5.4.

  • Host network pods and node use the wrong IP in IPv6 clusters

    The issue described in Host network pods and node use the wrong IP in IPv6 clusters below has been resolved in TKG v1.5.4 for IPv6 clusters based on Kubernetes v1.22.x.

    It is still a Known Issue in TKG v1.5.4 for clusters based on Kubernetes v1.20.x and and v1.21.x.

  • etcd v3.5.0-2 data inconsistency issue in Kubernetes v1.22.0-8

    Kubernetes versions 1.22.0-1.22.8, which are included in Tanzu Kubernetes Grid v1.5.0-v1.5.3, use etcd versions 3.5.0-3.5.2. These versions of etcd have a bug that can result in data corruption. VMware recommends not installing or upgrading to Tanzu Kubernetes Grid v1.5.0-v1.5.3. Fixes for this bug are incorporated into Tanzu Kubernetes Grid v1.5.4.

    If you encounter etcd data inconsistency in Tanzu Kubernetes Grid v1.5.0-v1.5.3 or for additional information and a diagnostic procedure, see the VMware Knowledge Base article etcd v3.5.0-3.5.2 can corrupt data in TKG v1.5.0-1.5.3.

  • You must run the tanzu plugin sync command after deploying a management cluster

    After a management cluster has been deployed in Tanzu Kubernetes Grid v1.5.0, v1.5.1, v1.5.2, or v1.5.3, you need to run tanzu plugin sync to install Tanzu CLI plugins.

Resolved in v1.5.3

The following issues are resolved in Tanzu Kubernetes Grid v1.5.3.

  • Unexpected VIP network separation after upgrading TKG management cluster to v1.5.2

    For management clusters that were created with AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME and _CIDR set different from AVI_DATA_NETWORK and _CIDR, upgrading to v1.5.2 changes internal settings that cause subsequently-created workload clusters’ control plane VIP networks to be different from the management cluster’s VIP network. This issue causes no loss of connectivity.

    Workaround: Specify the correct control plane network for NSX ALB to discover by running the following, in the management cluster kubeconfig context:

    kubectl patch akodeploymentconfig install-ako-for-all --type "json" -p '[
      {"op":"replace","path":"/spec/controlPlaneNetwork/name","value":AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME},
      {"op":"replace","path":"/spec/controlPlaneNetwork/cidr","value":AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR}]'
    

    Where AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME and _CIDR are the name and CIDR range of the control plane network that you want to assign to your management cluster’s load balancers.

  • You cannot deploy or upgrade to TKG v1.5 if you are using a registry with a self-signed certificate

    Management cluster deployment or upgrade fails when accessing an image registry with a custom certificate, such as in an internet-restricted environment.

  • Management and workload cluster upgrades fail on Azure

    On Azure, running tanzu management-cluster upgrade and tanzu cluster upgrade fails if AZURE_CLIENT_SECRET is not set as an environment variable.

    Workaround: Before running tanzu management-cluster upgrade or tanzu cluster upgrade, set the AZURE_CLIENT_SECRET environment variable. For more information, see this VMware Knowledge Base article.

  • load-balancer-and-ingress-service (AKO) package fails to reconcile on newly created Windows workload cluster

    After creating a Windows workload cluster that uses NSX Advanced Load Balancer, you may see the following error message:

    Reconcile failed: Error (see .status.usefulErrorMessage for details)
    avi-system ako-0 - Pending: Unschedulable (message: 0/2 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 1 node(s) had taint {os: windows}, that the pod didn't tolerate.)
    

    Workaround: Add a tolerations setting to the AKO pod specifications by following the procedure in Add AKO Overlay in Windows Custom Machine Images.

  • Azure private workload cluster upgrades fail

    Upgrading private workload clusters on Azure from v1.4.x fails with error failed to pull and unpack image because the upgrade process does not correctly reattach the control plane nodes to the load balancer, preventing external network access for the upgraded nodes.

  • Cannot delete upgraded workload cluster

    When you delete a workload cluster that has been upgraded from v1.4.x, a finalizer on the VSphereIdentitySecret is not deleted, resulting in the workload cluster not being deleted.

  • Unstable file naming convention prevents automated retrieval of image binaries

    Automated retrieval of signed image binaries fails due to unstable file naming convention.

  • Running Tanzu commands on Windows fails with a certificate error

    When using the Tanzu CLI on Windows OS, all registry operations fails with x509: certificate signed by unknown authority error.

    Workaround:

    1. Obtain a correct base64 encoded root certificate in PEM format from DigiCert Trusted Root Authority Certificates.

    2. Supply it as an environmental override via the TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE configuration variable. To do so, use the below command before running tanzu init or tanzu plugin:

        $env:TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE="LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURyekNDQXBlZ0F3SUJBZ0lRQ0R2Z1ZwQkNSckdoZFdySldaSEhTakFOQmdrcWhraUc5dzBCQVFVRkFEQmgKTVFzd0NRWURWUVFHRXdKVlV6RVZNQk1HQTFVRUNoTU1SR2xuYVVObGNuUWdTVzVqTVJrd0Z3WURWUVFMRXhCMwpkM2N1WkdsbmFXTmxjblF1WTI5dE1TQXdIZ1lEVlFRREV4ZEVhV2RwUTJWeWRDQkhiRzlpWVd3Z1VtOXZkQ0JEClFUQWVGdzB3TmpFeE1UQXdNREF3TURCYUZ3MHpNVEV4TVRBd01EQXdNREJhTUdFeEN6QUpCZ05WQkFZVEFsVlQKTVJVd0V3WURWUVFLRXd4RWFXZHBRMlZ5ZENCSmJtTXhHVEFYQmdOVkJBc1RFSGQzZHk1a2FXZHBZMlZ5ZEM1agpiMjB4SURBZUJnTlZCQU1URjBScFoybERaWEowSUVkc2IySmhiQ0JTYjI5MElFTkJNSUlCSWpBTkJna3Foa2lHCjl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUE0anZoRVhMZXFLVFRvMWVxVUtLUEMzZVF5YUtsN2hMT2xsc0IKQ1NETUFaT25UakMzVS9kRHhHa0FWNTNpalNMZGh3WkFBSUVKenM0Ymc3L2Z6VHR4UnVMV1pzY0ZzM1luRm85NwpuaDZWZmU2M1NLTUkydGF2ZWd3NUJtVi9TbDBmdkJmNHE3N3VLTmQwZjNwNG1WbUZhRzVjSXpKTHYwN0E2RnB0CjQzQy9keEMvL0FIMmhkbW9SQkJZTXFsMUdOWFJvcjVINGlkcTlKb3orRWtJWUl2VVg3UTZoTCtocWtwTWZUN1AKVDE5c2RsNmdTemVSbnR3aTVtM09GQnFPYXN2K3piTVVaQmZIV3ltZU1yL3k3dnJUQzBMVXE3ZEJNdG9NMU8vNApnZFc3alZnL3RSdm9TU2lpY05veEJOMzNzaGJ5VEFwT0I2anRTajFldFgramtNT3ZKd0lEQVFBQm8yTXdZVEFPCkJnTlZIUThCQWY4RUJBTUNBWVl3RHdZRFZSMFRBUUgvQkFVd0F3RUIvekFkQmdOVkhRNEVGZ1FVQTk1UU5WYlIKVEx0bThLUGlHeHZEbDdJOTBWVXdId1lEVlIwakJCZ3dGb0FVQTk1UU5WYlJUTHRtOEtQaUd4dkRsN0k5MFZVdwpEUVlKS29aSWh2Y05BUUVGQlFBRGdnRUJBTXVjTjZwSUV4SUsrdDFFbkU5U3NQVGZyZ1QxZVhrSW95UVkvRXNyCmhNQXR1ZFhIL3ZUQkgxakx1RzJjZW5Ubm1DbXJFYlhqY0tDaHpVeUltWk9Na1hEaXF3OGN2cE9wLzJQVjVBZGcKMDZPL25Wc0o4ZFdPNDFQMGptUDZQNmZidEdiZlltYlcwVzVCamZJdHRlcDNTcCtkV09JcldjQkFJKzB0S0lKRgpQbmxVa2lhWTRJQklxRGZ2OE5aNVlCYmVyT2dPelc2c1JCYzRMMG5hNFVVK0tyazJVODg2VUFiM0x1akVWMGxzCllTRVkxUVN0ZUR3c09vQnJwK3V2RlJUcDJJbkJ1VGhzNHBGc2l2OWt1WGNsVnpEQUd5U2o0ZHpwMzBkOHRiUWsKQ0FVdzdDMjlDNzlGdjFDNXFmUHJtQUVTcmNpSXhwZzBYNDBLUE1icDFaV1ZiZDQ9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0KCg=="
    
  • Tanzu CLI secret and package commands return error 

    If you deployed clusters using an alternative to Tanzu Kubernetes Grid, such as Spectro Cloud or Tanzu Kubernetes Grid Integrated Edition, you may see the following error when running Tanzu CLI package and secret commands on an authenticated package repository:

    Error: Unable to set up rest mapper: no Auth Provider found for name "oidc"
    
  • With multiple vCenters, recreating node can lose IP address 

    In vSphere deployments with multiple datacenters, shutting down and restarting a node from the vCenter UI sometimes lost the node’s IP address, causing its Antrea pod to crash.

    Workaround: Delete the vsphere-cloud-controller-manager pod so that it re-creates. Retrieve the pod name with kubectl get pods --namespace=kube-system and delete it with kubectl delete pod.

  • Pinniped remote authentication does not support Chrome 98 browser

    Running the procedure Authenticate Users on a Machine Without a Browser with a local machine that has Chrome 98 as the default browser generates an error instead of opening up an IdP login page.

  • credentials update needed for vSphere passwords that contain single-quote (') character

    Upgrading TKG on a vSphere account with a password containing the single-quote character requires running tanzu mc credentials update before running tanzu mc update.

Resolved in v1.5.2

The following issues are resolved in Tanzu Kubernetes Grid v1.5.2.

  • NSX ALB setup lists identical names when multiple clusters share networks

    The Avi Controller retrieves port group information from vCenter, which does not include the ports T1 and segment associations that are set in the NSX-T dashboard. When setting up load-balancing for deployments where multiple clusters share networks via NSX-T, this can mean having to choose port groups from lists with identical names in the Avi Controller UI.

    Tanzu Kubernetes Grid v1.5.2+ supports enabling the Avi Controller to retrieve network information from NSX-T. This lets users disambiguate port groups that have identical names but are attached to different T1 routers.

  • Node pool operations do not work in proxied environments

    Tanzu CLI tanzu cluster node-pool commands do not work in proxied environments.

  • Commands tanzu cluster node-pool scale and tanzu cluster node-pool delete target the wrong node pool.

    Because of an internal regex mismatch, the commands tanzu cluster node-pool scale and tanzu cluster node-pool delete sometimes operate on a node pool other than the one specified in the command.

  • Tanzu Standard repository v1.5.0 packages do not work with vSphere with Tanzu

    You cannot install Tanzu Standard repository v1.5.0 packages into Tanzu Kubernetes clusters created by using vSphere with Tanzu. The packages have not been validated for vSphere with Tanzu clusters.

Resolved in v1.5.1

The following issues are resolved in Tanzu Kubernetes Grid v1.5.1.

  • Tanzu CLI does not support node-pool commands for Tanzu Kubernetes Grid service (TKGS) clusters.

    Tanzu CLI tanzu cluster node-pool commands are not supported for Tanzu Kubernetes clusters created by using the Tanzu Kubernetes Grid service on vSphere 7.0 U3.

  • Management cluster installation and upgrade fail in airgapped environment

    In an airgapped environment, running tanzu management-cluster create or tanzu management-cluster upgrade fails when the kind process attempts to retrieve a pause v3.5 image from k8s.gcr.io.

  • Management cluster upgrade fails on AWS with Ubuntu v20.04

    On Amazon EC2, with a management cluster based on Ubuntu v20.04 nodes, running tanzu management-cluster upgrade fails after the kind process retrieves an incompatible pause version (v3.6) image from k8s.gcr.io.

  • Editing cluster resources on AWS with Calico CNI produces errors.

    Known Issue In: v1.4.0, v1.4.1

    Adding or removing a resource for a workload cluster on AWS on a TKG deployment that uses the Calico CNI produces errors if you do not manually add an ingress role.

  • CAPV controller parses datacenter incorrectly in multi-datacenter vSphere environment.

    Known Issue In: v1.4.0, v1.4.1

    During upgrade to Tanzu Kubernetes Grid v1.4.1 on vSphere, if you have multiple datacenters running within a single vCenter, the CAPV controller failed to find datacenter contents, causing upgrade failure and possible loss of data.

  • Management cluster create fails with Linux or MacOS bootstrap machines running cgroups v2 in their Linux kernel, and Docker Desktop v4.3.0 or later.

    Known Issue In: v1.4.0, v1.4.1

    Due to the version of kind that the v1.4.0 CLI uses to build its container image, bootstrap machines running cgroups v2 fail to run the image.

  • Upgrading management cluster does not automatically create tanzu-package-repo-global namespace.

    Known Issue In: v1.4.x

    When you upgrade to Tanzu Kubernetes Grid v1.4.x, you need to manually create the tanzu-package-repo-global namespace and associated package repository.

  • Changing name or location of virtual machine template for current Kubernetes version reprovisions cluster nodes when running tanzu cluster upgrade.

    Known Issue In: v1.3.x, v1.4.x

    Moving or renaming the virtual machine template in your vCenter and then running tanzu cluster upgrade causes cluster nodes to be reprovisioned with new IP addresses.

  • Unstable file naming convention prevents automated retrieval of image binaries

    Known Issue In: v1.5.0, v1.5.1, v1.5.2, v1.5.3

    Automated retrieval of signed image binaries fails due to unstable file naming convention.

Known Issues

The following are known issues in Tanzu Kubernetes Grid v1.5.4. See Resolved Issues for issues that are resolved in v1.5.4, but that applied to earlier patch versions of TKG v1.5.

Internet-Restricted Environments

  • kapp-controller crashes when upgrading a management cluster from TKG v1.5.1-2 to v1.5.3.

    In TKG v1.5.1 and v1.5.2 there is a known custom certificate airgap issue that makes kapp-controller enter a crashLoopBackoff state when upgrading a management cluster to v1.5.3. To resolve this issue, you must perform the below workaround before upgrading from v1.5.1-2 to v1.5.3.

    Workaround: See step 3 in Upgrade Management Cluster.

Packages

  • kapp-controller generates ctrl-change ConfigMap objects, even if there is no change

    The CustomResourceDefinition objects that define configurations for Calico, AKO Operator, and other packages include a status field. When the kapp-controller reconciles these CRD objects every five minutes, it interprets their status as having changed even when the package configuration did not change. This causes the kapp-controller to generate unnecessary, duplicate ctrl-change ConfigMap objects, which soon overrun their history buffer because each package saves a maximum of 200 ctrl-change ConfigMap records.

    Workaround: None

  • Tanzu Standard repository v1.5.0 packages do not work with TKGS

    You cannot install Tanzu Standard repository v1.5.0 packages into Tanzu Kubernetes clusters created by using the Tanzu Kubernetes Grid service; the packages have not been validated for Tanzu Kubernetes Grid service clusters. Tanzu Kubernetes Grid v1.5.0 and v1.5.1 both use the Tanzu Standard repository versioned as v1.5.0, projects.registry.vmware.com/tkg/packages/standard/repo:v1.5.0

    Workaround: If you are running Tanzu Kubernetes Grid v1.4 packages on Tanzu Kubernetes clusters created by using the Tanzu Kubernetes Grid service, do not upgrade to v1.5 until a fix is available.

  • Shared Services Cluster Does Not Work with TKGS

    Tanzu Kubernetes Grid Service (TKGS) does not support deploying packages to a shared services cluster. Workload clusters deployed by TKGS can only use packaged services deployed to the workload clusters themselves.

    Workaround: None

Upgrade

  • Workload cluster upgrade may hang or fail due to undetached persistent volumes

    If you are upgrading your Tanzu Kubernetes clusters from Tanzu Kubernetes Grid v1.4.x to v1.5.x and you have applications on the cluster that use persistent volumes, the volumes may fail to detach and re-attach during upgrade, causing the upgrade process to hang or fail.

    The same issue may manifest when you try to scale a cluster up or down, as described in Workload cluster scaling may hang or fail due to undetached persistent volumes.

    Workaround: Follow the steps in Persistent volumes cannot attach to a new node if previous node is deleted (85213) in the VMware Knowledge Base.

  • Pinniped authentication error on workload cluster after upgrading management cluster

    When attempting to authenticate to a workload cluster associated with the upgraded management cluster, you receive an error message similar to the following:

    Error: could not complete Pinniped login: could not perform OIDC discovery for "https://IP:PORT": Get "https://IP:PORT/.well-known/openid-configuration": x509: certificate signed by unknown authority
    

    Workaround: See Pinniped Authentication Error on Workload Cluster After Management Cluster Upgrade.

  • Kapp Controller crashLoopBackoff state requires recreating Carvel API after offline upgrade from v1.5.1 or v1.5.2

    The kapp-controller component may enter a crashLoopBackoff state when upgrading TKG from v1.5.1 or v1.5.2 in an internet-restricted environment with a private registry that uses a custom certificate.

    Workaround: Run kubectl delete apiservice v1alpha1.data.packaging.carvel.dev and then run the tanzu mc upgrade command again.

  • Upgrade ignores custom bootstrap token duration

    Management cluster upgrade ignores CAPBK_BOOTSTRAP_TOKEN_TTL setting configured in TKG v1.4.2+ to extend bootstrap token TTL during cluster initialization. This may cause timeout. The default TTL is 15m.

    If you no longer have the management cluster’s configuration file, you can use kubectl to determine its CAPBK_BOOTSTRAP_TOKEN_TTL setting:

    1. In the management cluster context, run kubectl get pods -n capi-kubeadm-bootstrap-system to list the bootstrap pods.
    2. Find the name of the capi-kubeadm-bootstrap-controller-manager pod and output its description in YAML. For example: kubectl get pod -n capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-7ffb6dc8fc-hzm7l -o yaml
    3. Check the path under the spec.containers[0].args. If the argument --bootstrap-token-ttl is present and is set to something other than 15m (the default value), then the value was customized and requires the workaround below.

    Workaround: Before running tanzu mc upgrade, set CAPBK_BOOTSTRAP_TOKEN_TTL as an environment variable. For example:

    export CAPBK_BOOTSTRAP_TOKEN_TTL=30m
    
  • After upgrading a cluster with fewer than 3 control plane nodes, such as dev plan clusters, kapp-controller fails to reconcile its CSI package

    When viewing the status of the CSI PackageInstall resource, you may see Reconcile failed:

    kubectl get pkgi -A
    

    In CSI v2.4, the default number of CSI replicas changed from 1 to 3. For clusters with fewer than 3 control plane nodes (e.g. 1, to preserve quorum), this change disables kapp-controller from matching the current number of CSI replicas to the desired state.

    Workaround: Patch deployment_replicas to match the number of control plane nodes:

    1. Retrieve the data values file corresponding to the secret for vsphere-csi-addon in the management cluster. For example:

      kubectl get secrets vsphere-csi-data-values -n tkg-system -o jsonpath={.data.values\\.yaml} | base64 -d > values.yaml
      
    2. Add this line to the end of the new values.yaml file:

      deployment_replicas: 1
      
    3. Encode the values.yaml file into base64, for example:

      cat values.yaml | base64 -w 0
      

      The example above uses -w 0 to wrap the lines on MacOS.

    4. Edit the tkg-mgmt-vc-vsphere-csi-addon object:

      kubectl edit secret -n tkg-system tkg-mgmt-vsphere-csi-addon
      
    5. In the object spec, paste in the base64-encoded output above as the value of data.values.yaml.

    6. Save and exit from kubectl edit, and run kubectl apply -f values.yaml to apply the change.

    7. Update the vsphere-csi installed package to use the new values file:

      tanzu package installed update vsphere-csi -p vsphere-csi.tanzu.vmware.com --version 2.4.1+vmware.1-tkg.1 --values-file values.yaml -n tkg-system
      
    8. Make sure the vsphere-csi PackageInstall has reconciled by getting its details and confirming its status as Reconcile succeeded:

      kubectl get pkgi vsphere-csi -n tkg-system
      
    9. Make sure the vsphere-csi app has reconciled by getting its details and confirming its status as Reconcile succeeded:

      kubectl get apps vsphere-csi -n tkg-system
      

CLI

  • Workload cluster scaling may hang or fail due to undetached persistent volumes

    If you scale a workload cluster up or down, and you have applications on the cluster that use persistent volumes, the volumes may fail to detach and re-attach during the scaling process, causing an AttachVolume.Attach failed error and timeout.

    The same issue may manifest when you try to upgrade a cluster a cluster, as described in Workload cluster upgrade may hang or fail due to undetached persistent volumes.

    Workaround: Follow the steps in Persistent volumes cannot attach to a new node if previous node is deleted (85213) in the VMware Knowledge Base.

  • Non-alphanumeric characters cannot be used in HTTP/HTTPS proxy passwords

    When deploying management clusters with CLI, the non-alphanumeric characters # ` ^ | / ? % ^ { [ ] } \ " < > cannot be used in passwords. Also, any non-alphanumeric character cannot be used in HTTP/HTTPS proxy passwords when deploying management cluster with UI.

    Workaround: You can use non-alphanumeric characters other than # ` ^ | / ? % ^ { [ ] } \ " < > in passwords when deploying management cluster with CLI.

  • Tanzu CLI does not work on macOS machines with ARM processors

    Tanzu CLI v0.11.6 does not work on macOS machines with ARM (Apple M1) chips, as identified under Finder > About This Mac > Overview.

    Workaround: Use a bootstrap machine with a Linux or Windows OS, or a macOS machine with an Intel processor.

  • Windows CMD: Extraneous characters in CLI output column headings

    In the Windows command prompt (CMD), Tanzu CLI command output that is formatted in columns includes extraneous characters in column headings.

    The issue does not occur in Windows Terminal or PowerShell.

    Workaround: On Windows bootstrap machines, run the Tanzu CLI from Windows Terminal.

  • Ignorable AKODeploymentConfig error during management cluster creation

    Running tanzu management-cluster create to create a management cluster with NSX ALB outputs the following error: no matches for kind “AKODeploymentConfig” in version “networking.tkg.tanzu.vmware.com/v1alpha1”. The error can be ignored. For more information, see this article in the KB.

  • Ignorable machinehealthcheck and clusterresourceset errors during workload cluster creation on vSphere

    When a workload cluster is deployed to vSphere by using the tanzu cluster create command through vSphere with Tanzu, the output might include errors related to running machinehealthcheck and accessing the clusterresourceset resources, as shown below:

    Error from server (Forbidden): error when creating "/tmp/kubeapply-3798885393": machinehealthchecks.cluster.x-k8s.io is forbidden: User "sso:Administrator@vsphere.local" cannot create resource "machinehealthchecks" in API group "cluster.x-k8s.io" in the namespace "tkg"
    ...
    Error from server (Forbidden): error when retrieving current configuration of: Resource: "addons.cluster.x-k8s.io/v1alpha3, Resource=clusterresourcesets", GroupVersionKind: "addons.cluster.x-k8s.io/v1alpha3, Kind=ClusterResourceSet"
    ...
    

    The workload cluster is successfully created. You can ignore the errors.

  • CLI temporarily misreports status of recently deleted nodes when MHCs are disabled

    When machine health checks (MHCs) are disabled, then Tanzu CLI commands such as tanzu cluster status may not report up-to-date node state while infrastructure is being recreated.

    Workaround: None

vSphere

  • Node pools created with small nodes may stall at Provisioning

    Node pools created with node SIZE configured as small may become stuck in the Provisioning state and never proceed to Running.

    Workaround: Configure node pool with at least medium size nodes.

  • Host network pods and node use the wrong IP in IPv6 clusters

    This issue is resolved in Tanzu Kubernetes Grid v1.5.4 for IPv6 clusters based on Kubernetes v1.22.x.

    When you deploy IPv6 clusters based on Kubernetes v1.20.x or v1.21.x with multiple control plane nodes on vSphere, one of your nodes as well as the etc, kube-apiserver, and kube-proxy pods may take on the IP you set for the VSPHERE_CONTROL_PLANE_ENDPOINT instead of an IP of their own. You might not see an error, but this could cause networking problems for these pods and prevent the control plane nodes from proper failover. To confirm this is your issue:

    1. Connect to the cluster and run kubectl get pods -A -o wide.
    2. Note the IPs for the etc, kube-apiserver, and kube-proxy pods.
    3. Run kubectl get nodes -o wide.
    4. Note the IP for the first node in the output.
    5. Compare the IPs for the pods and node to see if they match the VSPHERE_CONTROL_PLANE_ENDPOINT you set in the cluster configuration file.

    Workaround: Use TKG v1.5.4 with clusters based on Kubernetes v1.22.x.

  • When AVI_LABELS is set, ako-operator causes high latency on the AVI Controller

    Due to a bug in the ako-operator package, setting the AVI_LABELS variable or configuring Cluster Labels (Optional) in the Configure VMware NSX Advanced Load Balancer section of the installer interface when creating the management cluster results in the package attempting to reconcile indefinitely. This generates a high volume of events on the AVI Controller.

    Workaround: If you are experiencing this issue, follow the steps below:

    1. Pause the reconciliation of the ako-operator package:

      kubectl patch pkgi ako-operator -n tkg-system --type "json" -p '[{"op":"replace","path":"/spec/paused","value":true}]'
      
    2. Remove the cluster selector in the default AKODeploymentConfig custom resource:

      kubectl patch adc install-ako-for-all --type "json" -p='[{"op":"remove","path":"/spec/clusterSelector"}]'
      
    3. Remove the labels that you defined in AVI_LABELS or Cluster Labels (Optional) from each affected workload cluster:

      kubectl label CLUSTER-NAME YOUR-AVI-LABELS-
      

      For example:

      kubectl label my-workload-cluster tkg.tanzu.vmware.com/ako-enabled=-
      

    The ako-operator package must remain in the paused state to persist this change.

  • With NSX ALB, cannot create cluster in NAMESPACE that has name beginning with numeric character

    On vSphere with NSX Advanced Load Balancer, creating a workload cluster from Tanzu Mission Control or by running tanzu cluster create fails if its management namespace, set by the NAMESPACE configuration variable, begins with a numeric character (0-9).

    Workaround: Deploy workload clusters to management namespaces that do not start with numeric characters.

  • With NSX ALB, cannot create clusters with identical names

    If you are using NSX Advanced Load Balancer for workloads (AVI_ENABLE) or the control plane (AVI_CONTROL_PLANE_HA_PROVIDER) the Avi Controller may fail to distinguish between identically-named clusters.

    Workaround: Set a unique CLUSTER_NAME value for each cluster:

    • Management clusters: Do not create multiple management clusters with the same CLUSTER_NAME value, even from different bootstrap machines.

    • Workload clusters: Do not create multiple workload clusters that have the same CLUSTER_NAME and are also in the same management cluster namespace, as set by their NAMESPACE value.

  • Host network pods and node use the wrong IP in IPv6 clusters.

    When you deploy IPv6 clusters with multiple control plane nodes on vSphere and the clusters use Kubernetes 1.20.x or 1.21.x, one of your nodes as well as the etc, kube-apiserver, and kube-proxy pods may take on the IP you set for the VSPHERE_CONTROL_PLANE_ENDPOINT instead of an IP of their own. You might not see an error, but this could cause networking problems for these pods and prevent the control plane nodes from proper failover. To confirm this is your issue:

    1. Connect to the cluster and run kubectl get pods -A -o wide.
    2. Note the IPs for the etc, kube-apiserver, and kube-proxy pods.
    3. Run kubectl get nodes -o wide.
    4. Note the IP for the first node in the output.
    5. Compare the IPs for the pods and node to see if they match the VSPHERE_CONTROL_PLANE_ENDPOINT you set in the cluster configuration file.

    Workaround: None

  • When AVI_LABELS is set, ako-operator causes high latency on the AVI Controller

    Due to a bug in the ako-operator package, setting the AVI_LABELS variable or configuring Cluster Labels (Optional) in the Configure VMware NSX Advanced Load Balancer section of the installer interface when creating the management cluster results in the package attempting to reconcile indefinitely. This generates a high volume of events on the AVI Controller.

    Workaround: If you are experiencing this issue, follow the steps below:

    1. Pause the reconciliation of the ako-operator package:

      kubectl patch pkgi ako-operator -n tkg-system --type "json" -p '[{"op":"replace","path":"/spec/paused","value":true}]'
      
    2. Remove the cluster selector in the default AKODeploymentConfig custom resource:

      kubectl patch adc install-ako-for-all --type=json -p='[{"op":"remove","path":"/spec/clusterSelector"}]'
      
    3. Remove the labels that you defined in AVI_LABELS or Cluster Labels (Optional) from each affected workload cluster:

      kubectl label CLUSTER-NAME YOUR-AVI-LABELS-
      

      For example:

      kubectl label my-workload-cluster tkg.tanzu.vmware.com/ako-enabled=-
      

    The ako-operator package must remain in the paused state to persist this change.

AWS

  • Deleting cluster on AWS fails if cluster uses networking resources not deployed with Tanzu Kubernetes Grid.

    The tanzu cluster delete and tanzu management-cluster delete commands may hang with clusters that use networking resources created by the AWS Cloud Controller Manager independently from the Tanzu Kubernetes Grid deployment process. Such resources may include load balancers and other networking services, as listed in The Service Controller in the Kubernetes AWS Cloud Provider documentation.

    For more information, see the Cluster API issue Drain workload clusters of service Type=Loadbalancer on teardown.

    Workaround: Use kubectl delete to delete services of type LoadBalancer from the cluster. Or if that fails, use the AWS console to manually delete any LoadBalancer and SecurityGroup objects created for this service by the Cloud Controller manager. Warning: Do not to delete load balancers or security groups managed by Tanzu, which have the tags key: sigs.k8s.io/cluster-api-provider-aws/cluster/CLUSTER-NAME, value: owned.

Windows Workload Clusters

  • Pinniped fails to reconcile on newly created Windows workload cluster

    After creating a Windows workload cluster that uses an external identity provider, you may see the following error message:

    Reconcile failed: Error (see .status.usefulErrorMessage for details)
    pinniped-supervisor pinniped-post-deploy-job - Waiting to complete (1 active, 0 3h failed, 0 succeeded)^ pinniped-post-deploy-job--1-kfpr5 - Pending: Unschedulable (message: 0/2 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 1 node(s) had taint {os: windows}, that the pod didn't tolerate.)
    

    Workaround: Add a tolerations setting to the Pinniped secret by following the procedure in Add Pinniped Overlay in Windows Custom Machine Images.

NSX

  • Management cluster create fails or performance slow with older NSX-T versions and Photon 3 or Ubuntu with Linux kernel 5.8 VMs

    Deploying a management cluster with the following infrastructure and configuration may fail or result in restricted traffic between pods:

    • vSphere with any of the following versions of NSX-T:
      • NSX-T v3.1.3 with Enhanced Datapath enabled
      • NSX-T v3.1.x lower than v3.1.3
      • NSX-T v3.0.x lower than v3.0.2 hot patch
      • NSX-T v2.x. This includes Azure VMware Solution (AVS) v2.0, which uses NSX-T v2.5
    • Base image: Photon 3 or Ubuntu with Linux kernel 5.8

    This combination exposes a checksum issue between older versions of NSX-T and Antrea CNI.

    TMC: If the management cluster is registered with Tanzu Mission Control (TMC) there is no workaround to this issue. Otherwise, see the workarounds below.

    Workarounds:

    • Deploy workload clusters configured with ANTREA_DISABLE_UDP_TUNNEL_OFFLOAD set to "true". This setting disables Antrea’s UDP checksum offloading, which avoids the known issues with some underlay network and physical NIC network drivers.
    • Upgrade to NSX-T v3.0.2 Hot Patch, v3.1.3, or later, without Enhanced Datapath enabled
    • Use an Ubuntu base image with Linux kernel 5.9 or later.

AVS

  • vSphere CSI volume deletion may fail on AVS

    On Azure vSphere Solution (AVS), vSphere CSI Persistent Volumes (PVs) deletion may fail. Deleting a PV requires the cns.searchable permission. The default admin account for AVS, cloudadmin@vsphere.local, is not created with this permission. For more information, see vSphere Roles and Privileges.

    Workaround: To delete a vSphere CSI PV on AVS, contact Azure support.

Harbor

  • No Harbor proxy cache support

    You cannot use Harbor in proxy cache mode for running Tanzu Kubernetes Grid in an internet-restricted environment. Prior versions of Tanzu Kubernetes Grid supported the Harbor proxy cache feature.

    Workaround: None

check-circle-line exclamation-circle-line close-line
Scroll to top icon