VMware Tanzu Kubernetes Grid 1.6 Release Notes

What’s New

Tanzu Kubernetes Grid v1.6 includes the following new features:

  • On vSphere, you can create clusters of ESXi hosts with two types of NVIDIA GPU cards using PCI passthrough. See Deploy a GPU-Enabled Workload Cluster.
  • Support for edge devices. See Deploy a Workload Cluster to an Edge Device.
  • Support for CSI storage with Amazon EBS CSI driver and Azure Disk CSI driver for Kubernetes.
  • You can now use Whereabouts, an IP Address Management (IPAM) CNI plugin, with Multus CNI to dynamically assign IP addresses to pods across the cluster. For more information, see Implement Multiple Pod Network Interfaces with Multus and Whereabouts.
  • On vSphere, with NSX Advanced Load Balancer as the control plane endpoint, you can create clusters with VSPHERE_CONTROL_PLANE_ENDPOINT set to an FQDN instead of an IP address without having to apply an overlay.
  • Tanzu CLI:

    • tanzu feature activate, tanzu feature deactivate, and tanzu feature list manage features that are available in your target management cluster. For more information, see tanzu feature.
    • tanzu config set edition sets the Tanzu CLI edition. For more information, see CLI Configuration.
    • generate-default-values-file flag of the tanzu package available get command creates a configuration file with default values for the specified package. See Get the Details of an Available Package.
    • --local flag to tanzu plugin install lets you install plugins to your local machine for use in airgapped environments.
    • kctrl-based tanzu package commands:
      • The Tanzu CLI supports package commands based on the kapp-controller’s native CLI, kctrl, described in the Carvel docs.
      • kctrl mode is disabled by default. With kctrl mode enabled, tanzu package commands work identically to kctrl package commands.
      • kctrl mode adds the following commands, extending observability and debugging functionality:
        • package installed pause pauses the reconciliation of a package install.
        • package installed kick triggers the reconciliation of a package install.
        • package [...] status appended to the install, installed create, and installed update commands tails the status of the command operation.
      • Enable kctrl mode for tanzu package commands by running tanzu config set features.package.kctrl-package-command-tree true.
      • There are some breaking changes to the tanzu package command group when the kctrl mode is activated. For more information, see tanzu package with kctrl.
    • The Tanzu CLI telemetry plugin replaces tanzu mc ceip-participation commands with tanzu telemetry commands. See Manage Participation in CEIP for details.
    • The following commands now show Tanzu Kubernetes release (TKr) name in the output:
      • tanzu cluster list
      • tanzu cluster get
      • tanzu mc get
  • Cluster configuration variables:

    • (vSphere) If you are using NSX ALB, you can set VSPHERE_CONTROL_PLANE_ENDPOINT to an FQDN without needing a custom overlay.
    • The AVI_LABELS variable is supported for workload clusters. For more information, see NSX Advanced Load Balancer.
    • The AVI_CONTROLLER_VERSION cluster configuration variable is not needed because the AKO operator automatically detects the Avi Controller version that is in use.

    • Cluster configuration variables expand control of Antrea behavior: ANTREA_EGRESS, ANTREA_EGRESS_EXCEPT_CIDRS, ANTREA_ENABLE_USAGE_REPORTING, ANTREA_FLOWEXPORTER, ANTREA_FLOWEXPORTER_ACTIVE_TIMEOUT, ANTREA_FLOWEXPORTER_COLLECTOR_ADDRESS, ANTREA_FLOWEXPORTER_IDLE_TIMEOUT, ANTREA_FLOWEXPORTER_POLL_INTERVAL, ANTREA_IPAM, ANTREA_KUBE_APISERVER_OVERRIDE, ANTREA_MULTICAST, ANTREA_MULTICAST_INTERFACES, ANTREA_NETWORKPOLICY_STATS, ANTREA_NODEPORTLOCAL_ENABLED, ANTREA_NODEPORTLOCAL_PORTRANGE, ANTREA_PROXY_ALL, ANTREA_PROXY_LOAD_BALANCER_IPS, ANTREA_PROXY_NODEPORT_ADDRS, ANTREA_PROXY_SKIP_SERVICES, ANTREA_SERVICE_EXTERNALIP, ANTREA_TRANSPORT_INTERFACE, ANTREA_TRANSPORT_INTERFACE_CIDRS, ANTREA_TRAFFIC_ENCRYPTION_MODE, and ANTREA_WIREGUARD_PORT. For more information, see Antrea CNI Configuration.

    • Cluster configuration variables allow setting up GPU-enabled clusters: VSPHERE_CONTROL_PLANE_CUSTOM_VMX_KEYS, VSPHERE_CONTROL_PLANE_PCI_DEVICES, VSPHERE_IGNORE_PCI_DEVICES_ALLOW_LIST, VSPHERE_WORKER_PCI_DEVICES, VSPHERE_WORKER_CUSTOM_VMX_KEYS, WORKER_ROLLOUT_STRATEGY. For more information, see GPU-Enabled Clusters.

  • Package configuration variables:

    • Calico package added a new parameter calico.config.skipCNIBinaries that, if set to true, prevents Calico from overwriting the settings of existing CNI plugins during cluster upgrading. For more information, see Updating Package Configuration.
  • New Kubernetes versions, listed in Supported Kubernetes Versions in Tanzu Kubernetes Grid, below.
  • Security
    • Ubuntu 20.04 machine images for vSphere, AWS and Azure are hardened to Center for Internet Security (CIS) standards by default, with AppArmor enabled.
    • Photon OS 3 machine images are hardened to Security Technical Implementation Guides (STIG) standards by default.
    • Kernel upgrade addresses CVE-2022-0492, CVE-2022-0847
    • golang upgrade addresses CVE-2022-23806, CVE-2022-23772, CVE-2022-23773
  • Custom image building for Linux machine images supports RHEL 8, and drops support for RHEL 7.

Supported Kubernetes Versions in Tanzu Kubernetes Grid v1.6

Each version of Tanzu Kubernetes Grid adds support for the Kubernetes version of its management cluster, plus additional Kubernetes versions, distributed as Tanzu Kubernetes releases (TKrs).

Any version of Tanzu Kubernetes Grid supports all TKr versions from the previous two minor lines of Kubernetes. For example, TKG v1.6.0 supports the Kubernetes versions v1.23.x, v1.22.x, and v1.21.x listed below, but not v1.20.x, v1.19.x, or v1.18.x.

Tanzu Kubernetes Grid Version Kubernetes Version of
Management Cluster
Provided Kubernetes (TKr) Versions
1.6.0 1.23.8 1.23.8, 1.22.11, 1.21.14
1.5.4 1.22.9 1.22.9, 1.21.11, 1.20.15
1.5.3 1.22.8 1.22.8, 1.21.11, 1.20.15
1.5.2, 1.5.1, 1.5.0 1.23.8 1.23.8, 1.21.8, 1.20.14
1.4.2 1.21.8 1.21.8, 1.20.14, 1.19.16
1.4.0, 1.4.1 1.21.2 1.21.2, 1.20.8, 1.19.12

Product Snapshot for Tanzu Kubernetes Grid v1.6

Tanzu Kubernetes Grid v1.6 supports the following infrastructure platforms and operating systems (OSs), as well as cluster creation and management, networking, storage, authentication, backup and migration, and observability components. The component versions listed in parentheses are included in Tanzu Kubernetes Grid v1.6. For more information, see Component Versions.

vSphere AWS Azure
Infrastructure platform
  • vSphere 6.7U3
  • vSphere 7
  • VMware Cloud on AWS***
  • Azure VMware Solution
  • Oracle Cloud VMware Solution (OCVS)
  • Google Cloud VMware Engine (GCVE)
Native AWS Native Azure
CLI, API, and package infrastructure Tanzu Framework v0.25.0
Cluster creation and management Core Cluster API (v1.1.5), Cluster API Provider vSphere (v1.3.1) Core Cluster API (v1.1.5), Cluster API Provider AWS (v1.2.0) Core Cluster API (v1.1.5), Cluster API Provider Azure (v1.4.0)
Kubernetes node OS distributed with TKG Photon OS 3, Ubuntu 20.04 Amazon Linux 2, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Build your own image Photon OS 3, Red Hat Enterprise Linux 7**** and 8, Ubuntu 18.04, Ubuntu 20.04, Windows 2019 Amazon Linux 2, Ubuntu 18.04, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Container runtime Containerd (v1.6.6)
Container networking Antrea (v1.5.3), Calico (v3.22.1)
Container registry Harbor (v2.5.3)
Ingress NSX Advanced Load Balancer Essentials and Avi Controller (v20.1.6, v20.1.7, v20.1.8, v20.1.9, v21.1.4)*, Contour (v1.20.2) Contour (v1.20.2) Contour (v1.20.2)
Storage vSphere Container Storage Interface (v2.5.2**) and vSphere Cloud Native Storage Amazon EBS CSI driver (v1.8.0) and in-tree cloud providers Azure Disk CSI driver for Kubernetes (v1.19.0) and in-tree cloud providers
Authentication OIDC via Pinniped (v0.12.1), LDAP via Pinniped (v0.12.1) and Dex
Observability Fluent Bit (v1.8.15), Prometheus (v2.36.2), Grafana (v7.5.16)
Backup and migration Velero (v1.8.1)

NOTES:

  • * NSX Advanced Load Balancer Essentials is supported on vSphere 6.7U3, vSphere 7, and VMware Cloud on AWS. You can download it from the Download VMware Tanzu Kubernetes Grid page.
  • ** Version of vsphere_csi_driver. For a full list of vSphere Container Storage Interface components included in the Tanzu Kubernetes Grid v1.6 release, see Component Versions.
  • *** For a list of VMware Cloud on AWS SDDC versions that are compatible with this release, see the VMware Product Interoperability Matrix.
  • **** Tanzu Kubernetes Grid v1.6 is the last release that supports building Red Hat Enterprise Linux 7 images.

For a full list of Kubernetes versions that ship with Tanzu Kubernetes Grid v1.6, see Supported Kubernetes Versions in Tanzu Kubernetes Grid v1.6 above.

Component Versions

The Tanzu Kubernetes Grid v1.6.0 release includes the following software component versions:

Component TKG v1.6.0
aad-pod-identity v1.8.0+vmware.1
addons-manager v1.5.0_vmware.1-tkg.5
ako-operator v1.6.0+vmware.16*
alertmanager v0.24.0+vmware.1*
antrea v1.5.3_tkg.1*
aws-ebs-csi-driver* v1.8.0+vmware.1
azuredisk-csi-driver* v1.19.0+vmware.1
byoh-k8s-ubuntu-2004* v1.23.8+vmware.2-tkg.1
calico_all v3.22.1+vmware.1*
capabilities-package* v0.25.0-23-g6288c751-capabilities
carvel-secretgen-controller v0.9.1+vmware.1*
cloud-provider-azure v0.7.4+vmware.1
cloud_provider_vsphere v1.23.1+vmware.1*
cluster-api-provider-azure v1.4.0+vmware.2*
cluster_api v1.1.5+vmware.1*
cluster_api_aws v1.2.0+vmware.1
cluster_api_vsphere v1.3.1+vmware.1*
cni_plugins v1.1.1+vmware.6*
configmap-reload v0.7.1+vmware.1*
containerd v1.6.6+vmware.2*
contour v1.20.2+vmware.1*,
v1.18.2+vmware.1,
v1.17.2+vmware.1
coredns v1.8.6+vmware.7*
crash-diagnostics v0.3.7+vmware.5
cri_tools v1.22.0+vmware.8*
csi_attacher v3.4.0+vmware.1*,
v3.3.0+vmware.1
csi_livenessprobe v2.6.0+vmware.1*,
v2.5.0+vmware.1*,
v2.4.0+vmware.1
csi_node_driver_registrar v2.5.1+vmware.1*,
v2.5.0+vmware.1*,
v2.3.0+vmware.1
csi_provisioner v3.0.0+vmware.1
dex v3.1.0+vmware.2*,
v2.30.2+vmware.1
envoy v1.21.3_vmware.1*,
v1.19.1+vmware.1,
v1.18.4+vmware.1
external-dns v0.10.0+vmware.1
external-snapshotter* v6.0.1+vmware.1*,
v5.0.1+vmware.1
etcd v3.5.4_vmware.6*
fluent-bit v1.8.15+vmware.1*
gangway v3.2.0+vmware.2
grafana v7.5.16+vmware.1*
guest-cluster-auth-service* v1.0.0
harbor v2.5.3+vmware.1*
image-builder v0.1.12+vmware.2*
image-builder-resource-bundle* v1.23.8+vmware.2-tkg.1
imgpkg v0.29.0+vmware.1*
jetstack_cert-manager v1.5.3+vmware.4*
k8s-sidecar v1.15.6+vmware.1*
k14s_kapp v0.49.0+vmware.1*
k14s_ytt v0.41.1+vmware.1*
kapp-controller v0.38.4+vmware.1*
kbld v0.34.0+vmware.1*
kube-state-metrics v2.5.0+vmware.1*
kube-vip v0.4.2+vmware.1*
kube_rbac_proxy v0.11.0+vmware.2*
kubernetes v1.23.8+vmware.2*
kubernetes-csi_external-resizer v1.4.0+vmware.1*,
v1.3.0+vmware.1
kubernetes-sigs_kind v1.23.8+vmware.2-tkg.1_v0.25.0*
kubernetes_autoscaler v1.23.0+vmware.1*
load-balancer-and-ingress-service (AKO) v1.7.2+vmware.2*
metrics-server v0.6.1+vmware.1*
multus-cni v3.8.0+vmware.1*
pinniped v0.12.1+vmware.1-tkg.1
pinniped-post-deploy* v0.12.1+vmware.1-tkg.1
prometheus v2.36.2+vmware.1*
prometheus_node_exporter v1.3.1+vmware.1*
pushgateway v1.4.3+vmware.1*
standalone-plugins-package v0.25.0-standalone-plugins*
sonobuoy v0.56.6+vmware.1*
tanzu-framework v0.25.0*
tanzu-framework-addons v0.25.0-23-g6288c751*
tanzu-framework-management-packages v0.25.0*
tkg-bom v1.6.0*
tkg-core-packages v1.23.8+vmware.2-tkg.1*
tkg-standard-packages v1.6.0*
tkg-storageclass-package v0.25.0-23-g6288c751-tkg-storageclass*
tkg_telemetry v1.6.0+vmware.1*
velero v1.8.1+vmware.1
velero-plugin-for-aws v1.4.1+vmware.1
velero-plugin-for-microsoft-azure v1.4.1+vmware.1
velero-plugin-for-vsphere v1.3.1+vmware.1
vendir v0.27.0+vmware.1*
vsphere_csi_driver v2.5.2+vmware.1*
whereabouts* v0.5.1+vmware.2

* Indicates a new component or version bump since v1.5.4, which is the latest previous release.

For a complete list of software component versions that ship with Tanzu Kubernetes Grid v1.6.0, see ~/.config/tanzu/tkg/bom/tkg-bom-v1.6.0.yaml and ~/.config/tanzu/tkg/bom/tkr-bom-v1.23.8+vmware.1-tkg.1.yaml. For component versions in previous releases, see the tkg-bom- and tkr-bom- YAML files that install with those releases.

Supported Upgrade Paths

You can only upgrade to Tanzu Kubernetes Grid v1.6.x from v1.5.x. If you want to upgrade to Tanzu Kubernetes Grid v1.6.x from a version earlier than v1.5.x, you must upgrade to v1.5.x first.

When upgrading Kubernetes versions on workload clusters, you cannot skip minor versions. For example, you cannot upgrade a Tanzu Kubernetes cluster directly from v1.21.x to v1.23.x. You must upgrade a v1.21.x cluster to v1.22.x before upgrading the cluster to v1.23.x.

Release Dates

Tanzu Kubernetes Grid v1.6 release dates are:

  • v1.6.0: 1 Sep 2022

Behavior Changes Between Tanzu Kubernetes Grid v1.5.4 and v1.6

Tanzu Kubernetes Grid v1.6.0 introduces the following new behavior compared with v1.5.4, which is the latest previous release.

  • The Tanzu CLI telemetry plugin replaces tanzu mc ceip-participation commands with tanzu telemetry commands. See Manage Participation in CEIP for the new commands.

User Documentation

A new publication VMware Tanzu CLI Reference describes the Tanzu CLI and includes a command reference organized by Tanzu CLI command group. Much of this content was previously published in the Tanzu Kubernetes Grid product documentation and the Tanzu Application Platform product documentation.

The Tanzu Kubernetes Grid 1.6 documentation applies to all of the 1.6.x releases. It includes information about the following subjects:

  • This Release Notes topic covers the new features and will include other information specific to 1.6.x patch versions.
  • Concepts and References introduces the key components of Tanzu Kubernetes Grid and describes how you use them and what they do.
  • Prepare to Deploy Management Clusters describes how to install the Tanzu CLI as well as the prerequisites for deploying Tanzu Kubernetes Grid on vSphere, AWS, and Microsoft Azure
  • Deploy Management Clusters describes how to deploy Tanzu Kubernetes Grid management clusters to vSphere, AWS, and Microsoft Azure.
  • Deploy Workload Clusters describes how to use the Tanzu Kubernetes Grid CLI to deploy workload clusters from your management cluster
  • Manage Clusters describes how to manage the lifecycle of management and workload clusters.
  • Install and Configure Packages describes how to set up local shared services in your workload clusters, such as authentication and authorization, logging, networking, and ingress control.
  • Build Machine Images describes how to build and use your own base OS images for cluster nodes.
  • Upgrade Tanzu Kubernetes Grid describes how to upgrade to this version.
  • Identity and Access Management explains how to integrate an external identity provider (IDP) and configure role-based access control (RBAC).
  • Networking includes how to configure container networking, and on vSphere, NSX Advanced Load Balancer and IPv6.
  • Security and Compliance explains how Tanzu Kubernetes Grid maintains security, and covers NIST controls assessment, audit logging, and a FIPS-capable product version.
  • Logs and Troubleshooting includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying workload clusters.

Resolved Issues

The following issues that were documented as Known Issues in Tanzu Kubernetes Grid v1.5.4 are resolved in Tanzu Kubernetes Grid v1.6.0. For details of issues that were resolved in 1.5.x patch releases up to and including v1.5.4, see the v1.5.x Release Notes.

  • kapp-controller generates ctrl-change ConfigMap objects, even if there is no change

    The CustomResourceDefinition objects that define configurations for Calico, AKO Operator, and other packages include a status field. When the kapp-controller reconciles these CRD objects every five minutes, it interprets their status as having changed even when the package configuration did not change. This causes the kapp-controller to generate unnecessary, duplicate ctrl-change ConfigMap objects, which soon overrun their history buffer because each package saves a maximum of 200 ctrl-change ConfigMap records.

    Workaround: None

  • Host network pods and node use the wrong IP in IPv6 clusters.

    When you deploy IPv6 clusters with multiple control plane nodes on vSphere and the clusters use Kubernetes 1.20.x or 1.21.x, one of your nodes as well as the etc, kube-apiserver, and kube-proxy pods may take on the IP you set for the VSPHERE_CONTROL_PLANE_ENDPOINT instead of an IP of their own. You might not see an error, but this could cause networking problems for these pods and prevent the control plane nodes from proper failover.

  • When AVI_LABELS is set, ako-operator causes high latency on the AVI Controller

    Due to a bug in the ako-operator package, setting the AVI_LABELS variable or configuring Cluster Labels (Optional) in the Configure VMware NSX Advanced Load Balancer section of the installer interface when creating the management cluster results in the package attempting to reconcile indefinitely. This generates a high volume of events on the AVI Controller.

    Workaround: If you are experiencing this issue, follow the steps below:

    1. Pause the reconciliation of the ako-operator package:

      kubectl patch pkgi ako-operator -n tkg-system --type "json" -p '[{"op":"replace","path":"/spec/paused","value":true}]'
      
    2. Remove the cluster selector in the default AKODeploymentConfig custom resource:

      kubectl patch adc install-ako-for-all --type "json" -p='[{"op":"remove","path":"/spec/clusterSelector"}]'
      
    3. Remove the labels that you defined in AVI_LABELS or Cluster Labels (Optional) from each affected workload cluster:

      kubectl label CLUSTER-NAME YOUR-AVI-LABELS-
      

      For example:

      kubectl label my-workload-cluster tkg.tanzu.vmware.com/ako-enabled=-
      

    The ako-operator package must remain in the paused state to persist this change.

  • With NSX ALB, cannot create cluster in NAMESPACE that has name beginning with numeric character

    On vSphere with NSX Advanced Load Balancer, creating a workload cluster from Tanzu Mission Control or by running tanzu cluster create fails if its management namespace, set by the NAMESPACE configuration variable, begins with a numeric character (0-9).

Known Issues

The following are known issues in Tanzu Kubernetes Grid v1.6.0.

Upgrade

  • Upgrade fails for clusters created with the wildcard character (*) in TKG_NO_PROXY setting

    TKG v1.6 does not allow the wildcard character (*) in cluster configuration file settings for TKG_NO_PROXY. Clusters created by previous TKG versions with this setting require special handling before upgrading, in order to avoid the error workload cluster configuration validation failed: invalid string '*' in TKG_NO_PROXY.

    Workaround: Depending on the type of cluster you are upgrading:

    • Management cluster:

      1. Switch to management cluster kubectl context.
      2. Edit the configMap kapp-controller-config:

        kubectl edit cm kapp-controller-config -n tkg-system
        
      3. Find the data.noProxy field and change its wildcard hostname by removing *. For example, change *.vmware.com to .vmware.com

      4. Save and exit. The cluster is ready to upgrade.

    • Workload cluster:

      1. Switch to workload cluster kubectl context
      2. Set environment variables for your cluster name and namespace, for example:

        CLUSTER_NAME=my-test-cluster
        NS=my-test-namespace
        
      3. Obtain and decode the kapp controller data values for the workload cluster:

        kubectl get secret "${CLUSTER_NAME}-kapp-controller-data-values" -n $NS -o json | jq -r '.data."values.yaml"' | base64 -d > "${CLUSTER_NAME}-${NS}-kapp-controller-data-values"
        
      4. Edit the ${CLUSTER_NAME}-${NS}-kapp-controller-data-values file by removing * from its kappController.config.noProxy setting. For example, change *.vmware.com to .vmware.com.

      5. Save and quit.
      6. Re-encode the data values file ${CLUSTER_NAME}-${NS}-kapp-controller-data-values:

        cat "${CLUSTER_NAME}-${NS}-kapp-controller-data-values" | base64 -w 0
        
      7. Edit the ${CLUSTER_NAME}-${NS}-kapp-controller-data-values secret and update its data.value.yaml setting by pasting in the newly-encoded data values string.

        kubectl edit secret "${CLUSTER_NAME}-kapp-controller-data-values" -n "${NS}"
        
      8. Save and exit. The cluster is ready to upgrade.

Packages

  • Multus CNI fails on medium and smaller pods with NSX Advanced Load Balancer

    On vSphere, workload clusters with medium or smaller worker nodes running the Multus CNI package with NSX ALB can fail with Insufficient CPU or other errors.

    Workaround: To use Multus CNI with NSX ALB, deploy workload clusters with worker nodes of size large or extra-large.

Storage

  • Cluster and pod operations that delete pods may fail if DaemonSet configured to auto-restore persistent volumes

    In installations where a DaemonSet uses persistent volumes (PVs), machine deletion may fail because the drain by default process ignores DaemonSets and the system waits indefinitely for the volumes to be detached from the node. Affected cluster operations include upgrade, scale down, and delete.

    Workaround: To address this issue, do one of the following to each worker node in the cluster before upgrading, scaling down, or deleting the cluster:

    • Set a spec.NodeDrainTimeout value for the node. This lets the machine controller delete the node once the timeout expires, even if it has volumes attached.

    • Manually delete each pod in the node.

CLI

  • On vSphere with Tanzu, tanzu cluster list generates error for DevOps users

    When a user with the DevOps engineer role, as described in vSphere with Tanzu User Roles and Workflows, runs tanzu cluster list, they may see an error resembling Error: unable to retrieve combined cluster info: unable to get list of clusters. User cannot list resource "clusters" at the cluster scope.

    This happens because the tanzu cluster command without a -n option attempts to access all namespaces, some of which may not be accessible to a DevOps engineer user.

    Workaround: When running tanzu cluster list, include a --namespace value to specify a namespace that the user can access.

  • Non-alphanumeric characters cannot be used in HTTP/HTTPS proxy passwords

    When deploying management clusters with CLI, the non-alphanumeric characters # ` ^ | / ? % ^ { [ ] } \ " < > cannot be used in passwords. Also, any non-alphanumeric character cannot be used in HTTP/HTTPS proxy passwords when deploying management cluster with UI.

    Workaround: You can use non-alphanumeric characters other than # ` ^ | / ? % ^ { [ ] } \ " < > in passwords when deploying management cluster with CLI.

  • Tanzu CLI does not work on macOS machines with ARM processors

    Tanzu CLI v0.11.6 does not work on macOS machines with ARM (Apple M1) chips, as identified under Finder > About This Mac > Overview.

    Workaround: Use a bootstrap machine with a Linux or Windows OS, or a macOS machine with an Intel processor.

  • Windows CMD: Extraneous characters in CLI output column headings

    In the Windows command prompt (CMD), Tanzu CLI command output that is formatted in columns includes extraneous characters in column headings.

    The issue does not occur in Windows Terminal or PowerShell.

    Workaround: On Windows bootstrap machines, run the Tanzu CLI from Windows Terminal.

  • Ignorable AKODeploymentConfig error during management cluster creation

    Running tanzu management-cluster create to create a management cluster with NSX ALB outputs the following error: no matches for kind “AKODeploymentConfig” in version “networking.tkg.tanzu.vmware.com/v1alpha1”. The error can be ignored. For more information, see this article in the KB.

  • Ignorable machinehealthcheck and clusterresourceset errors during workload cluster creation on vSphere

    When a workload cluster is deployed to vSphere by using the tanzu cluster create command through vSphere with Tanzu, the output might include errors related to running machinehealthcheck and accessing the clusterresourceset resources, as shown below:

    Error from server (Forbidden): error when creating "/tmp/kubeapply-3798885393": machinehealthchecks.cluster.x-k8s.io is forbidden: User "sso:Administrator@vsphere.local" cannot create resource "machinehealthchecks" in API group "cluster.x-k8s.io" in the namespace "tkg"
    ...
    Error from server (Forbidden): error when retrieving current configuration of: Resource: "addons.cluster.x-k8s.io/v1beta1, Resource=clusterresourcesets", GroupVersionKind: "addons.cluster.x-k8s.io/v1beta1, Kind=ClusterResourceSet"
    ...
    

    The workload cluster is successfully created. You can ignore the errors.

  • CLI temporarily misreports status of recently deleted nodes when MHCs are disabled

    When machine health checks (MHCs) are disabled, then Tanzu CLI commands such as tanzu cluster status may not report up-to-date node state while infrastructure is being recreated.

    Workaround: None

vSphere

  • Node pools created with small nodes may stall at Provisioning

    Node pools created with node SIZE configured as small may become stuck in the Provisioning state and never proceed to Running.

    Workaround: Configure node pool with at least medium size nodes.

  • With NSX ALB, cannot create clusters with identical names

    If you are using NSX Advanced Load Balancer for workloads (AVI_ENABLE) or the control plane (AVI_CONTROL_PLANE_HA_PROVIDER) the Avi Controller may fail to distinguish between identically-named clusters.

    Workaround: Set a unique CLUSTER_NAME value for each cluster:

    • Management clusters: Do not create multiple management clusters with the same CLUSTER_NAME value, even from different bootstrap machines.

    • Workload clusters: Do not create multiple workload clusters that have the same CLUSTER_NAME and are also in the same management cluster namespace, as set by their NAMESPACE value.

  • Adding external identity management to an existing deployment may require setting dummy VSPHERE_CONTROL_PLANE_ENDPOINT value

    Integrating an external identity provider with an existing TKG deployment may require setting a dummy VSPHERE_CONTROL_PLANE_ENDPOINT value in the management cluster configuration file used to create the add-on secret, as described in Generate the Pinniped Add-on Secret for the Management Cluster

AWS

  • Deleting cluster on AWS fails if cluster uses networking resources not deployed with Tanzu Kubernetes Grid.

    The tanzu cluster delete and tanzu management-cluster delete commands may hang with clusters that use networking resources created by the AWS Cloud Controller Manager independently from the Tanzu Kubernetes Grid deployment process. Such resources may include load balancers and other networking services, as listed in The Service Controller in the Kubernetes AWS Cloud Provider documentation.

    For more information, see the Cluster API issue Drain workload clusters of service Type=Loadbalancer on teardown.

    Workaround: Use kubectl delete to delete services of type LoadBalancer from the cluster. Or if that fails, use the AWS console to manually delete any LoadBalancer and SecurityGroup objects created for this service by the Cloud Controller manager. Warning: Do not to delete load balancers or security groups managed by Tanzu, which have the tags key: sigs.k8s.io/cluster-api-provider-aws/cluster/CLUSTER-NAME, value: owned.

Image-Builder

  • Ignorable goss test failures during image-build process

    When you run Kubernetes Image Builder to create a custom Linux custom machine image, the goss tests python-netifaces, python-requests, and ebtables fail. Command output reports the failures. The errors can be ignored; they do not prevent a successful image build.

Windows Workload Clusters

  • You cannot create a Windows machine image on a MacOS machine

    Due to an issue with the open-source packer utility used by Kubernetes Image Builder, you cannot build a Windows machine image on a MacOS machine as described in Windows Custom Machine Images.

    Workaround: Use a Linux machine to build your custom Windows machine images.

  • Pinniped fails to reconcile on newly created Windows workload cluster

    After creating a Windows workload cluster that uses an external identity provider, you may see the following error message:

    Reconcile failed: Error (see .status.usefulErrorMessage for details)
    pinniped-supervisor pinniped-post-deploy-job - Waiting to complete (1 active, 0 3h failed, 0 succeeded)^ pinniped-post-deploy-job--1-kfpr5 - Pending: Unschedulable (message: 0/2 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 1 node(s) had taint {os: windows}, that the pod didn't tolerate.)
    

    Workaround: Add a tolerations setting to the Pinniped secret by following the procedure in Add Pinniped Overlay in Windows Custom Machine Images.

NSX

Note: For v4.0+, VMware NSX-T Data Center is renamed to “VMware NSX.”

  • Management cluster create fails or performance slow with older NSX-T versions and Photon 3 or Ubuntu with Linux kernel 5.8 VMs

    Deploying a management cluster with the following infrastructure and configuration may fail or result in restricted traffic between pods:

    • vSphere with any of the following versions of NSX-T:
      • NSX-T v3.1.3 with Enhanced Datapath enabled
      • NSX-T v3.1.x lower than v3.1.3
      • NSX-T v3.0.x lower than v3.0.2 hot patch
      • NSX-T v2.x. This includes Azure VMware Solution (AVS) v2.0, which uses NSX-T v2.5
    • Base image: Photon 3 or Ubuntu with Linux kernel 5.8

    This combination exposes a checksum issue between older versions of NSX-T and Antrea CNI.

    TMC: If the management cluster is registered with Tanzu Mission Control (TMC) there is no workaround to this issue. Otherwise, see the workarounds below.

    Workarounds:

    • Deploy workload clusters configured with ANTREA_DISABLE_UDP_TUNNEL_OFFLOAD set to "true". This setting disables Antrea’s UDP checksum offloading, which avoids the known issues with some underlay network and physical NIC network drivers.
    • Upgrade to NSX-T v3.0.2 Hot Patch, v3.1.3, or later, without Enhanced Datapath enabled
    • Use an Ubuntu base image with Linux kernel 5.9 or later.
  • Setting AVI_CONTROLLER_VERSION may cause error ako operator webhook validation fail

    In TKG v1.6, the AVI_CONTROLLER_VERSION cluster configuration variable is not needed because the AKO operator automatically detects the Avi Controller version that is in use. See the Product Snapshot for compatible Avi Controller versions.

    If you set this variable, or if you include a spec.controllerVersion setting when customizing your AKO deployment, management cluster creation or AKO customization may fail with a webhook validation failed error.

    Workaround: Do not set AVI_CONTROLLER_VERSION in a management cluster configuration file, and if you customize your AKO deployment by running kubectl apply -f with an AKODeploymentConfig object spec, do not include a spec.controllerVersion field in the spec.

AVS

  • vSphere CSI volume deletion may fail on AVS

    On Azure vSphere Solution (AVS), vSphere CSI Persistent Volumes (PVs) deletion may fail. Deleting a PV requires the cns.searchable permission. The default admin account for AVS, cloudadmin@vsphere.local, is not created with this permission. For more information, see vSphere Roles and Privileges.

    Workaround: To delete a vSphere CSI PV on AVS, contact Azure support.

Harbor

  • No Harbor proxy cache support

    You cannot use Harbor in proxy cache mode for running Tanzu Kubernetes Grid in an internet-restricted environment. Prior versions of Tanzu Kubernetes Grid supported the Harbor proxy cache feature.

    Workaround: None

check-circle-line exclamation-circle-line close-line
Scroll to top icon