VMware Tanzu Kubernetes Grid | 11 MAY 2021 | Build 17994759

Check for additions and updates to these release notes.

About VMware Tanzu Kubernetes Grid

VMware Tanzu Kubernetes Grid provides Enterprise organizations with a consistent, upstream compatible, regional Kubernetes substrate across SDDC, Public Cloud, and Edge environments that is ready for end-user workloads and ecosystem integrations. TKG builds on trusted upstream and community projects and delivers an engineered and supported Kubernetes platform for end users and partners.

Key features include:

  • The Tanzu Kubernetes Grid installer interface, a graphical installer that walks you through the process of deploying management clusters to vSphere, Amazon EC2, and Microsoft Azure.
  • The Tanzu CLI, providing simple commands that allow you to deploy CNCF conformant Kubernetes clusters to vSphere, Amazon EC2, and Microsoft Azure.
  • Binaries for Kubernetes and all of the components that you need in order to easily stand up an enterprise-class Kubernetes development environment. All binaries are tested and signed by VMware.
  • Extensions for your Tanzu Kubernetes Grid instance, that provide authentication and authorization, logging, networking, monitoring, Harbor registry, and ingress control. 
  • VMware support for your Tanzu Kubernetes Grid deployments.

New Features in Tanzu Kubernetes Grid v1.3.1

  • New Kubernetes versions:
    • 1.20.5
    • 1.19.9
    • 1.18.17
  • Addresses security vulnerabilities:
    • CVE-2021-30465
    • CVE-2021-28682
    • CVE-2021-28683
    • CVE-2021-29258
  • Workload clusters no longer use the Tanzu Mission Control Extension Manager.
  • OIDC authentication no longer uses dex .
  • Running tanzu cluster create --dry-run generates a workload cluster template from a configuration file without requiring a management cluster.
  • Bill of Materials (BoM) handling supports custom registry sources for individual images, overriding default registry.
  • Users can upgrade add-ons independently of upgrading Tanzu Kubernetes Grid.
  • (vSphere) Supports routable, no-NAT IP addresses for workload cluster pods, enabling traceability and auditing.
  • (vSphere v6.7) Installer interface includes access and configuration options for NSX-T Advanced Load Balancer.
  • (vSphere) Supports deploying multiple MachineDeployment and KubeadmControlPlane objects without changing overlay file.
  • (Azure) New cluster configuration variables:
    • AZURE_CUSTOM_TAGS applies Azure tags to cluster resources.
    • AZURE_ENABLE_PRIVATE_CLUSTER and AZURE_FRONTEND_PRIVATE_IP run workload clusters as private clusters with internal load balancers.
    • AZURE_ENABLE_NODE_DATA_DISK optionally provisions a data disk for worker nodes.
    • AZURE_CONTROL_PLANE_ and AZURE_NODE_ variables for DATA_DISK_SIZE_GIB and OS_DISK_SIZE_GIB configure data and OS disk sizes for control plane and worker nodes.
    • AZURE_CONTROL_PLANE_OS_DISK_STORAGE_ACCOUNT_TYPE and AZURE_NODE_OS_DISK_STORAGE_ACCOUNT_TYPE specify storage account for control plane and worker node disks.
  • (Azure) Future-compatibility cluster configuration variable AZURE_ENABLE_ACCELERATED_NETWORKING enables Azure accelerated networking when TKRs support it. (Currently Azure TKRs do not support Azure accelerated networking.

Supported Kubernetes Versions in Tanzu Kubernetes Grid v1.3.1

Each version of Tanzu Kubernetes Grid adds support for new Kubernetes versions. This version also supports versions of Kubernetes from previous versions of Tanzu Kubernetes Grid.

Tanzu Kubernetes Grid Version Provided Kubernetes Versions Supported in v1.3.1?
1.3.1 1.20.5
1.19.9
1.18.17
YES
YES
YES
1.3.0 1.20.4
1.19.8
1.18.16
1.17.16*
YES
YES
YES
NO
1.2.1 1.19.3
1.18.10
1.17.13
YES
YES
NO
1.2 1.19.1
1.18.8
1.17.11
YES
YES
NO
1.1.3 1.18.6
1.17.9
YES
NO
1.1.2 1.18.3
1.17.6
YES
NO
1.1.0 1.18.2 YES
1.0.0 1.17.3 NO

Product Snapshot for Tanzu Kubernetes Grid v1.3.1

Tanzu Kubernetes Grid v1.3.1 supports the following infrastructure platforms and operating systems (OSs), as well as cluster creation and management, networking, storage, authentication, backup and migration, and observability components. The component versions listed in parentheses are included in Tanzu Kubernetes Grid v1.3.1. For more information, see Component Versions.

  vSphere Amazon EC2 Azure
Infrastructure platform
  • vSphere 6.7U3
  • vSphere 7
  • VMware Cloud on AWS****
  • Azure VMware Solution
Native AWS* Native Azure
Cluster creation and management Core Cluster API (v0.3.14), Cluster API Provider vSphere (v0.7.7) Core Cluster API (v0.3.14), Cluster API Provider AWS (v0.6.4) Core Cluster API (v0.3.14), Cluster API Provider Azure (v0.4.10)
Kubernetes node OS distributed with TKG Photon OS 3, Ubuntu 20.04 Amazon Linux 2, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Bring your own image Photon OS 3, Red Hat Enterprise Linux 7, Ubuntu 18.04, Ubuntu 20.04 Amazon Linux 2, Ubuntu 18.04, Ubuntu 20.04 Ubuntu 18.04, Ubuntu 20.04
Container runtime Containerd (v1.4.4) + Containerd (v1.4.4) + Containerd (v1.4.4) +
Container networking Antrea (v0.11.3), Calico (v3.11.3) Antrea (v0.11.3), Calico (v3.11.3) Antrea (v0.11.3), Calico (v3.11.3)
Container registry Harbor (v2.1.3) Harbor (v2.1.3) Harbor (v2.1.3)
Ingress NSX Advanced Load Balancer Essentials (v20.1.3)**, Contour (v1.12.0) Contour (v1.12.0) Contour (v1.12.0)
Storage vSphere Container Storage Interface (v2.1.0***) and vSphere Cloud Native Storage In-tree cloud providers only In-tree cloud providers only
Authentication OIDC via Pinniped (v0.4.1), LDAP via Pinniped (v0.4.1) and Dex OIDC via Pinniped (v0.4.1), LDAP via Pinniped (v0.4.1) and Dex OIDC via Pinniped (v0.4.1), LDAP via Pinniped (v0.4.1) and Dex
Observability Fluent Bit (v1.6.9), Prometheus (v2.18.1), Grafana (v7.3.5) Fluent Bit (v1.6.9), Prometheus (v2.18.1), Grafana (v7.3.5) Fluent Bit (v1.6.9), Prometheus (v2.18.1), Grafana (v7.3.5)
Backup and migration Velero (v1.5.4) Velero (v1.5.4) Velero (v1.5.4)

NOTES:

For a full list of Kubernetes versions that ship with Tanzu Kubernetes Grid v1.3.1, see Supported Kubernetes Versions in Tanzu Kubernetes Grid v1.3.1 above.

Component Versions

The Tanzu Kubernetes Grid v1.3.1 release includes the following software component versions:

  • ako-operator: v1.3.1+vmware.1
  • alertmanager: v0.20.0+vmware.1
  • antrea: v0.11.3+vmware.2
  • cadvisor: v0.36.0+vmware.1
  • calico_all: v3.11.3+vmware.1
  • cloud-provider-azure: v0.5.1+vmware.2
  • cloud_provider_vsphere: v1.18.1+vmware.1
  • cluster-api-provider-azure: v0.4.10+vmware.1
  • cluster_api: v0.3.14+vmware.2
  • cluster_api_aws: v0.6.4+vmware.1
  • cluster_api_vsphere: v0.7.7+vmware.1
  • configmap-reload: v0.3.0+vmware.1
  • contour: v1.12.0+vmware.1
  • crash-diagnostics: v0.3.2+vmware.3
  • csi_attacher: v3.0.0+vmware.1
  • csi_livenessprobe: v2.1.0+vmware.1
  • csi_node_driver_registrar: v2.0.1+vmware.1
  • csi_provisioner: v2.0.0+vmware.1
  • dex: v2.27.0+vmware.1
  • envoy: v1.17.3+vmware.1 +
  • external-dns: v0.7.4+vmware.1
  • fluent-bit: v1.6.9+vmware.1
  • gangway: v3.2.0+vmware.2
  • grafana: v7.3.5+vmware.2
  • harbor: v2.1.3+vmware.1
  • imgpkg: v0.5.0+vmware.1
  • jetstack_cert-manager: v0.16.1+vmware.1
  • k8s-sidecar: v0.1.144+vmware.2
  • k14s_kapp: v0.36.0+vmware.1
  • k14s_ytt: v0.31.0+vmware.1
  • kapp-controller: v0.16.0+vmware.1, kapp-controller:v0.18.0+vmware.1*
  • kbld: v0.28.0+vmware.1
  • kube-state-metrics: v1.9.5+vmware.2
  • kube-vip: v0.3.3+vmware.1
  • kube_rbac_proxy: v0.4.1+vmware.2
  • kubernetes-csi_external-resizer: v1.0.0+vmware.1
  • kubernetes-sigs_kind: v1.20.5+vmware.1
  • kubernetes_autoscaler:
    v1.20.0+vmware.1, v1.19.1+vmware.1, v1.18.3+vmware.1, v1.17.4+vmware.1
  • load-balancer-and-ingress-service: v1.3.2+vmware.1
  • metrics-server: v0.4.0+vmware.1
  • pinniped: v0.4.1+vmware.1
  • prometheus: v2.18.1+vmware.1
  • prometheus_node_exporter: v0.18.1+vmware.1
  • pushgateway: v1.2.0+vmware.2
  • sonobuoy: v0.20.0+vmware.1
  • tanzu_core: v1.3.1
  • tkg-bom: v1.3.1
  • tkg_extensions: v1.3.1+vmware.1
  • tkg_telemetry: v1.3.1+vmware.1
  • velero: v1.5.4+vmware.1
  • velero-plugin-for-aws: v1.1.0+vmware.1
  • velero-plugin-for-microsoft-azure: v1.1.0+vmware.1
  • velero-plugin-for-vsphere: v1.1.0+vmware.1
  • vmware-private_tanzu-cli-tkg-plugins: v1.3.1
  • vsphere_csi_driver: v2.1.0+vmware.1

Indicates a version bump for a security fix. For more information, see Updated Base Image Files.

*The version of kapp-controller depends on the Kubernetes version you are running and on which cloud provider the cluster is deployed upon..

For a complete list of software component versions that ship with Tanzu Kubernetes Grid v1.3.1, see ~/.tanzu/tkg/bom/bom-v1.3.1.yaml and ~/.tanzu/tkg/bom/tkr-bom-v1.20.5+vmware.2-tkg.1.yaml.

Supported AWS Regions

You can use Tanzu Kubernetes Grid v1.3.1 to deploy clusters to the following AWS regions:

  • ap-northeast-1
  • ap-northeast-2
  • ap-south-1
  • ap-southeast-1
  • ap-southeast-2
  • eu-central-1
  • eu-west-1
  • eu-west-2
  • eu-west-3
  • sa-east-1
  • us-east-1
  • us-east-2
  • us-gov-east-1
  • us-gov-west-1
  • us-west-2

Supported Upgrade Paths

You can only upgrade to Tanzu Kubernetes Grid v1.3.1 from v1.2.x and v1.3.0. If you want to upgrade to Tanzu Kubernetes Grid v1.3.1 from a version earlier than v1.2.x, you must upgrade to v1.2.x first before upgrading to v1.3.1.

When upgrading Kubernetes versions on Tanzu Kubernetes clusters, you cannot skip minor versions. For example, you cannot upgrade a Tanzu Kubernetes cluster directly from v1.18.x to v1.20.x. You must upgrade a v1.18.x cluster to v1.19.x before upgrading the cluster to v1.20.x.

Updated Base Image Files

To address CVE-2021-30465, Tanzu Kubernetes Grid provides patched versions of vSphere OVA, Amazon EC2 AMI, and Azure base image files. If you are installing Tanzu Kubernetes Grid v1.3.1 on vSphere, you must download and import the updated vSphere OVAs. Make sure to download the OVAs listed under Updated Kubernetes OVAs to address CVE-2021-30465 for VMware Tanzu Kubernetes Grid 1.3.1. Updated Amazon EC2 and Azure based image files are available on the IaaS.

IMPORTANT: Follow the instructions in this Knowledge Base article (KB83781) to ensure that existing and new v1.3.1 Tanzu Kubernetes management clusters use the updated base images during cluster deployment and upgrade.

To address CVE-2021-28682, CVE-2021-28683, and CVE-2021-29258, each updated Tanzu Kubernetes Grid base image also includes a patched version of Envoy.

If you plan to deploy or have already deployed the Contour extension, see this Knowledge Base article (KB83761) for instructions on how to deploy or update to the patched version of Envoy in the Contour extension.

Behavior Changes Between Tanzu Kubernetes Grid v1.3.0 and v1.3.1

Tanzu Kubernetes Grid v1.3.1 introduces the following new behavior compared with v1.3.0.

  • Tanzu Kubernetes Grid v1.3.1 removes the Tanzu Mission Control extension manager from the extensions bundle. Extensions are no longer wrapped inside the extension resource. Instead of deploying and upgrading extensions with Tanzu Mission Control extension manager, you deploy and upgrade them by using the Kapp controller. As part of the upgrade to v1.3.1 extensions, you must remove the extension resource for each extension being upgraded on the cluster. For more information, see Upgrade Tanzu Kubernetes Grid Extensions.
  • Tanzu CLI v1.3.1 will not allow you to deploy a new Tanzu Kubernetes cluster without upgrading the management cluster first.

For the major differences in behavior between 1.2.0 and 1.3.x, see the VMware Tanzu Kubernetes Grid 1.3 Release Notes.

User Documentation

The Tanzu Kubernetes Grid 1.3 documentation applies to all of the 1.3.x releases. It includes information about the following subjects:

Resolved Issues

  • Tanzu Kubernetes Grid does not support Azure accelerated networking

    Tanzu Kubernetes Grid is currently incompatible with Azure accelerated networking, which is enabled by default on most VM instances that have 4 vCPUs or more.

    Note: The default node VM size that Tanzu Kubernetes Grid creates in Azure is Standard_D2s_v3, which does not use accelerated networking. This issue only affects larger node sizes.

  • Tanzu Kubernetes Grid 1.3.0 extensions do not function on Tanzu Kubernetes Grid Service clusters when attached to Tanzu Mission Control

    Tanzu Kubernetes Grid extensions (Contour, Fluentbit, Prometheus, Grafana) that have been previously installed and are functioning correctly on a Tanzu Kubernetes cluster, created by using Tanzu Kubernetes Grid Service, stop working when you attach that guest cluster to Tanzu Mission Control (TMC).

  • New Tanzu Kubernetes Grid v1.3.1 deployments fail on Azure due to missing base image

    The base image for Tanzu Kubernetes Grid v1.3.1 has been updated to address several CVEs, however the base image may not be available immediately in the Azure Marketplace. Until the image is made available, new management cluster deployments on Azure will fail.

    Workaround: Deploy Tanzu Kubernetes Grid v1.3.1 management clusters on Azure only after the base image becomes available in the Azure Marketplace.

  • Velero installer pulls incorrect container version

    If you download the Velero binary from the Tanzu Kubernetes Grid 1.3.0 downloads page and run velero install, Velero pulls the main version of the container, and not the correct version of the container.

  •  Thumbprints for expired vCenter Server certificates cannot be updated

    If your vCenter Server certificate expires, it is not possible to update the certificate thumbprint in cluster node VMs.

  • Option to skip TLS verification for private registries is ignored

    If you are deploying Tanzu Kubernetes Grid in an Internet-restricted environment and you set the TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY=true variable to skip TLS verification, this variable is ignored and the deployment of management clusters fails. If you examine one of the running apps in the cluster, for example by running kubectl describe app antrea, you see the following error:

    Error: Syncing directory '0': Syncing directory '.' with image contents: Imgpkg: exit status 1 
    (stderr: Error: Collecting images: Working with 
    <registry_address>/<project>/tanzu_core/addons/antrea-templates:v1.3.0: 
    Get https://172.50.0.10/v2/: x509: certificate signed by unknown authority
    

Known Issues

The known issues are grouped as follows.

vSphere Issues
  • Cannot use Velero to back up Kubernetes 1.20 clusters with persistent volumes on vSphere

    If you attempt to use Velero 1.5.3 to back up Kubernetes 1.20 clusters running on vSphere that have persistent volumes, the backup fails with the error in the backup logs:

    time="2021-04-02T16:29:05Z" level=info msg="1 errors encountered backup up item" backup=velero/nginx-backup logSource="pkg/backup/ backup.go:427" name=nginx-deployment-66689547d-d7n6c time="2021-04-02T16:29:05Z" level=error msg="Error backing up item" backup=velero/nginx-backup error="error executing custom action (groupResource=persistentvolumeclaims, namespace=nginx-example, name=nginx-logs): rpc error: code = Unknown desc = Failed during IsObjectBlocked check: Could not translate selfLink to CRD name" logSource="pkg/backup/backup.go:431" name=nginx-deployment-66689547d-d7n6

    This occurs because Kubernetes 1.20 deprecated selfLink.

    Workaround: See https://kb.vmware.com/s/article/83314.

  • Creating a workload cluster with multiple control plane nodes stalls as VIP of the kube-apiserver is lost

    When creating a workload cluster with multiple control plane nodes on vSphere, the following happens:

    • The first control plane node starts successfully.
    • When the second control plane node starts, the VIP on the first control plane is lost.
    • No IP addresses appear on the first control plane node in the vSphere Client.
    • The following event appears in the logs:
      [2021-04-22T19:43:16.516Z] [ warning] [guestinfo] *** WARNING: GuestInfo collection interval longer than expected; actual=511 sec, expected=30 sec. ***
    • The node shows an alert about high CPU utilization in the vSphere Client
    • The first control plane node becomes intermittently responsive, with a high load average:
      root@nv8-wl-02-control-plane-q58z7 [ ~ ]# uptime
      17:58:05 up 26 min,  1 user,  load average: 22.74, 84.23, 69.62 

    Workaround: Tune the kube-vip leader election parameters by updating the vSphere configuration with the following ytt overlay.

    1. Open the file ~/.tanzu/tkg/providers/infrastructure-vsphere/ytt/vsphere-overlay.yaml in a text editor.
    2. Paste the following into vsphere-overlay.yaml:
      
      #@ load("@ytt:overlay", "overlay")
      #@ load("@ytt:data", "data")
      #@ load("lib/helpers.star", "get_bom_data_for_tkr_name", "get_default_tkg_bom_data", "kubeadm_image_repo", "get_image_repo_for_component", "get_vsphere_thumbprint")
      #@ load("@ytt:yaml", "yaml")
      
      #@ bomData = get_default_tkg_bom_data()
      
      
      #@ def kube_vip_pod():
      ---
      apiVersion: v1
      kind: Pod
      metadata:
        creationTimestamp: null
        name: kube-vip
        namespace: kube-system
      spec:
        containers:
        - args:
          - start
          env:
          - name: vip_arp
            value: "true"
          - name: vip_leaderelection
            value: "true"
          - name: address
            value: #@ data.values.VSPHERE_CONTROL_PLANE_ENDPOINT
          - name: vip_interface
            value:  #@ data.values.VIP_NETWORK_INTERFACE
          - name: vip_leaseduration
            value: "30"
          - name: vip_renewdeadline
            value: "20"
          - name: vip_retryperiod
            value: "4"
          image: #@ "{}/{}:{}".format(get_image_repo_for_component(bomData.components["kube-vip"][0].images.kubeVipImage), bomData.components["kube-vip"][0].images.kubeVipImage.imagePath, bomData.components["kube-vip"][0].images.kubeVipImage.tag)
          imagePullPolicy: IfNotPresent
          name: kube-vip
          resources: {}
          securityContext:
            capabilities:
              add:
              - NET_ADMIN
              - SYS_TIME
          volumeMounts:
          - mountPath: /etc/kubernetes/admin.conf
            name: kubeconfig
        hostNetwork: true
        volumes:
        - hostPath:
            path: /etc/kubernetes/admin.conf
            type: FileOrCreate
          name: kubeconfig
      status: {}
      #@ end
      
      #@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
      ---
      apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
      kind: KubeadmControlPlane
      metadata:
        name: #@ "{}-control-plane".format(data.values.CLUSTER_NAME)
      spec:
        kubeadmConfigSpec:
          files:
          #@overlay/match by=overlay.index(0)
          - content: #@ yaml.encode(kube_vip_pod())
      
    3. Save and close the file.
    4. Attempt to deploy the workload cluster again.
  • On vSphere 7, offline volume expansion for vSphere CSI storage used by workload clusters does not work.

    Cluster storage interface (CSI) lacks the csi-resizer pod needed to resize storage volumes.

    Workaround: Add a csi-resizer sidecar pod to the cluster’s CSI processes, as documented in Enable Offline Volume Expansion for vSphere CSI (vSphere 7)

  • Management Cluster creation fails if vSphere password starts with special characters

    If the password for vSphere starts with %, !, &,*, or # deployment fails with the following error:

    Error: unable to set up management cluster: unable to build management cluster configuration: unable to get template: Extracting data value from KV: Deserializing value for key 'VSPHERE_PASSWORD': Deserializing YAML value: yaml: line 1: could not find expected directive name
    

    Workaround: Make sure that the vSphere password does not start with %, !, &,*, or #.

  • Management clusters that run Photon OS deploy workload clusters that run Ubuntu by default

    If you use a Photon OS OVA image when you deploy a management cluster to vSphere from the installer interface, the OS_NAME setting is not written into the configuration file. Consequently, if you use a copy of the management cluster configuration file to deploy workload clusters, the workload cluster OS defaults to Ubuntu, unless you explicitly set the OS_NAME variable to photon in the configuration file. If the Ubuntu image is not present in your vSphere inventory, deployment of workload clusters will fail.

    Workaround: To use Photon OS as the operating system for workload cluster nodes, always set the OS_NAME setting in the cluster configuration file:

    OS_NAME: photon

  • Cannot delete cluster if AKO agent pod is not running correctly

    If you use NSX Advanced Load Balancer, attempts to use tanzu cluster delete to delete a workload cluster fail if the AVI Kubernetes Operator (AKO) agent pod is in the CreateContainerConfigError status:

    kubectl get po -n avi-system
     NAME  READY STATUS                     RESTARTS AGE
     ako-0 0/1   CreateContainerConfigError 0        94s
    

    The deletion process waits indefinitely for the AKO agent to clean up its related items.  

    Workaround:

    1. Edit the cluster configuration.
      kubectl edit cluster cluster-name
    2. Under finalizers, remove the AKO related line 18, ako-operator.networking.tkg.tanzu.vmware.com:
       16   finalizers:
       17   - cluster.cluster.x-k8s.io
       18   - ako-operator.networking.tkg.tanzu.vmware.com

    The cluster will be successfully removed after a short time.

AWS Issues
  • Telemetry for the Customer Experience Improvement Program (CEIP) does not run on AWS.

    Telemetry pods fail with an error like the following:

    "ERROR workspace/main.go:48 the individual labels are formed incorrectly. e.g. --labels=<key1>=<value1>,<key2>=<value2> with no ',' and '=' allowed in keys and values"

    This issue only affects management clusters created with the CEIP Participation enabled in the installer interface, or ENABLE_CEIP_PARTICIPATION absent in the configuration file or set to true (the default).

    After deploying a management cluster to AWS, run:

    tanzu management-cluster ceip-participation set true --labels='entitlement-account-number="ACCOUNT-NUMBER",env_type="ENV-TYPE"'

    Where:

    • ACCOUNT-NUMBER is your alphanumeric entitlement account number
    • ENV-TYPE is production, development, or test
Azure Issues
    Upgrade Issues
    • List of clusters shows incorrect Kubernetes version after unsuccessful upgrade attempt

      If you attempt to upgrade a Tanzu Kubernetes cluster and the upgrade fails, and if you subsequently run tanzu cluster list or tanzu cluster get to see the list of deployed clusters and their versions, the cluster for which the upgrade failed shows the upgraded version of Kubernetes.

      Workaround: None

    Deployment and Extensions Issues
    • May 2021 Linux security patch causes kind clusters to fail during management cluster creation

      If you run Tanzu CLI commands on a machine with a recent Linux kernel, for example Linux 5.11 and 5.12 with Fedora, kind clusters do not operate. This happens because kube-proxy attempts to change nf_conntrack_max sysctl, which was made read-only in the May 2021 Linux security patch, and kube-proxy enters a CrashLoopBackoff state. The security patch is currently being backported to all LTS kernels from 4.9 onwards, so as  operating system updates are shipped, including for Docker Machine on Mac OS and Windows Subsystem for Linux, kind clusters will fail, resulting in management cluster deployment failure.

      Workaround: Update your version of kind to at least v.1.11.0, and run tanzu management-cluster create with the --use-existing-bootstrap-cluster option. For more information, see Use an Existing Bootstrap Cluster to Deploy Management Clusters.

    • Running Tanzu commands on Windows fails with a certificate error

      Attempts to run the Tanzu CLI on Windows for the first time result in Error: unable to ensure tkg BOM file: failed to download default bom files from the registry: [...] certificate signed by unknown authority

      Workaround 1: If ~/.tanzu/tkg/config.yaml is present

      If you have already run Tanzu CLI commands on this machine, for example with v1.3.0, disable TLS verification to allow Tanzu Kubernetes Grid to pull images from the repository without checking the certificate

      1. Open the ~/.tanzu/tkg/config.yaml file in a text editor. 
      2. Add the following line at the end of your ~/.tanzu/tkg/config.yaml file:
        TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: true

      Alternatively, to manually add the certificate to your Tanzu CLI deployment, perform the following steps:

      1. Obtain the TLS certificate of the Tanzu Kubernetes Grid image repository:

        openssl s_client -showcerts -connect projects.registry.vmware.com:443 </dev/null 2>/dev/null|openssl x509 -outform PEM >mycertfile.pem

      2. Obtain the base64 encoded value of the certificate:

        cat mycertfile.pem | base64

      3. Open your ~/.tanzu/tkg/config.yaml file in a text editor.
      4. Add the following line at the bottom of the file:

        TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE:

      5. Add the detail of the base64 encoded certificate in the config.yaml file:

        TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: LS0tLS1[...]RS0tLS0tDQo=

      NOTE: If this is the first time that you are running Tanzu CLI v1.3.1 commands on this machine, you must also perform the steps described in Updated Base Image Files.

      Workaround 2: If ~/.tanzu/tkg/config.yaml is not present

      If you have not already run Tanzu CLI commands with a previous version, and the ~/.tanzu/tkg/config.yaml is not yet present on your machine, perform the following steps:

      1. Set environment variables either to use the certificate or to disable TLS verification.

        To use the certificate:

        1. Obtain the TLS certificate of the Tanzu Kubernetes Grid image repository:

          openssl s_client -showcerts -connect projects.registry.vmware.com:443 </dev/null 2>/dev/null|openssl x509 -outform PEM >mycertfile.pem

        2. Obtain the base64 encoded value of the certificate:

          cat mycertfile.pem | base64

        3. Set TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE as an environment variable:

          set TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE=LS0tLS1[...]RS0tLS0tDQo=

        To disable TLS verification, set the TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY environment variable:

        set TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY=true

      2. Run tanzu management cluster create with no options.

        The command will fail with an error, but it creates the ~.tanzu folder and its contents.

      3. Open the ~/.tanzu/tkg/config.yaml file in a text editor.
      4. Add one of the following lines at the bottom of the file, to either use the certificate or to disable TLS verification.

        TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: LS0tLS1[...]RS0tLS0tDQo=

        Or

        TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: true

      NOTE: If this is the first time that you are running Tanzu CLI v1.3.1 commands on this machine, you must also perform the steps described in Updated Base Image Files.

    • Worker nodes cannot join cluster if cluster name contains period (.)

      If you deploy a Tanzu Kubernetes cluster and specify a name that includes the period character (.), the cluster appears to be created but only the control plane nodes are visible. Worker nodes are unable to join the cluster, and their names are truncated to exclude any text included after the period.

      Workaround: Do not include period characters in cluster names.

    • Deleting shared services cluster without removing registry webhook causes cluster deletion to stop indefinitely

      If you created a shared services cluster and deployed Harbor as a shared service with the Tanzu Kubernetes Grid Connectivity API, and then you created one or more Tanzu Kubernetes clusters, attempting to delete both the shared services cluster and the Tanzu Kubernetes clusters results in machines being deleted but both clusters remaining indefinitely in the deleting status.

      Workaround: Delete the registry admission webhook so that the cluster deletion process can complete. 

    • The tanzu CLI truncates workload cluster names or does not perform cluster operations.

      Workload cluster names must be 42 characters or less.

      Avoid workload cluster names longer than this-workload-cluster-name-is-far-too-long.

    • Cannot use tanzu login selector in Git Bash on Windows

      If you use Git Bash to run the tanzu login command on Windows systems, you see Error: Incorrect function and you cannot use the arrow keys to select a management cluster.

      Workaround: Run the following command in Git Bash before you run any Tanzu CLI commands:

      alias tanzu='winpty -Xallow-non-tty tanzu'
    • Management cluster fails to deploy when the Tanzu CLI is run on a MacOS system

      A management cluster fails to deploy when the Tanzu CLI is invoked from a MacOS system in the following circumstances:

      • The MacOS system where the tkg/tanzu CLI is launched is running Docker Desktop version 3.3.1 or earlier.
      • You see messages similar to the following in the capv-controller-manager logs in the bootstrap cluster:

      E0510 16:16:51.320061 1 controller.go:257] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create vSphere session: Post https://192.168.110.22/sdk: EOF" "controller"="vspherevm" "name"="cluster-name-control-plane-pqrjq" "namespace"="tkg-system"

      Workaround: Upgrade Docker Desktop to version 3.3.3 or higher.

    check-circle-line exclamation-circle-line close-line
    Scroll to top icon