check-circle-line exclamation-circle-line close-line

What's in the Release Notes

The release notes cover the following topics:

What's New

Updated on: July 30, 2020

VMware vSphere with Kubernetes has monthly patches to introduce new features and capabilities, provide updates to Kubernetes and other services, keep up with upstream, and to resolve reported issues. Here we document what each monthly patch introduces.

 

What's New July 30, 2020 

July 30, 2020 Build Information

ESXi 7.0 | 23 JUN 2020 | ISO Build 16324942

vCenter Server 7.0 | 30 JUL 2020 | ISO Build 16620007

New Features

  • Supervisor cluster: new version of Kubernetes, support for custom certificates and PNID changes
    • The Supervisor cluster now supports Kubernetes 1.18.2 (along with 1.16.7 and 1.17.4)
    • Replacing machine SSL certificates with custom certificates is now supported
    • vCenter PNID update is now supported when there are Supervisor clusters in the vCenter
  • Tanzu Kubernetes Grid Service for vSphere: new features added for cluster scale-in, networking and storage
    • Cluster scale-in operation is now supported for Tanzu Kubernetes Grid service clusters
    • Ingress firewall rules are now enforced by default for all Tanzu Kubernetes Grid service clusters
    • New versions of Kubernetes shipping regularly asynchronously to vSphere patches, current versions are 1.16.8, 1.16.12, 1.17.7, 1.17.8
  • Network service: new version of NCP
    • SessionAffinity is now supported for ClusterIP services
    • IngressClass, PathType, and Wildcard domain are supported for Ingress in Kubernetes 1.18
    • Client Auth is now supported in Ingress Controller
  • Registry service: new version of Harbor
    • The Registry service now is upgraded to 1.10.3

For more information and instructions on how to upgrade, refer to the Updating vSphere with Kubernetes Clusters documentation.

Resolved Issues

  • Tanzu Kubernetes Grid Service cluster NTP sync issue

 

What's New June 23, 2020 

June 23, 2020 Build Information

ESXi 7.0 | 23 JUN 2020 | ISO Build 16324942

vCenter Server 7.0 | 23 JUN 2020 | ISO Build 16386292

Tanzu Kubernetes clusters OVA: v1.16.8+vmware.1-tkg.3.60d2ffd

New Features

  • None, this is simply a bug-fix release.

Resolved Issues

  • Tanzu Kubernetes Grid Service cluster upgrade failure
    • We have resolved an issue where upgrade a Tanzu Kubernetes Grid service cluster can be failed due to "Error: unknown previous node"
  • Supervisor cluster upgrade failure
    • We have resolved an issue where a Supervisor cluster update may get stuck if the embedded Harbor is in a failed state

What's New May 19, 2020 

May 19, 2020 Build Information

ESXi 7.0 | 2 APR 2020 | ISO Build 15843807

vCenter Server 7.0 | 19 MAY 2020 | ISO Build 16189094

Tanzu Kubernetes clusters OVA: v1.16.8+vmware.1-tkg.3.60d2ffd

New Features

  • Tanzu Kubernetes Grid Service for vSphere: rolling upgrade and services upgrade
    • Customers can now perform rolling upgrades over their worker nodes and control plane nodes for the Tanzu Kubernetes Grid Service for vSphere, and upgrade the pvCSI, Calico, and authsvc services. This includes pre-checks and upgrade compatibility for this matrix of services.
    • Rolling upgrades can be used to vertically scale worker nodes, i.e. change the VM class of your worker nodes to a smaller or larger size.
  • Supervisor cluster: new versions of Kubernetes, upgrade supported
    • The Supervisor cluster now supports Kubernetes 1.17.4
    • The Supervisor cluster now supports upgrading from Kubernetes 1.16.x to 1.17.x

 

Resolved Issues

  • Naming conflict for deleted namespaces
    • We have resolved an issue where if a user deleted a vSphere namespace, and then created a new vSphere namespace with the same name, we had a naming collision that resulted in being unable to create Tanzu Kubernetes clusters.
  • Improved distribution names
    • We have made clearer which version of Kubernetes you are running by moving OVF versioning information to a separate column.

Build Information for Previous Releases

April 2, 2020 Build Information

ESXi 7.0 | 2 APR 2020 | ISO Build 15843807

vCenter Server 7.0 | 2 APR 2020 | ISO Build 15952498

Tanzu Kubernetes clusters OVA: v1.16.8+vmware.1-tkg.3.60d2ffd

Note: Tanzu Kubernetes cluster OVA builds listed here supersede build numbers in the vSphere with Kubernetes Configuration and Management guide.

Check for additions and updates to these release notes.

Services Overview

This release of VMware vSphere with Kubernetes for VMware vSphere 7.0 requires VMware ESXi 7.0 and VMware vCenter Server 7.0.

vSphere with Kubernetes for vSphere 7.0 includes:

  • VMware vSphere Namespaces

    Create Namespaces within the vCenter UI and attach compute, networking, storage, and access policy. Give access to these namespaces to DevOps Engineers who can then self-provision vSphere Pods and Tanzu Kubernetes clusters in their Namespaces via the kubectl plugin for vSphere.

  • VMware Tanzu Kubernetes Grid Service

    The Tanzu Kubernetes Grid service enables DevOps Engineers to create fully conformant upstream Kubernetes clusters on demand via kubectl, within their vSphere Namespaces, for all of their Kubernetes native workloads.

  • VMware vSphere Pod Service

    vSphere Pods are automated to run efficiently and securely, directly on the hypervisor. DevOps Engineers can use this service to deploy containerized workloads in their vSphere Namespaces.

  • Storage service

    The Storage service enables DevOps engineers to create and manage persistent volumes within their vSphere Namespaces following vSphere with Kubernetes’ paravirtualized architecture. These volumes can be attached and detached to both vSphere Pods and Tanzu Kubernetes cluster pods.

  • Network service

    The Network service automatically configures vSphere Pod networking and node and service type load balancer networking for Tanzu Kubernetes clusters.

  • Registry service

    Every cluster enabled with vSphere with Kubernetes also has a Registry service enabled which includes a Harbor cloud native repository (https://goharbor.io/). Every vSphere Namespace created on a vSphere with Kubernetes cluster gets a unique project created in Harbor for users of that Namespace to leverage.

Internationalization

vSphere with Kubernetes for vSphere 7.0 is available in the following languages:

  • English
  • French
  • German
  • Spanish
  • Japanese
  • Korean
  • Simplified Chinese
  • Traditional Chinese

Components of vSphere with Kubernetes for vSphere 7.0, including vCenter Server, ESXi, the vSphere Client, and the vSphere Host Client, do not accept non-ASCII input.

Compatibility

vSphere with Kubernetes for vSphere 7.0 will work on greenfield or brownfield installations of the following products:

  • vSphere 7.0 and later
  • NSX-T Advanced 3.0 and later

Before You Begin

vSphere with Kubernetes for vSphere 7.0 requires a special license in addition to the regular vSphere license. It is available as a part of VMware Cloud Foundation 4.0 and later.

Installation and Upgrades for This Release

To use vSphere with Kubernetes for vSphere 7.0, the following installations are required:

  • ESXi 7.0 or later
  • vCenter Server 7.0 or later
  • NSX-T Advanced 3.0 or later

vSphere with Kubernetes for vSphere 7.0 has the same hardware requirements as vSphere 7.0. However, vSphere with Kubernetes for vSphere 7.0 requires the use of NSX-T Edge virtual machines, and those VMs have their own smaller subset of CPU compatibility. See the NSX-T Data Center Installation Guide for more information.

For more information on getting started with vSphere with Kubernetes for vSphere 7.0, see the vSphere with Kubernetes Configuration and Management documentation.

Open Source Components for vSphere 7.0

The copyright statements and licenses applicable to the open source software components distributed in vSphere 7.0 are available at http://www.vmware.com. You need to log in to your My VMware account. Then, from the Downloads menu, select vSphere. On the Open Source tab, you can also download the source files for any GPL, LGPL, or other similar licenses that require the source code or modifications to source code to be made available for the most recent available release of vSphere.

Product Support Notices

  • VMware vSAN is included but not required. vSphere with Kubernetes for vSphere 7.0 is compatible with all vSphere storage partners.
  • VMware vSphere Virtual Volumes support is not included with vSAN for this release.
  • Users may experience some limitations when working with Tanzu Kubernetes clusters provisioned by the Tanzu Kubernetes Grid Service. For more information, see "Known Limitations for Tanzu Kubernetes Clusters" in the vSphere with Kubernetes Configuration and Management documentation.
  • vSphere with Kubernetes for vSphere 7.0 does not support vCenter Server 7.0 instances deployed with custom HTTP and HTTPs ports at this time.

 

Resolved Issues

  • Supervisor Namespace deletion is stuck in a "removing" state.

    After deleting a Supervisor Namespace, deletion is stuck in a "removing" state.

    Workaround: Do not attempt to delete a Supervisor Namespace until all Tanzu Kubernetes clusters in that namespace are deleted.

Known Issues

The known issues are grouped as follows.

Supervisor Cluster
  • Pod creation sometimes fails on a Supervisor Cluster when DRS is set to Manual mode

    Clusters where you enable workload management also must have HA and automated DRS enabled. Enabling workload management on clusters where HA and DRS are not enabled or where DRS is running in manual mode can lead to inconsistent behavior and Pod creation failures.

    Workaround: Enable DRS on the cluster and set it to Fully Automate or Partially Automate. Also ensure that HA is enabled on the cluster.

  • Storage class appears when you run kubectl get sc even after you remove the corresponding storage policy

    If you run kubectl get sc after you create storage policy, add the policy to a namespace, and then remove the policy, the command response will still list the corresponding storage class.

    Workaround: Run kubectl describe namespace to see the storage classes actually associated with the namespace.

  • All storage classes returned when you run kubectl describe storage-class or kubectl get storage-class on a Supervisor Cluster instead of just the ones for the Supervisor namespace

    When you run the kubectl describe storage-class or kubectl get storage-class command on a Supervisor Cluster, the command returns all storage classes instead of just the ones for the Supervisor namespace.

    Workaround: Infer the storage class names associated with the namespace from the verbose name of the quota.

  • Permissions in the Supervisor namespace do not match permissions in the container registry

    When a user with view permission to a namespace also belongs to a group that has edit permission to a namespace, it creates a permission inconsistency: The group edit permission lets the user deploy pods to the namespace. However, this user will not be able to push image to the container registry since the user will not inherit the group permission for the container registry.

    Workaround: Assign edit permission to affected users, or do not add affected users to groups with edit permissions.

  • Share Kubernetes API endpoint button ignores FQDN even if it is configured

    Even if FQDN is configured for the Kubernetes control plane IP for Supervisor Cluster namespace, the share namespace button gives the IP address instead of the FQDN.

    Workaround: Manually share Supervisor Cluster namespace with FQDN.

  • Change of Primary Network Identifier (PNID) is not supported

    Once Workload Management is enabled on a cluster, changing the vCenter Server PNID may cause issues.

    Workaround: Do not change the vCenter Server PNID.

  • During Supervisor cluster upgrade, extra vSphere Pods might be created and stuck at pending status if Daemon set is used

    During Supervisor cluster upgrade, Daemon set controller creates extra vSphere Pods for each Supervisor control plane node. This is caused by an upstream Kubernetes issue.

    Workaround: Add NodeSelector/NodeAffinity to vSphere Pod spec, so the Daemon set controller can skip the control plane nodes for pods creation.

  • Unable to access the load balancer via kubectl vSphere login

    You cannot access the api server via kubectl vSphere login when using a load balanced endpoint.

    Workaround: This issue can manifest in two ways.

    1. Check whether the api server is accessible through the control plane <curl -k https://vip:6443 (or 443)>

      1. If you are unable to access the load balancer from the api server, then the api server is not up yet.

      2. Workaround: Wait a few minutes for the api server to become accessible.

    2. Check if the edge virtual machine node status is up.

      1. Log in to the NSX Manager.

      2. Go to System > Fabric > Nodes > Edge Transport Nodes. The node status should be up.

      3. Go to Networking > Load Balancers > Virtual Servers. Find the vips that end with kube-apiserver-lb-svc-6443 and kube-apiserver-lb-svc-443. If their status is not up, use the following workaround.

      4. Workaround: Reboot the edge VM. The edge VM should reconfigure after the reboot.

  • Cluster configuration of vSphere with Kubernetes show timeout errors during configuration

    During the configuration of the cluster, you may see the following error messages:

    Api request to param0 failed

    or

    Config operation for param0 node VM timed out

    Workaround: None. Enabling vSphere with Kubernetes can take from 30 to 60 minutes. If you see these or similar param0 timeout messages, they are not errors and can be safely ignored.

  • Enabling the container registry fails with error

    When the user enables the container registry from the UI, the enable action fails after 10 minutes with a timeout error.

    Workaround: Disable the container registry and retry to enable. Note that the timeout error may occur again.

  • Enabling a cluster after disabling it fails with error

    Enabling a cluster shortly after disabling the cluster may create a conflict in the service account password reset process. The enable action fails with an error.

    Workaround: Restart with the command vmon-cli --restart wcp.

  • Container images on a failed embedded container registry in a vSphere Namespace-enabled Kubernetes cluster cannot be restored

    Backup and restore of container images on the embedded container registry is not supported. If an embedded container registry in a vSphere Namespace-enabled Kubernetes cluster fails and is not recovered after a restart, the container images on it cannot be restored.

    Workaround: Perform periodic backup of container images on an embedded container registry to an external container registry by pulling images from the embedded container registry and pushing to the external container registry.
     

  • Deleting a container image tag in an embedded container registry might delete all image tags that share the same physical container image

    Multiple images with different tags can be pushed to a project in an embedded container registry from the same container image. If one of the images on the project is deleted, all other images with different tags that are pushed from the same image will be deleted.

    Workaround: The operation cannot be undone. Push the image to the project again.

  • The image pull count for a container image in a project on an embedded container registry in vSphere might be incorrect

    When a pod is deployed on a Kubernetes cluster in vSphere and the pod uses a container image in a project on the embedded container registry of the cluster, the image pull count for the image may show 2 instead of 1 on the container registry UI after the pod is deployed. The image pull count will be updated correctly if the image is pulled from a docker client outside the cluster.

    Workaround: None.

  • Some pods in the system namespace for the embedded container registry of a Kubernetes cluster on vSphere might fail and restart

    There are 7 pods running in the system namespace for the embedded container registry of a Kubernetes cluster on vSphere. If the storage space in a pod is filled up with logs, the pod will fail and restart. After the pod restarts, the container registry should work again as normal.

    Workaround: None.

  • The embedded container registry of a Kubernetes cluster on vSphere might fail to enable with error

    Some pods in the embedded container registry namespace might attach PVC (persistent volume claim) volumes during pod startup. When such a pod fails, new pods might not be able to start up because the pod is unable to attach PVC volume. In these cases, you will see an error message in the pod events, such as Failed to attach cns volume or The resource 'volume' is in use. This can happen with pod failures during the embedded container registry enablement or after the enablement.

    Workaround: Delete all failed pods in the container registry namespace.

  • Failed purge operation on a registry project results in project being in 'error' state

    When you perform a purge operation on a registry project, the project temporarily displays as being in an error state. You will not be able to push or pull images from such project. At regular intervals, the project will be checked and all projects which are in error state will be deleted and recreated. When this happens, all previous project members will be added back to the recreated project and all the repositories and images which previously existed in the project will be deleted, effectively completing the purge operation.

    Workaround: None.

  • Container registry enablement fails when the storage capacity is less than 2000 mebibytes

    There is a minimum total storage capacity requirement for the container registry, addressed as the "limit" field in VMODL. This is because some Kubernetes pods need enough storage space to work properly. To achieve container registry functionality, there is a minimum capacity of 5 Gigabytes. Note that this limit offers no guarantee of improved performance or increased number or size of images that can be supported.

    Workaround: This issue can be avoided by deploying the container registry with a larger total capacity. The recommended storage volume is no less than 5 gigabytes.

  • If you replace the TLS certificate of the NSX load balancer for Kubernetes cluster you might fail to log in to the embedded Harbor registry from a docker client or the Harbor UI

    To replace the TLS certificate of the NSX load balancer for Kubernetes cluster, from the vSphere UI navigate to Configure > Namespaces > Certificates > NSX Load Balancer > Actions and click Replace Certificate. When you replace the NSX certificate, the login operation to the embedded Harbor registry from a docker client or the Harbor UI might fail with the unauthorized: authentication required or Invalid user name or password error.

    Workaround: Restart the registry agent pod in the vmware-system-registry namespace:

    1. Run the kubectl get pod -n vmware-system-registry command.
    2. Delete the pod output by running the kubectl delete pod vmware-registry-controller-manager-xxxxxx -n vmware-system-registry command.
    3. Wait until pod restarts.
  • Pods deployed with DNSDefault will use the clusterDNS settings

    Any vSphere pod deployed in supervisor clusters that makes use of the DNSDefault will fallback to using the clusterDNS configured for the cluster

    Workaround: None.

  • All hosts in a cluster might be updated simultaneously when upgrading a Supervisor Cluster

    In certain cases, all hosts in a cluster will be updated in parallel during the Supervisor Cluster upgrade process. This will cause downtime for all pods running on this cluster.

    Workaround: During Supervisor Cluster upgrade, don't restart wcpsvc or remove/add hosts.

  • Supervisor Cluster upgrade can be stuck indefinitely if VMCA is used as an intermediate CA

    Supervisor Cluster upgrade can be stuck indefinitely in "configuring" if VMCA is being used as an intermediate CA.

    Workaround: Switch to a non-intermediate CA for VMCA and delete any control plane VMs stuck in "configuring".

  • vSphere Pod deployment will be failed if a Storage Policy with encryption enabled is assigned for Pod Ephemeral Disks

    If a Storage Policy with encryption enabled is used for Pod Ephemeral Disks, vSphere Pod creation will be failed with an “AttachVolume.Attach failed for volume” error.

    Workaround: Use a storage policy with no encryption for Pod Ephemeral Disks.

  • IP address is not shown for vSphere Pods in vSphere Client (HTML5)

    IP address is not shown for vSphere Pods in vSphere Client (HTML5).

    Workaround: None

Networking
  • NSX Edge virtual machine deployment fails on slow networks

    There is a combined 60 minute timeout for NSX Edge OVF deployment and NSX Edge VM registration. In slower networks or environments with slower storage, if the time elapsed for Edge deployment and registration exceeds this 60 minute timeout, the operation will fail.

    Workaround: Clean up edges and restart the deployment.

  • You must have two ESXi hosts minimum to perform cluster configuration

    Single ESXi host clusters are not supported at this time. Preflight checks will prevent you from proceeding with configuring the NSX components on the cluster if only one ESXi host is provided.

    Workaround: None.

  • NSX Edges are not updated if vCenter Server DNS, NTP, or Syslog settings are changed after cluster configuration

    DNS, NTP, and Syslog settings are copied from vCenter Server to NSX Edge virtual machines during cluster configuration. If any of these vCenter Server settings are changed after configuration, the NSX Edges are not updated.

    Workaround: Use the NSX Manager APIs to update the DNS, NTP, and Syslog settings of your NSX Edges.

  • NSX Edge Management Network Configuration only provides subnet and gateway configuration on select portgroups

    The NSX Edge management network compatibility drop down list will show subnet and gateway information only if there are ESXi VMKnics configured on the host that are backed by a DVPG on the selected VDS. If you select a Distributed Portgroup without a VMKnic attached to it, you must provide a subnet and gateway for the network configuration.

    Workaround: Use one of the following configurations:

    • Discreet Portgroup: This is where no VMKs currently reside. You must supply the appropriate subnet and gateway information for this portgroup.

    • Shared Management Portgroup: This is where the ESXi hosts' Management VMK resides. Subnet and gateway information will be pulled automatically.

  • Unable to use VLAN 0 during cluster configuration

    When attempting to use VLAN 0 for overlay Tunnel Endpoints or uplink configuration, the operation fails with the message:

    Argument 'uplink_network vlan' is not a valid VLAN ID for an uplink network. Please use a VLAN ID between 1-4094

    Workaround: Manually enable VLAN 0 support using one of the following processes:

    1. SSH into your deployed VC (root/vmware).

    2. Open /etc/vmware/wcp/nsxdsvc.yaml. It will have content similar to:

    logging: 
      level: debug
      maxsizemb: 10 

    a. To enable VLAN0 support for NSX Cluster Overlay Networks, append the following lines to /etc/vmware/wcp/nsxdsvc.yaml and save the file.

    experimental:
     supportedvlan: 
      hostoverlay: 
        min: 0 
        max: 4094 
      edgeoverlay: 
        min: 1 
        max: 4094 
      edgeuplink: 
        min: 1 
        max: 4094 

    b. To enable VLAN0 support for NSX Edge Overlay Networks, append the following lines to /etc/vmware/wcp/nsxdsvc.yaml and save the file.

    experimental: 
     supportedvlan: 
      hostoverlay: 
        min: 1 
        max: 4094 
      edgeoverlay: 
        min: 0 
        max: 4094 
      edgeuplink: 
        min: 1 
        max: 4094 

    c. To enable VLAN0 support for NSX Edge Uplink Networks, append the following lines to /etc/vmware/wcp/nsxdsvc.yaml and save the file.

    experimental: 
     supportedvlan: 
      hostoverlay: 
        min: 1 
        max: 4094 
      edgeoverlay: 
        min: 1 
        max: 4094 
      edgeuplink: 
        min: 0 
        max: 4094 

    3. Restart the workload management service with vmon-cli --restart wcp.

  • Error occurs when a large number of vSphere Pods are created simultaneously 

    When a large number of vSphere Pods are created simultaneously, some vSphere Pods can be failed and get into ErrImagePull state.

    Workaround: Redeploy these vSphere Pods.

VMware Tanzu Kubernetes Grid Service for vSphere
  • Tanzu Kubernetes cluster continues to access removed storage policy

    When a VI Admin deletes a storage class on from the vCenter Server namespace, access to that storage class is not removed for any Tanzu Kubernetes cluster that is already using it.

    Workaround:

    1. As VI Admin, after deleting a storage class from the vCenter Server namespace, create a new storage policy with the same name.

    2. Re-add the existing storage policy or the one you just recreated to the supervisor namespace. TanzuKubernetesCluster instances using this storage class should now be fully-functional.

    3. For each TanzuKubernetesCluster resource using the storage class you wish to delete, create a new TanzuKubernetesCluster instance using a different storage class and use Velero to migrate workloads into the new cluster.

    4. Once no TanzuKubernetesCluster or PersistentVolume uses the storage class, it can be safely removed.

     

  • Supervisor containers restart due to leader election timeouts

    Storage I/O latency for etcd might cause Supervisor container restarts due to leader election timeouts.

    Workaround: None.

  • IP address is appended to the Tanzu Kubernetes cluster name

    If you define a Tanzu Kubernetes cluster name identical to the name of the Supervisor namespace where that cluster is deployed, the system automatically appends the control plane IP address to the Tanzu Kubernetes cluster name to avoid naming conflicts. For example, if you name a Tanzu Kubernetes cluster tkg-cluster_01 and the Supervisor namespace is also named tkg-cluster_01, running kubectl get-contexts will show the Tanzu Kubernetes cluster name as tkg-cluster_01-10.174.4.33, where 10.174.4.33 is the control plane IP address.

    Workaround: If you do not want the IP address to be part of the Tanzu Kubernetes cluster name, define a name different than the Supervisor namespace where the Tanzu Kubernetes cluster is deployed.

  • Certificate refresh results in certificate rotation error

    Users may receive an error about certificate rotation when they attempt to refresh certificates. Certificates are refreshed after the Tanzu Kubernetes cluster receives the older certificate.

    Workaround: Retry certificate refresh.

  • Adding nodes to a Tanzu Kubernetes cluster might cause a virtual machine creation error in vCenter Server

    When a user scales out by adding nodes to a Tanzu Kubernetes cluster, the VM Operator attempts to recreate nodes that already exist. This attempt fails with a failed task message in the vCenter Server task list.

    Workaround: None.

  • Creation of Tanzu Kubernetes clusters fails with error when a content library is not associated with the Supervisor cluster

    A vCenter Server content library must be associated with a Supervisor cluster in order for the Tanzu Kubernetes Grid Service to function on that cluster. If this has not been configured, cluster creation will fail with the following or similar error:

    Error from server (storage class is not valid for control plane VM: 
    Mandatory StorageClass is not specified for TanzuKubernetesCluster '' , 
    storage class is not valid worker VMs: Mandatory StorageClass is not specified for TanzuKubernetesCluster '' , 
    could not find spec.distribution.version "v1.15.5+vmware.1.66-tkg.1.1034"): error when creating "test-cluster.yaml": 
    admission webhook "default.tanzukubernetescluster.kb.io" denied the request: 
    storage class is not valid for control plane VM: 
    Mandatory StorageClass is not specified for TanzuKubernetesCluster '' , 
    storage class is not valid worker VMs: Mandatory StorageClass is not specified for TanzuKubernetesCluster '' , 
    could not find spec.distribution.version "v1.15.5+vmware.1.66-tkg.1.1034"

    Workaround: Have your VI Administrator connect the Supervisor cluster to a content library in vCenter Server.

  • Cannot scale out a Tanzu Kubernetes cluster due to a deleted storage class

    If you try to scale out a Tanzu Kubernetes cluster that is dependent on a deleted storage class, the system will not be able to attach new disks using the deleted storage class and the action will fail.

    Workaround: In vCenter Server, create a new storage policy with the same name as the deleted storage policy. Add the recreated storage policy to the Supervisor namespace. TanzuKubernetesCluster instances using this storage class should now be fully-functional. For each TanzuKubernetesCluster resource using the storage class you want to delete, create a new TanzuKubernetesCluster instance using a different storage class. Use Velero to migrate workloads into the new cluster. Once no TanzuKubernetesCluster or PersistentVolume uses the storage class you want to delete, it can be safely removed.

  • The embedded container registry SSL certificate is not copied to Tanzu Kubernetes cluster nodes

    When the embedded container registry is enabled for a Supervisor Cluster, the Harbor SSL certificate is not included in any Tanzu Kubernetes cluster nodes created on that SC, and you cannot connect to the registry from those nodes.

    Workaround: Copy and paste the SSL certificate from the Supervisor Cluster control plane to the Tanzu Kubernetes cluster worker nodes.

  • DNS server updates are not propagated to Tanzu Kubernetes clusters.

    If the vSphere Administrator updates the DNS Server supporting a vSphere with Kubernetes installation, the updated record is propagated to the Supervisor Cluster, but not to existing Tanzu Kubernetes clusters provisioned using the Tanzu Kubernetes Grid service. All existing Tanzu Kubernetes cluster nodes will continue to use previous DNS settings. Any new Tanzu Kubernetes cluster will inherit the new DNS record.

    Workaround: Existing Tanzu Kubernetes clusters that are inaccessible due to DNS changes must be upgraded or deleted and recreated.

  • Tanzu Kubernetes cluster​ virtual machine disks are thick-provisioned

    The disks for Tanzu Kubernetes cluster VMs are thick-provisioned regardless of the datastore policy.

    Workaround: None.

  • The name of a Tanzu Kubernetes cluster cannot exceed 31 bytes if a Kubernetes service of type LoadBalancer will be created for that cluster

    If you are creating a Tanzu Kubernetes cluster that will host one or more Kubernetes services of type LoadBalancer, the cluster name cannot exceed 31 bytes. For single-byte character sets this translates to approximately 30 characters. For double-byte character sets, it is 15.

    Workaround: None.

  • Post upgrade from Tanzu Kubernetes Grid 1.16.8 to 1.17.4, the "guest-cluster-auth-svc" pod on one of the control plane nodes is stuck at "Container Creating" state.

    After updating a Tanzu Kubernetes Cluster from Tanzu Kubernetes Grid Service 1.16.8 to 1.17.4, the "guest-cluster-auth-svc" pod on one of the cluster control plane nodes is stuck at "Container Creating" state.

    Workaround:

    1. SSH to one of the Tanzu Kuberenets cluster control plane nodes by following the instructions in the documentation topic titled "SSH to Tanzu Kubernetes Cluster Nodes as the System User."

    2. Once you are logged in as the `vmware-system-user` user, run the command "sudo su -" to switch to the root user.

    3. Run the following command: "KUBECONFIG=/etc/kubernetes/admin.conf /usr/lib/vmware-wcpgc-manifests/generate_key_and_csr.sh"

    4. After a few minutes, all authsvc pods should be be running.

  • User is unable to manage existing pods on a Tanzu Kubernetes cluster during or after performing a cluster update.

    User is unable to manage existing pods on a Tanzu Kubernetes cluster during or after performing a cluster update.

    Workaround:

    1. SSH to one of the Tanzu Kuberenets cluster control plane nodes by following the instructions in the documentation topic titled "SSH to Tanzu Kubernetes Cluster Nodes as the System User."

    2. Once you are logged in as the `vmware-system-user` user, run the command "sudo su -" to switch to the root user.

    3. Run the following command: "KUBECONFIG=/etc/kubernetes/admin.conf /usr/lib/vmware-wcpgc-manifests/generate_key_and_csr.sh"

    4. After a few minutes, all authsvc pods should be be running.

  • Virtual machine images are not available from the Content Library.

    When multiple vCenter servers are configured in Embedded Linked Mode setup, UI will allow the user to select a Content Library created on a different vCenter. Selecting such a library will result in virtual machine images not being available to DevOps user to provision Tanzu Kubernetes cluster. `kubectl get virtualmachineimages` will not return any results.

    Workaround: When you associate a Content Library with the Supervisor cluster for Tanzu Kubernetes cluster VM images, choose a library that is created in the same vCenter server where the Supervisor cluster resides. Alternatively, create a local content library which also supports air-gapped provisioning of Tanzu Kubernetes clusters.

  • LoadBalancer type Services deployed immediately after a Supervisor cluster update do not receive external IP

    LoadBalancer type Services deployed in Tanzu Kubernetes clusters after Supervisor cluster update do not receive external IP. The LoadBalancer type Services eventually get an external IP address in a few minutes after the Supervisor update is complete.

    Workaround: None

  • Unable to create a Tanzu Kubernetes cluster with a name that begins with a numeral or contain periods.

    Tanzu Kubernetes clusters with names that begin with a numeral or contain periods will not be successfully created.

    Workaround: None

  • After a successful Tanzu Kubernetes cluster update, older cluster nodes are not removed from the cluster.

    After a successful Tanzu Kubernetes cluster update, older cluster nodes may not be removed from the cluster.

    Workaround: These nodes can be manually deleted by issuing  the command "kubectl delete node <node_name>". The pods that are in the CrashLoopBackOff can now be deleted and they should return to Running state.

  • Tanzu Kubernetes cluster Upgrade Job fails with "timed out waiting for etcd health check to pass."

    The upgrade job in the vmware-system-tkg namespace associated with the upgrade of a Tanzu Kubernetes cluster fails with the following error message "timed out waiting for etcd health check to pass." The issue is caused by the missing PodIP addresses for the etcd pods.

    Workaround:

    Restart kubelet on the affected nodes, causing the etcd pods to restart and receive a PodIP. Then, run the following recovery steps to recover from a failed upgrade. Before attempting these steps, contact VMware support for guidance.

    1) For any Machine that was upgraded successfully, but didn't have the original Machine removed.

    • Remove the original Machine's node reference from etcd's member list
    • Delete the original Machine (leaving the newly upgraded one) 

    2) For any Machine that is unhealthy:

    • Retrieve the TanzuKubernetesCluster's resource version (.metadata.resourceVersion).
    • Retrieve the list of Machines with the annotation: "upgrade.cluster-api.vmware.com/id". These are the upgraded nodes from the previous upgrade attempt.
    • Update the annotation to match the resource version (not required if there's no difference).
    • Delete the upgrade Job belonging to the cluster.
    • Verify that the upgrade resumes.
  • Tanzu Kubernetes cluster creation fails when the Supervisor namespace is associated with a VM Encryption Storage Policy.

    Creation of Tanzu Kubernetes clusters fails if Supervisor namespace is associated with VM Encryption policy. Tanzu Kuberenetes cluster control plane VM does not get provisioned in vCenter.

     

    There is no workaround. VM Encryption is not supported with TKGS. You will need to select a different storage policy.

  • Guest cluster creation fails if POD CIDRs conflict across supervisor clusters on the same T0 router

    Guest cluster creation fails if POD CIDRs conflict across supervisor clusters on the same T0 router. Guest clusters stuck at the "creating" stage.

    Workaround: None

NSX-T
  • vSphere with Kubernetes and NSX-T cannot be enabled on a cluster where vSphere Lifecycle Manager Image is enabled

    vSphere with Kubernetes and NSX-T are not compatible with vSphere Lifecycle Manager Image. They are only compatible with vSphere Lifecycle Manage Baselines. When vSphere Lifecycle Manager Image is enabled on a cluster, you cannot enable vSphere with Kubernetes or NSX-T on that cluster.

    Workaround: Move hosts to a cluster where vSphere Lifecycle Manager Image is disabled. You must use a cluster with vSphere Lifecycle Manager Baselines. Once the hosts are moved, you can enable NSX-T and then vSphere with Kubernetes on that new cluster.

  • "LoadBalancerIP" is not supported.

    For Kubernetes service of type LoadBalancer, the "LoadBalancerIP" configuration is not supported for Tanzu Kubernetes clusters.

    Workaround: None

  • "ExternalTrafficPolicy: local" not supported.

    For Kubernetes service of type LoadBalancer, the "ExternalTrafficPolicy: local" configuration is not supported.

    Workaround: None.

  • The number of services of type LoadBalancer that a Tanzu Kuberetes cluster can support is limited by the NodePort range of the Supervisor Cluster

    Each VirtualMachineService of type LoadBalancer is translated to one Kubernetes service of type LoadBalancer and one Kubernetes endpoint. The maximum number of Kubernetes services of type LoadBalancer that can be created in a Supervisor Cluster is 2767, this includes those created on the Supervisor Cluster itself and those created in Tanzu Kubernetes clusters.

    Workaround: None.