This topic contains release notes for Tanzu Kubernetes Grid Integrated Edition (TKGI) v1.12.

Warning: Before installing or upgrading to Tanzu Kubernetes Grid Integrated Edition v1.12, review the Breaking Changes below.

TKGI v1.12.8

Release Date: August 3, 2022

Product Snapshot

Release Details
Version v1.12.8
Release date August 3, 2022
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.5.13*
CoreDNS* v1.8.6+vmware.6*
CSI Driver for vSphere v2.3.2 Release Notes
Docker Linux: v20.10.9
Windows: v20.10.9*
etcd v3.4.18
Harbor v2.5.3* Release Notes
Kubernetes v1.21.14* Release Notes
Metrics Server v0.3.6
NCP v3.1.2.6 Release Notes
Percona XtraDB Cluster (PXC) v0.43.0*
UAA v74.5.46*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.12
Compatibilities Versions
Ops Manager See VMware Tanzu Network.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.51 and later
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3 or later.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.8 are from TKGI v1.12.7 and earlier TKGI v1.12 patches, and from TKGI v1.11.10 and earlier TKGI v1.11 patches.

Enhancements and Resolved Issues

TKGI v1.12.8 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.7 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.8. See the TKGI v1.12.7 Known Issues below.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.


TKGI Management Console v1.12.8

Release Date: August 3, 2022

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.8
Release date August 3, 2022
Installed TKGI version v1.12.8
Installed Ops Manager version v2.10.45 Release Notes
Component Version
Installed Kubernetes version v1.21.14* Release Notes
Installed Harbor Registry version v2.5.3* Release Notes
Linux stemcell v621.256*
Windows stemcells v2019.51 and later*
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.8 are from TKGI MC v1.12.7 and earlier TKGI MC v1.12 patches, and TKGI MC v1.11.10 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

This release of the TKGI Management Console includes no new features or resolved issues.

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.7 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.8. See the TKGI MC v1.12.7 Known Issues below.


TKGI v1.12.7

Release Date: June 22, 2022

Product Snapshot

Release Details
Version v1.12.7
Release date June 22, 2022
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.5.11*
CoreDNS v1.8.4+vmware.9*
CSI Driver for vSphere v2.3.2* Release Notes
Docker Linux: v20.10.9
Windows: v20.10.7
etcd v3.4.18*
Harbor v2.5.1* Release Notes
Kubernetes v1.21.12* Release Notes
Metrics Server v0.3.6
NCP v3.1.2.6* Release Notes
Percona XtraDB Cluster (PXC) v0.42.0*
UAA v74.5.43*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.12
Compatibilities Versions
Ops Manager See VMware Tanzu Network.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.46 and later
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3 or later.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.7 are from TKGI v1.12.6 and earlier TKGI v1.12 patches, and from TKGI v1.11.10 and earlier TKGI v1.11 patches.

Features and Enhancements

  • Supports configuring the TKGI API Operation Timeout length. For more information, see Networking in Installing TKGI on vSphere with NSX-T.
  • Supports accessing images in a private Docker registry from Linux clusters with containerd container runtimes. For more information, see Configuring Cluster Access to Private Docker Registries (Beta).
  • Fluent Bit has been upgraded from v1.8.10 to v1.9.0. For more information, see Upgrade Notes in the Fluent Bit documentation.

Enhancements and Resolved Issues

TKGI v1.12.7 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.6 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.7. See the TKGI v1.12.6 Known Issues below.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.


TKGI Management Console v1.12.7

Release Date: June 22, 2022

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.7
Release date June 22, 2022
Installed TKGI version v1.12.7
Installed Ops Manager version v2.10.43 Release Notes
Component Version
Installed Kubernetes version v1.21.12* Release Notes
Installed Harbor Registry version v2.5.1* Release Notes
Linux stemcell v621.251*
Windows stemcells v2019.46 and later
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.7 are from TKGI MC v1.12.6 and earlier TKGI MC v1.12 patches, and TKGI MC v1.11.10 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

TKGI Management Console v1.12.7 has the following features and enhancements:

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.6 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.7. See the TKGI MC v1.12.6 Known Issues below.


TKGI v1.12.6

Release Date: April 29, 2022

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.6
Release date April 29, 2022
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.5.9
CoreDNS v1.8.0+vmware.11
CSI Driver for vSphere v2.3.1 Release Notes
Docker Linux: v20.10.9
Windows: v20.10.7
etcd v3.4.13
Harbor v2.4.2 Release Notes
Kubernetes v1.21.9 Release Notes
Metrics Server v0.3.6
NCP v3.1.2.5 Release Notes
Percona XtraDB Cluster (PXC) v0.41.0
UAA v74.5.37
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.12
Compatibilities Versions
Ops Manager See VMware Tanzu Network.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.46 and later
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3 or later.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.6 are from TKGI v1.12.4 and earlier TKGI v1.12 patches, and from TKGI v1.11.10 and earlier TKGI v1.11 patches.

Breaking Changes

TKGI v1.12.6 has the following breaking changes:

  • Docker commands on worker VMs’ no longer work:
    Clusters using the Docker container runtime no longer default to supporting Docker command line commands. For more information, see Docker Commands No Longer Work on Worker VMs below.

Enhancements and Resolved Issues

TKGI v1.12.6 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.5 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.6. See the TKGI v1.12.5 Known Issues below.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.


Docker Commands No Longer Work on Worker VMs

This issue is fixed in TKGI v1.12.7.

Clusters using the Docker container runtime no longer default to supporting Docker command line commands.

Supporting containerd as the default container runtime requires Docker-specific features incompatible with containerd are disabled by default.

Workaround

In an environment where you want to run Docker commands, complete one of the following:

  • Export the Docker environments variables before using Docker commands:

    source /var/vcap/jobs/docker/bin/envrc
    

    For example:

    source /var/vcap/jobs/docker/bin/envrc 
    
    docker images
    REPOSITORY                                                                   TAG                                        IMAGE ID       CREATED         SIZE
    ...                                                                          1a3337bb81890b6bb1848b5dd4565dfa5d124f38   ffb57751a939   3 months ago    182MB
    
  • Use absolute Docker paths when referencing Docker:

    /var/vcap/packages/docker/bin/docker --host unix:///var/vcap/sys/run/docker/docker.sock
    

    For example:

    /var/vcap/packages/docker/bin/docker --host unix:///var/vcap/sys/run/docker/docker.sock images
    REPOSITORY                                                                   TAG                                        IMAGE ID       CREATED         SIZE
    ...                                                                          1a3337bb81890b6bb1848b5dd4565dfa5d124f38   ffb57751a939   3 months ago    182MB
    


Some Windows Pods Become Unreachable

On occasion, the IP Addresses for one or more running Pods in a Windows cluster become unreachable. Pinging from within an unreachable Pod also fails, returning Request timed out..

The unreachable Pods might also enter a CrashLoopBackOff state.

Explanation

CNI requests within the Pod have entered a race condition. Afterward, networking for the Pod is unreachable.


TKGI Management Console v1.12.6

Release Date: April 29, 2022

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.6
Release date April 29, 2022
Installed TKGI version v1.12.6
Installed Ops Manager version v2.10.37 Release Notes
Component Version
Installed Kubernetes version v1.21.9 Release Notes
Installed Harbor Registry version v2.4.2 Release Notes
Linux stemcell v621.224
Windows stemcells v2019.46 and later
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.6 are from TKGI MC v1.12.4 and earlier TKGI MC v1.12 patches, and TKGI MC v1.11.10 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

This release of the TKGI Management Console includes no new features or resolved issues.

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.5 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.6. See the TKGI MC v1.12.5 Known Issues below.


TKGI v1.12.5 - Withdrawn

Warning: This release has been removed from VMware Tanzu Network because of an upgrade issue.

Release Date: April 13, 2022

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.5
Release date April 13, 2022
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.5.9
CoreDNS v1.8.0+vmware.11
CSI Driver for vSphere v2.3.1 Release Notes
Docker Linux: v20.10.9
Windows: v20.10.7
etcd v3.4.13
Harbor v2.4.2* Release Notes
Kubernetes v1.21.9 Release Notes
Metrics Server v0.3.6
NCP v3.1.2.5 Release Notes
Percona XtraDB Cluster (PXC) v0.41.0
UAA v74.5.37*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.12
Compatibilities Versions
Ops Manager See VMware Tanzu Network.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.46 and later
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3 or later.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.5 are from TKGI v1.12.4 and earlier TKGI v1.12 patches, and from TKGI v1.11.9 and earlier TKGI v1.11 patches.

Enhancements and Resolved Issues

TKGI v1.12.5 has the following features and enhancements:

TKGI v1.12.5 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.4 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.5. See the TKGI v1.12.4 Known Issues below.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.


CoreDNS Pods Fail While Upgrading to TKGI v1.12.5

This issue is fixed in TKGI v1.12.6.

CoreDNS worker Pods fail while upgrading from TKGI v1.12.3 or v1.12.4 to TKGI v1.12.5.

Symptoms

While upgrading clusters to TKGI v1.12.5, CoreDNS worker Pods fail and log errors similar to the following:

...start failed in pod codedns-... ErrImagePull: rpc error code = Unknown desc = Error response from daemon: unknown: artifact tkg/coredns:v1.8.0_vmware.9 not found
...12766 pod_workers.go:190] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"coredns\" with ImagePullBackOff: 
\"Back-off pulling image \\"projects.registry.vmware.com/tkg/coredns:v1.8.0_vmware.9\\"\" pod="kube-system/coredns-...

Explanation

During TKGI v1.12.5 cluster upgrades, the CoreDNS deployment definition is updated from coredns:v1.8.0_vmware.9 to coredns:v1.8.0_vmware.11 by running the apply-addons errand. There is a window before the apply-addons errand has run where the deployment still points to coredns:v1.8.0_vmware.9 but CoreDNS nodes are being recreated with a local CoreDNS image that includes coredns:v1.8.0_vmware.11.

kubelet does not find the CoreDNS coredns:v1.8.0_vmware.9 version and tries to pull the old version from the remote VMware registry. The CoreDNS Pods then fail with ImagePullBackOff or ErrImagePull errors.


Docker Service Fails While Upgrading Clusters to TKGI v1.12.5

This issue is fixed in TKGI v1.12.6.

The Docker service might fail for some Docker container runtime clusters while upgrading the clusters from TKGI v1.12.4 to TKGI v1.12.5.

Symptoms

While upgrading Docker container runtime clusters to TKGI v1.12.5, the cluster upgrade fails and logs errors similar to the following:

...operation: update, error-message: 'worker-... (1)' is not running after update. Review logs for failed jobs: docker...
...worker/...:~# monit summary The Monit daemon 5.2.5 uptime: 1m Process 'docker' Execution failed
...msg="Handler for POST /v1.41/images/create returned error: Get \"https://registry.tkg.vmware.run/v2/\": dial tcp: lookup registry.tkg.vmware.run on ...: no such host..."

Workaround

If upgrading a Docker container runtime cluster to TKGI v1.12.5 has failed:

  1. Restart the Docker process on that cluster.
  2. If any Pods in the cluster return Warning FailedCreatePodSandBox, you must recreate the affected worker node.


TKGI Management Console v1.12.5 - Withdrawn

Warning: This release has been removed from VMware Tanzu Network because of an upgrade issue.

Release Date: April 13, 2022

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.5
Release date April 13, 2022
Installed TKGI version v1.12.5
Installed Ops Manager version v2.10.37 Release Notes
Component Version
Installed Kubernetes version v1.21.9 Release Notes
Installed Harbor Registry version v2.4.2* Release Notes
Linux stemcell v621.224*
Windows stemcells v2019.46 and later*
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.5 are from TKGI MC v1.12.4 and earlier TKGI MC v1.12 patches, and TKGI MC v1.11.9 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

TKGI Management Console v1.12.5 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.4 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.5. See the TKGI MC v1.12.4 Known Issues below.


TKGI v1.12.4

Release Date: March 1, 2022

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.4
Release date March 1, 2022
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.5.9*
CoreDNS v1.8.0+vmware.11*
CSI Driver for vSphere v2.3.1* Release Notes
Docker Linux: v20.10.9
Windows: v20.10.7
etcd v3.4.13
Harbor v2.4.1 Release Notes
Kubernetes v1.21.9* Release Notes
Metrics Server v0.3.6
NCP v3.1.2.5 Release Notes
Percona XtraDB Cluster (PXC) v0.41.0*
UAA v74.5.34*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.12
Compatibilities Versions
Ops Manager See VMware Tanzu Network.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.44 and later
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3 or later.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.4 are from TKGI v1.12.3 and earlier TKGI v1.12 patches, and from TKGI v1.11.6 and earlier TKGI v1.11 patches.

Resolved Issues

TKGI v1.12.4 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.3 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.4. See the TKGI v1.12.3 Known Issues below.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.


Fluent Bit Does Not Merge Containerd Runtime Cluster Multi-Line Entries

The Fluent Bit Docker, CRI, Go, Java, and Python multi-line parser does not merge containerd runtime cluster log entries belonging to the same context into a single log entry.


Fluent Bit DNS Resolution Timeout Failure

This issue is fixed in TKGI v1.12.7.

Fluent Bit v1.8.5 and later might fail to forward Pod log entries to the desired logging destination and instead return a DNS timeout error.

Symptoms

When Fluent Bit encounters this unexpected DNS resolution issue, it logs Fluent Bit errors similar to the following:

[ warn] [engine] chunk '...' cannot be retried: task_id=1, input=tail.0 > output=splunk.0

[ warn] [net] getaddrinfo(host='...', err=11): Could not contact DNS servers

[ warn] [engine] failed to flush chunk '...', retry in 6 seconds: task_id=11, input=tail.0 > output=splunk.0 (out_id=0)

[ warn] [net] getaddrinfo(host='...', err=11): Could not contact DNS servers

Workaround

To resolve the Fluent Bit DNS timeout problem:

  1. Open the Fluent Bit config map in a text editor.
  2. Add the following to the OUTPUT section in the config map:

    net.dns.mode   TCP
    

    For more information on configuring a Fluent Bit ConfigMap, see Installation in the Fluent Bit Official Manual or the example ConfigMap files: kafka/fluent-bit-configmap.yaml and elasticsearch/fluent-bit-configmap.yaml in the Fluent Bit GitHub repository.

  3. Perform a rollout restart of Fluent Bit logging:

    kubectl logs POD-NAME -n NAMESPACE -c CONTAINER-NAME --follow
    

    Where:

    • POD-NAME is the name of your Pod.
    • NAMESPACE is the namespace for your Pod.
    • CONTAINER-NAME is the name of your Pod container.

    For more information on restarting Fluent Bit logging, see Interacting with running Pods in kubectl Cheat Sheet in the Kubernetes documentation.

For more information, see DNS resolution timeout/failure in >= 1.8.5 #4050 and Malformed HTTP response from splunk and cannot increase buffer on fluentBit v1.8.9 #4723 in the Fluent Bit GitHub repository.


TKGI Management Console v1.12.4

Release Date: March 1, 2022

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.4
Release date March 1, 2022
Installed TKGI version v1.12.4
Installed Ops Manager version v2.10.30 Release Notes
Component Version
Installed Kubernetes version v1.21.9* Release Notes
Installed Harbor Registry version v2.4.1 Release Notes
Linux stemcell v621.208*
Windows stemcells v2019.44 and later
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.4 are from TKGI MC v1.12.3 and earlier TKGI MC v1.12 patches, and TKGI MC v1.11.7 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

TKGI Management Console v1.12.4 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.3 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.4. See the TKGI MC v1.12.3 Known Issues below.


TKGI v1.12.3

Release Date: December 20, 2021

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.3
Release date December 20, 2021
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.4.11
CoreDNS v1.8.0+vmware.9
CSI Driver for vSphere v2.3.0 Release Notes
Docker Linux: v20.10.9
Windows: v20.10.7
etcd v3.4.13
Harbor v2.4.1* Release Notes
Kubernetes v1.21.6 Release Notes
Metrics Server v0.3.6
NCP v3.1.2.5* Release Notes
Percona XtraDB Cluster (PXC) v0.40.0*
UAA v74.5.29*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.12*
Compatibilities Versions
Ops Manager v2.10.24, v2.10.25, v2.10.26, and v2.10.29 and later versions. For more information, see VMware Tanzu Network or the Ops Manager Release Notes.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.44 and later***
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3 or later.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.3 are from TKGI v1.12.2 and earlier TKGI v1.12 patches, and from TKGI v1.11.6 and earlier TKGI v1.11 patches.

Resolved Issues

TKGI v1.12.3 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.2 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.3. See the TKGI v1.12.2 Known Issues below.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.


Harbor Private Projects Are Inaccessible after Upgrading to TKGI v1.12.3

If LDAP is enabled, Harbor private projects are inaccessible after upgrading to TKGI v1.12.3. For more information, see Private projects become inaccessible after upgrading Harbor for TKGI to v2.4.x with LDAP feature enabled in the VMware Tanzu Knowledge Base.


TKGI Management Console v1.12.3

Release Date: December 20, 2021

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.3
Release date December 20, 2021
Installed TKGI version v1.12.3
Installed Ops Manager version v2.10.24 Release Notes
Component Version
Installed Kubernetes version v1.21.6 Release Notes
Installed Harbor Registry version v2.4.1* Release Notes
Linux stemcell v621.183*
Windows stemcells v2019.42 and later
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.3 are from TKGI MC v1.12.2 and earlier TKGI MC v1.12 patches, and TKGI MC v1.11.7 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

This release of the TKGI Management Console includes no new features or resolved issues.

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.2 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.3. See the TKGI MC v1.12.2 Known Issues below.


TKGI v1.12.2

Release Date: December 9, 2021

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.2
Release date December 9, 2021
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.4.11*
CoreDNS v1.8.0+vmware.9*
CSI Driver for vSphere v2.3.0 Release Notes
Docker Linux: v20.10.9*
Windows: v20.10.7
etcd v3.4.13
Harbor v2.4.0* Release Notes
Kubernetes v1.21.6* Release Notes
Metrics Server v0.3.6
NCP v3.1.2.5* Release Notes
Percona XtraDB Cluster (PXC) v0.39.0
UAA v74.5.27*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.7
Compatibilities Versions
Ops Manager v2.10.24, v2.10.25, v2.10.26, and v2.10.29 and later versions. For more information, see VMware Tanzu Network or the Ops Manager Release Notes.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.44 and later***
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.2 are from TKGI v1.12.0, v1.12.1, and from TKGI v1.11.6 and earlier TKGI v1.11 patches.

Resolved Issues

TKGI v1.12.2 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.0 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.2. See the TKGI v1.12.0 Known Issues below.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

For Known Issues in NCP v3.1.2.5, see NSX Container Plugin 3.1.2 Release Notes.


Harbor Private Projects Are Inaccessible after Upgrading to TKGI v1.12.2

If LDAP is enabled, Harbor private projects are inaccessible after upgrading to TKGI v1.12.2. For more information, see Private projects become inaccessible after upgrading Harbor for TKGI to v2.4.x with LDAP feature enabled in the VMware Tanzu Knowledge Base.


TKGI Management Console v1.12.2

Release Date: December 9, 2021

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.2
Release date December 9, 2021
Installed TKGI version v1.12.2
Installed Ops Manager version v2.10.21* Release Notes
Component Version
Installed Kubernetes version v1.21.6* Release Notes
Installed Harbor Registry version v2.4.0* Release Notes
Linux stemcell v621.176*
Windows stemcells v2019.42* and later
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.2 are from TKGI MC v1.12.1 and TKGI MC v1.11.5 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

TKGI Management Console v1.12.2 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.0 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.2. See the TKGI MC v1.12.0 Known Issues below.


TKGI v1.12.1

Release Date: November 30, 2021

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.1
Release date November 30, 2021
Component Version
Antrea v1.2.2-0.13.3
cAdvisor v0.39.1
Containerd for Linux v1.4.6
CoreDNS v1.8.0+vmware.8*
CSI Driver for vSphere v2.3.0 Release Notes
Docker Linux: v20.10.7
Windows: v20.10.7
etcd v3.4.13
Harbor v2.3.2* Release Notes
Kubernetes v1.21.5* Release Notes
Metrics Server v0.3.6
NCP v3.1.2.3 Release Notes
Percona XtraDB Cluster (PXC) v0.39.0*
UAA v74.5.26*
Velero v1.6.2 Release Notes
VMware Cloud Foundation (VCF) v4.3.1
Wavefront Wavefront Collector: v1.6.0
Wavefront Proxy: v10.7
Compatibilities Versions
Ops Manager v2.10.24, v2.10.25, v2.10.26, and v2.10.29 and later versions. For more information, see VMware Tanzu Network or the Ops Manager Release Notes.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.44 and later***
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.1 are from TKGI v1.12.0 and TKGI v1.11.4 and earlier TKGI v1.11 patches.

Resolved Issues

TKGI v1.12.1 has the following resolved issues:

Deprecations

For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.12.0 are also in Tanzu Kubernetes Grid Integrated Edition v1.12.1. See the TKGI v1.12.0 Known Issues below.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

For Known Issues in NCP v3.1.2.3, see NSX Container Plugin 3.1.2 Release Notes.


TKGI Management Console v1.12.1

Release Date: October 26, 2021

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.1
Release date October 26, 2021
Installed TKGI version v1.12.1
Installed Ops Manager version v2.10.19 Release Notes
Component Version
Installed Kubernetes version v1.21.5* Release Notes
Installed Harbor Registry version v2.3.2* Release Notes
Linux stemcell v621.160*
Windows stemcells v2019.37 and later
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.1 are from TKGI MC v1.12.0 and TKGI MC v1.11.5 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

This release of the TKGI Management Console includes no new features or resolved issues.

Deprecations

For information about upcoming deprecations, see Deprecations in the TKGI MC v1.12.0 Release Notes below.

Known Issues

Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.0 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.1. See the TKGI MC v1.12.0 Known Issues below.


TKGI v1.12.0

Release Date: September 09, 2021

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

Product Snapshot

Release Details
Version v1.12.0
Release date September 9, 2021
Component Version
Antrea v1.2.2-0.13.3*
cAdvisor v0.39.1*
Containerd for Linux v1.4.6*
CoreDNS v1.8.0+vmware.6*
CSI Driver for vSphere v2.3.0* Release Notes
Docker Linux: v20.10.7*
Windows: v20.10.7*
etcd v3.4.13
Harbor v2.3.1* Release Notes
Kubernetes v1.21.3* Release Notes
Metrics Server v0.3.6
NCP v3.1.2.3* Release Notes
Percona XtraDB Cluster (PXC) v0.37.0*
UAA v74.5.25*
Velero v1.6.2* Release Notes
VMware Cloud Foundation (VCF) v4.3.1*
Wavefront Wavefront Collector: v1.6.0*
Wavefront Proxy: v10.7*
Compatibilities Versions
Ops Manager v2.10.24, v2.10.25, v2.10.26, and v2.10.29 and later versions. For more information, see VMware Tanzu Network or the Ops Manager Release Notes.
NSX-T See VMware Product Interoperability Matrices**.
vSphere
Windows stemcells v2019.44 and later***
Xenial stemcells See VMware Tanzu Network.

* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.12.0 are from TKGI v1.11.3 and earlier TKGI v1.11 patches.

Breaking Changes

TKGI v1.12.0 has the following breaking changes:

  • Fluent Bit has been upgraded to Fluent Bit v1.5.7:

    • Previous versions of TKGI have used Fluent Bit v1.3.4. For information on the differences between the Fluent Bit v1.3.4 and v1.5.7 releases, see Upgrade Notes in the Fluent Bit documentation.

    • By default, Fluent Bit v1.5.7 limits the size of Kubernetes log metadata to 32 KB. For more information, see Some Kubernetes Logs Are Not Forwarded after Upgrading in the TKGI v1.12.0 Known Issues below.

  • TKGI uses a new sink-resources image path:

    With TKGI v1.12.0 the sink-resources image path has changed from epks-docker-local.artifactory.eng.vmware.com/oratos/ to gcr.io/cf-pks-releng-environments/oratos/.

  • Newly created clusters use the containerd runtime by default instead of the Docker runtime:

    The logging prefix on containerd runtime cluster log entries is different than the prefix on Docker runtime cluster log entries. Cluster log entries read by Fluent Bit are prefixed:

    • TIMESTAMP (stdout|stderr) [EP] in Docker runtime cluster output.
    • TIMESTAMP stdout F in containerd runtime cluster output.

Features

TKGI v1.12.0 has the following features and enhancements:

Supports NSX-T Policy API

(Beta) Supports NSX-T Policy API for new installations. See Considerations for Using the NSX-T Policy API with TKGI and Networking in Installing Tanzu Kubernetes Grid Integrated Edition on vSphere with NSX-T.

Warning: Support for the NSX-T Policy API is currently in beta and is intended for evaluation and test purposes only. Do not use this feature in a production environment.

Supports Cluster Cert Rotation Using the CLI

Supports Kubernetes cluster certificate rotation using the TKGI CLI. See Rotate Kubernetes Cluster Certificates.

Supports Persistent Redirect URIs

Supports assigning persistent UAA cluster_client redirect_uri URIs to clusters on the TKGI tile. UAA redirect URIs configured on the TKGI tile persist through cluster updates and TKGI upgrades. For more information on configuring persistent UAA redirect URIs, see UAA in Installing TKGI.

Supports Clusters That Use the containerd Runtime

Supports creating and managing clusters that use the containerd runtime. Newly created clusters use the containerd runtime by default. You can optionally create clusters that use a docker runtime. For more information, see Creating Clusters.

Note: You cannot manually change the container runtime for existing clusters. TKGI does not automatically update existing clusters to use the containerd runtime.

Configure Clusters to Use NSX-T Client-Side ssl Profiles

Supports using Network Profiles to configure a cluster to use an NSX-T client-side ssl profile. For information about using a Network Profile to configure a cluster to use an NSX-T client-side ssl profile, see Network Profile Parameters or Confirm the Network Profile Property Supports Updates in Creating and Deleting Network Profiles (NSX-T Only).

Configure Clusters to Use NSX-T Load Balancer TCP Multiplexing

Supports using Network Profiles to configure a cluster to use NSX-T load balancer TCP multiplexing and to limit the number of allowed TCP multiplexing connections. For information about using a Network Profile to configure a cluster to use NSX-T load balancer TCP multiplexing, see Network Profile Parameters or Confirm the Network Profile Property Supports Updates in Creating and Deleting Network Profiles (NSX-T Only).

Migrate from In-Tree vSphere Storage Volumes to the Automatically Installed vSphere Container Storage Plug-in

Supports migrating an existing Linux cluster from an in-tree vSphere Storage Volume to the automatically installed vSphere CSI Driver. For more information about migrating existing clusters to the vSphere Container Storage Plug-in, see Migrate an In-Tree vSphere Storage Volume to the Automatically Installed vSphere CSI Driver in Deploying and Managing Cloud Native Storage (CNS) on vSphere.

SAML IdP Enhancements

Includes the following SAML IdP enhancements:

  • Supports users directly authenticating with the configured external identity provider.
  • Supports automatically bypassing the scope approval screen for the tkgi get-credentials CLI command and when logging in to the TKGI CLI.

Supports Securing Log Forwarding with a Custom CA Certificate

Supports optionally securing syslog ClusterLogSink and LogSink log forwarding with a custom CA certificate. For more information, see Create a Syslog ClusterLogSink or LogSink Resource.

Additional Features

TKGI v1.12.0 includes the following additional features:

  • Supports configuring the MetricSink telegraf interval setting. For more information, see Create a ClusterMetricSink or MetricSink Resource in Creating and Managing Sink Resources.
  • Supports NetApp Trident without manually sharing the root filesystem on each worker node. For more information about Trident, see Trident for Kubernetes in the NetApp documentation.
  • Supports using logsink to forward logs with TLS enabled in addition to the existing support for forwarding logs with TLS disabled.
  • Supports 3rd-party agents that previously failed to start and returned the error:

    Error: failed to start container "agent": Error response from daemon: 
    path / is mounted on/ but it is not a shared or slave mount.
    

Resolved Issues

TKGI v1.12.0 has the following resolved issues:

Deprecations

The following TKGI features have been deprecated or removed from TKGI v1.12:

  • Support for DenyEscalatingExec has been removed from Kubernetes v1.21 and can no longer be configured in the TKGI tile. For more information, see DenyEscalatingExec in the Kubernetes documentation.

Upcoming Deprecations

The following TKGI features will be deprecated or removed in future versions of TKGI:

  • Flannel Support: Support for the Flannel Container Networking Interface (CNI) is deprecated. VMware recommends that you upgrade your Flannel CNI-configured clusters to the Antrea CNI. For more information about Flannel CNI deprecation, see About Upgrading from the Flannel CNI to the Antrea CNI in About Tanzu Kubernetes Grid Integrated Edition Upgrades.

  • Docker Support: Kubernetes support for the Docker container runtime has been deprecated, and support for the Docker container runtime will be entirely removed in Kubernetes v1.24. TKGI v1.12 supports both the Docker and containerd container runtimes.

  • In-Tree vSphere Storage Volume Support: in-tree vSphere storage volume support has been deprecated and support will be entirely removed in TKGI v1.15. For information on how to manually migrate volumes on existing TKGI clusters from the in-tree vSphere Storage Driver to the automatically installed vSphere CSI Driver, see Migrate an In-Tree vSphere Storage Volume to the vSphere CSI Driver in Deploying and Managing Cloud Native Storage (CNS) on vSphere.

  • Manual vSphere CSI Driver Installation Support: Support for manually installing the vSphere CSI Driver will be deprecated in TKGI v1.13 and support entirely removed in a future TKGI version. For information on automatic vSphere CSI Driver installation, see Deploying and Managing Cloud Native Storage (CNS) on vSphere.

Known Issues

TKGI v1.12.0 has the following known issues.

Warning: VMware recommends that you upgrade to TKGI v1.12.6 or later as soon as possible to mitigate CVE-2022-22965, the Spring application remote execution vulnerability. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.

For Known Issues in NCP v3.1.2.3, see NSX Container Plugin 3.1.2 Release Notes.

Warning: Do not use TKGI with Ops Manager v2.10.17 with vSphere CPIv2. For more information, see v2.10.17 in Ops Manager v2.10 Release Notes.


TKGI MC Unable to Manage TKGI after Restoring the TKGI Control Plane from Backup

Symptom

After you restore Ops Manager and the TKGI API VM from backup, TKGI functions normally, but your TKGI MC tabs include the following error: “…product ‘pivotal-container service’ is not deployed…”.

Explanation

TKGI MC is associated with an Ops Manager with a specific name. If you rename Ops Manager with a new name while restoring, your TKGI MC will not recognize the restored Ops Manager and cannot manage it.


Kubernetes Pods on NSX-T Become Stuck in a Creating State

Symptom

The pods in your TKGI Kubernetes clusters on NSX-T become stuck in a creating state. The connections between nsx-node-agent and hyperbus repeatedly close, log Couldn't connect to 'tcp://...' (error: 111-Connection refused), and have a status of COMMUNICATION_ERROR.

Explanation

For information and workaround steps for this Known Issue, see Issue 2795268: Connection between nsx-node-agent and hyperbus flips and Kubernetes pod is stuck at creating state in NSX Container Plugin 3.1.2 Release Notes in the VMware documentation.


Pods Stop After Upgrading From NSX-T v3.0.2 to v3.1.0 on vSphere 7.0 and 7.0.1

This issue is fixed in NSX-T v3.0.3 and NSX-T v3.1.3.

Symptom

Your TKGI-provisioned Pods stop after upgrading from NSX-T v3.0.2 to NSX-T v3.1.0 on vSphere 7.0 and 7.0.1.

Explanation

For information, see Issue 2603550: Some VMs are vMotioned and lose network connectivity during UA nodes upgrade in the VMware NSX-T Data Center 3.1.1 Release Notes.

Workaround

To avoid the loss of network connectivity during UA node upgrade, ensure DRS is set to manual mode during your upgrade from NSX-T v3.0.2 to v3.1.0.

If you upgraded to NSX-T v3.1.0 with DRS in automation mode, run the following on the affected Pods’ control plane VMs to restore Pod connectivity:

monit restart ncp

For more information about upgrading NSX-T v3.0.2 to NSX-T v3.1.0, see Upgrade NSX-T Data Center to v3.0 or v3.1.


Error: Could Not Execute “Apply-Changes” in Azure Environment

Symptom

After clicking Apply Changes on the TKGI tile in an Azure environment, you experience an error ‘…could not execute “apply-changes”…’ with either of the following descriptions:

  • {“errors”:{“base”:[“undefined method ‘location’ for nil:NilClass”]}}
  • FailedError.new(“Resource Groups in region ‘#{location}’ do not support Availability Zones”))

For example:

INFO | 2020-09-21 03:46:49 +0000 | Vessel::Workflows::Installer#run | Install product (apply changes)
2020/09/21 03:47:02 could not execute "apply-changes": installation failed to trigger: request failed: unexpected response from /api/v0/installations:
HTTP/1.1 500 Internal Server Error
Transfer-Encoding: chunked
Cache-Control: no-cache, no-store
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Mon, 21 Sep 2020 17:51:50 GMT
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Referrer-Policy: strict-origin-when-cross-origin
Server: Ops Manager
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Request-Id: f5fc99c1-21a7-45c3-7f39
X-Runtime: 9.905591
X-Xss-Protection: 1; mode=block

44
{"errors":{"base":["undefined method `location' for nil:NilClass"]}}
0

Explanation

The Azure CPI endpoint used by Ops Manager has been changed and your installed version of Ops Manager is not compatible with the new endpoint.

Workaround

Run the following Ops Manager CLI command:

om --skip-ssl-validation --username USERNAME --password PASSWORD --target https://OPSMAN-API curl --silent --path /api/v0/staged/director/verifiers/install_time/IaasConfigurationVerifier -x PUT -d '{ "enabled": false }'

Where:

  • USERNAME is the account to use to run Ops Manager API commands.
  • PASSWORD is the password for the account.
  • OPSMAN-API is the IP address for the Ops Manager API

For more information, see Error ‘undefined method location’ is received when running Apply Change on Azure in the VMware Tanzu Knowledge Base.


VMware vRealize Operations Does Not Support Windows Worker-Based Kubernetes Clusters

VMware vRealize Operations (vROPs) does not support Windows worker-based Kubernetes clusters and cannot be used to manage TKGI-provisioned Windows workers.


TKGI Wavefront Requires Manual Installation for Windows Workers

To monitor Windows-based worker node clusters with a Wavefront collector and proxy, you must first install Wavefront on the clusters manually, using Helm. For instructions, see the Wavefront section of the Monitoring Windows Worker Clusters and Nodes topic.


Pinging Windows Worker Kubernetes Clusters Does Not Work

TKGI-provisioned Windows worker-based Kubernetes clusters inherit a Kubernetes limitation that prevents outbound ICMP communication from workers. As a result, pinging Windows workers does not work.

For information about this limitation, see Limitations > Networking in the Windows in Kubernetes documentation.


Velero Does Not Support Backing Up Stateful Windows Workloads

You can use Velero to back up stateless TKGI-provisioned Windows workers only. You cannot use Velero to back up stateful Windows applications. For more information, see Velero on Windows in Basic Install in the Velero documentation.


Tanzu Mission Control Integration Not Supported on GCP

TKGI on Google Cloud Platform (GCP) does not support Tanzu Mission Control (TMC) integration, which is configured in the Tanzu Kubernetes Grid Integrated Edition tile > the Tanzu Mission Control pane.

If you intend to run TKGI on GCP, skip this pane when configuring the Tanzu Kubernetes Grid Integrated Edition tile.


TMC Data Protection Feature Requires Privileged TKGI Containers

TMC Data Protection feature supports privileged TKGI containers only. For more information, see Plans in the Installing TKGI topic for your IaaS.


Windows Worker Kubernetes Clusters with Group Managed Service Account Do Not Support Compute Profiles

Windows worker-based Kubernetes clusters integrated with group Managed Service Account (gMSA) cannot be managed using compute profiles.


Windows Worker Kubernetes Clusters on Flannel Do Not Support Compute Profiles

On vSphere with NSX-T networking you can use compute profiles with both Linux and Windows worker‑based Kubernetes clusters. On vSphere with Flannel networking, you can apply compute profiles only to Linux clusters.


TKGI CLI Does Not Prevent Reducing the Control Plane Node Count

TKGI CLI does not prevent accidentally reducing a cluster’s control plane node count using a compute profile.

Warning: Reducing a cluster’s control plane node count can destroy the cluster. Do not scale out or scale in existing control plane nodes by reconfiguring the TKGI tile or by using a compute profile. Reducing a cluster’s number of control plane nodes may remove a control plane node and cause the cluster to become inactive.


Windows Cluster Nodes Not Deleted After VM Deleted

Symptom

After you delete a VM using the management console of your infrastructure provider, you notice a Windows worker node that had been on that VM is now in a notReady state.

Solution

  1. To identify the leftover node:

    kubectl get no -o wide
    
  2. Locate nodes on the returned list that are in a notReady state and have the same IP address as another node in the list.
  3. To manually delete a notReady node:

    kubectl delete node NODE-NAME
    

    Where NODE-NAME is the name of the node in the notReady state.


502 Bad Gateway After OIDC Login

Symptom

You experience a “502 Bad Gateway” error from the NSX load balancer after you log in to OIDC.

Explanation

A large response header has exceeded your NSX-T load balancer maximum response header size. The default maximum response header size is 10,240 characters and should be resized to 50,000.

Workaround

If you experience this issue, manually reconfigure your NSX-T request_header_size and response_header_size to 50,000 characters. For information about configuring NSX-T default header sizes, see OIDC Response Header Overflow in the Knowledge Base.


Difficulty Changing Proxy for Windows Workers

You must configure a global proxy in the Tanzu Kubernetes Grid Integrated Edition tile > Networking pane before you create any Windows workers that use the proxy.

You cannot change the proxy configuration for Windows workers in an existing cluster.


Character Limitations in HTTP Proxy Password

For vSphere with NSX-T, the HTTP Proxy password field does not support the following special characters: & or ;.


Error After Modifying Your Harbor Storage Configuration

Symptom

You receive the following error after modifying your existing Harbor installation’s storage configuration:

Error response from daemon: manifest for ... not found: manifest unknown: manifest unknown

Explanation

Harbor does not support modifying an existing Harbor installation’s storage configuration.

Workaround

To modify your Harbor storage configuration, re-install Harbor. Before starting Harbor, configure the new Harbor installation with the desired configuration.


Ingress Controller Statefulset Fails to Start After Resizing Worker Nodes

Symptom

Permissions are removed from your cluster’s files and processes after resizing the persistent disk during a cluster upgrade. The ingress controller statefulset fails to start.

Explanation

When resizing a persistent disk, Bosh migrates the data from the old disk to the new disk but does not copy the files’ extended attributes.

Workaround

To resolve the problem, complete the steps in Ingress controller statefulset fails to start after resize of worker nodes with permission denied in the VMware Tanzu Knowledge Base.


Azure Default Security Group Is Not Automatically Assigned to Cluster VMs

Symptom

You experience issues when configuring a load balancer for a multi-control plane node Kubernetes cluster or creating a service of type LoadBalancer. Additionally, in the Azure portal, the VM > Networking page does not display any inbound and outbound traffic rules for your cluster VMs.

Explanation

As part of configuring the Tanzu Kubernetes Grid Integrated Edition tile for Azure, you enter Default Security Group in the Kubernetes Cloud Provider pane. When you create a Kubernetes cluster, Tanzu Kubernetes Grid Integrated Edition automatically assigns this security group to each VM in the cluster. However, on Azure the automatic assignment may not occur.

As a result, your inbound and outbound traffic rules defined in the security group are not applied to the cluster VMs.

Workaround

If you experience this issue, manually assign the default security group to each VM NIC in your cluster.


One Plan ID Longer than Other Plan IDs

Symptom

One of your plan IDs is one character longer than your other plan IDs.

Explanation

In TKGI, each plan has a unique plan ID. A plan ID is normally a UUID consisting of 32 alphanumeric characters and 4 hyphens. However, the Plan 4 ID consists of 33 alphanumeric characters and 4 hyphens.

Solution

You can safely configure and use Plan 4. The length of the Plan 4 ID does not affect the functionality of Plan 4 clusters.

If you require all plan IDs to have identical length, do not activate or use Plan 4.


Database Cluster Stops After a Database Instance is Stopped

Symptom

After you stop one instance in a multiple-instance database cluster, the cluster stops, or communication between the remaining databases times out, and the entire cluster becomes unreachable.

The following might be in your UAA log:

WSREP has not yet prepared node for application use

Explanation

The database cluster is unable to recover automatically because a member is no longer available to reconcile quorum.


Velero Back Up Fails for vSphere PVs Attached to Clusters on Kubernetes v1.20 and Later

Symptom

Backing up vSphere persistent volumes using Velero fails and your Velero backup log includes the following error:

rpc error: code = Unknown desc = Failed during IsObjectBlocked check: Could not translate selfLink to CRD name

Explanation

This is a known issue when backing up clusters on Kubernetes v1.20 and later using the Velero Plugin for vSphere v1.1.0 or earlier.

Workaround

To resolve the problem, complete the steps in Velero backups of vSphere persistent volumes fail on Kubernetes clusters version 1.20 or higher (83314) in the VMware Tanzu Knowledge Base.


Creating Two Windows Clusters at the Same Time Fails

Symptom

The first time that you try to create two Windows clusters at the same time, the creation of one of the clusters fails. If you run pks cluster CLUSTER-NAME to examine the last action taken on the cluster, you see the following:

 Last Action: Create
Last Action State: failed
Last Action Description: Instance provisioning failed: There was a problem completing your request.

operation: create, error-message: Failed to acquire lock

locking task id is 111, description: ‘create deployment’

Explanation

This is a known issue that occurs the first time that you create two Windows clusters concurrently.

Workaround

Recreate the failed cluster. This issue only occurs the first time that you create two Windows clusters concurrently.


Deleted Clusters are Listed in Cluster Lists

Symptom

After running tkgi delete-cluster and cluster deletion has completed, the deleted cluster continues to be listed when running tkgi clusters.

Workaround

You must manually remove the deleted cluster using a customized version of the ncp_cleanup script. For more information, see Deleting a Tanzu Kubernetes Grid Integrated Edition cluster with “tkgi delete-cluster” stuck “in progress” status in the VMware Tanzu Knowledge Base.


Creating a New Cluster Fails With the Error ‘IP Is Already Allocated’

Symptom

Creating a new cluster fails while using the Antrea CNI. The coreDNS add-on errand logs an error similar to the following:

The Service \"kube-dns\" is invalid: spec.clusterIPs: Invalid value: []string{\"10.100.200.2\"}: failed to allocated ip:10.100.200.2 with error:provided IP is already allocated\n"

Workaround

Delete the failed cluster and run tkgi create-cluster again.


BOSH Director Logs the Error ‘Duplicate vm extension name’

Symptom

After you uninstall TKGI, then reinstall TKGI in the same environment, BOSH Director logs errors similar to the following:

.../gems/bosh-director-0.0.0/lib/bosh/director/deployment_plan/cloud_manifest_parser.rb:120:in `parse_vm_extensions': Duplicate vm extension name 'disk_enable_uuid' (Bosh::Director::DeploymentDuplicateVmExtensionName)

Explanation

The pivotal-container-service cloud-config was not removed when you uninstalled the TKGI tile, and it remained active. When you reinstalled the TKGI tile, an additional pivotal-container-service cloud-config was created, causing the metrics_server to fall into a crash-loop state.

Workaround

You must manually remove the pivotal-container-service cloud-config after removing your TKGI deployment, including after removing the TKGI tile from Ops Manager.

For more information, see “Duplicate vm extension name” error when metrics_server runs on Director VM in Tanzu Kubernetes Grid Integrated Edition in the VMware Tanzu Community Knowledge Base.


The TKGI API FQDN Must Not Include Trailing Whitespace

Symptom

Your TKGI logs include the following error:

'uaa'. Errors are:- Error filling in template 'uaa.yml.erb' (line 59: Client redirect-uri is invalid: uaa.clients.pks_cli.redirect-uri Client redirect-uri is invalid: uaa.clients.pks_cluster_client.redirect-uri)

Explanation

The TKGI API fully-qualified domain name (FQDN) for your cluster contains leading or trailing whitespace.

Workaround

Do not include whitespace in the TKGI tile API Hostname (FQDN) field.


Some Kubernetes Logs Are Not Forwarded after Upgrading

This issue is fixed in TKGI v1.12.1.

Symptom

After you upgrade TKGI to v1.12.0, some Kubernetes logs are no longer forwarded by Fluent Bit.

Explanation

In TKGI v1.12.0, Fluent Bit has been upgraded from v1.3.4 to v1.5.7. In Fluent Bit v1.5.7, the default Kubernetes annotation buffer size is 32 KB, limiting the size of Kubernetes log metadata to 32 KB by default. The Kubernetes log metadata in your environment exceeds this limit, and the log entries have been discarded.

Workaround

To reduce the length of the Kubernetes metadata applied to your log entries, reduce the total size of the Kubernetes annotation environment variables you have defined in your environment.


The TKGI CLI is Unable to Run Simultaneous Commands

This issue is fixed in TKGI v1.12.1.

Symptom

Your automated tasks occasionally attempt to run simultaneously, blocking some of the tasks.

Workaround

In TKGI v1.12.1 and later, increase the length of the TKGI CLI timeout. For more information about the TKGI timeout, see tkgi login in the TKGI CLI documentation.


TKGI Does Not Support Updating Dedicated Tier 1 Clusters

This issue is fixed in TKGI v1.12.4.

TKGI does not support updating dedicated Tier 1 clusters. You can use a Network Profile to create a dedicated Tier 1 cluster, but you cannot use tkgi update-cluster to update the cluster afterward.


TKGI Does Not Support CVDS / NVDS Mixed Environments

TKGI does not support environments where there are multiple matching networks, such as a mixed CVDS/NVDS environment.

Symptom

TKGI logs errors similar to the following in an environment with multiple matching networks:

LastOperationstatus='failed', description='Instance provisioning failed: 
There was a problem completing your request. Please contact your operations team providing the following information: 
service: p.pks, service-instance-guid: ..., broker-request-id: ..., task-id: ..., operation: create, 
error-message: Unknown CPI error 'Unknown' with message 'undefined method `mob' for #<VimSdk::Vim::OpaqueNetwork:' in create_vm' CPI method

Explanation

TKGI cannot identify which of the matching networks you intend to use and has selected the wrong network.


tkgi clusters Returns the Incorrect Number of Worker Instances

This issue is fixed in TKGI v1.12.2.

Symptom

The tkgi clusters --json command returns an incorrect kubernetes_worker_instances worker instance count for clusters configured using a Compute Profile.


The Tier-1 Gateway and Static Routes for the LoadBalancer CRD Are Not Deleted While Deleting Policy API-Based Clusters

This issue is fixed in TKGI v1.12.4.

Symptom

After successfully deleting a Policy API-based cluster with LoadBalancer CRDs, you notice that the cluster’s Tier-1 gateway, static routes for the LoadBalancer CRD, and some TKGI NSX-T resources were not deleted, and the following error has been logged:

Found errors in the request. Please refer to the related errors for details.

[Routing] Removal of edge cluster from Tier1 logical router is not allowed as multicast is enabled on it.

Workaround

To work around this issue:

  • If the cluster has not been deleted:
    1. Delete all of the cluster’s LoadBalancer CRDs before deleting the cluster.
  • If the cluster has already been deleted:

    1. Upgrade TKGI to TKGI v1.12.4 or later.
    2. Login to your PKS VM.
    3. Back up ncp_cleanup:

      cp /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup   /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup.bak
      
    4. Open /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup for editing.
    5. Modify the pks parameter to true:

      --pks=true \
      

      For example:

      $pksnsxcli cleanup \
        --api-type="${API_TYPE}" \
        --nsx-manager-host='192.168.111.101' \
        -c $nsx_manager_client_cert_file \
        -k $nsx_manager_client_key_file \
         \
        --nsx-ca-cert-path=$nsx_manager_ca_cert_file \
         \
        --insecure='false' \
        --cluster "$k8s_cluster_name" \
        --t0-router-id="$t0_router_id" \
        --pks=true \
        --read-only=false \
        --force=$force_delete
      
    6. Save your edit to ncp_cleanup.
    7. Run ncp_cleanup:

      /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup CLUSTER-ID
      

      Where CLUSTER-ID is the cluster’s UUID.

    8. Restore ncp_cleanup to its original state:

      cp /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup.bak   /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup
      


Apache Log4j Vulnerabilities CVE-2021-44228 and CVE-2021-45046

This issue is fixed in TKGI v1.12.3.

Due to the Apache Log4j vulnerabilities CVE-2021-44228 and CVE-2021-45046, VMware recommends that you take the following steps as soon as possible:

  1. Upgrade to Ops Manager v2.10.24 or later.
  2. Upgrade to TKGI v1.12.3 or later.

    Note: When upgrading TKGI to mitigate the Apache Log4j vulnerability you must also upgrade all TKGI clusters.

If you cannot upgrade Ops Manager or TKGI, VMware recommends that you implement the workaround steps provided in the VMware Tanzu Knowledge Base:

For more information about the impact of the Log4j CVE-2021-44228 and CVE-2021-45046 vulnerabilities on VMware products, see VMSA-2021-0028 in Advisories in the VMware Security Solutions documentation.


Occasionally update-cluster Does Not Complete for Windows Workers

Occasionally, tkgi update-cluster hangs while updating a Windows worker node instance and the BOSH task cannot finish and exits.

Symptom

The ovsdb-server service has stopped but other processes report that it is running.

Explanation

The ovsdb-server.pid file uses the pid for a process that is not the ovsdb-server.

To confirm that this is the root cause for tkgi update-cluster to hang:

  • To verify the ovsdb-server service has actually stopped, run the PowerShell Get-services command on the Windows worker node.
  • To verify that other processes report the ovsdb-server service is still running:

    1. Review the ovsdb-server job-service-wrapper.err.log log file.

      The job-service-wrapper.err.log log file is located at:

      C:\var\vcap\sys\log\openvswitch-windows\ovsdb-server\job-service-wrapper.err.log
      
    2. Confirm that after the flushing processes, the log includes an error similar to the following:

      Pid-Guard : ovsdb-server is already runing, please stop it first
      At C:\var\vcap\jobs\openvswitch-windows\bin\ovsdb-server_ctl.ps1:30 char:5
      +     Pid-Guard $PIDFILE "ovsdb-server"
      +     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          + CategoryInfo          : NotSpecified: ( [Write-Error], WriteErrorException
          + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Pid-Guard
      
  • To verify the root cause:

    1. Run the following PowerShell commands on the Windows worker node:

      $RUN_DIR = "C:\var\vcap\sys\run\openvswitch-windows"
      $PIDFILE = "$RUN_DIR\ovsdb-server.pid"
      $pid1 = Get-Content $PidFile -First 1
      echo $pid1
      $rst = Get-Process -Id $pid1 -ErrorAction SilentlyContinue
      echo $rst
      
    2. Confirm the returned ProcessName is not ovsdb-server.

Workaround

To resolve this issue for a single Windows worker:

  1. SSH to the affected worker node.
  2. Run the following:

    rm C:\var\vcap\sys\run\openvswitch-windows\ovsdb-server.pid
    
  3. Wait for the ovsdb-server process to start.
  4. Confirm the dependent services also start.


Using a New Compute Profile to Add and Remove Node Pools Deletes and Recreates the Node Pool VMs

This issue is fixed in TKGI v1.12.4.

When you update a cluster using a Compute Profile to remove existing node pools and create new node pools, BOSH deletes the VMs for the existing node pools and creates new VMs instead of renaming the existing VMs and updating them.


Rotate-Certificates Returns ‘Could Not Fetch VMs Info’ When Rotating Certificates for More than 1000 Clusters

This issue is fixed in TKGI v1.12.4.

Symptom

When you rotate certificates by running the pks rotate-certificates CLI command with the --only-nsx parameter, in an environment with more than one thousand clusters, the command returns the following error:

Error: Cluster ... not found.
ERROR 15217 — [nio-9021-exec-6] i.pivotal.pks.error.ApiExceptionHandler : Error: Status: 500; 
ErrorMessage: <nil>; Description: There was a problem completing your request. 
    Please contact your operations team providing the following information: 
    ...operation: bind - error-message: gathering binding info Could not fetch VMs info

Workaround

To rotate the certificates that did not automatically rotate, you must rotate the certificates manually. For more information, see How to rotate Tanzu Kubernetes Grid Integrated Edition tls-nsx-t cluster certificate in the VMware Tanzu Knowledge Base.


Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch

Microsoft changed Microsoft Windows’ support for tar file commands in the January 2022 Microsoft Windows security patch.

Packaging scripts that use tar commands for Windows worker-based Kubernetes Cluster deployments can fail after the Microsoft tar command patch update has been applied.

The BOSH agent used by vSphere stemcells built by Stembuild v2019.43 and earlier use tar commands that are no longer supported and will fail if the Microsoft Windows security patch has been applied.

Workaround

Stembuild v2019.44 and later include a version of the BOSH agent that does not use unsupported tar commands.

If you use vSphere stemcells, use Stembuild 2019.44 or later to avoid the BOSH agent tar error.


Cluster Migration to the vSphere Container Storage Plug-in Does Not Support Air-Gapped Environments

This issue is fixed in TKGI v1.12.4.

Symptom

After you enable the vSphere CSI Driver and migrate your clusters from in-tree vSphere volumes to the vSphere Container Storage Plug-in, the validating webhook isn’t able to start and reports errors similar to the following:

Warning Failed: Failed to pull image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v2.4.0": rpc error: code = Unknown desc = Error response from daemon: Get "https://gcr.io/v2/": dial tcp: lookup gcr.io on .... no such host

Warning Failed: Error: ErrImagePull

Warning Failed: ImagePullBackOff

Explanation

In air-gapped environments, the validating webhook is unable to pull the container image from gcr.io.

Workaround

Your environment must be able to access gcr.io during CSI migration.


Multiple CoreDNS Pods Might Run On the Same Worker Node after Upgrading a Cluster

This issue is fixed in TKGI v1.12.4.

Symptom

After upgrading a TKGI cluster, TKGI might run two CoreDNS Pods on the same worker node.


TKGI Reallocates Network Profile-Allocated FIP Pool Addresses

Symptom

The pre-start script for tkgi create-cluster fails and logs floating IP pool allocation errors in the pre-start.stderr.log similar to the following:

level=error msg="operation failed with [POST /pools/ip-pools/{pool-id}][409] allocateOrReleaseFromIpPoolConflict  &{RelatedAPIError:{Details: ErrorCode:5141 ErrorData:<nil> ErrorMessage:Requested IP Address ... is already allocated. ModuleName:id-allocation service} RelatedErrors:[]}\n"

level=warning msg="failed to allocate FIP from (pool: ...: [POST /pools/ip-pools/{pool-id}][409] allocateOrReleaseFromIpPoolConflict  &{RelatedAPIError:{Details: ErrorCode:5141 ErrorData:<nil> ErrorMessage:Requested IP Address ... is already allocated. ModuleName:id-allocation service} RelatedErrors:[]}\n"

Error: an error occurred during FIP allocation

Explanation

TKGI administrators can allocate floating IP pool IP Addresses in a Network Profile configuration. The TKGI control plane allocates IP Addresses from the floating IP pool without accounting for the IPs allocated using Network Profiles.

Workaround

TKGI allocates IP addresses starting from the beginning of a floating IP pool range. When configuring a Network Profile, allocate IP Addresses starting at the end of the floating IP pool range instead of those at the beginning.


Spring Application Remote Code Execution Vulnerability CVE-2022-22965

This issue is fixed in TKGI v1.12.5.

TKGI v1.12.0 through v1.12.4 deploy a version of UAA affected by CVE-2022-22965, a Spring application exploit.

VMware has confirmed that the vulnerability affects applications running in TKGI environments and recommends that you implement the workaround steps below as soon as possible.

Explanation

Through the CVE-2022-22965 exploit, Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution (RCE) via data binding.

For information on CVE-2022-22965, see CVE-2022-22965: Spring Framework RCE via Data Binding on JDK 9+ or VMSA-2022-0010 in the VMware Security Advisories documentation.

Workaround

For information on how to manually mitigate CVE-2022-22965 in TKGI, see Workaround instructions to address CVE-2022-22965 in TKGI v1.11 ~ v1.13 in the VMware Tanzu Knowledge Base.


Cluster Upgrading Fails After Upgrading NSX-T and TKGI

This issue is fixed in TKGI v1.12.5.

If you patch upgrade NSX-T and then upgrade TKGI, your cluster upgrades will fail for clusters running the Docker container runtime.

Symptom

Errors similar to the following are logged when your Docker container runtime cluster upgrades fail:

worker/...:/var/vcap/sys/log/docker# cat docker.stderr.log
failed to start daemon: failed to dial "/run/containerd/containerd.sock": failed to dial "/run/containerd/containerd.sock": context deadline exceeded

Explanation

Docker and kubelet are unable to start because the vrops-cadvisor DaemonSet has detected that the /run/containerd/containerd.sock file does not exist and then automatically created it.

Workaround

If you have already upgraded NSX-T and TKGI:

  1. During cluster upgrading, monitor your worker nodes.
  2. On each worker node:
    1. Remove /run/containerd/containerd.sock.
    2. Drain and stop all running jobs.


If you have not upgraded NSX-T and TKGI:

  1. Upgrade TKGI before updating NSX-T.
  2. Upgrade clusters using your normal procedures.


Certificate Rotation Fails If the Cluster Has a Kubernetes Profile

This issue is fixed in TKGI v1.12.5.

If you run tkgi rotate-certificates on a cluster configured with a Kubernetes Profile, the certificate rotation process will halt before completing.

Symptom

tkgi rotate-certificates halts and logs errors similar to the following when rotating certificates on a cluster configured with a Kubernetes Profile:

org.hibernate.LazyInitializationException: could not initialize proxy io.pivotal.pks.profile.kubernetes.data.KubernetesProfileEntity#... - no Session
at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:169) ~[hibernate-core-5.3.18.Final.jar!/:5.3.18.Final]
at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:309) ~[hibernate-core-5.3.18.Final.jar!/:5.3.18.Final]


Automatic vSphere CSI Driver Integration Ignores Proxy Configuration

This issue is fixed in TKGI v1.12.7.

In an environment configured with a proxy, automatic vSphere CSI Driver Integration ignores the configured proxy and fails.

Symptom

In environments configured with a Proxy and automatic vSphere CSI Driver Integration enabled, the csi-controller and csi-syncer services log errors similar to the following and fail:

"caller":"vsphere/virtualcenter.go:154","msg":"failed to create new client with err: Post \"...\": dial tcp: lookup ...: no such host"

Workaround

To configure the vSphere CSI Driver to use your proxy:

  1. SSH to TKGI Control Plane VM.

  2. Open the /var/vcap/jobs/csi-controller/bin/csi_controller_ctl configuration file for editing.

  3. Locate the start_csi_controller function in the configuration file:

    start_csi_controller()
    {
        ...
        export ...
        export ...
        export ...
        ...
    }
    
  4. Locate the export commands in the start_csi_controller function and add the following to the group of export commands:

    export HTTP_PROXY=HTTP-PROXY
    export HTTPS_PROXY=HTTPS-PROXY
    export NO_PROXY=NO-PROXY
    

    Where:

    • HTTP-PROXY is the HTTP Proxy that the vSphere CSI Driver must use.
    • HTTPS-PROXY is the HTTPS Proxy that the vSphere CSI Driver must use.
    • NO-PROXY is a list of host names that the vSphere CSI Driver should not use a proxy for.
  5. Restart the csi_controller:

    sudo monit restart csi_controller
    


Custom BOSH vm_extensions Configuration Settings Toggle On and Off

This issue is fixed in TKGI v1.12.7.

After a failed tkgi upgrade-cluster run, your custom BOSH vm_extensions configuration toggles between being present and absent depending on when you review the cluster manifest.

Explanation

The VMs that were updated by tkgi update-cluster before the failure have manifests with the desired BOSH vm_extensions configuration, while those updated afterward do not. Re-running tkgi update-cluster does not update the manifests in the remaining VMs because the cluster configuration has not changed.


Switching Your Default CNI to Antrea is Not Supported

This issue is fixed in TKGI v1.12.8.

You cannot switch your default CNI from Flannel to Antrea during TKGI upgrade if TKGI is running on Ops Manager v2.10.40 or later.


VMDKs Are Deleted during Migration from In-Tree Storage to CSI

This issue is fixed in TKGI v1.12.7.

While migrating a cluster from using the In-Tree vSphere Storage Driver to the automatically installed vSphere CSI Driver, VMDKs attached to worker VMs might be deleted.

For more information, see vsphere-csi-driver the vSphere CSI Driver Release Notes in GitHub.


TKGI Management Console v1.12.0

Release Date: September 9, 2021

Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.

Product Snapshot

Element Details
Version v1.12.0
Release date September 9, 2021
Installed TKGI version v1.12.0
Installed Ops Manager version v2.10.16* Release Notes
Component Version
Installed Kubernetes version v1.21.3* Release Notes
Installed Harbor Registry version v2.3.1* Release Notes
Linux stemcell v621.141*
Windows stemcells v2019.37 and later*
* Components marked with an asterisk have been updated.

Upgrade Path

The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.0 are from TKGI MC v1.11.3 and earlier TKGI MC v1.11 patches.

Features and Resolved Issues

This release of the TKGI Management Console includes no new features or resolved issues.

Deprecations

The following TKGI features have been deprecated or removed from TKGI Management Console v1.12:

  • Support for DenyEscalatingExec has been removed from Kubernetes v1.21 and can no longer be configured in the TKGI Management Console. For more information, see DenyEscalatingExec in the Kubernetes documentation.

Known Issues

The Tanzu Kubernetes Grid Integrated Edition Management Console v1.12.0 has the following known issues:


vRealize Log Insight Integration Does Not Support HTTPS Connections

Symptom

The Tanzu Kubernetes Grid Integrated Edition Management Console integration to vRealize Log Insight does not support connections to the HTTPS port on the vRealize Log Insight server.

Workaround

  1. Use SSH to log in to the Tanzu Kubernetes Grid Integrated Edition Management Console appliance VM.
  2. Open the file /lib/systemd/system/pks-loginsight.service in a text editor.
  3. Add -e LOG_SERVER_ENABLE_SSL_VERIFY=false.
  4. Set -e LOG_SERVER_USE_SSL=true.

    The resulting file should look like the following example:

    ExecStart=/bin/docker run --privileged --restart=always --network=pks
    -v /var/log/journal:/var/log/journal
    --name=pks-loginsight
    -e TYPE=gear2-vm
    -e LOG_SERVER_HOST=${LOGINSIGHT_HOST}
    -e LOG_SERVER_PORT=${LOGINSIGHT_PORT}
    -e LOG_SERVER_ENABLE_SSL_VERIFY=false
    -e LOG_SERVER_USE_SSL=true
    -e LOG_SERVER_AGENT_ID=${LOGINSIGHT_ID}
    pksoctopus/vrli-journald:v07092019
    
  5. Save the file and run systemctl daemon-reload.

  6. To restart the vRealize Log Insight service, run systemctl restart pks-loginsight.service.

Tanzu Kubernetes Grid Integrated Edition Management Console can now send logs to the HTTPS port on the vRealize Log Insight server.


vSphere HA causes Management Console ovfenv Data Corruption

Symptom

If you enable vSphere HA on a cluster, if the TKGI Management Console appliance VM is running on a host in that cluster, and if the host reboots, vSphere HA recreates a new TKGI Management Console appliance VM on another host in the cluster. Due to an issue with vSphere HA, the ovfenv data for the newly created appliance VM is corrupted and the new appliance VM does not boot up with the correct network configuration.

Workaround

  • In the vSphere Client, right-click the appliance VM and select Power > Shut Down Guest OS.
  • Right-click the appliance again and select Edit Settings.
  • Select VM Options and click OK.
  • Verify under Recent Tasks that a Reconfigure virtual machine task has run on the appliance VM.
  • Power on the appliance VM.


Base64 encoded file arguments are not decoded in Kubernetes profiles

Symptom

Some file arguments in Kubernetes profiles are base64 encoded. When the management console displays the Kubernetes profile, some file arguments are not decoded.

Workaround

Run echo "$content" | base64 --decode


Network profiles not immediately selectable

Symptom

If you create network profiles and then try to apply them in the Create Cluster page, the new profiles are not available for selection.

Workaround

Log out of the management console and log back in again.


Real-Time IP information not displayed for network profiles

Symptom

In the cluster summary page, only default IP pool, pod IP block, node IP block values are displayed, rather than the real-time values from the associated network profile.

Workaround

None


Error After Modifying Your Harbor Storage Configuration

Symptom

You receive the following error after modifying your existing Harbor installation’s storage configuration:

Error response from daemon: manifest for ... not found: manifest unknown: manifest unknown

Explanation

Harbor does not support modifying an existing Harbor installation’s storage configuration.

Workaround

To modify your Harbor storage configuration, re-install Harbor. Before starting Harbor, configure the new Harbor installation with the desired configuration.


Windows Stemcells Must be Re-Imported After Upgrading Ops Manager

Symptom

After upgrading Ops Manager, your Management Console does not recognize a Windows stemcell imported when using the prior version of Ops Manager.

Workaround

If your Management Console does not recognize a Windows stemcell after upgrading Ops Manager:

  1. Re-import your previously imported Windows stemcell.
  2. Apply Changes to TKGI MC.


Your New Clusters Are Not Shown In Tanzu Mission Control

Symptom

After you create a cluster, Tanzu Mission Control does not include the cluster in cluster lists. You have a “Resource not found” error similar to the following in your BOSH logs:

Cluster Name in TMC: cluster-1
Cluster Name Prefix: tkgi-my-prefix-
Group Name in TMC: my-prefix-clusters
Cluster Description in TMC: VMware Enterprise PKS Attaching cluster ''tkgi-my-prefix-cluster-1'' to TMC 
Fetching token successful 
request POST:/v1alpha1/clusters, 
response 404 Not Found:{"error":"Resource not found - clustergroup(my-prefix-clusters) 
org id(d859dc9f-g622-426d-8c91-939a9f13dea9)",
"code":5,"message":"Resource not found - clustergroup(my-prefix-clusters)

Explanation

The cluster group you assign a cluster to must be defined in Tanzu Mission Control before you assign your cluster to the cluster group in the TKGI Management Console.

Workaround

To resolve the problem, complete the steps in Attaching a Tanzu Kubernetes Grid Integrated (TKGI) cluster to Tanzu Mission Control (TMC) fails with “Resource not found - clustergroup(cluster-group-name)” in the VMware Tanzu Knowledge Base.


TKGI MC Does Not Support Single-Edge Node Configurations in Automated NAT Mode

This issue is fixed in TKGI v1.12.2.

Symptom

The TKGI Configuration view within TKGI MC displays the following error:

Fail to configure NSX for TKGI: Traceback (most recent call last): File"scripts/nsx_automoaror.py", line 182, 
in <module> edgee_tzs |= extra_edge2_tzs TpeError: unsupported operand type(s) for |=:'dict' and 'dict'


TKGI MC Validation Error If the Deployment DNS IP Address Is Reconfigured after Installing TKGI in a BYOT Environment

This issue is fixed in TKGI v1.12.4.

Symptom

In an NSX-T Data Center - Bring Your Own Topology environment, TKGI MC displays the following error if you reconfigure your TKGI deployment DNS after installing TKGI:

Input validation failed Validation errors found: Fail to parse network : invalid CIDR address:

Workaround

To reconfigure your TKGI Deployment DNS:

  1. Open the TKGI deployment YAML in the TKGI MC YAML Editor.
  2. To modify the TKGI Deployment DNS, modify the dep_dns field.
  3. To modify the TKGI Deployment CIDR, modify the dep_network_cidr field. Provide any CIDR that does not contain the Deployment DNS IP.

    For example:

    network:
      dep_dns:192.168.111.155
      dep_network_cidr: "80.0.0.1/24"
    

    The new CIDR will not affect your original CIDR configuration.

  4. Click Apply Configuration.


check-circle-line exclamation-circle-line close-line
Scroll to top icon