This topic contains release notes for Tanzu Kubernetes Grid Integrated Edition (TKGI) v1.13.
Warning: Before installing or upgrading to Tanzu Kubernetes Grid Integrated Edition v1.13, review the Breaking Changes below.
Release Date: December 15, 2022
Release | Details | |
---|---|---|
Version | v1.13.10 | |
Release date | December 15, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.13 | |
CoreDNS | v1.8.4+vmware.10 | |
CSI Driver for vSphere | v2.4.2 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.9 |
|
etcd | v3.5.6* | |
Harbor | v2.6.2* | Release Notes |
Kubernetes | v1.22.16* | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.1.3 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.44.0 | |
UAA | v74.5.55 | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.55* or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.10 are from TKGI v1.13.9 and earlier TKGI v1.13 patches, and from all TKGI v1.12 patches.
TKGI v1.13.10 has the following features and enhancements:
fluent-plugin-vmware-loginsight
Fluentd output plugin to v1.3.1. fluent-plugin-vmware-loginsight
forwards logs to VMware Log Insight.TKGI v1.13.10 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.9 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.10. See the TKGI v1.13.9 Known Issues below.
Release Date: December 15, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.10 | |
Release date | December 15, 2022 | |
Installed TKGI version | v1.13.10 | |
Installed Ops Manager version | v2.10.50* | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.16* | Release Notes |
Installed Harbor Registry version | v2.6.2* | Release Notes |
Linux stemcell | v621.359* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.10 are from TKGI MC v1.13.9 and earlier TKGI v1.13 patches, and from all TKGI v1.12 patches.
This release of the Tanzu Kubernetes Grid Integrated Edition Management Console includes no new features or resolved issues.
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.9 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.10. See the TKGI MC v1.13.9 Known Issues below.
Release Date: November 1, 2022
Release | Details | |
---|---|---|
Version | v1.13.9 | |
Release date | November 1, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.13 | |
CoreDNS | v1.8.4+vmware.10 | |
CSI Driver for vSphere | v2.4.2 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.9 |
|
etcd | v3.5.4 | |
Harbor | v2.6.1* | Release Notes |
Kubernetes | v1.22.15* | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.1.3 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.44.0 | |
UAA | v74.5.55* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.51 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.9 are from TKGI v1.13.8 and earlier TKGI v1.13 patches, and from all TKGI v1.12 patches.
This release of the TKGI includes no new features.
TKGI v1.13.9 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.8 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.9. See the TKGI v1.13.8 Known Issues below.
Release Date: November 1, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.9 | |
Release date | November 1, 2022 | |
Installed TKGI version | v1.13.9 | |
Installed Ops Manager version | v2.10.47* | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.15* | Release Notes |
Installed Harbor Registry version | v2.6.1* | Release Notes |
Linux stemcell | v621.305* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.9 are from TKGI MC v1.13.8 and earlier TKGI v1.13 patches, and from all TKGI v1.12 patches.
This release of the Tanzu Kubernetes Grid Integrated Edition Management Console includes no new features or resolved issues.
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.8 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.9. See the TKGI MC v1.13.8 Known Issues below.
Release Date: September 6, 2022
Release | Details | |
---|---|---|
Version | v1.13.8 | |
Release date | September 6, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.13 | |
CoreDNS | v1.8.4+vmware.10 | |
CSI Driver for vSphere | v2.4.2 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.9 |
|
etcd | v3.5.4 | |
Harbor | v2.5.3 | Release Notes |
Kubernetes | v1.22.12 | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.1.3* | Release Notes |
Percona XtraDB Cluster (PXC) | v0.44.0 | |
UAA | v74.5.48* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.51 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.8 are from TKGI v1.13.7 and earlier TKGI v1.13 patches, and from TKGI v1.12.8 and earlier TKGI v1.12 patches.
This release of the TKGI includes no new features.
TKGI v1.13.8 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.7 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.8. See the TKGI v1.13.7 Known Issues below.
Release Date: September 6, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.8 | |
Release date | September 6, 2022 | |
Installed TKGI version | v1.13.8 | |
Installed Ops Manager version | v2.10.46 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.12 | Release Notes |
Installed Harbor Registry version | v2.5.3 | Release Notes |
Linux stemcell | v621.265* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.8 are from TKGI MC v1.13.7 and earlier TKGI v1.13 patches, and from TKGI MC v1.12.8 and earlier TKGI v1.12 patches.
This release of the Tanzu Kubernetes Grid Integrated Edition Management Console includes no new features or resolved issues.
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.7 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.8. See the TKGI MC v1.13.7 Known Issues below.
Release Date: August 10, 2022
Release | Details | |
---|---|---|
Version | v1.13.7 | |
Release date | August 10, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.13* | |
CoreDNS | v1.8.4+vmware.10* | |
CSI Driver for vSphere | v2.4.2 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.9* |
|
etcd | v3.5.4 | |
Harbor | v2.5.3* | Release Notes |
Kubernetes | v1.22.12* | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.1.2* | Release Notes |
Percona XtraDB Cluster (PXC) | v0.44.0* | |
UAA | v74.5.46* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.51 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.7 are from TKGI v1.13.6 and earlier TKGI v1.13 patches, and from TKGI v1.12.8 and earlier TKGI v1.12 patches.
TKGI v1.13.7 has the following features and enhancements:
metric_version=1
format instead of the default metric_version=2
format.For more information, see Configure Telegraf in the Tile in Configuring Telegraf in TKGI. For more information on Telegraf output formats, see Example Output in the Telegraf GitHub documentation.
TKGI v1.13.7 has the following resolved issues:
[Security Fix] Fixes the following CVEs:
Fixes Kubernetes Services Cannot Be Accessed from a Windows Pod in Some Windows Clusters.
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.6 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.7. See the TKGI v1.13.6 Known Issues below.
This issue is fixed in TKGI v1.13.8.
On occasion, the IP Addresses for one or more running Pods in a Windows cluster become unreachable. Pinging from within an unreachable Pod also fails, returning Request timed out.
.
The unreachable Pods might also enter a CrashLoopBackOff
state.
Explanation
CNI requests within the Pod have entered a race condition. Afterward, networking for the Pod is unreachable.
Release Date: August 10, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.7 | |
Release date | August 10, 2022 | |
Installed TKGI version | v1.13.7 | |
Installed Ops Manager version | v2.10.45 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.12* | Release Notes |
Installed Harbor Registry version | v2.5.3* | Release Notes |
Linux stemcell | v621.256* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.7 are from TKGI MC v1.13.6 and earlier TKGI v1.13 patches, and from TKGI MC v1.12.8 and earlier TKGI v1.12 patches.
TKGI Management Console v1.13.7 has the following features and enhancements:
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.6 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.7. See the TKGI MC v1.13.6 Known Issues below.
Release Date: June 29, 2022
Release | Details | |
---|---|---|
Version | v1.13.6 | |
Release date | June 29, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.9 | |
CoreDNS | v1.8.4+vmware.7 | |
CSI Driver for vSphere | v2.4.2* | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.7 |
|
etcd | v3.5.4 | |
Harbor | v2.5.1 | Release Notes |
Kubernetes | v1.22.6 | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.1.1 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.42.0 | |
UAA | v74.5.44* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.46 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.6 are from TKGI v1.13.5 and earlier TKGI v1.13 patches, and from TKGI v1.12.7 and earlier TKGI v1.12 patches.
TKGI v1.13.6 has the following features:
Supports accessing images in a private Docker registry from Linux clusters with containerd container runtimes. For more information, see Configuring Cluster Access to Private Docker Registries (Beta).
Supports VMware NSX-T v3.2.1.
TKGI v1.13.6 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.5 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.6. See the TKGI v1.13.5 Known Issues below.
For Known Issues in NCP v3.2.1.1, see NSX Container Plugin 3.2.1.1 Release Notes.
This issue is fixed in TKGI v1.13.7.
When accessing Kubernetes services from a Windows Pod, the following error is logged in some Windows clusters:
curl: (7) Failed to connect to 192.168.1.1 port 8080: Timed out
The Windows Pod is unable to communicate with the Kubernetes service.
This issue is fixed in TKGI v1.13.8.
After upgrading TKGI to TKGI v1.13.6, the vROPs cAdvisor daemonset cannot be deployed on clusters with Pod Security Policy (PSP) enabled.
Symptom
The vROPs cAdvisor daemonset does not deploy on a cluster and errors similar to the following are logged:
Warning FailedCreate... daemonset-controller Error creating: pods "..." is forbidden: PodSecurityPolicy: unable to admit pod:
Workaround
To restart vROPs cAdvisor daemonset:
Configure privileged: true
and allowPrivilegeEscalation: true
in the spec
settings.
For example:
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: a-vrops-psp
spec:
privileged: true
allowPrivilegeEscalation: true
....
Save your edits and update the cluster with the revised configuration file.
For more information on configuring PSP, see Enabling and Configuring Pod Security Policies.
The CSI Nodes in Kubernetes clusters are created with Kubernetes v1.22.12+vmware.1
instead of the Kubernetes version used in the other nodes in the clusters.
Release Date: June 29, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.6 | |
Release date | June 29, 2022 | |
Installed TKGI version | v1.13.6 | |
Installed Ops Manager version | v2.10.43 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.7* | Release Notes |
Installed Harbor Registry version | v2.5.1* | Release Notes |
Linux stemcell | v621.252* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.6 are from TKGI MC v1.13.5 and earlier TKGI v1.13 patches, and from TKGI MC v1.12.7 and earlier TKGI v1.12 patches.
TKGI Management Console v1.13.6 has the following features and enhancements:
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.5 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.6. See the TKGI MC v1.13.5 Known Issues below.
Release Date: May 23, 2022
Release | Details | |
---|---|---|
Version | v1.13.5 | |
Release date | May 23, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.9 | |
CoreDNS | v1.8.4+vmware.7 | |
CSI Driver for vSphere | v2.4.1 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.7 |
|
etcd | v3.5.4* | |
Harbor | v2.5.0* | Release Notes |
Kubernetes | v1.22.6 | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.0.3 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.42.0 | |
UAA | v74.5.39* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.46 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.5 are from TKGI v1.13.4 and earlier TKGI v1.13 patches, and from TKGI v1.12.6 and earlier TKGI v1.12 patches.
TKGI v1.13.5 has the following breaking changes:
audit.log
file is in a new location:
/var/vcap/sys/log/kube-apiserver/audit.log
/var/vcap/sys/log/kube-apiserver/audit/log/audit.log
TKGI v1.13.5 has the following features:
TKGI v1.13.5 has the following resolved issues:
* Fixes Switching Cluster Container Runtimes Two or More Times Fails on Clusters with vROps Enabled.
* Fixes kube-apiserver Logs Occupy More Disk Space than Allocated.
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.4 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.5. See the TKGI v1.13.4 Known Issues below.
For Known Issues in NCP v3.2.0.3, see NSX Container Plugin 3.2.0.3 Release Notes.
Release Date: May 23, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.5 | |
Release date | May 23, 2022 | |
Installed TKGI version | v1.13.5 | |
Installed Ops Manager version | v2.10.39 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.6 | Release Notes |
Installed Harbor Registry version | v2.5.0* | Release Notes |
Linux stemcell | v621.236* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.5 are from TKGI MC v1.13.4 and earlier TKGI v1.13 patches, and from TKGI MC v1.12.6 and earlier TKGI v1.12 patches.
TKGI Management Console v1.13.5 has the following features and enhancements:
nsx_feign_client_read_timeout
property in the TKGI MC Configuration File. For more information, see Generate Configuration File and Deploy Tanzu Kubernetes Grid Integrated Edition in Deploy TKGI by Using the Configuration Wizard.For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.4 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.5. See the TKGI MC v1.13.4 Known Issues below.
Release Date: April 29, 2022
Release | Details | |
---|---|---|
Version | v1.13.4 | |
Release date | April 29, 2022 | |
Component | Version | |
Antrea | v1.3.1-1.2.3* | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.9 | |
CoreDNS | v1.8.4+vmware.7 | |
CSI Driver for vSphere | v2.4.1 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.7 |
|
etcd | v3.5.3* | |
Harbor | v2.4.2* | Release Notes |
Kubernetes | v1.22.6 | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.0.3 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.42.0* | |
UAA | v74.5.37* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices**. | |
vSphere | ||
Windows stemcells | v2019.46 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.4 are from TKGI v1.13.3 and earlier TKGI v1.13 patches, and from TKGI v1.12.5 and earlier TKGI v1.12 patches.
TKGI v1.13.4 has the following breaking changes:
Fluent Bit has been upgraded from v1.8.10 to v1.9.0. For more information, see Upgrade Notes in the Fluent Bit documentation.
OpenJDK has been upgraded to v11.0.14. For more information, see OpenJDK v11.0.14 Released in the OpenJDK notification archives.
Antrea has been upgraded from v1.3.0 to v1.3.1. For more information, see Integration of Antrea Container Clusters.
TKGI v1.13.4 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.3 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.4. See the TKGI v1.13.3 Known Issues below.
For Known Issues in NCP v3.2.0.3, see NSX Container Plugin 3.2.0.3 Release Notes.
This issue is fixed in TKGI v1.13.5.
Switching a cluster container runtime from Docker to containerd then back to Docker will fail if the cluster is configured with vROps enabled.
Symptom
Switching a cluster container runtime from containerd to Docker fails and logs errors similar to the following:
failed to load listeners: can't create unix socket /var/vcap/sys/run/docker/docker.sock: is a directory
Explanation
The vrops-cadvisor daemonset mounted /var/vcap/sys/run/docker/docker.sock
because it did not exist.
Workaround
To resolve this error:
Run the following command to remove the /var/vcap/sys/run/docker/docker.sock
directory from worker nodes:
bosh -d service-instance_CLUSTER-UUID ssh worker "rm -rf /var/vcap/sys/run/docker/docker.sock"
Where CLUSTER-UUID
is the BOSH deployment name of your cluster.
To switch your cluster’s container runtime, re-run your TKGI update-cluster
command.
This issue is fixed in TKGI v1.13.6.
Clusters using the Docker container runtime no longer default to supporting Docker command line commands.
Supporting containerd as the default container runtime requires Docker-specific features incompatible with containerd are deactivated by default.
Workaround
In an environment where you want to run Docker commands, complete one of the following:
Export the Docker environments variables before using Docker commands:
source /var/vcap/jobs/docker/bin/envrc
For example:
source /var/vcap/jobs/docker/bin/envrc
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
... 1a3337bb81890b6bb1848b5dd4565dfa5d124f38 ffb57751a939 3 months ago 182MB
Use absolute Docker paths when referencing Docker:
/var/vcap/packages/docker/bin/docker --host unix:///var/vcap/sys/run/docker/docker.sock
For example:
/var/vcap/packages/docker/bin/docker --host unix:///var/vcap/sys/run/docker/docker.sock images
REPOSITORY TAG IMAGE ID CREATED SIZE
... 1a3337bb81890b6bb1848b5dd4565dfa5d124f38 ffb57751a939 3 months ago 182MB
Release Date: April 29, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.4 | |
Release date | April 29, 2022 | |
Installed TKGI version | v1.13.4 | |
Installed Ops Manager version | v2.10.37 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.6 | Release Notes |
Installed Harbor Registry version | v2.4.2* | Release Notes |
Linux stemcell | v621.224* |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.4 are from TKGI MC v1.13.3 and earlier TKGI v1.13 patches, and from TKGI MC v1.12.5 and earlier TKGI v1.12 patches.
TKGI Management Console v1.13.4 has the following resolved issues:
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.3 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.4. See the TKGI MC v1.13.3 Known Issues below.
Release Date: March 11, 2022
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
Release | Details | |
---|---|---|
Version | v1.13.3 | |
Release date | March 11, 2022 | |
Component | Version | |
Antrea | v1.3.0-1.2.2 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.9 | |
CoreDNS | v1.8.4+vmware.7 | |
CSI Driver for vSphere | v2.4.1 | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.7 |
|
etcd | v3.5.0 | |
Harbor | v2.4.1 | Release Notes |
Kubernetes | v1.22.6 | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.0.3 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.41.0 | |
UAA | v74.5.35* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13 |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices**. | |
vSphere | ||
Windows stemcells | v2019.45 or later | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.3 are from TKGI v1.13.2 and earlier TKGI v1.13 patches, and from TKGI v1.12.4 and earlier TKGI v1.12 patches.
TKGI v1.13.3 has the following features:
Kubernetes has deprecated Docker container runtime support and all Docker-runtime clusters must be switched to use the containerd container runtime prior to upgrading to TKGI v1.15.
The TKGI v1.14 cluster upgrade will automatically switch clusters to the containerd container runtime.
Warning: Cluster workloads will experience downtime while the cluster switches from using the Docker runtime to containerd.
In TKGI v1.13.3 and later, the tkgi update-cluster
CLI command includes options to prepare your clusters for the removal of Docker-runtime support:
Supports switching an existing cluster between container runtimes:
For more information, see Switch a Cluster to a Different Container Runtime in Upgrading Clusters.
Supports locking a cluster to the Docker container runtime:
Locking a cluster to the Docker container runtime prevents the TKGI v1.14 cluster upgrade from automatically switching the cluster to the containerd-runtime.
For more information, see Lock a Cluster to the Docker Container Runtime in Upgrading Clusters and Create a Kubernetes Cluster in Creating Clusters.
Note: All Docker-runtime clusters must be switched to the containerd-runtime prior to upgrading to TKGI v1.15.
For more information, see Customize Cluster Container Runtimes Before Upgrading in Upgrading Clusters.
Warning: Clusters that are not locked to the Docker-runtime will automatically switch to the containerd container runtime during the TKGI v1.14 cluster upgrade.
TKGI v1.13.3 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.2 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.3. See the TKGI v1.13.2 Known Issues below.
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
For Known Issues in NCP v3.2.0.3, see NSX Container Plugin 3.2.0.3 Release Notes.
This issue is fixed in TKGI v1.13.4.
The Docker service might fail for some Docker container runtime clusters while upgrading the clusters from TKGI v1.13.2 to TKGI v1.13.3.
Symptoms
While upgrading Docker container runtime clusters to TKGI v1.13.3, the cluster upgrade fails and logs errors similar to the following:
...operation: update, error-message: 'worker-... (1)' is not running after update. Review logs for failed jobs: docker...
...worker/...:~# monit summary The Monit daemon 5.2.5 uptime: 1m Process 'docker' Execution failed
...msg="Handler for POST /v1.41/images/create returned error: Get \"https://registry.tkg.vmware.run/v2/\": dial tcp: lookup registry.tkg.vmware.run on ...: no such host..."
Workaround
If upgrading a Docker container runtime cluster to TKGI v1.13.3 has failed:
Warning FailedCreatePodSandBox
, you must recreate the affected worker node.Release Date: March 11, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.3 | |
Release date | March 11, 2022 | |
Installed TKGI version | v1.13.3 | |
Installed Ops Manager version | v2.10.32 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.6 | Release Notes |
Installed Harbor Registry version | v2.4.1 | Release Notes |
Linux stemcell | v621.211* | |
Windows stemcells | v2019.44 and later |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.3 are from TKGI MC v1.13.2 and earlier TKGI v1.13 patches, and from TKGI MC v1.12.4 and earlier TKGI v1.12 patches.
TKGI Management Console v1.13.3 has the following resolved issues:
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.1 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.3. See the TKGI MC v1.13.1 Known Issues below.
Release Date: February 17, 2022
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
Release | Details | |
---|---|---|
Version | v1.13.2 | |
Release date | February 17, 2022 | |
Component | Version | |
Antrea | v1.3.0-1.2.2 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.9* | |
CoreDNS | v1.8.4+vmware.7* | |
CSI Driver for vSphere | v2.4.1* | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.7 |
|
etcd | v3.5.0 | |
Harbor | v2.4.1 | Release Notes |
Kubernetes | v1.22.6* | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.0.3* | Release Notes |
Percona XtraDB Cluster (PXC) | v0.41.0* | |
UAA | v74.5.34* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1 Wavefront Proxy: v10.13* |
|
Compatibilities | Versions | |
Ops Manager | See VMware Tanzu Network. | |
NSX-T | See VMware Product Interoperability Matrices. | |
vSphere | ||
Windows stemcells | v2019.44 or later*** | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.2 are from TKGI v1.13.1, v1.13.0, and from TKGI v1.12.3 and earlier TKGI v1.12 patches.
TKGI v1.13.2 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.1 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.2. See the TKGI v1.13.1 Known Issues below.
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
For Known Issues in NCP v3.2.0.3, see NSX Container Plugin 3.2.0.3 Release Notes.
The Fluent Bit Docker, CRI, Go, Java, and Python multi-line parser does not merge containerd runtime cluster log entries belonging to the same context into a single log entry.
This issue is fixed in TKGI v1.13.4.
Fluent Bit v1.8.5 and later might fail to forward Pod log entries to the desired logging destination and instead return a DNS timeout error.
Symptoms
When Fluent Bit encounters this unexpected DNS resolution issue, it logs Fluent Bit errors similar to the following:
[ warn] [engine] chunk '...' cannot be retried: task_id=1, input=tail.0 > output=splunk.0
[ warn] [net] getaddrinfo(host='...', err=11): Could not contact DNS servers
[ warn] [engine] failed to flush chunk '...', retry in 6 seconds: task_id=11, input=tail.0 > output=splunk.0 (out_id=0)
[ warn] [net] getaddrinfo(host='...', err=11): Could not contact DNS servers
Workaround
To resolve the Fluent Bit DNS timeout problem:
Add the following to the OUTPUT
section in the config map:
net.dns.mode TCP
For more information on configuring a Fluent Bit ConfigMap, see Installation in the Fluent Bit Official Manual or the example ConfigMap files: kafka/fluent-bit-configmap.yaml and elasticsearch/fluent-bit-configmap.yaml in the Fluent Bit GitHub repository.
Perform a rollout restart of Fluent Bit logging:
kubectl logs POD-NAME -n NAMESPACE -c CONTAINER-NAME --follow
Where:
POD-NAME
is the name of your Pod.NAMESPACE
is the namespace for your Pod.CONTAINER-NAME
is the name of your Pod container.For more information on restarting Fluent Bit logging, see Interacting with running Pods in kubectl Cheat Sheet in the Kubernetes documentation.
For more information, see DNS resolution timeout/failure in >= 1.8.5 #4050 and Malformed HTTP response from splunk and cannot increase buffer on fluentBit v1.8.9 #4723 in the Fluent Bit GitHub repository.
Release Date: February 17, 2022
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.2 | |
Release date | February 17, 2022 | |
Installed TKGI version | v1.13.2 | |
Installed Ops Manager version | v2.10.29 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.6* | Release Notes |
Installed Harbor Registry version | v2.4.1 | Release Notes |
Linux stemcell | v621.208* | |
Windows stemcells | v2019.44 and later |
* Components marked with an asterisk have been updated.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.2 are from TKGI MC v1.13.1, v1.13.0, and from TKGI MC v1.12.3 and earlier TKGI MC v1.12 patches.
This release of the Tanzu Kubernetes Grid Integrated Edition Management Console includes no new features or resolved issues.
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.1 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.2. See the TKGI MC v1.13.1 Known Issues below.
Release Date: December 22, 2021
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
Release | Details | |
---|---|---|
Version | v1.13.1 | |
Release date | December 22, 2021 | |
Component | Version | |
Antrea | v1.3.0-1.2.2 | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.7 | |
CoreDNS | v1.8.4+vmware.4 | |
CSI Driver for vSphere | v2.4.0* | Release Notes |
Docker | Linux: v20.10.9 Windows: v20.10.7 |
|
etcd | v3.5.0 | |
Harbor | v2.4.1* | Release Notes |
Kubernetes | v1.22.2 | Release Notes |
Metrics Server | v0.5.0 | |
NCP | v3.2.0 | Release Notes |
Percona XtraDB Cluster (PXC) | v0.40.0* | |
UAA | v74.5.29* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1* Wavefront Proxy: v10.12* |
|
Compatibilities | Versions | |
Ops Manager | v2.10.24, v2.10.25, v2.10.26, and v2.10.29 and later versions. For more information, see VMware Tanzu Network or the Ops Manager Release Notes. | |
NSX-T | See VMware Product Interoperability Matrices**. | |
vSphere | ||
Windows stemcells | v2019.44 and later*** | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition v1.13.1 are from TKGI v1.13.0, and from TKGI v1.12.2 and earlier TKGI v1.12 patches.
TKGI v1.13.1 has the following resolved issues:
For information about upcoming deprecations, see Upcoming Deprecations in the TKGI v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition v1.13.0 are also in Tanzu Kubernetes Grid Integrated Edition v1.13.1. See the TKGI v1.13.0 Known Issues below.
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
For Known Issues in NCP v3.2.0.1, see NSX Container Plugin 3.2.0.1 Release Notes.
Warning: VMware recommends that you take steps to mitigate the Apache Log4j vulnerabilities CVE-2021-44228 and CVE-2021-45046 as soon as possible. For more information, see Apache Log4j Vulnerabilities CVE-2021-44228 and CVE-2021-45046 below.
If LDAP is enabled, Harbor private projects are inaccessible after upgrading to TKGI v1.13.1. For more information, see Private projects become inaccessible after upgrading Harbor for TKGI to v2.4.x with LDAP feature enabled in the VMware Tanzu Knowledge Base.
Release Date: December 22, 2021
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.1 | |
Release date | December 22, 2021 | |
Installed TKGI version | v1.13.1 | |
Installed Ops Manager version | v2.10.24 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.2 | Release Notes |
Installed Harbor Registry version | v2.4.1* | Release Notes |
Linux stemcell | v621.183* | |
Windows stemcells | v2019.42* and later |
The supported upgrade paths to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.1 are from TKGI MC v1.13.0, and from TKGI MC v1.12.2 and earlier TKGI MC v1.12 patches.
TKGI Management Console v1.13.1 has the following resolved issues:
For information about upcoming deprecations, see Deprecations in the TKGI MC v1.13.0 Release Notes below.
Except where noted, the known issues in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.0 are also in Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.1. See the TKGI MC v1.13.0 Known Issues below.
Release Date: November 30, 2021
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
Release | Details | |
---|---|---|
Version | v1.13.0 | |
Release date | November 30, 2021 | |
Component | Version | |
Antrea | v1.3.0-1.2.2* | Release Notes |
cAdvisor | v0.39.1 | |
Containerd for Linux | v1.5.7* | |
CoreDNS | v1.8.4+vmware.4* | |
CSI Driver for vSphere | v2.4.0* | Release Notes |
Docker | Linux: v20.10.9* Windows: v20.10.7 |
|
etcd | v3.5.0* | |
Harbor | v2.4.0* | Release Notes |
Kubernetes | v1.22.2* | Release Notes |
Metrics Server | v0.5.0* | |
NCP | v3.2.0* | Release Notes |
Percona XtraDB Cluster (PXC) | v0.39.0 | |
UAA | v74.5.26* | |
Velero | v1.6.3 | Release Notes |
Wavefront | Wavefront Collector: v1.7.1* Wavefront Proxy: v10.7 |
|
Compatibilities | Versions | |
Ops Manager | v2.10.24, v2.10.25, v2.10.26, and v2.10.29 and later versions. For more information, see VMware Tanzu Network or the Ops Manager Release Notes. | |
NSX-T | See VMware Product Interoperability Matrices**. | |
vSphere | ||
Windows stemcells | v2019.44 and later*** | |
Xenial stemcells | See VMware Tanzu Network. |
* Components marked with an asterisk have been updated.
** To use Policy API features, you must use NSX-T v3.1.3.
*** See Deployments Fail on TKGI Windows Worker-based Kubernetes Clusters after the January 2022 Microsoft Windows Security Patch.
The supported upgrade paths to TKGI v1.13.0 are from Tanzu Kubernetes Grid Integrated Edition v1.12.0 and earlier.
TKGI v1.13.0 has the following breaking changes:
Kubernetes has been upgraded to Kubernetes v1.22:
With Kubernetes v1.22, several deprecated API versions are no longer served. Notable removed API versions include:
admissionregistration.k8s.io/v1beta1
for Webhook resources.apiextensions.k8s.io/v1beta1
for CustomResourceDefinition.certificates.k8s.io/v1beta1
for CertificateSigningRequest.extensions/v1beta1
and networking.k8s.io/v1beta1
for Ingress.For more information about the features and deprecations in Kubernetes 1.22, see Kubernetes API and Feature Removals In 1.22: Here’s What You Need To Know, Deprecated API Migration Guide: 1.22 or Kubernetes 1.22: Reaching New Peaks in the Kubernetes documentation.
Telegraf has been upgraded to Telegraf v1.20.2 for host monitoring:
Previous versions of TKGI have used Telegraf v1.12.3 for host monitoring. For information on the differences between the Telegraf v1.12.3 and v1.20.2 releases, see Telegraf 1.20 release notes and the release notes for earlier Telegraf versions in the Telegraf documentation.
Note: Cluster monitoring continues to use Telegraf v1.13.2.
The Telegraf Prometheus Input Plugin is now configured with metric_version=2
:
In TKGI v1.13.0 the Telegraf Prometheus Input plugin is configured with metric_version=2
. If you use the Prometheus Output plugin, your Prometheus Client must also be configured with metric_version=2
. For Telegraf Prometheus Output plugin configuration information, see Configuration in the Telegraf GitHub repository.
TKGI uses a new sink-resources
image path:
With TKGI v1.13.0 the sink-resources
image path has changed from gcr.io/cf-pks-releng-environments/oratos/
to cnabu-docker-local.artifactory.eng.vmware.com/oratos/
.
TKGI v1.13.0 has the following features and enhancements:
[Security Fix] Component bumps fix the following CVEs:
[Security Feature] Passes additional CIS Kubernetes Benchmarks:
--audit-log-maxage
is set as appropriate.--audit-log-maxbackup
is set as appropriate.--audit-log-maxsize
is set as appropriate.For more information, see CIS Kubernetes Benchmarks.
NodePortLocal
, NetworkPolicyStats
, and FlowExporter
configurations default to TRUE
when using the Antrea CNI.TKGI v1.13.0 has the following resolved issues:
error: kubectl: unbound variable
while creating multiple clusters simultaneously.tkgi clusters
Returns the Incorrect Number of Worker Instances.The following TKGI features have been deprecated or removed from TKGI v1.13:
Manual vSphere CSI Driver Installation Support: Support for manually installing the vSphere CSI Driver has been deprecated and support will be entirely removed in TKGI v1.14. For information on automatic vSphere CSI Driver installation, see Deploying and Managing Cloud Native Storage (CNS) on vSphere.
Flannel Support: Support for the Flannel Container Networking Interface (CNI) is deprecated. VMware recommends that you switch your Flannel CNI-configured clusters to the Antrea CNI. For more information about Flannel CNI deprecation, see About Switching from the Flannel CNI to the Antrea CNI in About Tanzu Kubernetes Grid Integrated Edition Upgrades.
Docker Support: Kubernetes support for the Docker container runtime has been deprecated, and support for the Docker container runtime will be entirely removed in Kubernetes v1.24. TKGI v1.13 supports both the Docker and containerd container runtimes.
VCP Volume Support: VCP volume support has been deprecated. VCP volume support will be entirely removed in TKGI v1.15. For information on how to manually migrate VCP volumes on existing TKGI clusters from VCP to the automatically installed vSphere CSI Driver, see Migrate from VCP to the vSphere CSI Driver in Deploying and Managing Cloud Native Storage (CNS) on vSphere.
Pod Security Policy Support: Kubernetes Pod Security Policy (PSP) support has been deprecated and PSP support will be entirely removed in Kubernetes v1.25. Kubernetes v1.23 and v1.24 provide beta support for Pod Security Admission. For more information, see Pod Security Admission and Enforce Pod Security Standards with Namespace Labels in the Kubernetes documentation.
TKGI v1.13.0 has the following known issues.
Warning: The etcd maintainers do not recommend using etcd v3.5.0 for production environments due to a data corruption that might occur under heavy loads. For more information, see etcd v3.5.0 Data Corruption Under Heavy Loads in Known Issues below.
Warning: Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution via data binding. For more information, see Spring Application Remote Code Execution Vulnerability CVE-2022-22965.
For Known Issues in NCP v3.2.0, see NSX Container Plugin 3.2.0 Release Notes.
Warning: VMware recommends that you take steps to mitigate the Apache Log4j vulnerabilities CVE-2021-44228 and CVE-2021-45046 as soon as possible. For more information, see Apache Log4j Vulnerabilities CVE-2021-44228 and CVE-2021-45046 below.
This issue is fixed in TKGI v1.13.2.
TKGI does not support updating dedicated Tier 1 clusters. You can use a Network Profile to create a dedicated Tier 1 cluster, but you cannot use tkgi update-cluster
to update the cluster afterward.
Symptom
After you restore Ops Manager and the TKGI API VM from backup, TKGI functions normally, but your TKGI MC tabs include the following error: “…product ‘pivotal-container service’ is not deployed…”.
Explanation
TKGI MC is associated with an Ops Manager with a specific name. If you rename Ops Manager with a new name while restoring, your TKGI MC will not recognize the restored Ops Manager and cannot manage it.
Symptom
The pods in your TKGI Kubernetes clusters on NSX-T become stuck in a creating state. The connections between nsx-node-agent and hyperbus repeatedly close, log Couldn't connect to 'tcp://...' (error: 111-Connection refused)
, and have a status of COMMUNICATION_ERROR
.
Explanation
For information and workaround steps for this Known Issue, see Issue 2795268: Connection between nsx-node-agent and hyperbus flips and Kubernetes pod is stuck at creating state in NSX Container Plugin 3.1.2 Release Notes in the VMware documentation.
This issue is fixed in NSX-T v3.0.3 and NSX-T v3.1.3.
Symptom
Your TKGI-provisioned Pods stop after upgrading from NSX-T v3.0.2 to NSX-T v3.1.0 on vSphere 7.0 and 7.0.1.
Explanation
For information, see Issue 2603550: Some VMs are vMotioned and lose network connectivity during UA nodes upgrade in the VMware NSX-T Data Center 3.1.1 Release Notes.
Workaround
To avoid the loss of network connectivity during UA node upgrade, ensure DRS is set to manual mode during your upgrade from NSX-T v3.0.2 to v3.1.0.
If you upgraded to NSX-T v3.1.0 with DRS in automation mode, run the following on the affected Pods’ control plane VMs to restore Pod connectivity:
monit restart ncp
For more information about upgrading NSX-T v3.0.2 to NSX-T v3.1.0, see Upgrade NSX-T Data Center to v3.0 or v3.1.
Symptom
After clicking Apply Changes on the TKGI tile in an Azure environment, you experience an error ‘…could not execute “apply-changes”…’ with either of the following descriptions:
For example:
INFO | 2020-09-21 03:46:49 +0000 | Vessel::Workflows::Installer#run | Install product (apply changes)
2020/09/21 03:47:02 could not execute "apply-changes": installation failed to trigger: request failed: unexpected response from /api/v0/installations:
HTTP/1.1 500 Internal Server Error
Transfer-Encoding: chunked
Cache-Control: no-cache, no-store
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Date: Mon, 21 Sep 2020 17:51:50 GMT
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Referrer-Policy: strict-origin-when-cross-origin
Server: Ops Manager
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-Permitted-Cross-Domain-Policies: none
X-Request-Id: f5fc99c1-21a7-45c3-7f39
X-Runtime: 9.905591
X-Xss-Protection: 1; mode=block
44
{"errors":{"base":["undefined method `location' for nil:NilClass"]}}
0
Explanation
The Azure CPI endpoint used by Ops Manager has been changed and your installed version of Ops Manager is not compatible with the new endpoint.
Workaround
Run the following Ops Manager CLI command:
om --skip-ssl-validation --username USERNAME --password PASSWORD --target https://OPSMAN-API curl --silent --path /api/v0/staged/director/verifiers/install_time/IaasConfigurationVerifier -x PUT -d '{ "enabled": false }'
Where:
USERNAME
is the account to use to run Ops Manager API commands.PASSWORD
is the password for the account.OPSMAN-API
is the IP address for the Ops Manager APIFor more information, see Error ‘undefined method location’ is received when running Apply Change on Azure in the VMware Tanzu Knowledge Base.
VMware vRealize Operations (vROPs) does not support Windows worker-based Kubernetes clusters and cannot be used to manage TKGI-provisioned Windows workers.
To monitor Windows-based worker node clusters with a Wavefront collector and proxy, you must first install Wavefront on the clusters manually, using Helm. For instructions, see the Wavefront section of the Monitoring Windows Worker Clusters and Nodes topic.
TKGI-provisioned Windows worker-based Kubernetes clusters inherit a Kubernetes limitation that prevents outbound ICMP communication from workers. As a result, pinging Windows workers does not work.
For information about this limitation, see Limitations > Networking in the Windows in Kubernetes documentation.
You can use Velero to back up stateless TKGI-provisioned Windows workers only. You cannot use Velero to back up stateful Windows applications. For more information, see Velero on Windows in Basic Install in the Velero documentation.
TKGI on Google Cloud Platform (GCP) does not support Tanzu Mission Control (TMC) integration, which is configured in the Tanzu Kubernetes Grid Integrated Edition tile > the Tanzu Mission Control pane.
If you intend to run TKGI on GCP, skip this pane when configuring the Tanzu Kubernetes Grid Integrated Edition tile.
TMC Data Protection feature supports privileged TKGI containers only. For more information, see Plans in the Installing TKGI topic for your IaaS.
Windows worker-based Kubernetes clusters integrated with group Managed Service Account (gMSA) cannot be managed using compute profiles.
On vSphere with NSX-T networking you can use compute profiles with both Linux and Windows worker‑based Kubernetes clusters. On vSphere with Flannel networking, you can apply compute profiles only to Linux clusters.
TKGI CLI does not prevent accidentally reducing a cluster’s control plane node count using a compute profile.
Warning: Reducing a cluster’s control plane node count can destroy the cluster. Do not scale out or scale in existing control plane nodes by reconfiguring the TKGI tile or by using a compute profile. Reducing a cluster’s number of control plane nodes may remove a control plane node and cause the cluster to become inactive.
Symptom
After you delete a VM using the management console of your infrastructure provider, you notice a Windows worker node that had been on that VM is now in a notReady
state.
Solution
To identify the leftover node:
kubectl get no -o wide
notReady
state and have the same IP address as another node in the list.To manually delete a notReady
node:
kubectl delete node NODE-NAME
Where NODE-NAME
is the name of the node in the notReady
state.
Symptom
You experience a “502 Bad Gateway” error from the NSX load balancer after you log in to OIDC.
Explanation
A large response header has exceeded your NSX-T load balancer maximum response header size. The default maximum response header size is 10,240 characters and should be resized to 50,000.
Workaround
If you experience this issue, manually reconfigure your NSX-T request_header_size
and response_header_size
to 50,000 characters. For information about configuring NSX-T default header sizes, see OIDC Response Header Overflow in the Knowledge Base.
You must configure a global proxy in the Tanzu Kubernetes Grid Integrated Edition tile > Networking pane before you create any Windows workers that use the proxy.
You cannot change the proxy configuration for Windows workers in an existing cluster.
For vSphere with NSX-T, the HTTP Proxy password field does not support the following special characters: &
or ;
.
Symptom
You receive the following error after modifying your existing Harbor installation’s storage configuration:
Error response from daemon: manifest for ... not found: manifest unknown: manifest unknown
Explanation
Harbor does not support modifying an existing Harbor installation’s storage configuration.
Workaround
To modify your Harbor storage configuration, re-install Harbor. Before starting Harbor, configure the new Harbor installation with the desired configuration.
Symptom
Permissions are removed from your cluster’s files and processes after resizing the persistent disk during a cluster upgrade. The ingress controller statefulset fails to start.
Explanation
When resizing a persistent disk, Bosh migrates the data from the old disk to the new disk but does not copy the files’ extended attributes.
Workaround
To resolve the problem, complete the steps in Ingress controller statefulset fails to start after resize of worker nodes with permission denied in the VMware Tanzu Knowledge Base.
Symptom
You experience issues when configuring a load balancer for a multi-control plane node Kubernetes cluster or creating a service of type LoadBalancer
. Additionally, in the Azure portal, the VM > Networking page does not display any inbound and outbound traffic rules for your cluster VMs.
Explanation
As part of configuring the Tanzu Kubernetes Grid Integrated Edition tile for Azure, you enter Default Security Group in the Kubernetes Cloud Provider pane. When you create a Kubernetes cluster, Tanzu Kubernetes Grid Integrated Edition automatically assigns this security group to each VM in the cluster. However, on Azure the automatic assignment may not occur.
As a result, your inbound and outbound traffic rules defined in the security group are not applied to the cluster VMs.
Workaround
If you experience this issue, manually assign the default security group to each VM NIC in your cluster.
Symptom
One of your plan IDs is one character longer than your other plan IDs.
Explanation
In TKGI, each plan has a unique plan ID. A plan ID is normally a UUID consisting of 32 alphanumeric characters and 4 hyphens. However, the Plan 4 ID consists of 33 alphanumeric characters and 4 hyphens.
Solution
You can safely configure and use Plan 4. The length of the Plan 4 ID does not affect the functionality of Plan 4 clusters.
If you require all plan IDs to have identical length, do not activate or use Plan 4.
Symptom
After you stop one instance in a multiple-instance database cluster, the cluster stops, or communication between the remaining databases times out, and the entire cluster becomes unreachable.
The following might be in your UAA log:
WSREP has not yet prepared node for application use
Explanation
The database cluster is unable to recover automatically because a member is no longer available to reconcile quorum.
Symptom
Backing up vSphere persistent volumes using Velero fails and your Velero backup log includes the following error:
rpc error: code = Unknown desc = Failed during IsObjectBlocked check: Could not translate selfLink to CRD name
Explanation
This is a known issue when backing up clusters on Kubernetes v1.20 and later using the Velero Plugin for vSphere v1.1.0 or earlier.
Workaround
To resolve the problem, complete the steps in Velero backups of vSphere persistent volumes fail on Kubernetes clusters version 1.20 or higher (83314) in the VMware Tanzu Knowledge Base.
Symptom
The first time that you try to create two Windows clusters at the same time, the creation of one of the clusters fails. If you run pks cluster CLUSTER-NAME
to examine the last action taken on the cluster, you see the following:
Last Action: Create Last Action State: failed Last Action Description: Instance provisioning failed: There was a problem completing your request. … operation: create, error-message: Failed to acquire lock … locking task id is 111, description: ‘create deployment’
Explanation
This is a known issue that occurs the first time that you create two Windows clusters concurrently.
Workaround
Recreate the failed cluster. This issue only occurs the first time that you create two Windows clusters concurrently.
Symptom
After running tkgi delete-cluster
and cluster deletion has completed, the deleted cluster continues to be listed when running tkgi clusters
.
Workaround
You must manually remove the deleted cluster using a customized version of the ncp_cleanup script. For more information, see Deleting a Tanzu Kubernetes Grid Integrated Edition cluster with “tkgi delete-cluster” stuck “in progress” status in the VMware Tanzu Knowledge Base.
Symptom
After you uninstall TKGI, then reinstall TKGI in the same environment, BOSH Director logs errors similar to the following:
.../gems/bosh-director-0.0.0/lib/bosh/director/deployment_plan/cloud_manifest_parser.rb:120:in `parse_vm_extensions': Duplicate vm extension name 'disk_enable_uuid' (Bosh::Director::DeploymentDuplicateVmExtensionName)
Explanation
The pivotal-container-service
cloud-config was not removed when you uninstalled the TKGI tile, and it remained active. When you reinstalled the TKGI tile, an additional pivotal-container-service
cloud-config was created, causing the metrics_server to fall into a crash-loop state.
Workaround
You must manually remove the pivotal-container-service
cloud-config after removing your TKGI deployment, including after removing the TKGI tile from Ops Manager.
For more information, see “Duplicate vm extension name” error when metrics_server runs on Director VM in Tanzu Kubernetes Grid Integrated Edition in the VMware Tanzu Community Knowledge Base.
Symptom
Your TKGI logs include the following error:
'uaa'. Errors are:- Error filling in template 'uaa.yml.erb' (line 59: Client redirect-uri is invalid: uaa.clients.pks_cli.redirect-uri Client redirect-uri is invalid: uaa.clients.pks_cluster_client.redirect-uri)
Explanation
The TKGI API fully-qualified domain name (FQDN) for your cluster contains leading or trailing whitespace.
Workaround
Do not include whitespace in the TKGI tile API Hostname (FQDN) field.
The TMC Cluster Data Protection Backup fails in TKGI environments upgraded from an earlier version.
Symptom
The TMC Cluster Data Protection Backup fails to back up your existing clusters and logs the following error:
error executing custom action (groupResource=customresourcedefinitions.apiextensions.k8s.io, namespace=, name=ncpconfigs.nsx.vmware.com): rpc error: code = Unknown desc = error fetching v1beta1 version of ncpconfigs.nsx.vmware.com: the server could not find the requested resource
Explanation
Kubernetes v1.22 disallows the spec.preserveUnknownFields: true
configuration in your existing clusters and the creation of a v1 CustomResourceDefinitions configuration fails.
The TMC Cluster Data Protection Restore operation can fail when restoring multiple Antea resources.
Symptom
The TMC Cluster Data Protection Restore fails and logs errors that requests to restore the admission webhook
have been denied.
Explanation
Velero has encountered a race condition while operating a resource. For more information, see Allow customizing restore order for Kubernetes controllers and their managed resources in the Velero GitHub repository.
TKGI does not support environments where there are multiple matching networks, such as a mixed CVDS/NVDS environment.
Symptom
TKGI logs errors similar to the following in an environment with multiple matching networks:
LastOperationstatus='failed', description='Instance provisioning failed:
There was a problem completing your request. Please contact your operations team providing the following information:
service: p.pks, service-instance-guid: ..., broker-request-id: ..., task-id: ..., operation: create,
error-message: Unknown CPI error 'Unknown' with message 'undefined method `mob' for <VimSdk::Vim::OpaqueNetwork:' in create_vm' CPI method
Explanation
TKGI cannot identify which of the matching networks you intend to use and has selected the wrong network.
This issue is fixed in TKGI v1.13.2.
Symptom
After successfully deleting a Policy API-based cluster with LoadBalancer CRDs, you notice that the cluster’s Tier-1 gateway, static routes for the LoadBalancer CRD, and some TKGI NSX-T resources were not deleted, and the following error has been logged:
Found errors in the request. Please refer to the related errors for details.
[Routing] Removal of edge cluster from Tier1 logical router is not allowed as multicast is enabled on it.
Workaround
To work around this issue:
If the cluster has already been deleted:
Back up ncp_cleanup
:
cp /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup.bak
/var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup
for editing.Modify the pks
parameter to true
:
--pks=true \
For example:
$pksnsxcli cleanup \
--api-type="${API_TYPE}" \
--nsx-manager-host='192.168.111.101' \
-c $nsx_manager_client_cert_file \
-k $nsx_manager_client_key_file \
\
--nsx-ca-cert-path=$nsx_manager_ca_cert_file \
\
--insecure='false' \
--cluster "$k8s_cluster_name" \
--t0-router-id="$t0_router_id" \
--pks=true \
--read-only=false \
--force=$force_delete
ncp_cleanup
.Run ncp_cleanup
:
/var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup CLUSTER-ID
Where CLUSTER-ID
is the cluster’s UUID.
Restore ncp_cleanup
to its original state:
cp /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup.bak /var/vcap/jobs/pks-nsx-t-osb-proxy/bin/ncp_cleanup
This issue is fixed in TKGI v1.13.1.
Due to the Apache Log4j vulnerabilities CVE-2021-44228 and CVE-2021-45046, VMware recommends that you take the following steps as soon as possible:
Upgrade to TKGI v1.13.1 or later.
Note: When upgrading TKGI to mitigate the Apache Log4j vulnerability you must also upgrade all TKGI clusters.
If you cannot upgrade Ops Manager or TKGI, VMware recommends that you implement the workaround steps provided in the VMware Tanzu Knowledge Base:
For more information about the impact of the Log4j CVE-2021-44228 and CVE-2021-45046 vulnerabilities on VMware products, see VMSA-2021-0028 in Advisories in the VMware Security Solutions documentation.
Occasionally, tkgi update-cluster
hangs while updating a Windows worker node instance and the BOSH task cannot finish and exits.
Symptom
The ovsdb-server
service has stopped but other processes report that it is running.
Explanation
The ovsdb-server.pid
file uses the pid for a process that is not the ovsdb-server.
To confirm that this is the root cause for tkgi update-cluster
to hang:
ovsdb-server
service has actually stopped, run the PowerShell Get-services
command on the Windows worker node.To verify that other processes report the ovsdb-server
service is still running:
Review the ovsdb-server job-service-wrapper.err.log
log file.
The job-service-wrapper.err.log
log file is located at:
C:\var\vcap\sys\log\openvswitch-windows\ovsdb-server\job-service-wrapper.err.log
Confirm that after the flushing processes, the log includes an error similar to the following:
Pid-Guard : ovsdb-server is already runing, please stop it first
At C:\var\vcap\jobs\openvswitch-windows\bin\ovsdb-server_ctl.ps1:30 char:5
+ Pid-Guard $PIDFILE "ovsdb-server"
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: ( [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Pid-Guard
To verify the root cause:
Run the following PowerShell commands on the Windows worker node:
$RUN_DIR = "C:\var\vcap\sys\run\openvswitch-windows"
$PIDFILE = "$RUN_DIR\ovsdb-server.pid"
$pid1 = Get-Content $PidFile -First 1
echo $pid1
$rst = Get-Process -Id $pid1 -ErrorAction SilentlyContinue
echo $rst
ProcessName
is not ovsdb-server
.Workaround
To resolve this issue for a single Windows worker:
Run the following:
rm C:\var\vcap\sys\run\openvswitch-windows\ovsdb-server.pid
ovsdb-server
process to start.This issue is fixed in TKGI v1.13.2.
When you update a cluster using a Compute Profile to remove existing node pools and create new node pools, BOSH deletes the VMs for the existing node pools and creates new VMs instead of renaming the existing VMs and updating them.
This issue is fixed in TKGI v1.13.2.
Symptom
When you rotate certificates by running the pks rotate-certificates
CLI command with the --only-nsx
parameter, in an environment with more than one thousand clusters, the command returns the following error:
Error: Cluster ... not found.
ERROR 15217 — [nio-9021-exec-6] i.pivotal.pks.error.ApiExceptionHandler : Error: Status: 500;
ErrorMessage: <nil>; Description: There was a problem completing your request.
Please contact your operations team providing the following information:
...operation: bind - error-message: gathering binding info Could not fetch VMs info
Workaround
To rotate the certificates that did not automatically rotate, you must rotate the certificates manually. For more information, see How to rotate Tanzu Kubernetes Grid Integrated Edition tls-nsx-t cluster certificate in the VMware Tanzu Knowledge Base.
If LDAP is enabled, Harbor private projects are inaccessible after upgrading to TKGI v1.13.0. For more information, see Private projects become inaccessible after upgrading Harbor for TKGI to v2.4.x with LDAP feature enabled in the VMware Tanzu Knowledge Base.
This issue is fixed in TKGI v1.13.2.
On TKGI v1.13 and later, on vSphere with NSX-T, if you apply a load balancer ingress rule to a domain before applying ingress rules to subdomains, the domain ingress rule is applied to both the domain and the subdomains, and the subdomain ingress rules are ignored.
Explanation
With Kubernetes v1.22 and NCP v3.2.0, load balancer ingress rules are applied to domains and subdomains via regex pattern matching instead of exact-match pattern matching.
The load balancer subdomains match the specified domain ingress rule pattern and the subsequent subdomain ingress rules are ignored.
For more information, see Resolved Issues in the NSX Container Plugin 3.2.0.3 Release Notes.
Workaround
For newly created ingress rules, specify subdomain rules before rules for their associated domain. For example, specify the rules for api.example.com
and mail.example.com
before the rules for example.com
.
Microsoft changed Microsoft Windows’ support for tar file commands in the January 2022 Microsoft Windows security patch.
Packaging scripts that use tar commands for Windows worker-based Kubernetes Cluster deployments can fail after the Microsoft tar command patch update has been applied.
The BOSH agent used by vSphere stemcells built by stembuild v2019.43 and earlier use tar commands that are no longer supported and will fail if the Microsoft Windows security patch has been applied.
Workaround
Stembuild v2019.44 and later include a version of the BOSH agent that does not use unsupported tar commands.
If you use vSphere stemcells, use Stembuild 2019.44 or later to avoid the BOSH agent tar error.
This issue is fixed in TKGI v1.13.2.
Symptom
After you enable the vSphere CSI Driver and migrate your clusters from VCP to vSphere CSI Driver, the validating webhook isn’t able to start and reports errors similar to the following:
Warning Failed: Failed to pull image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v2.4.0": rpc error: code = Unknown desc = Error response from daemon: Get "https://gcr.io/v2/": dial tcp: lookup gcr.io on .... no such host
Warning Failed: Error: ErrImagePull
Warning Failed: ImagePullBackOff
Explanation
In air-gaped environments, the validating webhook is unable to pull the container image from gcr.io.
Workaround
Your environment must be able to access gcr.io during CSI migration.
This issue is fixed in TKGI v1.13.2.
Symptom
After upgrading a TKGI cluster, TKGI might run two CoreDNS Pods on the same worker node.
This issue is fixed in TKGI v1.13.3.
Migrating a cluster from Flannel to Antrea fails if the first octet in the cluster’s CIDR IP address is greater than 128
. For example: 172.0.0.0/17
.
Symptom
After your cluster fails to migrate from Flannel to Antrea:
An error similar to the following is logged in your cluster creation log:
"code":450001,"message":"Action Failed get_task: Task ... result: 1 of 6 post-start scripts failed.
Failed Jobs: deploy-antrea. Successful Jobs: bosh-dns, kube-apiserver, kubernetes-roles, etcd, deploy-proxy-agent."
And errors similar to the following are logged in your antrea-deploy folder:
/var/vcap/jobs/deploy-antrea/bin/post-start: line 45: 2.88676e+09: syntax error: invalid arithmetic operator (error token is ".88676e+09")
/var/vcap/jobs/deploy-antrea/bin/post-start: line 50: antrea_svc_ip_int: unbound variable
Symptom
The pre-start script for tkgi create-cluster
fails and logs floating IP pool allocation errors in the pre-start.stderr.log
similar to the following:
level=error msg="operation failed with [POST /pools/ip-pools/{pool-id}][409] allocateOrReleaseFromIpPoolConflict &{RelatedAPIError:{Details: ErrorCode:5141 ErrorData:<nil> ErrorMessage:Requested IP Address ... is already allocated. ModuleName:id-allocation service} RelatedErrors:[]}\n"
level=warning msg="failed to allocate FIP from (pool: ...: [POST /pools/ip-pools/{pool-id}][409] allocateOrReleaseFromIpPoolConflict &{RelatedAPIError:{Details: ErrorCode:5141 ErrorData:<nil> ErrorMessage:Requested IP Address ... is already allocated. ModuleName:id-allocation service} RelatedErrors:[]}\n"
Error: an error occurred during FIP allocation
Explanation
TKGI administrators can allocate floating IP pool IP Addresses in a Network Profile configuration. The TKGI control plane allocates IP Addresses from the floating IP pool without accounting for the IPs allocated using Network Profiles.
Workaround
TKGI allocates IP addresses starting from the beginning of a floating IP pool range. When configuring a Network Profile, allocate IP Addresses starting at the end of the floating IP pool range instead of those at the beginning.
This issue is fixed in TKGI v1.13.4.
Explanation
The etcd maintainers have announced that etcd v3.5.0 is no longer recommended for production environments due to a data corruption that might occur under heavy loads.
The corruption might occur silently. VMware recommends that TKGI v1.13 and later administrators configure their clusters to detect if the etcd data corruption occurs.
If your Kubernetes cluster control plane nodes running etcd do not have memory pressure and sigkill has not been interrupted, community analysis indicates that there is a very low possibility of data corruption occurring.
For more information, see How to detect etcd inconsistency issues in TKGI 1.13.x which could be affected by a known issue in etcd v3.5.x in the VMware Tanzu Knowledge Base, and Etcd v3.5.0-v3.5.2 is not recommended for production in the etcd-dev blog.
This issue is fixed in TKGI v1.13.4.
TKGI v1.13.0 through v1.13.3 deploy a version of UAA affected by CVE-2022-22965, a Spring application exploit.
VMware has confirmed that the vulnerability affects applications running in TKGI environments and recommends that you implement the workaround steps below as soon as possible.
Explanation
Through the CVE-2022-22965 exploit, Spring MVC and Spring WebFlux JDK 9+ applications running on Tomcat as a WAR deployment are vulnerable to remote code execution (RCE) via data binding.
For information on CVE-2022-22965, see CVE-2022-22965: Spring Framework RCE via Data Binding on JDK 9+ or VMSA-2022-0010 in the VMware Security Advisories documentation.
Workaround
For information on how to manually mitigate CVE-2022-22965 in TKGI, see Workaround instructions to address CVE-2022-22965 in TKGI v1.11 ~ v1.13 in the VMware Tanzu Knowledge Base.
This issue is fixed in TKGI v1.13.4.
If you patch upgrade NSX-T and then upgrade TKGI, your cluster upgrades will fail for clusters running the Docker container runtime.
Symptom
Errors similar to the following are logged when your Docker container runtime cluster upgrades fail:
worker/...:/var/vcap/sys/log/docker# cat docker.stderr.log
failed to start daemon: failed to dial "/run/containerd/containerd.sock": failed to dial "/run/containerd/containerd.sock": context deadline exceeded
Explanation
Docker and kubelet are unable to start because the vrops-cadvisor
DaemonSet has detected that the /run/containerd/containerd.sock
file does not exist and then automatically created it.
Workaround
If you have already upgraded NSX-T and TKGI:
/run/containerd/containerd.sock
.
If you have not upgraded NSX-T and TKGI:
This issue is fixed in TKGI v1.13.4.
If you run tkgi rotate-certificates
on a cluster configured with a Kubernetes Profile, the certificate rotation process will halt before completing.
Symptom
tkgi rotate-certificates
halts and logs errors similar to the following when rotating certificates on a cluster configured with a Kubernetes Profile:
org.hibernate.LazyInitializationException: could not initialize proxy io.pivotal.pks.profile.kubernetes.data.KubernetesProfileEntity#... - no Session
at org.hibernate.proxy.AbstractLazyInitializer.initialize(AbstractLazyInitializer.java:169) ~[hibernate-core-5.3.18.Final.jar!/:5.3.18.Final]
at org.hibernate.proxy.AbstractLazyInitializer.getImplementation(AbstractLazyInitializer.java:309) ~[hibernate-core-5.3.18.Final.jar!/:5.3.18.Final]
This issue is fixed in TKGI v1.13.5.
kube-apiserver logs are regularly compressed and stored for future reference.
Stale kube-apiserver logs should be deleted when the compressed logs occupy their maximum allocated disk space. The compressed kube-apiserver logs are instead being re-compressed, renamed, and moved to a different location.
This issue is fixed in TKGI v1.13.6.
In an environment configured with a proxy, automatic vSphere CSI Driver Integration ignores the configured proxy and fails.
Symptom
In environments configured with a Proxy and automatic vSphere CSI Driver Integration enabled, the csi-controller
and csi-syncer
services log errors similar to the following and fail:
"caller":"vsphere/virtualcenter.go:154","msg":"failed to create new client with err: Post \"...\": dial tcp: lookup ...: no such host"
Workaround
To configure the vSphere CSI Driver to use your proxy:
SSH to TKGI Control Plane VM.
Open the /var/vcap/jobs/csi-controller/bin/csi_controller_ctl
configuration file for editing.
Locate the start_csi_controller
function in the configuration file:
start_csi_controller()
{
...
export ...
export ...
export ...
...
}
Locate the export
commands in the start_csi_controller
function and add the following to the group of export commands:
export HTTP_PROXY=HTTP-PROXY
export HTTPS_PROXY=HTTPS-PROXY
export NO_PROXY=NO-PROXY
Where:
HTTP-PROXY
is the HTTP Proxy that the vSphere CSI Driver must use.HTTPS-PROXY
is the HTTPS Proxy that the vSphere CSI Driver must use.NO-PROXY
is a list of host names that the vSphere CSI Driver should not use a proxy for.Restart the csi_controller
:
sudo monit restart csi_controller
This issue is fixed in TKGI v1.13.6.
After a failed tkgi upgrade-cluster
run, your custom BOSH vm_extensions configuration toggles between being present and absent depending on when you review the cluster manifest.
Explanation
The VMs that were updated by tkgi update-cluster
before the failure have manifests with the desired BOSH vm_extensions configuration, while those updated afterward do not. Re-running tkgi update-cluster
does not update the manifests in the remaining VMs because the cluster configuration has not changed.
You cannot switch your default CNI from Flannel to Antrea during TKGI upgrade if TKGI is running on Ops Manager v2.10.40 or later.
This issue is fixed in TKGI v1.13.6.
While migrating a cluster from using the In-Tree vSphere Storage Driver to the automatically installed vSphere CSI Driver, VMDKs attached to worker VMs might be deleted.
For more information, see vsphere-csi-driver the vSphere CSI Driver Release Notes in GitHub.
This issue is fixed in TKGI v1.13.8.
The Docker image is not automatically removed after successfully switching a cluster from the Docker container runtime to containerd.
Workaround
To remove a Docker image after switching a cluster’s container runtime:
/var/vcap/store/docker
directory.This issue is fixed in TKGI v1.13.9.
Switching a cluster’s container runtime from Docker to containerd can timeout and fail if the Docker directory contains many files.
Description
A timeout occurs during the remove Docker
step while switching a cluster’s container runtime from Docker to containerd if the Docker directory contains too many files to delete within the 180-second timeout interval.
Workaround
To work around this issue:
/var/vcap/store/docker
directory.If a Pod is recreated in a new instance node, the persistent volume might remain attached to the old node.
Symptom
A persistent volume remains attached to an old node, and attachment errors similar to the following are logged:
Warning FailedMount... kubelet Unable to attach or mount volumes: unmounted volumes=..., unattached volumes=...: timed out waiting for the condition
Warning FailedMount... kubelet Unable to attach or mount volumes: unmounted volumes=..., unattached volumes=...: timed out waiting for the condition
Warning FailedAttachVolume... attachdetach-controller AttachVolume.Attach failed for volume...:
rpc error: code = Internal desc = failed to attach disk:... with node:... err failed to attach cns volume:... to node vm:....
fault: "(*types.LocalizedMethodFault)(0xc000c88d80)({\n DynamicData: (types.DynamicData)
For more information, see Persistent volume fails to be detached from a node in VMware vSphere Container Storage Plug-in 2.5 Release Notes.
This issue is fixed in TKGI v1.13.9.
Pods in a cluster that has been switched from the Docker container runtime to containerd might enter a CrashLoopBackOff state. If the container runtime switch is part of a cluster upgrade, the upgrade halts.
Symptom
The Pods that have entered the CrashLoopBackOff state log the following:
Warning FailedCreatePodSandBox... Failed to create pod sandbox: rpc error: code =
Unknown desc = failed to create containerd task: failed to start shim:
write /var/vcap/sys/run/containerd/io.containerd.runtime.v2.task/.../config.json:
no space left on device: unknown
The /var/vcap/data/sys/run
directory on the instance node with Pods that have entered the CrashLoopBackOff state is full.
The VMware vSphere CSI Driver supports a limited set of VMware vSphere features. Before enabling the vSphere CSI Driver on a TKGI cluster, confirm the cluster and storage configuration are supported by the driver. For more information, see Unsupported Features and Limitations in Deploying and Managing Cloud Native Storage (CNS) on vSphere.
Release Date: November 11, 2021
Note: Tanzu Kubernetes Grid Integrated Edition Management Console provides an opinionated installation of TKGI. The supported versions may differ from or be more limited than what is generally supported by TKGI.
Element | Details | |
---|---|---|
Version | v1.13.0 | |
Release date | November 30, 2021 | |
Installed TKGI version | v1.13.0* | |
Installed Ops Manager version | v2.10.21 | Release Notes |
Component | Version | |
Installed Kubernetes version | v1.22.2* | Release Notes |
Installed Harbor Registry version | v2.4.0* | Release Notes |
Linux stemcell | v621.176* | |
Windows stemcells | v2019.37 and later |
The supported upgrade path to Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.0 is from Tanzu Kubernetes Grid Integrated Edition v1.12.0 and later.
This release of the TKGI Management Console includes no new features or resolved issues.
The following TKGI features have been deprecated or removed from TKGI Management Console v1.13:
The Tanzu Kubernetes Grid Integrated Edition Management Console v1.13.0 has the following known issues:
This issue is fixed in TKGI MC v1.13.1.
You cannot use the TKGI MC to configure or deploy TKGI in an environment with multiple data centers.
Symptom
The Tanzu Kubernetes Grid Integrated Edition Management Console integration to vRealize Log Insight does not support connections to the HTTPS port on the vRealize Log Insight server.
Workaround
/lib/systemd/system/pks-loginsight.service
in a text editor.-e LOG_SERVER_ENABLE_SSL_VERIFY=false
.Set -e LOG_SERVER_USE_SSL=true
.
The resulting file should look like the following example:
ExecStart=/bin/docker run --privileged --restart=always --network=pks
-v /var/log/journal:/var/log/journal
--name=pks-loginsight
-e TYPE=gear2-vm
-e LOG_SERVER_HOST=${LOGINSIGHT_HOST}
-e LOG_SERVER_PORT=${LOGINSIGHT_PORT}
-e LOG_SERVER_ENABLE_SSL_VERIFY=false
-e LOG_SERVER_USE_SSL=true
-e LOG_SERVER_AGENT_ID=${LOGINSIGHT_ID}
pksoctopus/vrli-journald:v07092019
Save the file and run systemctl daemon-reload
.
systemctl restart pks-loginsight.service
.Tanzu Kubernetes Grid Integrated Edition Management Console can now send logs to the HTTPS port on the vRealize Log Insight server.
Symptom
If you enable vSphere HA on a cluster, if the TKGI Management Console appliance VM is running on a host in that cluster, and if the host reboots, vSphere HA recreates a new TKGI Management Console appliance VM on another host in the cluster. Due to an issue with vSphere HA, the ovfenv
data for the newly created appliance VM is corrupted and the new appliance VM does not boot up with the correct network configuration.
Workaround
Reconfigure virtual machine
task has run on the appliance VM.Symptom
Some file arguments in Kubernetes profiles are base64 encoded. When the management console displays the Kubernetes profile, some file arguments are not decoded.
Workaround
Run echo "$content" | base64 --decode
Symptom
If you create network profiles and then try to apply them in the Create Cluster page, the new profiles are not available for selection.
Workaround
Log out of the management console and log back in again.
Symptom
In the cluster summary page, only default IP pool, pod IP block, node IP block values are displayed, rather than the real-time values from the associated network profile.
Workaround
None
Symptom
You receive the following error after modifying your existing Harbor installation’s storage configuration:
Error response from daemon: manifest for ... not found: manifest unknown: manifest unknown
Explanation
Harbor does not support modifying an existing Harbor installation’s storage configuration.
Workaround
To modify your Harbor storage configuration, re-install Harbor. Before starting Harbor, configure the new Harbor installation with the desired configuration.
Symptom
After upgrading Ops Manager, your Management Console does not recognize a Windows stemcell imported when using the prior version of Ops Manager.
Workaround
If your Management Console does not recognize a Windows stemcell after upgrading Ops Manager:
Symptom
After you create a cluster, Tanzu Mission Control does not include the cluster in cluster lists. You have a “Resource not found” error similar to the following in your BOSH logs:
Cluster Name in TMC: cluster-1
Cluster Name Prefix: tkgi-my-prefix-
Group Name in TMC: my-prefix-clusters
Cluster Description in TMC: VMware Enterprise PKS Attaching cluster ''tkgi-my-prefix-cluster-1'' to TMC
Fetching token successful
request POST:/v1alpha1/clusters,
response 404 Not Found:{"error":"Resource not found - clustergroup(my-prefix-clusters)
org id(d859dc9f-g622-426d-8c91-939a9f13dea9)",
"code":5,"message":"Resource not found - clustergroup(my-prefix-clusters)
Explanation
The cluster group you assign a cluster to must be defined in Tanzu Mission Control before you assign your cluster to the cluster group in the TKGI Management Console.
Workaround
To resolve the problem, complete the steps in Attaching a Tanzu Kubernetes Grid Integrated (TKGI) cluster to Tanzu Mission Control (TMC) fails with “Resource not found - clustergroup(cluster-group-name)” in the VMware Tanzu Knowledge Base.
This issue is fixed in TKGI MC v1.13.1.
Symptom
The TKGI Configuration view within TKGI MC displays the following error:
Fail to configure NSX for TKGI: Traceback (most recent call last): File"scripts/nsx_automoaror.py", line 182,
in <module> edgee_tzs |= extra_edge2_tzs TpeError: unsupported operand type(s) for |=:'dict' and 'dict'
This issue is fixed in TKGI MC v1.13.3.
Symptom
In an NSX-T Data Center - Bring Your Own Topology environment, TKGI MC displays the following error if you reconfigure your TKGI deployment DNS after installing TKGI:
Input validation failed Validation errors found: Fail to parse network : invalid CIDR address:
Workaround
To reconfigure your TKGI Deployment DNS:
dep_dns
field.To modify the TKGI Deployment CIDR, modify the dep_network_cidr
field. Provide any CIDR that does not contain the Deployment DNS IP.
For example:
network:
dep_dns:192.168.111.155
dep_network_cidr: "80.0.0.1/24"
The new CIDR will not affect your original CIDR configuration.
Click Apply Configuration.