TKG Service provides self-service life cycle management of Kubernetes workload clusters. TKG Service with Supervisor is optimized for the vSphere environment and integrates with the underlying infrastructure, including vCenter, ESXi, virtual networking, and cloud native storage. With TKG Service, you can provision conformant Kubernetes clusters and maintain upstream concurrency.

The TKG Service includes several components integrated with the vSphere IaaS control plane.
Figure 1. TKG Service Components
TKG Service Components

Workload Management

Workload Management is a VMware solution that provides a Kubernetes control plane on native vSphere infrastructure. Enabling Workload Management lets you deploy one or more Supervisors in your vSphere environment. Workload Management is bundled with vCenter Server but separately licensed.

Supervisor

Supervisor is a Kubernetes cluster that serves as the control plane to manage workload clusters. You configure Supervisor to support developers who provision and operate Tanzu Kubernetes Grid clusters.

Supervisor Deployment: vSphere Cluster

The traditional way to deploy Supervisor is on a single vSphere cluster. A vSphere cluster is a collection of ESXi hosts managed by a vCenter Server. A vSphere cluster must be configured with specific features to support Supervisor, including:
  • Multiple ESXi hosts connected by a vSphere Distributed Switch (VDS)
  • Shared storage is configured, such as vSAN or NFS
  • vSphere HA and fully-automated DRS are enabled
  • Lifecycle manager is enabled
  • Networking is configured, either NSX with its embedded load balancer, or VDS with an external load balancer
Architecturally, for production deployments of TKG on Supervisor, you should separate the management plane host or cluster where vCenter Server runs from the compute plane cluster where Supervisor will be enabled. If you are using a vSphere cluster to host vCenter Server, this cluster should not have DRS enabled. Refer to the vCenter documentation for details.

Supervisor Deployment: vSphere Zones

vSphere 8 introduces vSphere Zones. You can assign vSphere clusters to vSphere Zones to provide high availability and fault tolerance for Supervisor. Deploying Supervisor across vSphere Zones lets you provision TKG clusters in specific availability zones.

In a deployment of Supervisor on a single vSphere cluster, there is a one-to-one relationship between the Supervisor and the vSphere cluster. In a zoned Supervisor deployment, Supervisor stretches across three vSphere clusters to provide high-availability and failure domains for TKG clusters on Supervisor.

A vSphere administrator creates the three vSphere Zone and associates them with a vSphere cluster. A Supervisor deployment on three vSphere Zones has special requirements in addition to vSphere cluster requirements, including:

  • Three vSphere Zones connected by a vSphere Distributed Switch (VDS)
  • Each vSphere Zone contains a single vSphere cluster
  • A Supervisor must deploy on exactly 3 vSphere Zones
  • A vSphere storage policy is available on vSphere Zone

vSphere Namespace

A vSphere Namespace is a namespace on Supervisor where one or more Tanzu Kubernetes Grid clusters are provisioned. For each vSphere Namespace, you configure role-based access control, persistent storage, resource limits, images library, and virtual machine classes.

TKG Service

TKG Service is an implementation of the open source Cluster API project that defines a set of custom resources and controllers to manage the life cycle of Kubernetes clusters. Tanzu Kubernetes Grid is a component of Supervisor.

TKG Service has three layers of controllers to manage the life cycle of TKG clusters, including Virtual Machine Service, Cluster API, and Cloud Provider Plugin.

VM Operator
The Virtual Machine Service controller provides a declarative, Kubernetes-style API for management of VMs and associated vSphere resources. The Virtual Machine Service introduces the concept of a virtual machine class that represents an abstract reusable hardware configuration. TKG Service uses the Virtual Machine Service manage the life cycle of the control plane and worker node VMs hosting a workload cluster.
Cluster API
The Cluster API controller provides declarative, Kubernetes-style APIs for cluster creation, configuration, and management. The inputs to Cluster API include a resource describing the cluster, a set of resources describing the virtual machines that make up the cluster, and a set of resources describing cluster add-ons.
Cloud Provider Plugin
TKG Service provisions workload clusters that include the components necessary to integrate with the underlying vSphere Namespace resources. These components include a Cloud Provider Plugin that integrates with the Supervisor. TKG uses the Cloud Provider Plugin to pass requests for persistent volumes to the Supervisor, which is integrated with VMware Cloud Native Storage (CNS).

Tanzu Kubernetes Releases

A Tanzu Kubernetes release provides the Kubernetes software distribution and add-ons signed and supported by VMware for use with Tanzu Kubernetes Grid clusters.

Each Tanzu Kubernetes release is distributed as a virtual machine template (OVA file). The Tanzu Kubernetes Grid uses the OVA format to construct the virtual machine nodes for TKG clusters. Tanzu Kubernetes releases are versioned according to Kubernetes versioning, and include OS customizations and optimizations for vSphere infrastructure.

For a list of Tanzu Kubernetes releases and compatibility with Supervisor, refer to the Tanzu Kubernetes releases Release Notes. See also the vSphere IaaS control plane Support Policy.

Content Library

Tanzu Kubernetes releases are made available to TKG clusters using a vCenter content library. You can create a subscribed content library and automatically receive TKRs when they are made available by VMware, or use a local content library and manually upload TKRs.

TKG Service Cluster Components

The components that run in a TKG Service cluster provide functionality for four areas of the product: authentication, storage, networking, and load balancing.

Authentication Webhook

The authentication webhook runs as a pod inside the cluster to validate user authentication tokens.

TKG clusters support authentication two ways: using vCenter Single Sign-On and using an external identity provider that supports the Open ID Connect (OIDC) protocol.

TKG runs the Pinniped OIDC client on Supervisor and on TKG cluster nodes. Configuring an external OIDC provider for Supervisor automatically configures the Pinniped components.

TKG supports various clients for authenticating with Supervisor, including the vSphere Plugin for kubectl and the Tanzu CLI.

Container Storage Interface (CSI)

A paravirtual CSI plugin is a Kubernetes pod that runs inside a TKG cluster and integrates with VMware Cloud Native Storage (CNS) through Supervisor. A Kubernetes pod that runs inside a TKG cluster can mount three types of virtual disks: ephemeral, persistent volume and container image.

Transient Storage

A pod requires transient storage to store ephemeral data such Kubernetes objects as logs, volumes, and configuration maps. Transient storage lasts while the pod exists. Ephemeral data persists across container restarts, but once the pod is deleted, the virtual disk storing ephemeral data disappears.

Persistent Storage

TKG leverages the vSphere storage policy framework for defining storage classes and reserving persistent volumes. TKG clusters pass requests for persistent volumes to the Supervisor, which is integrated with VMware Cloud Native Storage (CNS). Persistent volumes are provisioned on cluster nodes dynamically using a storage class, or manually.

Container Image Storage

Containers inside a Kubernetes pod use images that contain the software to be run. The pod mounts images used by its containers as image virtual disks. When the pod completes its life cycle, the image virtual disks are detached from the pod. Kubelet is responsible for pulling container images from the image registry and transforming them into virtual disks to run inside the pod.

Container Network Interface (CNI)

The Container Network Interface Plugin is a CNI plugin that provides pod networking.

TKG clusters support the following Container Network Interface (CNI) options: Antrea (default) and Calico. In addition, TKG provides the Antrea NSX Routed CNI to implement routable pods networking.

The table summarizes TKG cluster networking features and their implementation.

Table 1. TKG Service Cluster Networking
Endpoint Provider Description
Pod connectivity Antrea or Calico Container network interface for pods. Antrea uses Open vSwitch. Calico uses the Linux bridge with BGP.
Service type: ClusterIP Antrea or Calico Default Kubernetes service type that is only accessible from within the cluster.
Service type: NodePort Antrea or Calico Allows external access through a port opened on each worker node by the Kubernetes network proxy.
Network policy Antrea or Calico Controls what traffic is allowed to and from selected pods and network endpoints. Antrea uses Open vSwitch. Calico uses Linux IP tables.
Cloud Provider Implementation
The Cloud Provider Implementation lets you create Kubernetes load balancer and ingress services.
Table 2. TKG Load Balancing
Endpoint Provider Description
Service type: LoadBalancer

NSX embedded load balancer (part of the NSX network stack)

NSX advanced load balancer (separate installation for use with VDS networking)

HAProxy (separate installation for use with VDS networking)

For the NSX embedded load balancer, one virtual server per service type definition.

For the NSX advanced load balancer, refer to that section of this documentation.

For HAProxy, refer to that section of this documentation.

Note: Some load balancing features might not be available on each supported load balancer type, such as static IPs.
Cluster ingress Third-party ingress controller Provides routing for inbound pod traffic. You can use any third-party ingress controller, such as Contour.

TKG Service Cluster APIs

TKG Service provides two APIs for provisioning and managing the life cycle of TKG clusters.
  • API version v1alpha3 for Tanzu Kubernetes clusters
  • API version v1beta1 for Clusters based on a ClusterClass

The v1alpha3 API lets you create conformant Kubernetes clusters of type TanzuKubernetesCluster. This type of cluster is pre-configured with common defaults for quick provisioning, and can be customized. The v1beta1 API lets you create conformant Kubernetes clusters of type Cluster based on a default ClusterClass provided by VMware.

Note: To upgrade vSphere IaaS control plane from vSphere 7 to vSphere 8, the TKG cluster must be running the v1alpha2 API. The v1alpha2 API was introduced with v7.0 Update 3. The v1alpha1 API is deprecated. For more information, see: .

TKG Service Cluster Clients

TKG on vSphere 8 Supervisor supports various client interfaces for provisioning, monitoring, and managing TKG clusters.
  • vSphere Client for configuring Supervisor and monitoring deployed TKG clusters.
  • vSphere Plugin for kubectl for authenticating with Supervisor and TKG clusters using vCenter Single Sign-On.
  • kubectl to provision and manage the life cycle of TKG clusters declaratively, and to interact with Supervisor.
  • vSphere Docker Credential Helper to push and pull images to and from a container registry.
  • Tanzu CLI for provisioning clusters using commands, and for installing Tanzu packages.
  • Tanzu Mission Control web interface for managing TKG clusters.