vSphere with Tanzu transforms the vSphere cluster into a platform for running Kubernetes workloads in dedicated resource pools. When vSphere with Tanzu is enabled on a vSphere cluster, vSphere with Tanzu creates a Kubernetes control plane directly in the hypervisor layer. You can then run Kubernetes containers by creating upstream Kubernetes clusters through the VMware Tanzu Kubernetes Grid Service, and run your applications inside these clusters.
This document provides a reference design for deploying VMware Tanzu for Kubernetes Operations (informally known as TKO) on vSphere with Tanzu.
The following reference design is based on the architecture and components described in VMware Tanzu for Kubernetes Operations Reference Architecture.
Supervisor Cluster: When Workload Management is enabled on a vSphere cluster, it creates a Kubernetes layer within the ESXi hosts that are part of the cluster. A cluster that is enabled for Workload Management is called a Supervisor Cluster. You run containerized workloads by creating upstream Kubernetes clusters on the Supervisor Cluster through the Tanzu Kubernetes Grid Service.
The Supervisor Cluster runs on top of an SDDC layer that consists of ESXi for compute, vSphere Distributed Switch for networking, and vSAN or another shared storage solution.
vSphere Namespaces: A vSphere Namespace is a tenancy boundary within vSphere with Tanzu. A vSphere Namespace allows for sharing vSphere resources (computer, networking, storage), and enforcing resource limits with the underlying objects such as Tanzu Kubernetes clusters. For each namespace, you configure role-based access control ( policies and permissions ), images library, and virtual machine classes.
Tanzu Kubernetes Grid Service: Tanzu Kubernetes Grid Service (TKGS) allows you to create and manage ubiquitous Kubernetes clusters on a VMware vSphere infrastructure using the Kubernetes Cluster API. The Cluster API provides declarative, Kubernetes-style APIs for the creation, configuration, and management of the Tanzu Kubernetes Cluster.
Tanzu Kubernetes Grid Service also provides self-service lifecycle management of Tanzu Kubernetes clusters.
Tanzu Kubernetes Cluster (Workload Cluster): Tanzu Kubernetes clusters are Kubernetes workload clusters in which your application workloads run. These clusters can be attached to SaaS solutions such as Tanzu Mission Control, Tanzu Observability, and Tanzu Service Mesh, which are part of Tanzu for Kubernetes Operations.
VM Class in vSphere with Tanzu: A VM class is a template that defines CPU, memory, and reservations for VMs. VM classes are used for VM deployment in a Supervisor Namespace. VM classes can be used by standalone VMs that run in a Supervisor Namespace and by VMs hosting a Tanzu Kubernetes cluster.
VM classes in vSphere with Tanzu are broadly categorized into the following groups:
vSphere with Tanzu offers several default VM classes. You can use them as is or you can create new VM classes. The following screenshot shows the default VM classes that are available in vSphere with Tanzu.
Storage Classes in vSphere with Tanzu: A StorageClass provides a way for administrators to describe the classes of storage they offer. Different classes can map to quality-of-service levels, to backup policies, or to arbitrary policies determined by the cluster administrators.
You can deploy vSphere with Tanzu with an existing default StorageClass or the vSphere Administrator can define StorageClass objects (Storage policy) that let cluster users dynamically create PVC and PV objects with different storage types and rules.
The following table provides recommendations for configuring VM Classes/Storage Classes in a vSphere with Tanzu environment.
Decision ID | Design Decision | Design Justification | Design Implications |
---|---|---|---|
TKO-TKGS-001 | Create custom Storage Classes/Profiles/Policies | To provide different levels of QoS and SLA for prod and dev/test K8s workloads. To isolate Supervisor clusters from workload clusters. |
Default Storage Policy might not be adequate if deployed applications have different performance and availability requirements. |
TKO-TKGS-002 | Create custom VM Classes | To facilitate deployment of K8s workloads with specific compute/storage requirements. | Default VM Classes in vSphere with Tanzu are not adequate to run a wide variety of K8s workloads. |
The following diagram shows a high-level architecture of vSphere with Tanzu.
The Supervisor Cluster consists of the following components:
Kubernetes control plane VM: Three Kubernetes control plane VMs in total are created on the hosts that are part of the Supervisor cluster. The three control plane VMs are load-balanced as each one of them has its own IP address.
Cluster API and Tanzu Kubernetes Grid Service: These modules run on the Supervisor cluster and enable the provisioning and management of Tanzu Kubernetes clusters.
The following diagram shows the general architecture of the Supervisor cluster.
After a Supervisor cluster is created, the vSphere administrator creates vSphere namespaces. When initially created, vSphere namespaces have unlimited resources within the Supervisor cluster. The vSphere administrator defines the limits for CPU, memory, and storage, as well as the number of Kubernetes objects such as deployments, replica sets, persistent volumes, and so on. that can run within the namespace. These limits are configured for each vSphere namespace.
For more information about the maximum supported number, see the vSphere with Tanzu Configuration Maximums guide.
To provide tenants access to namespaces, the vSphere administrator assigns permission to users or groups available within an identity source that is associated with vCenter Single Sign-On.
Once the permissions are assigned, tenants can access the namespace to create Tanzu Kubernetes Clusters using YAML files and the Cluster API.
Here are some recommendations for using namespaces in a vSphere with Tanzu environment.
Decision ID | Design Decision | Design Justification | Design Implications |
---|---|---|---|
TKO-TKGS-003 | Create namespaces to logically separate K8s workloads. | Create dedicated namespaces for the type of workloads (prod/dev/test) that you intend to run. | All Kubernetes clusters created under a namespace share the same access policy/quotas/network resources. |
TKO-TKGS-004 | Enable self-service namespaces. | Enable DevOps/Cluster admin users to provision namespaces in a self-service manner. | The vSphere administrator must publish a namespace template to the LDAP users/groups to enable them to create namespaces. |
TKO-TKGS-005 | Register external identity source (AD/LDAP) with vCenter. | Limit access to a namespace to authorized users/groups. | A prod namespace can be accessed by a handful of users, whereas a dev/test namespace can be exposed to a wider audience. |
Software Components | Version |
---|---|
Tanzu Kubernetes Release | 1.24.9 |
VMware vSphere ESXi | 8.0 U1 or later |
VMware vCenter (VCSA) | 8.0 U1 or later |
NSX Advanced Load Balancer | 22.1.3 |
vSphere with Tanzu integrates with shared datastores available in the vSphere infrastructure. The following types of shared datastores are supported:
vSphere with Tanzu uses storage policies to integrate with shared datastores. The policies represent datastores and manage the storage placement of objects such as control plane VMs, container images, and persistent storage volumes.
Before you enable vSphere with Tanzu, create storage policies to be used by the Supervisor Cluster and namespaces. Depending on your vSphere storage environment, you can create several storage policies to represent different classes of storage.
vSphere with Tanzu is agnostic about which storage option you choose. For Kubernetes stateful workloads, vSphere with Tanzu installs the vSphere Container Storage Interface (vSphere CSI) to automatically provision Kubernetes persistent volumes for pods.
A Tanzu Kubernetes cluster provisioned by the Tanzu Kubernetes Grid supports the following Container Network Interface (CNI) options:
The CNI options are open-source software that provide networking for cluster pods, services, and ingress.
When you deploy a Tanzu Kubernetes cluster using the default configuration of Tanzu CLI, Antrea CNI is automatically enabled in the cluster.
To provision a Tanzu Kubernetes cluster using Calico CNI, see Deploy Tanzu Kubernetes clusters with Calico
Each CNI is suitable for a different use case. The following table lists some common use cases for the CNI options that Tanzu Kubernetes Grid supports. This table will help you select the most appropriate CNI for your Tanzu Kubernetes Grid implementation.
CNI | Use Case | Pros and Cons |
---|---|---|
Antrea | Enable Kubernetes pod networking with IP overlay networks using VXLAN or Geneve for encapsulation. Optionally encrypt node-to-node communication using IPSec packet encryption. Antrea supports advanced network use cases like kernel bypass and network service mesh. |
Pros - Antrea leverages Open vSwitch as the networking data plane. Open vSwitch supports both Linux and Windows. - VMware supports the latest conformant Kubernetes and stable releases of Antrea. |
Calico | Calico is used in environments where factors like network performance, flexibility, and power are essential. For routing packets between nodes, Calico leverages the BGP routing protocol instead of an overlay network. This eliminates the need to wrap packets with an encapsulation layer resulting in increased network performance for Kubernetes workloads. |
Pros - Support for Network Policies - High network performance - SCTP Support Cons - No multicast support |
You can deploy vSphere with Tanzu on various networking stacks, including:
VMware NSX-T Data Center Networking.
vSphere Virtual Distributed Switch (VDS) Networking with NSX Advanced Load Balancer.
NoteThe scope of this discussion is limited to vSphere Networking (VDS) with NSX Advanced Load Balancer.
In a vSphere with Tanzu environment, a Supervisor Cluster configured with vSphere networking uses distributed port groups to provide connectivity to Kubernetes control plane VMs, services, and workloads. All hosts from the cluster, which is enabled for vSphere with Tanzu, are connected to the distributed switch that provides connectivity to Kubernetes workloads and control plane VMs.
You can use one or more distributed port groups as Workload Networks. The network that provides connectivity to the Kubernetes Control Plane VMs is called Primary Workload Network. You can assign this network to all the namespaces on the Supervisor Cluster, or you can use different networks for each namespace. The Tanzu Kubernetes clusters connect to the Workload Network that is assigned to the namespace.
The Supervisor Cluster leverages NSX Advanced Load Balancer (NSX ALB) to provide L4 load balancing for the Tanzu Kubernetes clusters control-plane HA. Users access the applications by connecting to the Virtual IP address (VIP) of the applications provisioned by NSX Advanced Load Balancer.
The following diagram shows a general overview for vSphere with Tanzu on vSphere Networking.
NSX Advanced Load Balancer is deployed in write access mode in a vSphere environment. This mode grants NSX Advanced Load Balancer Controllers full write access to the vCenter, which helps in automatically creating, modifying, and removing SEs and other resources as needed to adapt to changing traffic needs. The following are the core components of NSX Advanced Load Balancer:
NSX Advanced Load Balancer Controller: NSX Advanced Load Balancer Controller manages Virtual Service objects and interacts with the vCenter Server infrastructure to manage the lifecycle of the service engines (SEs). It is the central repository for the configurations and policies related to services and management and provides the portal for viewing the health of VirtualServices and SEs and the associated analytics that NSX Advanced Load Balancer provides.
NSX Advanced Load Balancer Service Engine: NSX Advanced Load Balancer Service Engines (SEs) are lightweight VMs that handle all data plane operations by receiving and executing instructions from the controller. The SEs perform load balancing and all client and server-facing network interactions.
Avi Kubernetes Operator (AKO): Avi Kubernetes Operator is a Kubernetes operator that runs as a pod in the Supervisor Cluster. It provides ingress and load balancing functionality. Avi Kubernetes Operator translates the required Kubernetes objects to NSX Advanced Load Balancer objects and automates the implementation of ingresses/routes/services on the Service Engines (SE) via the NSX Advanced Load Balancer Controller.
Each environment configured in NSX Advanced Load Balancer is referred to as a cloud. Each cloud in NSX Advanced Load Balancer maintains networking and NSX Advanced Load Balancer Service Engine settings. Each cloud is configured with one or more VIP networks to provide IP addresses to L4 load balancing virtual services created under that cloud.
The virtual services can be spanned across multiple Service Engines if the associated Service Engine Group is configured in Active/Active HA mode. A Service Engine can belong to only one Service Engine group at a time.
IP address allocation for virtual services can be over DHCP or via NSX Advanced Load Balancer in-built IPAM functionality. The VIP networks created/configured in NSX Advanced Load Balancer are associated with the IPAM profile.
To deploy vSphere with Tanzu, build separate networks for the Tanzu Kubernetes Grid management (Supervisor) cluster, Tanzu Kubernetes Grid workload clusters, NSX Advanced Load Balancer components, and the Tanzu Kubernetes Grid control plane HA.
The network reference design can be mapped into this general framework.
NoteThe network/portgroup designated for the workload cluster, carries both data and control traffic. Firewalls cannot be utilized to segregate traffic between workload clusters; instead, the underlying CNI must be employed as the main filtering system. Antrea CNI has the Custom Resource Definitions (CRDs) for firewall rules that can be enforce before Kubernetes network policy is added.
Based on your requirements, you can create additional networks for your workload cluster. These networks are also referred to as vSphere with Tanzu workload secondary network.
This topology enables the following benefits:
Isolate and separate SDDC management components (vCenter, ESX) from the vSphere with Tanzu components. This reference design allows only the minimum connectivity between the Tanzu Kubernetes Grid clusters and NSX Advanced Load Balancer to the vCenter Server.
Isolate and separate the NSX Advanced Load Balancer management network from the supervisor cluster network and the Tanzu Kubernetes Grid workload networks.
Separate vSphere Admin and Tenant access to the supervisor cluster. This prevents tenants from attempting to connect to the supervisor cluster.
Allow tenants to access only their own workload cluster(s) and restrict access to this cluster from other tenants. This separation can be achieved by assigning permissions to the supervisor namespaces.
Depending on the workload cluster type and use case, multiple workload clusters may leverage the same workload network or new networks can be used for each workload cluster.
Network Requirements
As per the reference architecture, the list of required networks is as follows:
Network Type | DHCP Service | Description |
---|---|---|
NSX Advanced Load Balancer Management Network | Optional | NSX Advanced Load Balancer controllers and SEs will be attached to this network. |
TKG Management Network | Optional | Supervisor Cluster nodes will be attached to this network. |
TKG Workload Network (Primary) | Optional | Control plane and worker nodes of TKG workload clusters will be attached to this network. The second interface of the Supervisor nodes is also attached to this network. |
TKG Cluster VIP/Data Network | No | Virtual Services (L4) for Control plane HA of all TKG clusters (Supervisor and Workload). Reserve sufficient IPs depending on the number of TKG clusters planned to be deployed in the environment. |
For the purpose of demonstration, this document makes use of the following Subnet CIDR for TKO deployment.
Network Type | Segment Name | Gateway CIDR | DHCP Pool | NSX Advanced Load Balancer IP Pool |
---|---|---|---|---|
NSX Advanced Load Balancer Mgmt Network | NSX-Advanced Load Balancer-Mgmt | 192.168.10.1/27 | NA | 192.168.10.14 - 192.168.10.30 |
Supervisor Cluster Network | TKG-Management | 192.168.40.1/28 | 192.168.40.2 - 192.168.40.14 | NA |
TKG Workload Primary Network | TKG-Workload-PG01 | 192.168.60.1/24 | 192.168.60.2 - 192.168.60.251 | NA |
TKG Cluster VIP/Data Network | TKG-Cluster-VIP | 192.168.80.1/26 | NA | SE Pool: 192.168.80.2 - 192.168.80.20 TKG Cluster VIP Range: 192.168.80.21 - 192.168.80.60 |
To prepare the firewall, you need the following information:
The following table provides a list of firewall rules based on the assumption that there is no firewall within a subnet/VLAN.
Source | Destination | Protocol:Port | Description |
---|---|---|---|
Client Machine | NSX Advanced Load Balancer Controller Nodes and VIP | TCP:443 | Access NSX Advanced Load Balancer portal for configuration. |
Client Machine | vCenter Server | TCP:443 | Access and configure WCP in vCenter. |
Client Machine | TKG Cluster VIP Range | TCP:6443 TCP:443 TCP:80 |
TKG Cluster Access. Access https workload. Access http workload. |
Client Machine (optional) |
*.tmc.cloud.vmware.com console.cloud.vmware.com |
TCP:443 | Access TMC portal, and so on. |
TKG Management and Workload Cluster CIDR | DNS Server NTP Server |
TCP/UDP:53 UDP:123 |
DNS Service Time Synchronization |
TKG Management Cluster CIDR | vCenter IP | TCP:443 | Allow components to access vCenter to create VMs and Storage Volumes. |
TKG Management and Workload Cluster CIDR | NSX Advanced Load Balancer controller nodes | TCP:443 | Allow Avi Kubernetes Operator (AKO) and AKO Operator (AKOO) access to NSX Advanced Load Balancer Controller. |
TKG Management and Workload Cluster CIDR | TKG Cluster VIP Range | TCP:6443 | Allow Supervisor cluster to configure workload clusters. |
TKG Management and Workload Cluster CIDR | Image Registry (Harbor) (If Private) | TCP:443 | Allow components to retrieve container images. |
TKG Management and Workload Cluster CIDR | wp-content.vmware.com *.tmc.cloud.vmware.com Projects.registry.vmware.com |
TCP:443 | Sync content library, pull TKG binaries, and interact with TMC. |
TKG Management cluster CIDR | TKG Workload Cluster CIDR | TCP:6443 | VM Operator and TKC VM communication. |
TKG Workload Cluster CIDR | TKG Management Cluster CIDR | TCP:6443 | Allow the TKG workload cluster to register with the Supervisor cluster. |
NSX Advanced Load Balancer Management Network | vCenter and ESXi Hosts | TCP:443 | Allow NSX Advanced Load Balancer to discover vCenter objects and deploy SEs as required. |
NSX Advanced Load Balancer Controller Nodes | DNS Server NTP Server |
TCP/UDP:53 UDP:123 |
DNS Service Time Synchronization |
TKG Cluster VIP Range | TKG Management Cluster CIDR | TCP:6443 | To interact with the Supervisor cluster. |
TKG Cluster VIP Range | TKG Workload Cluster CIDR | TCP:6443 TCP:443 TCP:80 |
To interact with workload cluster and K8s applications |
vCenter Server | TKG Management Cluster CIDR | TCP:443 TCP:6443 TCP:22 (optional) |
NoteFor TMC, if the firewall does not allow wildcards, all IP addresses of [account].tmc.cloud.vmware.com and extensions.aws-usw2.tmc.cloud.vmware.com need to be whitelisted.
Starting with vSphere 8, when you enable vSphere with Tanzu, you can configure either one-zone Supervisor mapped to one vSphere cluster or three-zone Supervisor mapped to three vSphere clusters.
A supervisor deployed on s single vSphere cluster has three control plane VMs, which reside on the ESXi hosts part of the cluster. A single zone is created for the Supervisor automatically or you can use a zone that is created in advance. In a Single-Zone deployment, cluster-level high availability is maintained through vSphere HA and can scale with vSphere with Tanzu setup by adding physical hosts to vSphere cluster that maps to the Supervisor. You can run workloads through vSphere Pods, Tanzu Kubernetes Grid clusters and VMs when Supervisor is enabled with the NSX networking stack.
Configure each vSphere cluster as an independent failure domain and map it to the vSphere zone. In a Three-Zone deployment, all three vSphere clusters become one Supervisor and can provide :
For more information, see Supervisor Architecture and Components.
vSphere with Tanzu deployment starts with deploying the Supervisor cluster (Enabling Workload Management). The deployment is directly done from the vCenter user interface (UI). The Get Started page lists the pre-requisites for the deployment.
The vCenter UI shows that, in the current version, it is possible to install vSphere with Tanzu on the VDS networking stack as well as NSX-T Data Center as the networking solution.
This installation process takes you through the steps of deploying Supervisor Cluster in your vSphere environment. Once the Supervisor cluster is deployed, you can use either Tanzu Mission Control or Kubectl utility to deploy the Tanzu Kubernetes Shared Service and workload clusters.
The following table provides recommendations for configuring NSX Advanced Load Balancer in a vSphere with Tanzu environment.
Decision ID | Design Decision | Design Justification | Design Implications |
---|---|---|---|
TKO-Advanced Load Balancer-001 | Deploy NSX Advanced Load Balancer controller cluster nodes on a network dedicated to NSX-Advanced Load Balancer. | To isolate NSX Advanced Load Balancer traffic from infrastructure management traffic and Kubernetes workloads. | Allows for ease of management of controllers. Additional Network (VLAN) is required. |
TKO-Advanced Load Balancer-002 | Deploy 3 NSX Advanced Load Balancer controllers nodes. | To achieve high availability for the NSX Advanced Load Balancer platform. In clustered mode, NSX Advanced Load Balancer availability is not impacted by an individual controller node failure. The failed node can be removed from the cluster and redeployed if recovery is not possible. | Clustered mode requires more compute and storage resources. |
TKO-Advanced Load Balancer-003 | Configure vCenter settings in Default-Cloud. | Using a non-default vCenter cloud is not supported with vSphere with Tanzu. | Using a non-default cloud can lead to deployment failures. |
TKO-Advanced Load Balancer-004 | Use static IPs for the NSX Advanced Load Balancer controllers if DHCP cannot guarantee a permanent lease. | NSX Advanced Load Balancer Controller cluster uses management IP addresses to form and maintain quorum for the control plane cluster. Any changes would be disruptive. | NSX Advanced Load Balancer Controller control plane might go down if the management IPs of the controller node changes. |
TKO-Advanced Load Balancer-005 | Use NSX Advanced Load Balancer IPAM for Service Engine data network and virtual services IP assignment. |
Guarantees IP address assignment for Service Engine Data NICs and Virtual Services. | Removes the corner case scenario when the DHCP server runs out of the lease or is down. |
TKO-Advanced Load Balancer-006 | Reserve an IP in the NSX Advanced Load Balancer management subnet to be used as the Cluster IP for the Controller Cluster. | NSX Advanced Load Balancer portal is always accessible over Cluster IP regardless of a specific individual controller node failure. | NSX Advanced Load Balancer administration is not affected by an individual controller node failure. |
TKO-Advanced Load Balancer-007 | Use default Service Engine Group for load balancing of TKG clusters control plane. | Using a non-default Service Engine Group for hosting L4 virtual service created for TKG control plane HA is not supported. | Using a non-default Service Engine Group can lead to Service Engine VM deployment failure. |
TKO-Advanced Load Balancer-008 | Share Service Engines for the same type of workload (dev/test/prod)clusters. | Minimize the licensing cost. | Each Service Engine contributes to the CPU core capacity associated with a license. Sharing Service Engines can help reduce the licensing cost. |
TKO-Advanced Load Balancer-009 | Configure anti-affinity rules for the NSX ALB controller cluster. | This is to ensure that no two controllers end up in same ESXi host and thus avoid single point of failure. | Anti-Affinity rules need to be created manually. |
TKO-Advanced Load Balancer-0010 | Configure backup for the NSX ALB Controller cluster. | Backups are required if the NSX ALB Controller becomes inoperable or if the environment needs to be restored from a previous state. | To store backups, a SCP capable backup location is needed. SCP is the only supported protocol currently. |
TKO-Advanced Load Balancer-0011 | Initial setup should be done only on one NSX ALB controller VM out of the three deployed to create an NSX ALB controller cluster. | NSX ALB controller cluster is created from an initialized NSX ALB controller which becomes the cluster leader. Follower NSX ALB controller nodes need to be uninitialized to join the cluster. |
NSX ALB controller cluster creation fails if more than one NSX ALB controller is initialized. |
TKO-Advanced Load Balancer-0012 | Configure Remote logging for NSX ALB Controller to send events on Syslog. | For operations teams to centrally monitor NSX ALB and escalate alerts events must be sent from the NSX ALB Controller | Additional Operational Overhead. Additional infrastructure Resource. |
TKO-Advanced Load Balancer-0013 | Use LDAP/SAML based Authentication for NSX ALB | Helps to Maintain Role based Access Control. | Additional Configuration is required. |
The following are the key network recommendations for a production-grade vSphere with Tanzu deployment:
Decision ID | Design Decision | Design Justification | Design Implications |
---|---|---|---|
TKO-NET-001 | Use separate networks for Supervisor cluster and workload clusters. | To have a flexible firewall and security policies | Sharing the same network for multiple clusters can complicate creation of firewall rules. |
TKO-NET-002 | Use distinct port groups for network separation of K8s workloads. | Isolate production Kubernetes clusters from dev/test clusters by placing them on distinct port groups. | Network mapping is done at the namespace level. All Kubernetes clusters created in a namespace connect to the same port group. |
TKO-NET-003 | Use routable networks for Tanzu Kubernetes clusters. | Allow connectivity between the TKG clusters and infrastructure components. | Networks that are used for Tanzu Kubernetes cluster traffic must be routable between each other and the Supervisor Cluster Management Network. |
Decision ID | Design Decision | Design Justification | Design Implications |
---|---|---|---|
TKO-TKGS-001 | Create a Subscribed Content Library. | Subscribed Content Library can automatically pull the latest OVAs used by the Tanzu Kubernetes Grid Service to build cluster nodes. Using a subscribed content library facilitates template management as new versions can be pulled by initiating the library sync. |
Local Content Library would require manual upload of images, suitable for air-gapped or Internet-restricted environment. |
TKO-TKGS-002 | Deploy Supervisor cluster control plane nodes in large form factor. | Large form factor should suffice to integrate Supervisor Cluster with TMC and velero deployment. | Consume more Resources from Infrastructure. |
TKO-TKGS-003 | Register Supervisor cluster with Tanzu Mission Control. | Tanzu Mission Control automates the creation of the Tanzu Kubernetes clusters and manage the life cycle of all clusters centrally. | Need outbound connectivity to internet for TMC registration. |
NoteSaaS endpoints here refers to Tanzu Mission Control, Tanzu Service Mesh and Tanzu Observability.
Decision ID | Design Decision | Design Justification | Design Implications |
---|---|---|---|
TKO-TKC-001 | Deploy Tanzu Kubernetes clusters with prod plan and multiple worker nodes. | The prod plan provides high availability for the control plane. | Consume from resource from Infrastructure. |
TKO-TKC-002 | Use guaranteed VM class for Tanzu Kubernetes clusters. | Guarantees compute resources are always available for containerized workloads. | Could prevent automatic migration of nodes by DRS. |
TKO-TKC-003 | Implement RBAC for Tanzu Kubernetes clusters. | To avoid the usage of administrator credentials for managing the clusters. | External AD/LDAP needs to be integrated with vCenter or SSO groups need to be created manually. |
TKO-TKC-04 | Deploy Tanzu Kubernetes clusters from Tanzu Mission Control. | Tanzu Mission Control provides life-cycle management for the Tanzu Kubernetes clusters and automatic integration with Tanzu Service Mesh and Tanzu Observability. | Only Antrea CNI is supported on Workload clusters created from TMC portal. |
vSphere with Tanzu does not ship with a default ingress controller. Any Tanzu-supported ingress controller can be used.
One example of an ingress controller is Contour, an open-source controller for Kubernetes ingress routing. Contour is part of a Tanzu package and can be installed on any Tanzu Kubernetes cluster. Deploying Contour is a prerequisite for deploying Prometheus, Grafana, and Harbor on a workload cluster.
For more information about Contour, see the Contour site and Implementing Ingress Control with Contour.
Tanzu Service Mesh also offers an Ingress controller based on Istio.
Each ingress controller has pros and cons of its own. The below table provides general recommendations on when you should use a specific ingress controller for your Kubernetes environment.
Ingress Controller | Use Cases |
---|---|
Contour | Use Contour when only north-south traffic is needed in a Kubernetes cluster. You can apply security policies for the north-south traffic by defining the policies in the manifest file for the application. Contour is a reliable solution for simple Kubernetes workloads. |
Istio | Use Istio ingress controller when you need to provide security, traffic direction, and insight within the cluster (east-west traffic) and between the cluster and the outside world (north-south traffic). |
Regardless of NSX Advanced Load Balancer Controller configuration, each controller cluster can achieve up to 5,000 virtual services; 5,000 is a hard limit. For more information, see Avi Controller Sizing.
Controller Size | VM Configuration | Virtual Services | Avi SE Scale |
---|---|---|---|
Essentials | 4 vCPUS, 24 GB RAM | 0-50 | 0-10 |
Small | 6 vCPUS, 24 GB RAM | 0-200 | 0-100 |
Medium | 10 vCPUS, 32 GB RAM | 200-1000 | 100-200 |
Large | 16 vCPUS, 48 GB RAM | 1000-5000 | 200-400 |
See Sizing Service Engines for guidance on sizing your SEs.
Performance metric | 1 vCPU core |
---|---|
Throughput | 4 Gb/s |
Connections/s | 40k |
SSL Throughput | 1 Gb/s |
SSL TPS (RSA2K) | ~600 |
SSL TPS (ECC) | 2500 |
Multiple performance vectors or features may have an impact on performance. For example, to achieve 1 Gb/s of SSL throughput and 2000 TPS of SSL with EC certificates, NSX Advanced Load Balancer recommends two cores.
NSX Advanced Load Balancer Service Engines may be configured with as little as 1 vCPU core and 2 GB RAM, or up to 64 vCPU cores and 256 GB RAM. It is recommended for a Service Engine to have at least 4 GB of memory when GeoDB is in use.
VMware Tanzu for Kubernetes Operations using vSphere with Tanzu includes Harbor as a container registry. Harbor is an open-source, trusted, cloud-native container registry that stores, signs, and scans content.
The initial configuration and setup of the platform does not require any external registry because the required images are delivered through vCenter. Customer can choose any existing repository and if required can deploy harbor registry for storing the images.
When vSphere with Tanzu is deployed on VDS networking, you can deploy an external container registry (Harbor) for Tanzu Kubernetes clusters.
You may use one of the following methods to install Harbor:
Tanzu Kubernetes Grid Package deployment - VMware recommends this installation method for general use cases. The Tanzu packages, including Harbor, must either be pulled directly from VMware or be hosted in an internal registry.
VM-based deployment using OVA - VMware recommends this installation method in cases where Tanzu Kubernetes Grid is being installed in an air-gapped or Internet-restricted environment, and no pre-existing image registry exists to host the Tanzu Kubernetes Grid system images. VM-based deployments are only supported by VMware Global Support Services to host the system images for air-gapped or Internet-restricted deployments. Do not use this method for hosting application images.
When deploying Harbor with self-signed certificates or certificates signed by internal CAs, it is necessary for the Tanzu Kubernetes cluster to establish trust with the registry’s certificate. To do so, follow the procedure in Trust Custom CA Certificates on Cluster Nodes.
The SaaS products in the VMware Tanzu portfolio are on the critical path for securing systems at the heart of your IT infrastructure. VMware Tanzu Mission Control provides a centralized control plane for Kubernetes, and Tanzu Service Mesh provides a global control plane for service mesh networks. Tanzu Observability features include Kubernetes monitoring, application observability, and service insights.
To learn more about Tanzu Kubernetes Grid integration with Tanzu SaaS, see Tanzu SaaS Services.
Tanzu Observability provides various out-of-the-box dashboards. You can customize the dashboards for your particular deployment. For information about customizing Tanzu Observability dashboards for Tanzu for Kubernetes Operations, see Customize Tanzu Observability Dashboard for Tanzu for Kubernetes Operations.
vSphere with Tanzu on hyper-converged hardware offers high-performance potential and convenience and addresses the challenges of creating, testing, and updating on-premises Kubernetes platforms in a consolidated production environment. This validated approach results in a production installation with all the application services needed to serve combined or uniquely separated workload types via a combined infrastructure solution.
This plan meets many Day-0 needs for quickly aligning product capabilities to full-stack infrastructure, including networking, configuring firewall rules, load balancing, workload compute alignment, and other capabilities.
For instructions on how to deploy this reference design, see Deploy Tanzu for Kubernetes Operations using vSphere with Tanzu.