You can tolerate failures at the pod, process, virtual machine, and availability zone layers in VMware Enterprise PKS.

VMware Enterprise PKS provides the following levels of high-availability.

  • BOSH Director and Kubernetes

    • Pod level (done by Kubernetes)

    • Process level (done by BOSH Director)

    • Virtual machine level (done by BOSH Director)

  • Availability zones

BOSH Director VM and Service Recovery

VMware Enterprise PKS uses BOSH Director to monitor and resurrect Kubernetes processes and underlying virtual machines.

BOSH Director provides a built-in feature VM Resurrector Plugin. You can enable this plug-in to proactively monitor and provide self-repairing capabilities for failed virtual machines.

The VM Resurrector Plugin consists of VM Resurrector and BOSH Agent Process Monitor.

Figure 1. Components for VM and Service Recovery in BOSH Director


VM Resurrector

The VM Resurrector plugin monitors and heals the following virtual machines in a VMware Enterprise PKS deployment:

  • Kubernetes cluster virtual machines, in the master and worker nodes

  • VMware Enterprise PKS control plane virtual machine

BOSH Agent Process Monitor

The process monitor restarts a process by interacting with the BOSH agent on each virtual machine that VMware Enterprise PKS is provisions.

The following processes are monitored on the Kubernetes master node:

kube-apiserver
  • Standard Kubernetes component.

  • Component on the master node that exposes the Kuberenetes API.

  • Front end for the Kubernetes control plane.

kube-controller-manager
  • Standard Kubernetes component.

  • Component on the master node that runs controllers.

kube-scheduler
  • Standard Kubernetes component.

  • Component on the master that watches newly created pods that have no node assigned and selects a node for them to run.

etcd
  • Standard Kubernetes component.

  • Consistent and highly available key value store used as backing Kuberenetes store for all cluster data.

etcd_consistency_checker
  • VMware Enterprise PKS add-on component.

  • Verifies that the etcd key value store database is consistent.

blackbox
  • Open source add-on component.

  • Runs on master and worker nodes.

  • Allows probing of endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.

bosh-dns
  • VMware Enterprise PKS add-on component.

  • Runs on master and worker nodes.

  • Keeps fully qualified domain names (FQDNs) for all nodes in a Kubernetes cluster.

bosh-dns-healthcheck
  • VMware Enterprise PKS add-on component.

  • Runs on master and worker nodes.

  • Maintains local DNS database consistency.

The following processes are monitored on the Kubernetes worker node:

docker
  • Standard Docker runtime engine.

kubelet
  • Standard Kubernetes component.

  • Represents an agent that runs on each node in the cluster to ensure that containers are running in a pod.

kube-proxy
  • Standard Kubernetes component.

  • Enables Kubernetes service abstraction by maintaining network rules on the host and performing connection forwarding.

blackbox
  • Open source add-on component.

  • Runs on master and worker nodes.

  • Allows probing of endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.

ovsdb-server
  • VMware Enterprise PKS add-on component.

  • A lightweight database server that ovs-vswitchd queries to obtain its configuration.

ovs-vswitchd
  • VMware Enterprise PKS add-on component.

  • The OVS daemon that implements the virtual switch on the worker node.

bosh-dns
  • VMware Enterprise PKS add-on component.

  • Runs on master and worker nodes.

  • Keeps fully qualified domain names (FQDNs) for all nodes in a Kubernetes cluster.

bosh-dns-healthcheck
  • VMware Enterprise PKS add-on component.

  • Runs on master and worker nodes.

  • Maintains local DNS database consistency.

Table 1. Design Decisions on Using BOSH VM Resurrector

Decision ID

Design Decision

Design Justification

Design Implications

PKS-CS-AZ-001

Enable the VM Ressurector Plugin feature for BOSH Director.

Provides automatic monitoring and self-repairing of failed virtual machines that VMware Enterprise PKS provisions.

None.

Availability Zones

As a fourth level of availability for Kubernetes, VMware Enterprise PKS supports availability zones. If a failure in an availability zone occurs, the Kubernetes cluster remains online.

Availability zones are defined in PCF Ops Manager, where they are associated with either a vSphere Cluster or Resource Pool available under a vCenter Server instance. Availability is improved by using multiple availability zones and load balancing for Kubernetes cluster deployments and cloud native workloads.

This design calls for a minimum of four availability zones to be deployed. The first availability zone supports VMware Enterprise PKS management workloads, including PCF Ops Manager, BOSH Director, VMware Enterprise PKS control plane, and VMware Harbor Registry. The last three availability zones will support end-user workloads - Kubernetes clusters, including Kubernetes cluster master and worker node objects for cloud native workloads. Kubernetes worker nodes are responsible for handling workloads and can be horizontally scaled to support an increased load. The master node and associated etcd key-value store are responsible for the state of the cluster, API endpoints, and cluster management. If more than one instance of an application is run, the instances are balanced across the worker nodes in all availability zones that are assigned to the Kubernetes cluster.

Figure 2. Availability Design for VMware Enterprise PKS

In a typical Kubernetes cluster, one active master-etcd pair exists. If the master node is down, you cannot deploy additional containers, scale the environment, or receive telemetry on the Kubernetes cluster.

VMware Enterprise PKS supports the multi-master availability zone or HA cluster feature. You can deploy multiple master-etcd nodes per cluster, across multiple availability zones. Only a single master is active at a time. If a master node or an availability zone fails, to restore the state of the cluster, VMware Enterprise PKS elects one of the master nodes from another availability zone as active.

Table 2. Design Decisions on Availability of VMware Enterprise PKS

Decision ID

Design Decision

Decision Justification

Decision Implication

PKS-CS-AZ-002

Define the availability zone for VMware Enterprise PKS management workloads as a resource pool in the Shared Edge and Compute vSphere cluster.

Allows for the consolidation of NSX-T Edge Transport Node VMs and VMware Enterprise PKS workloads in the same vSphere cluster. If contention occurs, it also ensures an adequate resource availability for processing of the network traffic through NSX-T Edge Transport Node VMs.

During contention, workloads in a resource pool might lack resources and experience poor performance. As such, monitoring and capacity management must be a proactive activity.

PKS-CS-AZ-003

Define availability zones mapped to the root of each of the three compute clusters for the Kubernetes master, the worker, and the pod deployment.

Defining each end-user-facing availability zone as a vSphere Cluster ensures that availability zones align with logical compute failure domains. In the instance where an outage impacts all compute resources within a vSphere cluster, other availability zones are not affected.

Delineation of availability zones based on vSphere clusters significantly increases the number of ESXi hosts needed for a three availability zone deployment.