You can deploy a generic Kubernetes cluster and persistent volumes on vSAN stretched clusters. You can deploy multiple Kubernetes clusters with different storage requirements in the same vSAN stretched cluster.

vSAN stretched clusters support file volumes backed by vSAN file shares. For more information, see Provisioning File Volumes with vSphere Container Storage Plug-in.

Prerequisites

When you plan to configure a Kubernetes cluster on a vSAN stretched cluster, consider the following items:
  • A generic Kubernetes cluster does not enforce the same storage policy on the node VMs and on the persistent volumes. The vSphere administrator is responsible for the correct storage policy configuration, assignment, and use of the storage policies within the Kubernetes clusters.
  • Use the VM storage policy with the same replication and site affinity settings for all storage objects on the Kubernetes cluster. The same storage policy should be used for all node VMs, including the control plane and worker, and all PVs.
  • The topology feature cannot be used to provision a volume that belongs to a specific fault domain within the vSAN stretched cluster.

Procedure

  1. Set up your vSAN stretched cluster.
    1. Create a vSAN stretched cluster.
      For more information, search for vSAN stretched cluster on the VMware vSAN Documentation site.
    2. Turn on DRS on the stretched cluster.
    3. Turn on vSphere HA.
      Make sure to set up Host Monitoring.
    4. Enable host monitoring and configure host failure response, response for host isolation, and VM monitoring.
      Note: VMware recommends you to disable VM Component Protection (VMCP) when all Node VMs and Volumes are deployed on the vSAN Datastore.
      • Disable Datastore with PDL.
      • Disable Datastore with APD.
  2. Create a VM storage policy compliant with the vSAN stretched cluster requirements.
    1. Configure Site disaster tolerance.
      Select Dual site mirroring to have data mirrored at both sites of the stretched cluster.
      The screenshot shows options available for Site disaster tolerance.
    2. Specify Failures to tolerate.
      For the stretched cluster, the setting defines the number of disk or host failures a storage object can tolerate for each of the site. The number of required fault domains, or hosts within a site for the stretched cluster, in order to tolerate n failures is 2n + 1 for mirroring.

      Raid-1 mirroring provides better performance. Raid-5 and Raid-6 achieve failure tolerance using parity blocks, which provides better space efficiency. These options are available only on all-flash clusters.

      The screenshot shows options available for Failures to tolerate.

    3. Enable Force provisioning.
      The screenshot shows the Force provisioning option on the Advanced Policy Rules tab.
  3. Create VM-Host affinity rules to place Kubernetes nodes on specific primary or secondary site, such as Site-A.
    The screenshot shows VM/Host Rule dialog box.
    For information about affinity rules, see Create a VM-Host Affinity Rule in the vSphere Resource Management documentation.
  4. Deploy Kubernetes VMs using the vSAN stretched cluster storage policy.
  5. Create a storage class using the vSAN stretched cluster storage policy.
  6. Deploy persistent volumes using the vSAN stretched cluster storage class.

What to do next

Depending on your needs and environment, you can use one of the following deployment scenarios when deploying your Kubernetes cluster and workloads on the vSAN stretched cluster.

Deployment 1

In this deployment, the control plane and worker nodes are placed on the primary site, but flexible enough to failover on another site, if the primary site fails. You deploy HA Proxy on the primary site. This is also known as an Active-Passive deployment because only one site of the stretched vSAN cluster is used to deploy VMs.

If you plan to use file volumes (RWX volumes), it is recommended to configure the vSAN file service domain to place file servers on the active site (preferred site). This reduces the cross-site traffic latency and delivers better performance for applications using file volumes.

Requirements for Deployment 1

Requirements Parameters
Node Placement
  • The control plane and worker nodes are on the primary site. They are flexible enough to failover to another site, if the primary site fails.
  • HA Proxy on the primary site.
Failure to Tolerate At least FTT1
DRS Enabled
Site Disaster Tolerance Dual Site Mirroring
Storage Policy Force Provisioning Enabled
vSphere HA Enabled

Potential Failover Scenarios for Deployment 1

The following table describes potential failover scenarios that might occur when you use deployment model 1.

Scenario Description
Several ESXi hosts fail on the primary site.
  • Kubernetes node VMs move from unavailable hosts to the available hosts within primary sites.
  • If the worker node needs to be restarted, pods running on that node can be rescheduled and recreated on another node.
  • If the control plane node needs to be restarted, the existing application workload does not get affected.
The entire primary site and all hosts on the site fail.
  • Kubernetes node VMs move from the primary site to the secondary site.
  • You experience a complete downtime until node VMs restart on the secondary site.
Several hosts fail on the secondary site.

The failure does not affect the Kubernetes cluster because the entire cluster is at the primary site.

The entire secondary site and all hosts on the site fail.
  • The failure does not affect the Kubernetes cluster because the entire cluster is at the primary site.
  • Replication for storage objects stops because the secondary site is not available.
Intersite network failure occurs.
  • The failure does not affect the Kubernetes cluster because the entire cluster is at the primary site.
  • Replication for storage objects stops because the secondary site is not available.

Deployment 2

With this model, place the control plane nodes on the primary site and worker nodes can be spread across the primary and secondary site. You deploy HA Proxy on the primary site.

Requirements for Deployment 2

Requirements Parameters
Node Placement
  • The control plane nodes on the primary site.
  • Worker nodes spread across the primary and secondary site.
  • HA Proxy on the primary site.
Failure to Tolerate At least FTT1
DRS Enabled
Site Disaster Tolerance Dual Site Mirroring
Storage Policy Force Provisioning Enabled
vSphere HA Enabled

Potential Failover Scenarios for Deployment 2

The following table describes potential failover scenarios that might occur when you deploy a Kubernetes cluster using the Deployment 2 model.

Scenario Description
Several ESXi hosts fail on the primary site.
  • Kubernetes node VMs move from unavailable hosts to the available hosts within the same site. If resources are not available, they move to anther site.
  • If the worker node needs to be restarted, pods running on that node might be rescheduled and recreated on another node.
  • If the control plane node needs to be restarted, the existing application workload does not get affected.
The entire primary site and all hosts on the site fail.
  • Kubernetes control plane node VMs and some worker nodes present on the primary site move from the primary site to the secondary site.
  • Expect the control plane downtime until the control plane nodes restart on the secondary site.
  • Expect partial downtime for pods running on the worker nodes on the primary site.
  • Pods deployed on the worker nodes on the secondary site are not affected.
Several hosts fail on the secondary site. Node VMs and pods running on the node VMs restart on another host.
The entire secondary site and all hosts on the site fail.
  • Kubernetes control plane is unaffected.
  • Kubernetes control plane nodes move to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with node VMs.
Intersite network failure occurs.
  • Kubernetes control plane is unaffected.
  • Kubernetes worker nodes move to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with node VMs.

Deployment 3

In this deployment model, you can place two control plane nodes on the primary site and one control plane node on the secondary site. Deploy HA Proxy on the primary site. Worker nodes can be on any site.

Requirements for Deployment 3

You can use this deployment model if you have equal resources at both the primary, or preferred, fault domain and the secondary, non-preferred, fault domain and you want to use hardware located at both fault domains. Since both fault domains have some workload running, in case of a complete site failure, this deployment model will help with faster recovery.

This model requires specific DRS policy rules. One rule to specify affinity between two control plane nodes and the primary site and another rule for affinity between the third control plane node and the secondary site.
Requirements Parameters
Node Placement
  • Two control plane nodes on the primary site.
  • One control plane node on the secondary site.
  • HA Proxy on the primary site.
  • Worker nodes on any site.
Failure to Tolerate At least FTT1
DRS Enabled
Site Disaster Tolerance Dual Site Mirroring
Storage Policy Force Provisioning Enabled
vSphere HA Enabled

Potential Failover Scenarios for Deployment 3

The following table describes potential failover scenarios that might occur when you use the Deployment 3 model.

Scenario Description
Several ESXi hosts fail on the primary site.
  • Affected nodes get restarted on the available host on the primary site.
  • If both control plane nodes are present on the failed host on the primary site, the control plane will be down until both control plane node recover on the available hosts on the primary site.
  • While nodes are restarting on available hosts, pods might get rescheduled and recreated on available nodes.
The entire primary site and all hosts on the site fail.
  • Node VMs move from the primary site to the secondary site.
  • Expect a downtime until node VMs restart on the secondary site.
Several hosts fail on the secondary site.
  • Node VMs and pods running on the node VMs restart on another host.
  • If a control plane node on the secondary site is affected, Kubernetes control plane remains unaffected. Kubernetes remains accessible through two master nodes on the primary site.
The entire secondary site and all hosts on the site fail.
  • The control plane node and worker nodes from the secondary site are migrated to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with the node VMs.
Intersite network failure occurs.
  • Kubernetes control plane is unaffected.
  • Kubernetes nodes move to the primary site.
  • Pods deployed on the worker nodes on the secondary site are affected. They restart along with the node VMs.

Upgrade Kubernetes and Persistent Volumes on vSAN Stretched Clusters

If you already have Kubernetes deployments on a vSAN datastore, you can upgrade your deployments after enabling vSAN stretched clusters on the datastore.

Procedure

  1. Edit existing VM storage policy used for provisioning volumes and node VMs on the vSAN cluster to add stretched cluster parameters.
  2. Apply updated storage policy on all objects.
  3. Apply updated storage policy on the persistent volumes that have Out of date status.