Stretched clusters extend the vSAN cluster from a single data site to two sites for a better level of availability and intersite load balancing. Stretched clusters are typically deployed in environments where the distance between data centers is limited, such as metropolitan or campus environments.
You can use stretched clusters to manage planned maintenance and avoid disaster scenarios, because maintenance or loss of one site does not affect the overall operation of the cluster. In a stretched cluster configuration, both data sites are active sites. If either site fails, vSAN uses the storage on the other site. vSphere HA restarts any VM that must be restarted on the remaining active site.
You must designate one site as the preferred site. The other site becomes a secondary or nonpreferred site. If the network connection between the two active sites is lost, vSAN continues operation with the preferred site. The site designated as preferred typically is the one that remains in operation, unless it is resyncing or has another issue. The site that leads to maximum data availability is the one that remains in operation.
A vSAN stretched cluster can tolerate one link failure at a time without data becoming unavailable. A link failure is a loss of network connection between the two sites or between one site and the witness host. During a site failure or loss of network connection, vSAN automatically switches to fully functional sites.
vSAN 7.0 Update 3 and later stretched clusters can tolerate a witness host failure when one site is unavailable. Configure the storage policy Site disaster tolerance rule to Site mirroring - stretched cluster. If one site is down due to maintenance or failure and the witness host fails, objects become non-compliant but remain accessible.
For more information about working with stretched clusters, see the vSAN Stretched Cluster Guide.
Each stretched cluster consists of two data sites and one witness host. The witness host resides at a third site and contains the witness components of virtual machine objects. The witness host does not store customer data, only metadata, such as the size and UUID of vSAN object and components.
The witness host serves as a tiebreaker when a decision must be made regarding availability of datastore components when the network connection between the two sites is lost. In this case, the witness host typically forms a vSAN cluster with the preferred site. But if the preferred site becomes isolated from the secondary site and the witness, the witness host forms a cluster using the secondary site. When the preferred site is online again, data is resynchronized to ensure that both sites have the latest copies of all data.
If the witness host fails, all corresponding objects become noncompliant but are fully accessible.
The witness host has the following characteristics:
- The witness host can use low bandwidth/high latency links.
- The witness host cannot run VMs.
- A single witness host can support only one vSAN stretched cluster. Two-node vSAN clusters can share a single witness host.
- The witness host must have one VMkernel adapter with vSAN traffic enabled, with connections to all hosts in the cluster. The witness host uses one VMkernel adapter for management and one VMkernel adapter for vSAN data traffic. The witness host can have only one VMkernel adapter dedicated to vSAN.
- The witness host must be a standalone host dedicated to the stretched cluster. It cannot be added to any other cluster or moved in inventory through vCenter Server.
The witness host can be a physical host or an ESXi host running inside a VM. The VM witness host does not provide other types of functionality, such as storing or running VMs. Multiple witness hosts can run as VMs on a single physical server. For patching and basic networking and monitoring configuration, the VM witness host works in the same way as a typical ESXi host. You can manage it with vCenter Server, patch it and update it by using esxcli or vSphere Lifecycle Manager, and monitor it with standard tools that interact with ESXi hosts.
You can use a witness virtual appliance as the witness host in a stretched cluster. The witness virtual appliance is an ESXi host in a VM, packaged as an OVF or OVA. The appliance is available in different options, based on the size of the deployment. You can use a witness virtual appliance as the witness host in a stretched cluster. The witness virtual appliance is an ESXi host in a VM, packaged as an OVF or OVA. Different appliances and different options are available, based on the vSAN architecuture and the size of the deployment.
Stretched Clusters and Fault Domains
Stretched clusters use fault domains to provide redundancy and failure protection across sites. Each site in a stretched cluster resides in a separate fault domain.
A stretched cluster requires three fault domains: the preferred site, the secondary site, and a witness host. Each fault domain represents a separate site. When the witness host fails or enters maintenance mode, vSAN considers it a site failure.
- Site disaster tolerance. For stretched clusters, this rule defines the failure tolerance method. Select Site mirroring - stretched cluster.
- Failures to tolerate (FTT). For stretched clusters, FTT defines the number of additional host failures that a virtual machine object can tolerate.
- None. You can set this data locality rule to None, Preferred, or Secondary. This rule enables you to restrict virtual machine objects to a selected site in the stretched cluster.d
In a stretched cluster with local fault protection, even when one site is unavailable, the cluster can perform repairs on missing or broken components in the available site.
vSAN 7.0 and later continue to serve I/O if any disks or disks on one site reach 96 percent full or 5 GB free capacity (whichever is less) while disks on the other site have free space available. Components on the affected site are marked absent, and vSAN continues to perform I/O to healthy object copies on the other site. When disks on the affected site disk reach 94 percent capacity or 10 GB (whichever is less), the absent components become available. vSAN resyncs the available components and all objects become policy compliant.
Stretched Cluster Design Considerations
Consider these guidelines when working with a vSAN stretched cluster.
- Configure DRS settings for the stretched cluster.
- DRS must be enabled on the cluster. If you place DRS in partially automated mode, you can control which VMs to migrate to each site. vSAN 7.0 Update 2 enables you to operate DRS in automatic mode, and recover gracefully from network partitions.
- Create two host groups, one for the preferred site and one for the secondary site.
- Create two VM groups, one to hold the VMs on the preferred site and one to hold the VMs on the secondary site.
- Create two VM-Host affinity rules that map VMs-to-host groups, and specify which VMs and hosts reside in the preferred site and which VMs and hosts reside in the secondary site.
- Configure VM-Host affinity rules to perform the initial placement of VMs in the cluster.
- Configure HA settings for the stretched cluster.
- HA rule settings should respect VM-Host affinity rules during failover.
- Disable HA datastore heartbeats.
- Use HA with Host Failure Monitoring, Admission Control, and set FTT to the number of hosts in each site.
- Stretched clusters require on-disk format 2.0 or later. If necessary, upgrade the on-disk format before configuring a stretched cluster. See "Upgrade vSAN Disk Format" in Administering VMware vSAN.
- Configure the FTT to 1 for stretched clusters.
- vSAN stretched clusters support enabling Symmetric Multiprocessing Fault Tolerance (SMP-FT) VMs when FTT is set to None with either Preferred or Secondary. vSAN does not support SMP-FT VMs on a stretched cluster with FTT set to 1 or more.
- When a host is disconnected or not responding, you cannot add or remove the witness host. This limitation ensures that vSAN collects enough information from all hosts before initiating reconfiguration operations.
- Using esxcli to add or remove hosts is not supported for stretched clusters.
- Do not create snapshots of the witness host or backup the witness host. If the witness host fails, change the witness host.
Best Practices for Working with Stretched Clusters
When working with vSAN stretched clusters, follow these recommendations for proper performance.
- If one of the sites (fault domains) in a stretched cluster is inaccessible, new VMs can still be provisioned in the subcluster containing the other two sites. These new VMs are implicitly force provisioned and are non-compliant until the partitioned site rejoins the cluster. This implicit force provisioning is performed only when two of the three sites are available. A site here refers to either a data site or the witness host.
- If an entire site goes offline due to a power outage or loss of network connection, restart the site immediately, without much delay. Instead of restarting vSAN hosts one by one, bring all hosts online approximately at the same time, ideally within a span of 10 minutes. By following this process, you avoid resynchronizing a large amount of data across the sites.
- If a host is permanently unavailable, remove the host from the cluster before you perform any reconfiguration tasks.
- If you want to clone a VM witness host to support multiple stretched clusters, do not configure the VM as a witness host before cloning it. First deploy the VM from OVF, then clone the VM, and configure each clone as a witness host for a different cluster. Or you can deploy as many VMs as you need from the OVF, and configure each one as a witness host for a different cluster.
Stretched Clusters Network Design
All three sites in a stretched cluster communicate across the management network and across the vSAN network. The VMs in both data sites communicate across a common virtual machine network.
- Management network requires connectivity across all three sites, using a Layer 2 stretched network or a Layer 3 network.
- The vSAN network requires connectivity across all three sites. It must have independent routing and connectivity between the data sites and the witness host. vSAN supports both Layer 2 and Layer 3 between the two data sites, and Layer 3 between the data sites and the witness host.
- VM network requires connectivity between the data sites, but not the witness host. Use a Layer 2 stretched network or Layer 3 network between the data sites. In the event of a failure, the VMs do not require a new IP address to work on the remote site.
- vMotion network requires connectivity between the data sites, but not the witness host. Use a Layer 2 stretched or a Layer 3 network between data sites.
Using Static Routes on ESXi Hosts
If you use a single default gateway on ESXi hosts, each ESXi host contains a default TCP/IP stack that has a single default gateway. The default route is typically associated with the management network TCP/IP stack.
The management network and the vSAN network might be isolated from one another. For example, the management network might use vmk0 on physical NIC 0, while the vSAN network uses vmk2 on physical NIC 1 (separate network adapters for two distinct TCP/IP stacks). This configuration implies that the vSAN network has no default gateway.
In vSAN 7.0 and later, you can override the default gateway for the vSAN VMkernel adapter on each host, and configure a gateway address for the vSAN network.
You also can use static routes to communicate across networks. Consider a vSAN network that is stretched over two data sites on a Layer 2 broadcast domain (for example, 220.127.116.11) and the witness host is on another broadcast domain (for example, 172.30.0.0). If the VMkernel adapters on a data site try to connect to the vSAN network on the witness host, the connection fails because the default gateway on the ESXi host is associated with the management network. There is no route from the management network to the vSAN network.
Define a new routing entry that indicates which path to follow to reach a particular network. For a vSAN network on a stretched cluster, you can add static routes to ensure proper communication across all hosts.
For example, you can add a static route to the hosts on each data site, so requests to reach the 172.30.0.0 witness network are routed through the 18.104.22.168 interface. Also add a static route to the witness host so that requests to reach the 22.214.171.124 network for the data sites are routed through the 172.30.0.0 interface.
Use the esxcli network ip route command to add static routes.