This procedure describes how to stretch a VxRail cluster across two availability zones.

This example use case has two availability zones in two buildings in an office campus - AZ1 and AZ2. Each availability zone has its own power supply and network. The management domain is on AZ1 and contains the default cluster, SDDC-Cluster1. This cluster contains four ESXi hosts.
vSAN network VLAN ID=1623
MTU=9000
Network=172.16.234.0
netmask 255.255.255.0
gateway 172.16.23.253
IP range=172.16.23.11 - 172.16.234.59
vMotion network VLAN ID=1622
MTU=9000
Network=172.16.22.0
netmask 255.255.255.0
gateway 172.16.22.253
IP range=172.16.22.11 - 172.16.22.59

There are four ESXi hosts in AZ2 that are not in the VMware Cloud Foundation inventory yet.

We will stretch the default cluster SDDC-Cluster1 in the management domain from AZ1 to AZ2.

Figure 1. Stretch Cluster Example
Example of a Stretch Cluster

To stretch a cluster for VMware Cloud Foundation on Dell EMC VxRail, perform the following steps:

Prerequisites

  • Verify that vCenter Server is operational.
  • Verify that you have completed the Planning and Preparation Workbook with the management domain or VI workload domain deployment option included.
  • Verify that your environment meets the requirements listed in the Prerequisite Checklist sheet in the Planning and Preparation Workbook.
  • Ensure that you have enough hosts such that there is an equal number of hosts on each availability zone. This is to ensure that there are sufficient resources in case an availability zone goes down completely.
  • Deploy and configure a vSAN witness host. See Deploy and Configure vSAN Witness Host.
  • If you are stretching a cluster in a VI workload domain, the default management vSphere cluster must have been stretched.
  • Download initiate_stretch_cluster_vxrail.py.
Important: You cannot deploy an NSX Edge cluster on a vSphere cluster that is stretched. If you plan to deploy an NSX Edge cluster, you must do so before you execute the stretch cluster workflow.
Note: You cannot stretch a cluster in the following conditions:
  • The cluster uses static IP addresses for the NSX-T Data Center Host Overlay Network TEPs.
  • The cluster has a vSAN remote datastore mounted on it.
  • The cluster shares a vSAN Storage Policy with any other clusters.
  • The cluster is enabled for Workload Management (vSphere with Tanzu).

Procedure

  1. Using an SSH File Transfer tool, copy initiate_stretch_cluster_vxrail.py to the /home/vcf/ directory on the SDDC Manager appliance.
  2. Using SSH, log in to the SDDC Manager appliance with the user name vcf and the password you specified in the deployment parameter workbook.
  3. Run the script with -h option for details about the script options.
    python initiate_stretch_cluster_vxrail.py -h 
  4. Run the following command to prepare the cluster to be stretched. The command creates affinity rules for the VMs to run on the preferred site:
    python initiate_stretch_cluster_vxrail.py --workflow prepare-stretch --sc-domain <SDDC-valid-domain-name> --sc-cluster <valid-cluster-name> 
    Replace <SDDC-valid-domain-name> and <valid-cluster-name> with the correct values for your environment. For example:
    python initiate_stretch_cluster_vxrail.py --workflow prepare-stretch --sc-domain wdc1-workflowspec-vxrail --sc-cluster VxRail-Virtual-SAN-Cluster-8d2c9f37-e230-4238-ab35-cafd5033a59e 
    Enter the SSO user name and password when prompted to do so.
    Once the workflow is triggered, track the task status in the SDDC Manager UI. If the task fails, debug and fix the issue and retry the task from the SDDC Manager UI. Do not run the script again.
  5. Use the VxRail vCenter plug-in to add the additional hosts in Availability Zone 2 to the cluster by performing the VxRail Manager cluster expansion work flow.
  6. Run the following command to stretch the cluster:
    python initiate_stretch_cluster_vxrail.py --workflow stretch-vsan --sc-domain <SDDC-valid-domain-name> --sc-cluster <valid cluster name which is a part of the domain to be stretched> --sc-hosts <valid host names> --witness-host-fqdn <witness host/appliance IP or fqdn> --witness-vsan-ip <witness vsan IP address> --witness-vsan-cidr <witness-vsan-network-IP-address-with-mask> 
    Replace <SDDC-valid-domain-name>, <valid cluster name which is a part of the domain to be stretched>, <valid host names>, <witness vsan IP address>, <witness host/appliance IP or fqdn>, <witness vsan IP address>, and <witness-vsan-network-IP-address-with-mask> with the correct values for your environment. For example:
    python initiate_stretch_cluster_vxrail.py --workflow stretch-vsan --sc-domain wdc1-workflowspec-vxrail --sc-cluster VxRail-Virtual-SAN-Cluster-8d2c9f37-e230-4238-ab35-cafd5033a59e --sc-hosts wdc3-005-proxy.vxrail.local --witness-host-fqdn 172.16.10.235 --witness-vsan-ip 172.16.20.235 --witness-vsan-cidr 172.16.20.0/24 
  7. When prompted, enter the following information:
    • SSO user name and password
    • Root user password for ESXi hosts
    • vSAN gateway IP for the preferred (primary) and non-preferred (secondary) site
    • vSAN CIDR for the preferred (primary) and non-preferred (secondary) site
    • VLAN ID for the non-preferred site overlay VLAN
    • Confirm the SSH thumbprints for the hosts
    Once the workflow is triggered, the task is tracked in the SDDC Manager UI. If the task fails, debug and fix the issue and retry from SDDC Manager UI. Do not run the script again.
  8. Monitor the progress of the AZ2 hosts being added to the cluster.
    1. In the SDDC Manager UI, click View All Tasks.
    2. Refresh the window to monitor the status.
  9. Validate that stretched cluster operations are working correctly by logging in to the vSphere Web Client.
    1. Verify vSAN Health.
      1. On the home page, click Host and Clusters and then select the stretched cluster.
      2. Click Monitor > vSAN > Skyline Health.
      3. Click Retest.
      4. Fix errors, if any.
    2. Verify the vSAN Storage Policy.
      1. On the home page, click Policies and Profiles > VM Storage Policies > vSAN Default Storage Policies.
      2. Select the policy associated with the vCenter Server for the stretched cluster and click Check Compliance.
      3. Click VM Compliance and check the Compliance Status column for each VM.
      4. Fix errors, if any.