Use this list of requirements and recommendations for reference related to the vSphere cluster configuration in an environment with a single or multiple VMware Cloud Foundation instances. The design elements also consider if an instance contains a single or multiple availability zones.
For full design details, see Logical vSphere Cluster Design for VMware Cloud Foundation.
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-REQD-CFG-001 |
Create a cluster in each workload domain for the initial set of ESXi hosts. |
|
Management of multiple clusters and vCenter Server instances increases operational overhead. |
VCF-CLS-REQD-CFG-002 |
Allocate a minimum number of ESXi hosts according to the cluster type being deployed. |
|
To support redundancy, you must allocate additional ESXi host resources. |
VCF-CLS-REQD-CFG-003 |
If using a consolidated workload domain, configure the following vSphere resource pools to control resource usage by management and customer workloads.
|
|
You must manage the vSphere resource pool settings over time. |
VCF-CLS-REQD-CFG-004 |
For vSAN clusters, except for vSAN Max clusters, configure the vSAN network gateway IP address as the isolation address for the cluster. |
vSphere HA can validate if a host is isolated from the vSAN network. |
You must allocate an additional IP address. |
VCF-CLS-REQD-CFG-005 |
For vSAN clusters, except for vSAN Max clusters, set the advanced cluster setting |
Ensures that vSphere HA uses the manual isolation addresses instead of the default management network gateway address. |
None. |
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-REQD-CFG-006 |
Configure the IP address of the vSAN network for the second availability zone as an additional isolation addresses for the cluster. |
Enables vSphere HA to validate if a host is isolated from the vSAN network for hosts in both availability zones. |
The IP address of the vSAN network gateway must be highly available and reply to ICMP requests. |
VCF-CLS-REQD-CFG-007 |
Enable the Override default gateway for this adapter setting on the vSAN VMkernel adapters on all ESXi hosts. |
Enables routing the vSAN data traffic through the vSAN network gateway rather than through the management gateway. |
vSAN networks across availability zones must have a route to each other. |
VCF-CLS-REQD-CFG-008 |
Create a host group for each availability zone and add the ESXi hosts in the zone to the respective group. |
Makes it easier to manage which virtual machines run in which availability zone. |
You must create and maintain VM-Host DRS group rules. |
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-L3MR-REQD-CFG-001 |
Configure the IP address of the vSAN network gateway for each rack accessible over Layer 3 as the isolation address for the nodes in that rack in the cluster. |
Enables vSphere HA to validate if a host is isolated from the vSAN network. |
The IP address of the vSAN network gateway must be highly available and reply to ICMP requests. |
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-MRE-REQD-ENV-001 |
Deploy a minimum of two vSphere clusters for NSX edge nodes, with one vSphere cluster in each rack. |
Provides availability if a failure of a rack or single vSphere cluster occurs. |
Additional cluster required. |
Recommendation ID |
Design Recommendation |
Justification |
Implication |
---|---|---|---|
VCF-CLS-RCMD-CFG-001 |
Use vSphere HA to protect all virtual machines against failures. |
vSphere HA supports a robust level of protection for both ESXi host and virtual machine availability. |
You must provide sufficient resources on the remaining hosts so that virtual machines can be restarted on those hosts in the event of a host outage. |
VCF-CLS-RCMD-CFG-002 |
For vSAN clusters, set host isolation response to Power Off and restart VMs in vSphere HA. |
vSAN requires that the host isolation response be set to Power Off and to restart virtual machines on available ESXi hosts. |
If a false positive event occurs, virtual machines are powered off and an ESXi host is declared isolated incorrectly. |
VCF-CLS-RCMD-CFG-003 |
Configure admission control for 1 ESXi host failure and percentage-based failover capacity. |
Using the percentage-based reservation works well in situations where virtual machines have varying and sometimes significant CPU or memory reservations. vSphere automatically calculates the reserved percentage according to the number of ESXi host failures to tolerate and the number of ESXi hosts in the cluster. |
In a cluster of 4 ESXi hosts, the resources of only 3 ESXi hosts are available for use. |
VCF-CLS-RCMD-CFG-004 |
Enable VM Monitoring for each cluster. |
VM Monitoring provides in-guest protection for most VM workloads. The application or service running on the virtual machine must be capable of restarting successfully after a reboot or the virtual machine restart is not sufficient. |
None. |
VCF-CLS-RCMD-CFG-005 |
Set the advanced cluster setting |
Enables triggering a restart of a management appliance when an OS failure occurs and heartbeats are not received from VMware Tools instead of waiting additionally for the I/O check to complete. |
If you want to specifically enable I/O monitoring, you must configure the das.iostatsinterval advanced setting. |
VCF-CLS-RCMD-CFG-006 |
Enable vSphere DRS on all clusters, using the default fully automated mode with medium threshold. |
Provides the best trade-off between load balancing and unnecessary migrations with vSphere vMotion. |
If a vCenter Server outage occurs, the mapping from virtual machines to ESXi hosts might be difficult to determine. |
VCF-CLS-RCMD-CFG-007 |
Enable Enhanced vMotion Compatibility (EVC) on all clusters in the management domain. |
Supports cluster upgrades without virtual machine downtime. |
You must enable EVC only if the clusters contain hosts with CPUs from the same vendor. You must enable EVC on the default management domain cluster during bringup. |
VCF-CLS-RCMD-CFG-008 |
Set the cluster EVC mode to the highest available baseline that is supported for the lowest CPU architecture on the hosts in the cluster. |
Supports cluster upgrades without virtual machine downtime. |
None. |
VCF-CLS-RCMD-LCM-001 |
Use images as the life cycle management method for all workload domains. |
|
|
Recommendation ID |
Design Recommendation |
Justification |
Implication |
---|---|---|---|
VCF-CLS-RCMD-CFG-009 |
Increase admission control percentage to half of the ESXi hosts in the cluster. |
Allocating only half of a stretched cluster ensures that all VMs have enough resources if an availability zone outage occurs. |
In a cluster of 8 ESXi hosts, the resources of only 4 ESXi hosts are available for use. If you add more ESXi hosts to the default management cluster, add them in pairs, one per availability zone. |
VCF-CLS-RCMD-CFG-010 |
Create a virtual machine group for each availability zone and add the VMs in the zone to the respective group. |
Ensures that virtual machines are located only in the assigned availability zone to avoid unnecessary vSphere vMotion migrations. |
You must add virtual machines to the allocated group manually. |
VCF-CLS-RCMD-CFG-011 |
Create a should-run-on-hosts-in-group VM-Host affinity rule to run each group of virtual machines on the respective group of hosts in the same availability zone. |
Ensures that virtual machines are located only in the assigned availability zone to avoid unnecessary vSphere vMotion migrations. |
You must manually create the rules. |
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-L3MR-RCMD-CFG-001 |
Deploy a multi-rack VI workload domain cluster with Layer 3 networking in a minimum of four racks with a Layer 3 boundary at the rack level. |
Improves resiliency. When combined with vSAN fault domains, the minimum of four racks protects against a single rack failure. |
|