Use this list of requirements and recommendations for reference related to the vSphere cluster configuration in an environment with a single or multiple VMware Cloud Foundation instances. The design elements also consider if an instance contains a single or multiple availability zones.
For full design details, see Logical vSphere Cluster Design for VMware Cloud Foundation.
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-REQD-CFG-001 |
Create a cluster in each workload domain for the initial set of ESXi hosts. |
|
Management of multiple clusters and vCenter Server instances increases operational overhead. |
VCF-CLS-REQD-CFG-002 |
Allocate a minimum number of ESXi hosts according to the cluster type being deployed. |
|
To support redundancy, you must allocate additional ESXi host resources. |
VCF-CLS-REQD-CFG-003 |
If using a consolidated workload domain, configure the following vSphere resource pools to control resource usage by management and customer workloads.
|
|
You must manage the vSphere resource pool settings over time. |
VCF-CLS-REQD-CFG-004 |
Configure the vSAN network gateway IP address as the isolation address for the cluster. |
Allows vSphere HA to validate if a host is isolated from the vSAN network. |
None. |
VCF-CLS-REQD-CFG-005 |
Set the advanced cluster setting |
Ensures that vSphere HA uses the manual isolation addresses instead of the default management network gateway address. |
None. |
VCF-CLS-REQD-LCM-001 |
Use baselines as the life cycle management method for the management domain. |
vSphere Lifecycle Manager images are not supported for the management domain. |
You must manage the life cycle of firmware and vendor add-ons manually. |
Requirement ID |
Design Requirement |
Justification |
Implication |
---|---|---|---|
VCF-CLS-REQD-CFG-006 |
Configure the vSAN network gateway IP addresses for the second availability zone as an additional isolation addresses for the cluster. |
Allows vSphere HA to validate if a host is isolated from the vSAN network for hosts in both availability zones. |
None. |
VCF-CLS-REQD-CFG-007 |
Enable the Override default gateway for this adapter setting on the vSAN VMkernel adapters on all ESXi hosts. |
Enables routing the vSAN data traffic through the vSAN network gateway rather than through the management gateway. |
vSAN networks across availability zones must have a route to each other. |
VCF-CLS-REQD-CFG-008 |
Create a host group for each availability zone and add the ESXi hosts in the zone to the respective group. |
Makes it easier to manage which virtual machines run in which availability zone. |
You must create and maintain VM-Host DRS group rules. |
VCF-CLS-REQD-LCM-002 |
Create workload domains selecting baselines as the life cycle management method. |
vSphere Lifecycle Manager images are not supported for vSAN stretched clusters. |
You must manage the life cycle of firmware and vendor add-ons manually. |
Recommendation ID |
Design Recommendation |
Justification |
Implication |
---|---|---|---|
VCF-CLS-RCMD-CFG-001 |
Use vSphere HA to protect all virtual machines against failures. |
vSphere HA supports a robust level of protection for both ESXi host and virtual machine availability. |
You must provide sufficient resources on the remaining hosts so that virtual machines can be restarted on those hosts in the event of a host outage. |
VCF-CLS-RCMD-CFG-002 |
Set host isolation response to Power Off and restart VMs in vSphere HA. |
vSAN requires that the host isolation response be set to Power Off and to restart virtual machines on available ESXi hosts. |
If a false positive event occurs, virtual machines are powered off and an ESXi host is declared isolated incorrectly. |
VCF-CLS-RCMD-CFG-003 |
Configure admission control for 1 ESXi host failure and percentage-based failover capacity. |
Using the percentage-based reservation works well in situations where virtual machines have varying and sometimes significant CPU or memory reservations. vSphere automatically calculates the reserved percentage according to the number of ESXi host failures to tolerate and the number of ESXi hosts in the cluster. |
In a cluster of 4 ESXi hosts, the resources of only 3 ESXi hosts are available for use. |
VCF-CLS-RCMD-CFG-004 |
Enable VM Monitoring for each cluster. |
VM Monitoring provides in-guest protection for most VM workloads. The application or service running on the virtual machine must be capable of restarting successfully after a reboot or the virtual machine restart is not sufficient. |
None. |
VCF-CLS-RCMD-CFG-005 |
Set the advanced cluster setting |
Enables triggering a restart of a management appliance when an OS failure occurs and heartbeats are not received from VMware Tools instead of waiting additionally for the I/O check to complete. |
If you want to specifically enable I/O monitoring, then configure the |
VCF-CLS-RCMD-CFG-006 |
Enable vSphere DRS on all clusters, using the default fully automated mode with medium threshold. |
Provides the best trade-off between load balancing and unnecessary migrations with vSphere vMotion. |
If a vCenter Server outage occurs, the mapping from virtual machines to ESXi hosts might be difficult to determine. |
VCF-CLS-RCMD-CFG-007 |
Enable Enhanced vMotion Compatibility (EVC) on all clusters in the management domain. |
Supports cluster upgrades without virtual machine downtime. |
You must enable EVC only if the clusters contain hosts with CPUs from the same vendor. You must enable EVC on the default management domain cluster during bringup. |
VCF-CLS-RCMD-CFG-008 |
Set the cluster EVC mode to the highest available baseline that is supported for the lowest CPU architecture on the hosts in the cluster. |
Supports cluster upgrades without virtual machine downtime. |
None. |
VCF-CLS-RCMD-LCM-001 |
Use images as the life cycle management method for VI workload domains that do not require vSAN stretched clusters. |
vSphere Lifecycle Manager images simplify the management of firmware and vendor add-ons manually. |
You cannot add any stretched cluster to a VI workload domains that uses vSphere Lifecycle Manager images. |
Recommendation ID |
Design Recommendation |
Justification |
Implication |
---|---|---|---|
VCF-CLS-RCMD-CFG-009 |
Increase admission control percentage to the half of the ESXi hosts in the cluster. |
Allocating only half of a stretched cluster ensures that all VMs have enough resources if an availability zone outage occurs. |
In a cluster of 8 ESXi hosts, the resources of only 4 ESXi hosts are available for use. If you add more ESXi hosts to the default management cluster, add them in pairs, one per availability zone. |
VCF-CLS-RCMD-CFG-010 |
Create a virtual machine group for each availability zone and add the VMs in the zone to the respective group. |
Ensures that virtual machines are located only in the assigned availability zone to avoid unnecessary vSphere vMotion migrations. |
You must add virtual machines to the allocated group manually. |
VCF-CLS-RCMD-CFG-011 |
Create a should-run-on-hosts-in-group VM-Host affinity rule to run each group of virtual machines on the respective group of hosts in the same availability zone. |
Ensures that virtual machines are located only in the assigned availability zone to avoid unnecessary vSphere vMotion migrations. |
You must manually create the rules. |