When determining the vSAN deployment specification for a VI workload domain in VMware Cloud Foundation, you decide on the datastore size, тхе number of ESXi hosts per cluster, тхе number of disk groups per ESXi host, and the vSAN policy.

Sizing vSAN Storage

You size your vSAN datastore according to the requirements of the your organization. This sizing is calculated according to a certain node configuration according to your hardware specifications.

Note:

Although vSAN is using thin provisioning to conserve the capacity consumption, in your calculations consider the full disk space, that is, thick provisioning. This approach prevents unexpected exhaustion of the vSAN capacity in the VI workload domain.

Disk space usage distribution consists of the following components:

Component

Description

Effective Raw Capacity

Space available for the vSAN datastore

Slack Space

Space reserved for vSAN specific operations such as resync and rebuilds

Dedupe Overhead

Space reserved for deduplication and compression metadata such as hash, translation, and allocations maps.

Disk Formatting Overhead

Reservation for file system metadata

Checksum Overhead

Space used for checksum information

Physical Reservation

How much physical space or raw capacity do we consume as part of the overheads

Sizing vSAN Datastore

When sizing your vSAN datastore, consider the required storage capacity for the workloads you plan to run and the number of NSX Edge nodes providing networking services to the workloads. Consider also the additional capacity needed for file system and any third-party management components you want to add. Consider also cost against availability to provide the appropriate sizing.

This design applies the shared edge and compute architecture to the vSphere clusters of a VI workload domain. In this architecture, the NSX Edge nodes and customer workloads share the resources of the same cluster, using resource pools to allocate CPU and memory to the NSX Edge appliances with high priority and to customer workloads according to their SLA. See High Availability Design for the NSX Edge Nodes for a Virtual Infrastructure Workload Domain.
Table 1. Design Decisions on the vSAN Datastore

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-003

On all vSAN datastores, ensure that at least 30% of free space is always available.

When vSAN reaches 80% usage, a rebalance task is started which can be resource-intensive.

Increases the amount of available storage needed.

Number of vSAN-Enabled ESXi Hosts Per Cluster

The number of ESXi hosts in the cluster depends on these factors:

  • The amount of available space on the vSAN datastore

  • The number of failures you can tolerate in the cluster

For example, if the vSAN cluster has only 3 ESXi hosts, only a single failure is supported. If a higher level of availability is required, you must add more hosts.

Table 2. Design Decisions on the vSAN Cluster Size in a VI Workload Domain with a Single Availability Zone

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-004

The VI workload domain cluster requires a minimum of 4 ESXi hosts to support vSAN.
  • Having 4 ESXi hosts addresses the availability and sizing requirements.

  • You can take an ESXi host offline for maintenance or upgrades without impacting the overall vSAN cluster health.

The availability requirements for the VI Workload Domain cluster might cause underutilization of the cluster's ESXi hosts.

Table 3. Design Decisions on the vSAN Cluster Size in a VI Workload Domain with Multiple Availability Zones

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-005

The VI workload domain cluster requires a minimum of 8 ESXi hosts (4 in each availability zone) to support a stretched vSAN configuration.

  • Having 8 ESXi hosts addresses the availability and sizing requirements.

  • You can take an availability zone offline for maintenance or upgrades without impacting the overall vSAN cluster health.

The capacity of the additional 4 hosts is not added to capacity of the cluster. They are only used to provide additional availability.

Number of vSAN Disk Groups per ESXi Host

Disk group sizing is an important factor during volume design. The number of disk groups can affect availability and performance. If more ESXi hosts are available in the cluster, more failures are tolerated in the cluster. This capability adds cost because additional hardware for the disk groups is required. More available disk groups can increase the recoverability of vSAN during a failure. Consider these data points when deciding on the number of disk groups per ESXi host:

  • The amount of available space on the vSAN datastore.

  • The number of failures you can tolerate in the cluster.

  • The performance required when recovering vSAN objects.

The optimal number of disk groups is a balance between hardware and space requirements for the vSAN datastore. More disk groups increase space and provide higher availability. However, adding disk groups can be restricted by cost.

Table 4. Design Decision on the Disk Groups per ESXi Host

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-006

Configure vSAN with a minimum of two disk groups per ESXi host.

Reduces the size of the fault domain and spreads the I/O load over more disks for better performance.

Multiple disks groups require more disks in each ESXi host.

Sizing vSAN Disks Per ESXi Host

The size of the vSAN disks depends on the requirements for the datastore, the number of ESXi hosts in the vSAN cluster and the number of disk groups per host.

For sizing the vSAN caching tier, see the Designing vSAN Disk groups – All Flash Cache Ratio Update post on VMware Blogs. For vSAN memory consumption by the ESXi hosts in the VI workload domain, see VMware Knowledge Base article 2113954.

Table 5. Design Decisions on the vSAN Disk Configuration

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-007

Use a 600 GB or greater flash-based drive for the cache tier in each disk group.

Provides enough cache for both hybrid or all-flash vSAN configurations to buffer I/O and ensure disk group performance.

Larger flash disks can increase initial host cost

vSAN Policy Design

After you enable and configure VMware vSAN, you can create storage policies that define the virtual machine storage characteristics. Storage characteristics specify different levels of service for different virtual machines.

The default storage policy tolerates a single failure and has a single disk stripe. Use the default policy. If you configure a custom policy, vSAN should guarantee its application. However, if vSAN cannot guarantee a policy, you cannot provision a virtual machine that uses the policy unless you enable force provisioning.

Policy design starts with assessment of business needs and application requirements. Use cases for VMware vSAN must be assessed to determine the necessary policies. Start by assessing the following application requirements:

  • I/O performance and profile of your workloads on a per-virtual-disk basis

  • Working sets of your workloads

  • Hot-add of additional cache (requires repopulation of cache)

  • Specific application best practice (such as block size)

After assessment, configure the software-defined storage module policies for availability and performance in a conservative manner so that space consumed and recoverability properties are balanced. In many cases the default system policy is adequate and no additional policies are required unless specific requirements for performance or availability exist.

A storage policy includes several attributes. You can use them alone or combine them to provide different service levels. By using policies, you can customize any configuration according to the business requirements of the consuming application.

Before making design decisions, understand the policies and the objects to which they can be applied.

If you do not specify a user-configured policy, vSAN uses a default system policy of 1 failure to tolerate and 1 disk stripe for virtual machine namespace and virtual disks. To ensure protection for critical virtual machine components, vSAN uses the policy set on virtual machine namespace for swap files. vSAN also uses the policy set on the virtual disk for the associated snapshot delta disks, if exist.

Configure policies according to the business requirements of the application. By using policies, vSAN can adjust the performance and the availability of a disk on the fly.

Object

Policy

Comments

Virtual machine namespace

User-Configured Storage Policy

Can be any storage policy configured on the system.

Swap

Uses Virtual machine namespace policy

Same as Virtual machine namespace policy.

Virtual disks

User-Configured Storage Policy

Can be any storage policy configured on the system.

Virtual disk snapshots

Uses virtual disk policy

Same as virtual disk policy.

Table 6. Design Decisions on the vSAN Storage Policy in a VI Workload Domain with a Single Availability Zone

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-008

Use the default vSAN storage policy.

Provides the level of redundancy that is needed in the VI workload domain cluster.

Provides the level of performance that is enough for NSX Edge appliances and tenant workloads.

You might need additional policies for third-party virtual machines hosted in the VI workload domain cluster because their performance or availability requirements might differ from what the default vSAN policy supports.

Table 7. Design Decisions on the vSAN Storage Policy in a VI Workload Domain with Multiple Availability Zones

Decision ID

Design Decision

Design Justification

Design Implication

VCF-WLD-vSAN-CFG-009

Add the following setting to the default vSAN storage policy:

Secondary Failures to Tolerate = 1

Provides the necessary protection for virtual machines in each availability zone, with the ability to recover from an availability zone outage.

You might need additional policies if third-party virtual machines are to be hosted in the VI workload domain cluster because their performance or availability requirements might differ from what the default vSAN policy supports.

VCF-WLD-vSAN-CFG-010

Configure two fault domains, one for each availability zone. Assign each host to their respective availability zone fault domain.

Fault domains are mapped to availability zones to provide logical host separation and ensure a copy of vSAN data is always available even when an availability zone goes offline.

Additional raw storage is required when the secondary failures to tolerate option and dault domains are enabled.