This design uses VMware vSAN to implement software-defined storage as the primary storage type for the shared edge and workload cluster. By using vSAN, you have a high level of control over the storage subsystem.

All functional testing and validation of the design is on vSAN. Although VMware Validated Design uses vSAN, in particular for the clusters running tenant workloads, you can use any supported storage solution. If you select a storage solution other than vSAN, consider that all the design, deployment, and Day-2 guidance in VMware Validated Design applies under the context of vSAN and adjust appropriately. Your storage design must match or exceed the capacity and performance capabilities of the vSAN configuration in the design. For multiple availability zones, the vSAN configuration includes vSAN stretched clusters.

vSAN is a hyper-converged storage software that is fully integrated with the hypervisor. vSAN creates a cluster of local ESXi host hard disk drives and solid-state drives, and presents a flash-optimized, highly resilient, shared storage datastore to ESXi hosts. By using vSAN storage policies, you can control capacity, performance, and availability on a per virtual machine basis.

vSAN Physical Requirements and Dependencies

The software-defined storage module has the following requirements and options.

Requirement Category

Requirements

Number of hosts

Minimum of three ESXi hosts providing storage resources to the vSAN cluster.

vSAN configuration

vSAN is configured as hybrid storage or all-flash storage.

  • A vSAN hybrid storage configuration requires both magnetic devices and flash caching devices.

  • An all-flash vSAN configuration requires flash devices for both the caching and capacity tiers.

Requirements for individual hosts that provide storage resources

  • Minimum of one flash device. The flash-based cache tier must be sized to handle the anticipated performance requirements. For sizing the vSAN caching tier, see the Design Considerations for Flash Caching Devices in vSAN in the VMware vSAN product documentation.

  • Minimum of two additional devices for capacity tier.

  • RAID controller that is compatible with vSAN.

  • Minimum 10 Gbps network for vSAN traffic.

  • vSphere High Availability host isolation response set to power off virtual machines. With this setting, you prevent split-brain conditions if isolation or network partition occurs. In a split-brain condition, the virtual machine might be powered on by two ESXi hosts by accident.

vSAN Hardware Considerations

While VMware supports building your own vSAN cluster from compatible components, vSAN ReadyNodes are selected for this VMware Validated Design.

Build Your Own

Use hardware from the VMware Compatibility Guide for the following vSAN components:

  • Flash-based drives

  • Magnetic hard drives

  • I/O controllers, including vSAN certified driver and firmware combinations

Use VMware vSAN ReadyNodes

A vSAN ReadyNode is a server configuration that is validated in a tested, certified hardware form factor for vSAN deployment, jointly recommended by the server OEM and VMware. See the vSAN ReadyNode documentation. The vSAN Compatibility Guide for vSAN ReadyNodes documentation provides examples of standardized configurations, including supported numbers of VMs and estimated number of 4K IOPS delivered.

I/O Controllers for vSAN

The I/O controllers are as important to a vSAN configuration as the selection of disk drives. vSAN supports SAS, SATA, and SCSI adapters in either pass-through or RAID 0 mode. vSAN supports multiple controllers per ESXi host.

  • You select between single- and multi-controller configuration in the following way: Multiple controllers can improve performance and mitigate a controller or SSD failure to a smaller number of drives or vSAN disk groups.

  • With a single controller, all disks are controlled by one device. A controller failure impacts all storage, including the boot media (if configured).

Controller queue depth is possibly the most important aspect for performance. All I/O controllers in the VMware vSAN Hardware Compatibility Guide have a minimum queue depth of 256. Consider regular day-to-day operations and increase of I/O because of virtual machine deployment operations, or re-sync I/O activity as a result of automatic or manual fault remediation.

Table 1. Design Decisions on the vSAN I/O Controller Configuration

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-WLD-VI-SDS-001

Ensure that the storage I/O controller that is running the vSAN disk groups is capable and has a minimum queue depth of 256 set.

Storage controllers with lower queue depths can cause performance and stability problems when running vSAN.

vSAN ReadyNode servers are configured with the right queue depths for vSAN.

Limits the number of compatible I/O controllers that can be used for storage.

vSAN Flash Options

vSAN has two configuration options: all-flash and hybrid.

Hybrid Mode

In a hybrid storage architecture, vSAN pools server-attached capacity devices (in this case magnetic devices) and flash-based caching devices, typically SSDs, or PCI-e devices, to create a distributed shared datastore.

All-Flash Mode

All-flash storage uses flash-based devices (SSD or PCI-e) as a write cache while other flash-based devices provide high endurance for capacity and data persistence.

Table 2. Design Decisions on the vSAN Mode

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-WLD-VI-SDS-002

Configure vSAN in All-Flash mode in the shared edge and workload cluster.

  • Provides support for vSAN deduplication and compression.

  • Meets the performance needs of the shared edge and workload cluster.

Using high speed magnetic disks in a hybrid vSAN configuration can provide satisfactory performance and is supported.

All vSAN disks must be flash disks, which might cost more than magnetic disks.​

Sizing Storage

You usually base sizing on the requirements of the IT organization. However, this design provides calculations that are based on a single-region implementation, and is then implemented on a per-region basis. In this way, you can handle storage in a dual-region deployment that has failover capabilities enabled.

This sizing is calculated according to a certain node configuration per region. Although VMware Validated Design has enough memory capacity to handle N-1 host failures and uses thin-provisioned swap for the vSAN configuration, the potential thin-provisioned swap capacity is factored in the calculation.

Category

Quantity

Resource Type

Consumption

Physical Infrastructure (ESXi)

4

Memory

1024 GB

NSX-T Edge Appliances

2

Disk

400 GB

Swap

64 GB

Tenant Workloads (example, 8 GB memory & 120 GB disk per VM)

60

Disk

7200 GB

Swap

480 GB

Total

  • 60 tenant workload VMs

  • 2 NSX-T Edge VMs

  • 4 ESXi hosts

Disk

7600 GB

Swap

544 GB

Memory

1024 GB

The storage space that is required for the vSAN capacity tier according is worked out using the following calculations. For sizing the vSAN caching tier, see Design Considerations for Flash Caching Devices in vSAN in the VMware vSAN product documentation. For vSAN memory consumption by ESXi hosts in the workload domain, see VMware Knowledge Base article 2113954.

Derive the consumption of storage space by the management virtual machines according to the following calculations. See vSAN Design and Sizing Guide.

7,600 GB Disk + 544 GB Swap = 8,144 GB Virtual Machine Raw Capacity Requirements
8,144 GB Virtual Machine Raw Capacity Requirements * 2 (FTT=1, RAID1) = 16,288 GB Final Virtual Machine Raw Capacity Requirements
34 GB vSAN Memory Consumption + 16,288 GB VM Raw Capacity = 16,322 GB Total Raw Storage Capacity
16,322 GB Total Raw Storage Capacity * 30% Slack Overhead * 20% Estimated Growth ≈ 24,485 GB ≈ 24 TB Raw Unformatted Storage Capacity (with 20% Growth Capacity)
24 TB Raw Unformatted Storage Capacity / 3 ESXi hosts ≈ 8 TB Final Raw Storage Capacity per host (considering worst-case scenario with one host failure)
8 TB / 2 disk groups = 4 TB raw capacity per disk group
Table 3. Design Decisions on the vSAN Disk Configuration

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-WLD-VI-SDS-003

Use a 600 GB or greater flash-based drive for the cache tier in each disk group.

Provides enough cache for both hybrid or all-flash vSAN configurations to buffer I/O and ensure disk group performance.

Larger flash disks can increase initial host cost

SDDC-WLD-VI-SDS-004

Have at least 4TB of flash-based drives for the capacity tier in each disk group.

Provides enough capacity for the NSX-T Edge nodes and tenant workloads with a minimum of 10% caching, 30% of overhead, and 20% growth when the number of primary failures to tolerate is 1.

None.

vSAN Hardware Considerations

While VMware supports building your own vSAN cluster from compatible components, vSAN ReadyNodes are selected for this VMware Validated Design. See Design Decisions on Server Hardware for ESXi.