This design uses VMware vSAN to implement software-defined storage for the management clusters. By using vSAN, you have a high level of control on the storage subsystem.

vSAN is a hyper-converged storage software that is fully integrated with the hypervisor. vSAN creates a cluster of server hard disk drives and solid state drives, and presents a flash-optimized, highly-resilient, shared storage datastore to ESXi hosts and virtual machines. By using vSAN storage policies, you can control capacity, performance, and availability on a per virtual machine basis.

Requirements and Dependencies

The software-defined storage module has the following requirements and options.

Requirement Category

Requirements

Number of hosts

  • Minimum of 3 ESXi hosts providing storage resources to the vSAN cluster.

vSAN configuration

vSAN is configured as hybrid storage or all-flash storage. 

  • A vSAN hybrid storage configuration requires both magnetic devices and flash caching devices.

  • An all-flash vSAN configuration requires flash devices for both the caching and capacity tiers.  

Requirements for individual hosts that provide storage resources

  • Minimum of one SSD. The SSD flash cache tier should be at least 10% of the size of the HDD capacity tier.

  • Minimum of two HDDs for hybrid, or two additional flash devices for an all-flash configuration

  • RAID controller that is compatible with vSAN. 

  • 10 Gbps network for vSAN traffic.

  • vSphere High Availability host isolation response set to power off virtual machines. With this setting, you prevent split-brain conditions if isolation or network partition occurs. In a split-brain condition, the virtual machine might be powered on by two ESXi hosts by accident.

    See design decision SDDC-VI-VC-012 for more details. 

Hybrid Mode and All-Flash Mode

vSAN has two modes of operation: all-flash and hybrid.

Hybrid Mode

In a hybrid storage architecture, vSAN pools server-attached capacity devices (in this case magnetic devices) and caching devices, typically SSDs or PCI-e devices, to create a distributed shared datastore.

All-Flash Mode

All-flash storage uses flash-based devices (SSD or PCI-e) as a write cache while other flash-based devices provide high endurance for capacity and data persistence.

Table 1. Design Decisions on the vSAN Mode

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-PHY-STO-001

Configure vSAN in hybrid mode in the management cluster.

Provides performance that is good enough for the VMs in the management cluster that are hosted on vSAN. Management nodes do not require the performance or expense of an all-flash vSAN configuration.

vSAN hybrid mode does not provide the potential performance or additional capabilities such as deduplication of an all-flash configuration.

Sizing Storage

You usually base sizing on the requirements of the IT organization. However, this design provides calculations that are based on a single-region implementation, and is then implemented on a per-region basis. In this way, you can handle storage in a dual-region deployment that has failover capabilities enabled.

This sizing is calculated according to a certain node configuration per region. Although VMware Validated Design has enough memory capacity to handle N-1 host failures, and uses thin-provisioned swap for the vSAN configuration, the potential think-provisioned swap capacity is factored in the calculation.

Table 2. Management Layers and Hardware Sizes

Category

Quantity

Resource Type

Capacity Consumption

Physical Infrastructure (ESXi)

4

Memory

768 GB

Virtual Infrastructure

16

Disk

1,200 GB

Swap

108 GB

Operations Management

10

Disk

6,272 GB

Swap

170 GB

Cloud Management

14

Disk

1,200 GB

Swap

144 GB

Business Continuity

2

Disk

58 GB

Swap

8 GB

Total

  • 42 management virtual machines

  • 4 ESXi hosts

Disk

8,730 GB

Swap

430 GB

Memory

769 GB

Derive the storage space that is required for the capacity tier according to the following calculations For vSAN memory consumption by management ESXi hosts, see VMware Knowledge Base article 2113954.

[Static Base Consumption + (Number of Disk Groups * (Static Disk Group Base Consumption + (Static Flash Device Memory Overhead Per GB * Flash Device Capactiy))) + (Static Capacity Disk Base Consumption * Number of Capacity Disks) ] * Host Quantity = vSAN  Memory Consumption
[5426 MB + (2 Disk Groups * (636 MB + (8 MB * 150 GB Flash Storage))) + (70 MB * 3 Magnetic Disks)] * 4 ESXi Hosts

[5426 MB + (2 Disk Groups * (636 MB + 1200 MB)) + 210 MB] * 4 ESXi Hosts = [5426 MB + 3882 MB] * 4 ESXi Hosts * 10e-3 GB ≈ 38 GB vSAN Memory Consumption 

Derive the consumption of storage space by the management virtual machines according to the following calculations. See VMware vSAN Design and Sizing Guide.

VM Raw Storage Requirements (without FTT) + VM Swap (without FTT) = Virtual Machine Raw Capacity Requirements

Virtual Machine Raw Capacity Requirements * FTT = Final Virtual Machine Raw Capacity Requirements

8,730 GB Disk + 430 GB Swap = 9,160 GB Virtual Machine Raw Capacity Requirements

9,160 GB Virtual Machine Raw Capacity Requirements * 2 (FTT=1, RAID1) = 18,320 GB Final Virtual Machine Raw Capacity Requirements

Derive the requirements for total storage space for the capacity tier according to the following calculations:

vSAN Memory Consumption  + Final Virtual Machine Raw Capacity Requirements = Total Raw Storage Capacity

Total Raw Storage Capacity * 30% Slack Overhead * 1% On-disk Format Overhead * 0.12% Checksum Overhead = Raw Unformatted Storage Capacity
OR
Total Raw Storage Capacity * 30% Slack Overhead * 1% On-disk Format Overhead * 0.12% Checksum Overhead * 20% Estimated Growth = Raw Unformatted Storage Capacity (with 20% Growth Capacity)

Raw Unformatted Storage Capacity / ESXi Quantity = Final Raw Storage Capacity per Host

Based on the calculations for the vSAN memory consumption and the management virtual machine consumption, calculate the final raw storage capacity required for the cluster and per the ESXi hosts.

38 GB vSAN Memory Consumption + 18,320 GB VM Raw Capactiy = 18,358 GB Total Raw Storage Capacity

18,358 GB Total Raw Storage Capacity * 30% Slack Overhead * 1% On-disk Format Overhead * 0.12% Overhead ≈ 24,132 ≈ 24 TB Raw Unformatted Storage Capacity
24 TB Raw Unformatted Storage Capacity / 4 ESXi hosts ≈ 6 TB Final Raw Storage Capacity per host

18,358 GB Total Raw Storage Capacity * 30% Slack Overhead * 1% On-disk Format Overhead * 0.12% Overhead * 20% Estimated Growth ≈ 28,959 GB ≈ 29 TB Raw Unformatted Storage Capacity (with 20% Growth Capacity)
29 TB Raw Unformatted Storage Capacity / 4 ESXi hosts ≈ 8 TB Final Raw Storage Capacity per host

Derive the storage space that is required for the caching tier according to the following calculation:

Raw Unformatted Storage Capacity  * 50% * 10% = Total Flash Device Capacity

Total Flash Device Capacity / ESXi Quantity = Final Flash Device Capacity per Host
24 TB Raw Unformatted Storage Capacity * 50% * 10% Cache Required ≈ 1.2 TB Flash Device Capacity
1.2 TB Flash Device Storage Capacity / 4 ESXi Hosts ≈ 300 GB of Flash Device Capacity per Host

29 TB Raw Unformatted Storage Capacity (with 20% Growth Capacity) * 50% * 10% Cache Required ≈ 1.5 TB Flash Device Capacity
1.5 TB Flash Device Storage Capacity / 4 ESXi Hosts ≈ 400 GB of Flash Device Capacity per Host
Table 3. Design Decisions on the vSAN Disk Configuration

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-PHY-STO-002

In the management cluster, for each host, allocate the following storage configuration:

  • For the caching tier, provide 300 GB or more of SSD storage.

  • For the capacity tier, provide 6 TB or more of magnetic HDD storage.

Provides enough capacity for the management VMs with a minimum of 10% flash-based caching and 30% of overhead.

When using only a single disk group, you limit the amount of striping (performance) capability and increase the size of the fault domain.