You can size the capacity of a vSAN datastore to accommodate the virtual machines (VMs) files in the cluster and to handle failures and maintenance operations.
Use this formula to determine the raw capacity of a vSAN datastore. Multiply the total number of disk groups in the cluster by the size of the capacity devices in those disk groups. Subtract the overhead required by the vSAN on-disk format.
Primary Level of Failures to Tolerate
When you plan the capacity of the vSAN datastore, not including the number of virtual machines and the size of their VMDK files, you must consider the Primary level of failures to tolerate and the Failure tolerance method attributes of the virtual machine storage policies for the cluster.
The Primary level of failures to tolerate has an important role when you plan and size storage capacity for vSAN. Based on the availability requirements of a virtual machine, the setting might result in doubled consumption or more, compared with the consumption of a virtual machine and its individual devices.
For example, if the Failure tolerance method is set to RAID-1 (Mirroring) - Performance and the Primary level of failures to tolerate (PFTT) is set to 1, virtual machines can use about 50 percent of the raw capacity. If the PFTT is set to 2, the usable capacity is about 33 percent. If the PFTT is set to 3, the usable capacity is about 25 percent.
But if the Failure tolerance method is set to RAID-5/6 (Erasure Coding) - Capacity and the PFTT is set to 1, virtual machines can use about 75 percent of the raw capacity. If the PFTT is set to 2, the usable capacity is about 67 percent. For more information about RAID 5/6, see Administering VMware vSAN.
For information about the attributes in a vSAN storage policy, see Administering VMware vSAN.
Calculating Required Capacity
Plan the capacity required for the virtual machines in a cluster with RAID 1 mirroring based on the following criteria:
- Calculate the storage space that the virtual machines in the vSAN cluster are expected to consume.
expected overall consumption = number of VMs in the cluster * expected percentage of consumption per VMDK
- Consider the Primary level of failures to tolerate attribute configured in the storage policies for the virtual machines in the cluster. This attribute directly impacts the number of replicas of a VMDK file on hosts in the cluster.
datastore capacity = expected overall consumption * (PFTT + 1)
- Estimate the overhead requirement of the vSAN on-disk format.
- On-disk format version 3.0 and later adds an extra overhead, typically no more than 1-2 percent capacity per device. Deduplication and compression with software checksum enabled require extra overhead of approximately 6.2 percent capacity per device.
- On-disk format version 2.0 adds an extra overhead, typically no more than 1-2 percent capacity per device.
- On-disk format version 1.0 adds an extra overhead of approximately 1 GB per capacity device.
Capacity Sizing Guidelines
- Keep at least 30 percent unused space to prevent vSAN from rebalancing the storage load. vSAN rebalances the components across the cluster whenever the consumption on a single capacity device reaches 80 percent or more. The rebalance operation might impact the performance of applications. To avoid these issues, keep storage consumption to less than 70 percent.
- Plan extra capacity to handle any potential failure or replacement of capacity devices, disk groups, and hosts. When a capacity device is not reachable, vSAN recovers the components from another device in the cluster. When a flash cache device fails or is removed, vSAN recovers the components from the entire disk group.
- Reserve extra capacity to make sure that vSAN recovers components after a host failure or when a host enters maintenance mode. For example, provision hosts with enough capacity so that you have sufficient free capacity left for components to rebuild after a host failure or during maintenance. This extra space is important when you have more than three hosts, so you have sufficient free capacity to rebuild the failed components. If a host fails, the rebuilding takes place on the storage available on another host, so that another failure can be tolerated. However, in a three-host cluster, vSAN does not perform the rebuild operation if the Primary level of failures to tolerate is set to 1 because when one host fails, only two hosts remain in the cluster. To tolerate a rebuild after a failure, you must have at least three surviving hosts.
Provide enough temporary storage space for changes in the vSAN VM storage policy. When you dynamically change a VM storage policy, vSAN might create a new RAID tree layout of the object. When vSAN instantiates and synchronizes a new layout, the object might consume extra space temporarily. Keep some temporary storage space in the cluster to handle such changes.
- If you plan to use advanced features, such as software checksum or deduplication and compression, reserve extra capacity to handle the operational overhead.
Considerations for Virtual Machine Objects
When you plan the storage capacity in the vSAN datastore, consider the space required in the datastore for the VM home namespace objects, snapshots, and swap files.
- VM Home Namespace. You can assign a storage policy specifically to the home namespace object for a virtual machine. To prevent unnecessary allocation of capacity and cache storage, vSAN applies only the Primary level of failures to tolerate and the Force provisioning settings from the policy on the VM home namespace. Plan storage space to meet the requirements for a storage policy assigned to a VM Home Namespace whose Primary level of failures to tolerate is greater than 0.
- Snapshots. Delta devices inherit the policy of the base VMDK file. Plan extra space according to the expected size and number of snapshots, and to the settings in the vSAN storage policies.
The space that is required might be different. Its size depends on how often the virtual machine changes data and how long a snapshot is attached to the virtual machine.
- Swap files. vSAN uses an individual storage policy for the swap files of virtual machines. The policy tolerates a single failure, defines no striping and read cache reservation, and enables force provisioning.