Virtual SAN can perform block-level deduplication and compression to save storage space. When you enable deduplication and compression on a Virtual SAN all-flash cluster, redundant data within each disk group is reduced.
Deduplication removes redundant data blocks, whereas compression removes additional redundant data within each data block. These techniques work together to reduce the amount of space required to store the data. Virtual SAN applies deduplication and then compression as it moves data from the cache tier to the capacity tier.
You can enable deduplication and compression as a cluster-wide setting, but they are applied on a disk group basis. When you enable deduplication and compression on a Virtual SAN cluster, redundant data within a particular disk group is reduced to a single copy.
You can enable deduplication and compression when you create a new Virtual SAN all-flash cluster or when you edit an existing Virtual SAN all-flash cluster. For more information about creating and editing Virtual SAN clusters, see Enabling Virtual SAN.
When you enable or disable deduplication and compression, Virtual SAN performs a rolling reformat of every disk group on every host. Depending on the data stored on the Virtual SAN datastore, this process might take a long time. It is recommended that you do not perform these operations frequently. If you plan to disable deduplication and compression, you must first verify that enough physical capacity is available to place your data.
How to Manage Disks in a Cluster with Deduplication and Compression
Consider the following guidelines when managing disks in a cluster with deduplication and compression enabled.
Avoid adding disks to a disk group incrementally. For more efficient deduplication and compression, consider adding a new disk group to increase cluster storage capacity.
When you add a new disk group manually, add all of the capacity disks at the same time.
You cannot remove a single disk from a disk group. You must remove the entire disk group to make modifications.
A single disk failure causes the entire disk group to fail.
Verifying Space Savings from Deduplication and Compression
The amount of storage reduction from deduplication and compression depends on many factors, including the type of data stored and the number of duplicate blocks. Larger disk groups tend to provide a higher deduplication ratio. You can check the results of deduplication and compression by viewing the Deduplication and Compression Overview in the Virtual SAN Capacity monitor.
You can view the Deduplication and Compression Overview when you monitor Virtual SAN capacity in the vSphere Web Client. It displays information about the results of deduplication and compression. The Used Before space indicates the logical space required before applying deduplication and compression, while the Used After space indicates the physical space used after applying deduplication and compression. The Used After space also displays an overview of the amount of space saved, and the Deduplication and Compression ratio.
The Deduplication and Compression ratio is based on the logical (Used Before) space required to store data before applying deduplication and compression, in relation to the physical (Used After) space required after applying deduplication and compression. Specifically, the ratio is the Used Before space divided by the Used After space. For example, if the Used Before space is 3 GB, but the physical Used After space is 1 GB, the deduplication and compression ratio is 3x.
When deduplication and compression are enabled on the Virtual SAN cluster, it might take several minutes for capacity updates to be reflected in the Capacity monitor as disk space is reclaimed and reallocated.