What's in the Release Notes

The release notes cover the following topics:

What's New
VMware vSAN Community
Upgrades for This Release
Limitations
Known Issues

What's New

VMware vSAN 6.6.1 introduces the following new features and enhancements:

vSphere Update Manager build recommendations for vSAN. Update Manager can scan the vSAN cluster and recommend host baselines that include updates, patches, and extensions. It manages recommended baselines, validates the support status from vSAN HCL, and downloads the correct ESXi ISO images from VMware.

vSAN requires Internet access to generate build recommendations. If your vSAN cluster uses a proxy to connect to the Internet, vSAN can generate recommendations for patch upgrades, but not for major upgrades.

Performance diagnostics. The performance diagnostics tool analyzes previously executed benchmark tests. It detects issues, suggests remediation steps, and provides supporting performance graphs for further insight. Performance diagnostics requires participation in the Customer Experience Improvement Program (CEIP).

Increased support for locator LEDs on vSAN disks. Gen-9 HPE controllers in pass-through mode now support vSAN activation of locator LEDs. Blinking LEDs help to identify and isolate specific drives.

VMware vSAN Community

Use the vSAN Community Web site to provide feedback and request assistance with any problems you find while using vSAN.

Upgrades for This Release

For instructions about upgrading vSAN, see the VMware vSAN 6.6.1 documentation.

Upgrading the On-disk Format for Hosts with Limited Capacity

During an upgrade of the vSAN on-disk format, a disk group evacuation is performed. The disk group is removed and upgraded to on-disk format version 5.0, and the disk group is added back to the cluster. For two-node or three-node clusters, or clusters without enough capacity to evacuate each disk group, select Allow Reduced Redundancy from the vSphere Web Client. You also can use the following RVC command to upgrade the on-disk format: vsan.ondisk_upgrade --allow-reduced-redundancy

When you allow reduced redundancy, your VMs are unprotected for the duration of the upgrade, because this method does not evacuate data to the other hosts in the cluster. It removes each disk group, upgrades the on-disk format, and adds the disk group back to the cluster. All objects remain available, but with reduced redundancy.

If you enable deduplication and compression during the upgrade to vSAN 6.6.1, you can select Allow Reduced Redundancy from the vSphere Web Client.

Using vSphere Update Manager with Stretched Clusters

Using vSphere Update Manager to upgrade hosts in parallel might result in the witness host being upgraded in parallel with one of the data hosts in a stretched cluster. To avoid upgrade problems, do not configure Update Manager to upgrade a witness host in parallel with the data hosts in a stretched cluster. Upgrade the witness host after all data hosts have been successfully upgraded and have exited maintenance mode.

Verifying Health Check Failures During Upgrade

During upgrades of the vSAN on-disk format, the Physical Disk Health – Metadata Health check can fail intermittently. These failures can occur if the destaging process is slow, most likely because vSAN must allocate physical blocks on the storage devices. Before you take action, verify the status of this health check after the period of high activity, such as multiple virtual machine deployments, is complete. If the health check is still red, the warning is valid. If the health check is green, you can ignore the previous warning. For more information, see Knowledge Base article 2108690.

Limitations

For information about other maximum configuration limits for the vSAN 6.6.1 release, see the Configuration Maximums documentation.

Known Issues

The following issues are known to occur in vSAN 6.6.1:

Custom ISOs are not supported in vSAN build recommendations for Update Manager
vSAN 6.6.1 does not support custom ISOs in its build recommendations for vSphere Update Manager. You cannot use custom ISOs as part of a vSAN system baseline. Baselines that use custom ISOs are considered Non-Compliant.

Workaround: None
When reconfiguring a vSAN cluster in a new vCenter Server, error messages appear
When you add an existing vSAN host to a new cluster in a new vCenter Server, you might see one of the following error messages:

Required field "objectSet" not provided (not @optional)

The request refers to an unexpected or unknown type.

These transient messages appear only when you monitor the process from the vSAN health page (Monitor > vSAN > Health). You can ignore these messages.

Workaround: None
Reconfiguring an existing stretched cluster under a new vCenter Server causes vSAN to issue a health check warning
When rebuilding a current stretched cluster under a new vCenter Server, the vSAN cluster health check is red. The following message appears: vSphere cluster members match vSAN cluster members

Workaround: Use the following procedure to configure the stretched cluster.
1. Use SSH to log in to the witness host.
2. Decommission the disks on witness host. Run the following command: esxcli vsan storage remove -s "SSD UUID"
3. Force the witness host to leave the cluster. Run the following command: esxcli vsan cluster leave
4. Reconfigure the stretched cluster from the new vCenter Server (Configure > vSAN > Fault Domains & Stretched Cluster).
During VCenter Server replacement, esxcli vsan health cluster list command displays health issues
During replacement of vCenter Server, the following command incorrectly displays health issues: esxcli vsan health cluster list It might report issues with network connectivity, physical disk health retrieval, and vSAN CLOMD liveness. Health checks displayed in vCenter report no issues.

Workaround: After VCenter Server replacement is complete, go to Cluster > Monitor > vSAN > Health. Select Cluster > vCenter state is authoritative, and click Update ESXi configuration.
Reconfiguring a vSAN cluster under a new vCenter Server fails due to vSAN cluster UUID mismatch
When adding hosts to an empty vSAN cluster in parallel, the task might fail. The following error message appears: The vSAN host cannot be moved to the destination cluster: vSAN cluster UUID mismatch...

This problem can occur when you recover an existing vSAN cluster in a new vCenter Server.

Workaround: Add one host to the new vSAN cluster and wait for the task to complete. Then you can add the other hosts sequentially or in parallel.
On an encrypted vSAN cluster, Disk Format Conversion (DFC) occurs when vSAN health service remediates a failed shallow rekey
If a shallow rekey operation failed on an encrypted vSAN cluster, the cluster might have an inconsistent state where some hosts use the new KEK while others use the old KEK. The vSAN health service can detect this inconsistency, and attempt to remediate it. vSAN performs a Disk Format Conversion (DFC) during the remediation. DFC can take a long time if the vSAN cluster has large amount of data.

You can reduce the chance of a failed or interrupted shallow rekey operation.
- Make sure all hosts in the cluster are connected and operational. They cannot be disconnected, or in maintenance mode, or powered off.
- Make sure the health check for KMS connection is green before you begin the shallow rekey.
Workaround: None.
Disk format upgrade fails while vSAN resynchronizes large objects
If the vSAN cluster contains very large objects, the disk format upgrade might fail while the object is resynchronized. You might see the following error message: Failed to convert object(s) on vSAN

vSAN cannot perform the upgrade until the object is resynchronized. You can check the status of the resynchronization (Monitor > vSAN > Resyncing Components) to verify when the process is complete.

Workaround: Wait until no resynchronization is pending, then retry the disk format upgrade.
Updated vCenter Server shows deduplication and compression as Not Supported
If the vCenter Server is running 6.5 or earlier software and is in linked mode, the deduplication and compression feature might appear as Not Supported. You cannot enable the feature through vCenter Server. You might also see the following error message while cluster configuration is in progress:

Some elements could not be shown or their information could not be retrieved in time.

This problem is in the vCenter Server interface and does not affect the operation of your vSAN cluster, only the ability to configure deduplication and compression.

Workaround: You can upgrade the vCenter Server to the latest software release, or you can use another interface, such as PowerCLI, to configure deduplication and compression.
Cluster consistency health check fails during deep rekey operation
The deep rekey operation on an encrypted vSAN cluster can take several hours. During the rekey, the following health check might indicate a failure: Cluster configuration consistency. The cluster consistency check does not detect the deep rekey operation, and there might not be a problem.

Workaround: Retest the vSAN cluster consistency health check after the deep rekey operation is complete.
VM OVF deploy fails if DRS is disabled
If you deploy an OVF template on the vSAN cluster, the operation fails if DRS is disabled on the vSAN cluster. You might see a message similar to the following: The operation is not allowed in the current state.

Workaround: Enable DRS on the vSAN cluster before you deploy an OVF template.
vSAN stretched cluster configuration lost after you disable vSAN on a cluster
If you disable vSAN on a stretched cluster, the stretched cluster configuration is not retained. The stretched cluster, witness host, and fault domain configuration is lost.

Workaround: Reconfigure the stretched cluster parameters when you re-enable the vSAN cluster.
Orphaned or inaccessible VMs after total cluster failure
After total cluster failure, some powered off or suspended VMs might become orphaned or inaccessible, especially when vSAN encryption is enabled.

Workaround: Use the following procedure to re-register orphaned or inaccessible VMs.
1. Use RVC to connect to vCenter Server.
2. Navigate to the name of the cluster where orphaned VMs exist and re-register them. For example, if the name of the cluster is "vsan," run the following command: vsan.check_state -ref /localhost/Datacenter/computers/vsan
  Sample output:
  
  vsan.check_state -ref /localhost/Datacenter/computers/vsan 2017-03-03 18:54:04 +0000: Step 1: Check for inaccessible vSAN objects 2017-03-03 18:54:10 +0000: Step 1b: Check for inaccessible vSAN objects, again 2017-03-03 18:54:11 +0000: Step 2: Check for invalid/inaccessible VMs 2017-03-03 18:54:11 +0000: Step 2b: Check for invalid/inaccessible VMs again 2017-03-03 18:54:11 +0000: Step 3: Check for VMs for which VC/hostd/vmx are out of sync Did not find VMs for which VC/hostd/vmx are out of sync
On-disk format version for witness host is later than version for data hosts
When you change the witness host during an upgrade to vSAN 6.6 and later, the new witness host receives the latest on-disk format version. The on-disk format version of the witness host might be later than the on-disk format version of the data hosts. In this case, the witness host cannot store components.

Workaround: Use the following procedure to change the on-disk format to an earlier version.
1. Delete the disk group on the new witness host.
2. Set the advanced parameter to enable formatting of disk groups with an earlier on-disk format. For more information, see Knowledge Base article 2146221.
3. Recreate a new disk group on the witness host with a vSAN on-disk format version that matches the data hosts.
Powered off VMs appear as inaccessible during witness host replacement
When you change a witness host in a stretched cluster, VMs that are powered off appear as inaccessible in the vSphere Web Client for a brief time. After the process is complete, powered off VMs appear as accessible. All running VMs appear as accessible throughout the process.

Workaround: None
Cannot place hosts in maintenance mode if they have faulty boot media
vSAN cannot place hosts with faulty boot media into maintenance mode. The task to enter maintenance mode might fail with an internal vSAN error, due to the inability to save configuration changes. You might see log events similar to the following: Lost Connectivity to the device xxx backing the boot filesystem

Workaround: Remove disk groups manually from each host, using the Full data evacuation option. Then place the host in maintenance mode.
Health check times out if a host fails
If one host in the cluster fails, the health check might time out. You might see the following message: a back-end task took more than 120 seconds. When the vSAN health service detects that the host has failed, it restarts. The health check automatically resumes after ten minutes.

Workaround: None
Health service does not work if vSAN cluster has ESXi hosts with vSphere 6.0 Update 1 or earlier
The vSAN 6.6 and later health service does not work if the cluster has ESXi hosts running vSphere 6.0 Update 1 or earlier releases.

Workaround: Do not add ESXi hosts with vSphere 6.0 Update 1 or earlier software to a vSAN 6.6 or later cluster.
After stretched cluster failover, VMs on the preferred site register alert: Failed to failover
If the secondary site in a stretched cluster fails, VMs failover to the preferred site. VMs already on the preferred site might register the following alert: Failed to failover. Ignore this alert. It does not impact the behavior of the failover.

Workaround: None
During network partition, components in the active site appear to be absent
During a network partition in a vSAN 2 host or stretched cluster, the vSphere Web Client might display a view of the cluster from the perspective of the non-active site. You might see active components in the primary site displayed as absent.

Workaround: Use RVC commands to query the state of objects in the cluster. For example: vsan.vm_object_info
vCenter Server Appliance Installer accepts cluster name greater than 80 characters
If you enter a vSAN cluster name that is more than characters, the vCenter Server Appliance Installer accepts the name, but the configuration is invalid. The vCenter Server Appliance fails when it is booted.

Workaround: Enter a vSAN cluster name that is 80 characters or less.
vCenter Server Appliance Installer accepts mix of flash and magnetic drives for capacity
The vCenter Server Appliance Installer allows you to select a mix of flash devices and magnetic disks for the capacity tier of a disk group in a new vSAN cluster. The capacity tier of each disk group can support either all-flash or all-magnetic devices.

Workaround: Do not mix flash devices and magnetic disks on the capacity tier of the vSAN cluster.
Temporary Update configuration tasks visible if hosts are disconnected when you change vSAN encryption configurations
When you change the configurations in an encrypted vSAN cluster (such as turning encryption on or off or changing the KMS key), an Update vSAN configuration task runs on each host every 3 seconds until all hosts reconnect or until 5 minutes have passed. These tasks are not harmful and rarely impact performance.

Workaround: None
Some objects are non-compliant after force repair
After you perform a force repair, some objects might not be repaired because the ownership of the objects was transferred to a different node during the process. The force repair might be delayed for those objects.

Workaround: Attempt the force repair operation after all other objects are repaired and resynchronized. You can wait until vSAN repairs the objects.
When you move a host from one encrypted cluster to another, and then back to the original cluster, the task fails
When you move a host from an encrypted vSAN cluster to another encrypted vSAN cluster, then move the host back to the original encrypted cluster, the task might fail. You might see the following message: A general system error occurred: Invalid fault. This error occurs because vSAN cannot re-encrypt data on the host using the original encryption key. After a short time, vCenter Server restores the original key on the host, and all unmounted disks in the vSAN cluster are mounted.

Workaround: Reboot the host and wait for all disks to get mounted.
Stretched cluster imbalance after a site recovers
When you recover a failed site in a stretched cluster, sometimes hosts in the failed site are brought back sequentially over a long period of time. vSAN might overuse some hosts when it begins repairing the absent components.

Workaround: Recover all of the hosts in a failed site together within a short time window.
VM operations fail due to HA issue with stretched clusters
Under certain failure scenarios in stretched clusters, certain VM operations such as vMotions or powering on a VM might be impacted. These failures scenarios include a partial or a complete site failure, or the failure of the high speed network between the sites. This problem is caused by the dependency on VMware HA being available for normal operation of stretched cluster sites.

Workaround: Disable vSphere HA before performing vMotion, VM creation, or powering on VMs. Then re-enable vSphere HA.
Restoring or replacing vCenter Server can cause cluster partition
If the vCenter Server is replaced or recovered from backup, the host membership list might become out-of-date. This can cause ESXi hosts to become partitioned from the cluster.

Workaround: Use the following procedure to make sure all hosts are added to the vSAN cluster as the vCenter Server reboots.
1. Before you reboot vCenter Server, configure hosts to ignore cluster member list updates. Run the following command on each host in the vSAN cluster:
  esxcfg-advcfg -s1 /VSAN/IgnoreClusterMemberListUpdates
2. After vCenter Server is running and all hosts are present in the cluster, configure hosts to use cluster member list updates. Run the following command on each host in the cluster:
  esxcfg-advcfg -s0 /VSAN/IgnoreClusterMemberListUpdates
Disk decommission or disk unmount task fails
Disk decommission or disk unmount task might fail due to a conflict between the data write commit task and the virtual disk delete task. This problem might occur during upgrades that require a new vSAN on-disk format. You might see the following message in the VMkernel.log:

4724 2017-04-10T18:46:51.309Z cpu30:67232)LSOM: LSOMFreeMDDispatch:3797: Throttled: Waiting for component cleanup

Workaround: Reboot the host to clear the conflict and retry the operation.
Cannot perform deep rekey if a disk group is unmounted
Before vSAN performs a deep rekey, it performs a shallow rekey. The shallow rekey fails if an unmounted disk group is present. The deep rekey process cannot begin.

Workaround: Remount or remove the unmounted disk group.
Log entries state that firewall configuration has changed
A new firewall entry appears in the security profile when vSAN encryption is enabled: vsanEncryption. This rule controls how hosts communicate directly to the KMS. When it is triggered, log entries are added to /var/log/vobd.log. You might see the following messages:

Firewall configuration has changed. Operation 'addIP4' for rule set vsanEncryption succeeded.
Firewall configuration has changed. Operation 'removeIP4' for rule set vsanEncryption succeeded.

These messages can be ignored.

Workaround: None
Limited support for First Class Disks with vSAN datastores
vSAN 6.6 and later does not fully support First Class Disks in vSAN datastores. You might experience the following problems if you use First Class Disks in a vSAN datastore:
- vSAN health service does not display the health of First Class Disks correctly.
- The Used Capacity Breakdown includes the used capacity for First Class Disks in the following category: Other
- The health status of VMs that use First Class Disks is not calculated correctly.
Workaround: None
HA failover does not occur after setting Traffic Type option on a vmknic to support witness traffic
If you set the traffic type option on a vmknic to support witness traffic, vSphere HA does not automatically discover the new setting. You must manually disable and then re-enable HA so it can discover the vmknic. If you configure the vmknic and the vSAN cluster first, and then enable HA on the cluster, it does discover the vmknic.

Workaround: Manually disable vSphere HA on the cluster, and then re-enable it.
After you disable and delete the iSCSI target service, some iSCSI objects remain in the vSAN datastore
If you use the Web Client to remove all iSCSI targets and LUNs, and disable the iSCSI target service, the iSCSI home object still exists in the vSAN datastore.

Workaround: To delete the iSCSI home object and all metadata associated with the iSCSI target service, run the following command on any host in the cluster: esxcli vsan iscsi homeobject delete
iSCSI I/O operation might be interrupted during iSCSI target failover
During iSCSI target failover, the iSCSI I/O operations might be interrupted. A host failure or a host reboot might trigger an iSCSI target failover.

Workaround: Retry the session from the iSCSI initiator.
iSCSI MCS is not supported
vSAN iSCSI target service does not support Multiple Connections per Session (MCS).

Workaround: None
Any iSCSI initiator can discover iSCSI targets
vSAN iSCSI target service allows any initiator on the network to discover iSCSI targets.

Workaround: You can isolate your ESXi hosts from iSCSI initiators by placing them on separate VLANs.
After resolving network partition, some VM operations on linked clone VMs might fail
Some VM operations on linked clone VMs that are not producing I/O inside the guest operating system might fail. The operations that might fail include taking snapshots and suspending the VMs. This problem can occur after a network partition is resolved, if the parent base VM's namespace is not yet accessible. When the parent VM's namespace becomes accessible, HA is not notified to power on the VM.

Workaround: Power cycle VMs that are not actively running I/O operations.
When you log out of the Web client after using the Configure vSAN wizard, some configuration tasks might fail
The Configure vSAN wizard might require up to several hours to complete the configuration tasks. You must remain logged in to the Web client until the wizard completes the configuration. This problem usually occurs in clusters with many hosts and disk groups.

Workaround: If some configuration tasks failed, perform the configuration again.
New policy rules ignored on hosts with older versions of ESXi software
This might occur when you have two or more vSAN clusters, with one cluster running the latest software and another cluster running an older software version. The vSphere Web Client displays policy rules for the latest vSAN software, but those new policies are not supported on the older hosts. For example, RAID-5/6 (Erasure Coding) – Capacity is not supported on hosts running 6.0U1 or earlier software. You can configure the new policy rules and apply them to any VMs and objects, but they are ignored on hosts running the older software version.

Workaround: None
Snapshot memory objects are not displayed in the Used Capacity Breakdown of the vSAN Capacity monitor
For virtual machines created with hardware version lower than 10, the snapshot memory is included in the Vmem objects on the Used Capacity Breakdown.

Workaround: To view snapshot memory objects in the Used Capacity Breakdown, create virtual machines with hardware version 10 or higher.
Storage Usage reported in VM Summary page might appear larger after upgrading to vSAN 6.5 or later
In previous releases of vSAN, the value reported for VM Storage Usage was the space used by a single copy of the data. For example, if the guest wrote 1 GB to a thin-provisioned object with two mirrors, the Storage Usage was shown as 1 GB. In vSAN 6.5 and later, the Storage Usage field displays the actual space used, including all copies of the data. So if the guest writes 1 GB to a thin-provisioned object with two mirrors, the Storage Usage is shown as 2 GB. The reported storage usage on some VMs might appear larger after upgrading to vSAN 6.5, but the actual space consumed did not increase.

Workaround: None
Cannot place a witness host in Maintenance Mode
When you attempt to place a witness host in Maintenance Mode, the host remains in the current state and you see the following notification: A specified parameter was not correct.

Workaround: When placing a witness host in Maintenance Mode, choose the No data migration option.
Moving the witness host into and then out of a stretched cluster leaves the cluster in a misconfigured state
If you place the witness host in a vSAN-enabled vCenter cluster, an alarm notifies you that the witness host cannot reside in the cluster. But if you move the witness host out of the cluster, the cluster remains in a misconfigured state.

Workaround: Move the witness host out of the vSAN stretched cluster, and reconfigure the stretched cluster. For more information, see Knowledge Base article 2130587.
When a network partition occurs in a cluster which has an HA heartbeat datastore, VMs are not restarted on the other data site
When the preferred or secondary site in a vSAN cluster loses its network connection to the other sites, VMs running on the site that loses network connectivity are not restarted on the other data site, and the following error might appear: vSphere HA virtual machine HA failover failed.

This is expected behavior for vSAN clusters.

Workaround: Do not select HA heartbeat datastore while configuring vSphere HA on the cluster.
Unmounted vSAN disks and disk groups displayed as mounted in the vSphere Web Client Operational Status field
After the vSAN disks or disk groups are unmounted by either running the esxcli vsan storage disk group unmount command or by the vSAN Device Monitor service when disks show persistently high latencies, the vSphere Web Client incorrectly displays the Operational Status field as mounted.
Workaround: Use the Health field to verify disk status, instead of the Operational Status field.
On-disk format upgrade displays disks not on vSAN
When you upgrade the disk format, vSAN might incorrectly display disks that were removed from the cluster. The UI also might show the version status as mixed. This display issue usually occurs after one or multiple disks are manually unmounted from the cluster. It does not affect the upgrade process. Only the mounted disks are checked. The unmounted disks are ignored.

Workaround: None

All vSAN clusters share the same external proxy settings
All vSAN clusters share the same external proxy settings, even if you set the proxy at the cluster level. vSAN uses external proxies to connect to Support Assistant, the Customer Experience Improvement Program, and the HCL database, if the cluster does not have direct Internet access.

Workaround: None
VMs in a stretched cluster become inaccessible when preferred site is isolated, then regains connectivity only to the witness host
When the preferred site becomes unavailable or loses its network connection to the secondary site and the witness host, the secondary site forms a cluster with the witness host and continues storage operations. Data on the preferred site might become outdated over time. If the preferred site then reconnects to the witness host but not to the secondary site, the witness host leaves the cluster it is in and forms a cluster with the preferred site, and some VMs might become inaccessible because they do not have access to the most recent data in this cluster.
Workaround: Before you reconnect the preferred site to the cluster, mark the secondary site as the preferred site. After the sites are resynchronized, you can mark the site you want to use as the preferred site.
Storage Consumption Model for VM Storage Policy wizard shows incorrect information
If one or more hosts in a vSAN cluster is not running software version 6.0 Update 2 or later, the Storage Consumption Model for the VM Storage Policy wizard might show incorrect information when you select RAID 5/6 as the failure tolerance method.
Workaround: Upgrade all hosts to the latest software version.