vRealize Operations Manager generates an alert if a problem occurs with the components in the storage area network that the vSAN adapter is monitoring.

Alerts for the vSAN Cluster Object

Alerts on the vSAN Cluster object have health, risk, and efficiency impact.

Table 1. vSAN Cluster Object Health Alert Definitions

Alert

Alert Type

Alert Subtype

Description

Basic (unicast) connectivity check (normal ping) has failed on vSAN host.

Storage

Configuration

Triggered when basic (unicast) connectivity check (normal ping) has failed on the vSAN host due to network misconfiguration.

Check the free space on physical disks in the vSAN cluster.

Storage

Availability

Triggered when a check of free space on physical disks in the vSAN cluster results in an error or warning.

CLOMD process on the host has issues and impacting the functionality of vSAN cluster.

Storage

Availability

Triggered when CLOMD process on the host has issues and impacting the functionality of vSAN cluster.

Disk load variance between some vSAN disks exceeded the threshold value.

Storage

Performance

Triggered when disk load variance between some vSAN disks exceeded the threshold value.

vSAN cannot perform the load balance properly.

Host ESXi version and the vSAN disk format version is incompatible with the other hosts and disks in a vSAN cluster.

Storage

Configuration

Host ESXi version and the vSAN disk format version is incompatible with the other hosts and disks in a vSAN cluster.

Host has invalid unicast agent and impacting the health of vSAN Stretched Cluster.

Storage

Configuration

Triggered when the host has invalid unicast agent and impacting the health of vSAN Stretched Cluster.

An invalid unicast agent on the host can cause a communication malfunction with the witness host.

Host in a vSAN cluster does not have a VMkernel NIC configured for vSAN traffic.

Network

Configuration

Triggered when the host in a vSAN cluster does not have a VMkernel NIC configured for vSAN traffic.

Note:

Even if an ESXi host is part of the vSAN cluster, but is not contributing storage, it must still have a VMkernel NIC configured for vSAN traffic.

Host in a vSAN cluster has connectivity issues and vCenter Server does not know its state.

Network

Configuration

Triggered when the host in a vSAN cluster has connectivity issues and vCenter Server does not know its state.

Host in a vSAN cluster has IP multicast connectivity issue.

Network

Configuration

Triggered when the host in a vSAN cluster has IP multicast connectivity issue. It means that multicast is most likely the root cause of a vSAN network partition.

Host is either running an outdated version of the vSAN Health Service VIB or It is not installed on the host.

Storage

Configuration

Triggered when the host is either running an outdated version of the vSAN Health Service VIB or It is not installed on the host.

Network latency check of vSAN hosts failed. It requires < 1 ms RTT.

Network

Configuration

Triggered if network latency check of vSAN hosts is greater than or equal to 1 ms RTT.

One or more hosts in the vSAN cluster have misconfigured multicast addresses.

Network

Configuration

Triggered when one or more hosts in the vSAN cluster have misconfigured multicast addresses.

One or more physical disks on vSAN host is experiencing software state health issues.

Storage

Availability

Triggered when one or more physical disks on vSAN host is experiencing software state health issues.

One or more vSAN enabled hosts are not in the same IP subnet

Network

Configuration

Triggered when one or more vSAN enabled hosts are not in the same IP subnet.

Overall health of the physical disks in a vSAN Cluster is impacted.

Storage

Availability

Triggered when overall health of the physical disks in a vSAN Cluster is impacted. See the health status of each physical disk individually on all the hosts.

Overall health of VMs residing on vSAN datastore is reporting issues.

Storage

Availability

Triggered when overall health of the VMs on a vSAN datastore is impacted.

Overall health of vSAN objects is reporting issues.

Storage

Availability

Triggered when overall health of vSAN objects is reporting issues.

Ping test with large packet size between all VMKernel adapters with vMotion traffic enabled has issues.

Network

Configuration

Triggered when ping test with large packet size between all VMKernel adapter with vMotion traffic enabled is impacted.

Ping test with small packet size between all VMkernel adapters with vMotion traffic enabled has issues.

Network

Configuration

Triggered when ping test with small packet size between all VMKernel adapter with vMotion traffic enabled is impacted.

Site latency between two fault domains and the witness host has exceeded the recommended threshold values in a vSAN Stretched cluster.

Storage

Performance

Site latency between two fault domains and the witness host has exceeded the recommended threshold values in a vSAN Stretched cluster.

Statistics collection of vSAN performance service is not working correctly.

Storage

Availability

Triggered when statistics collection of vSAN performance service is not working correctly.

This means that statistics collection or writing statistics data to storage have failed for three consecutive intervals.

MTU check (ping with large packet size) has failed on vSAN host.

Storage

Configuration

Triggered when MTU check (ping with large packet size) has failed on vSAN environment due to some MTU misconfiguration in the vSAN network.

The preferred fault domain is not set for the witness host in a vSAN Stretched cluster.

Storage

Configuration

Triggered when the preferred fault domain is not set for the witness host in a vSAN Stretched cluster and affecting the operations of vSAN Stretched cluster.

Unicast agent is not configured on the host and affecting operations of vSAN Stretched cluster.

Storage

Configuration

Triggered when unicast agent is not configured on the host and affecting operations of vSAN Stretched cluster.

vCenter Server has lost connection to a host that is part of a vSAN cluster.

Storage

Availability

Triggered when the host that is part of a vSAN cluster is in disconnected state or not responding and vCenter Server does not know its state.

vSAN Cluster contains host whose ESXi version does not support vSAN Stretched Cluster.

Storage

Configuration

Triggered when vSAN Cluster contains host whose ESXi version does not support vSAN Stretched Cluster.

vSAN cluster has issues in electing stats master of vSAN Performance service. This affects the functionality of vSAN Performance service.

Storage

Configuration

Triggered when vSAN cluster has issues in electing stats master of vSAN Performance service.

vSAN cluster has multiple network partitions.

Network

Configuration

Triggered when vSAN cluster has multiple network partitions due to a network issue.

vSAN Cluster has multiple Stats DB objects which are creating conflicts and affecting vSAN Performance Service

Storage

Configuration

Triggered when vSAN cluster has issues in electing stats master of vSAN Performance service.

This affects the functionality of vSAN Performance service.

vSAN disk group has incorrect deduplication and compression configuration

Storage

Configuration

Triggered when vSAN disk group has incorrect deduplication and compression configuration.

vSAN has encountered an issue while reading the metadata of a physical disk

Storage

Availability

Triggered when vSAN has encountered an issue while reading the metadata of a physical disk and cannot use this disk.

vSAN health service is not installed on the host

Storage

Configuration

Triggered when vSAN health service is not installed on the host.

vSAN host and its disks have inconsistent deduplication and compression configuration with the cluster

Storage

Configuration

Triggered when vSAN host and its disks have inconsistent deduplication and compression configuration with the cluster.

vSAN is unable to retrieve the physical disk information from host

Storage

Availability

Triggered when vSAN is unable to retrieve the physical disk information from host. vSAN Health Service may not be working properly on this host.

vSAN Performance Service is not enabled.

Storage

Configuration

Triggered when vSAN Performance Service is not enabled.

vSAN Performance Service is unable to communicate and retrieve statistics from host

Storage

Configuration

Triggered when vSAN Performance Service is unable to communicate and retrieve statistics from host.

vSAN Stretched cluster contains a witness host without a valid disk group.

Storage

Configuration

Triggered when vSAN Stretched cluster contains a witness host without a valid disk group.

If the witness host does not have any disk claimed by vSAN then its fault domain is not available.

vSAN Stretched cluster does not contain a valid witness host.

Storage

Configuration

Triggered when vSAN Stretched cluster does not contain a valid witness host.

This affects the operations of vSAN Stretched cluster.

vSAN Stretched cluster does not contain two valid fault domains.

Storage

Configuration

Triggered when vSAN Stretched cluster does not contain two valid fault domains.

vSAN Stretched cluster has inconsistent configuration for Unicast agent.

Storage

Configuration

Triggered when vSAN Stretched cluster contains multiple unicast agents.

This means multiple unicast agents were set on non-witness hosts.

vSAN witness host has an invalid preferred fault domain.

Storage

Configuration

Triggered when vSAN witness host has an invalid preferred fault domain.

Witness host is a part of vSAN Stretched cluster.

Storage

Configuration

Triggered when witness host is a part of the vCenter cluster, which forms vSAN Stretched cluster.

Witness host resides in one of the data fault domains.

Storage

Configuration

Triggered when witness host resides in one of the data fault domains.

This affects the operations of vSAN Stretched cluster.

Table 2. vSAN Cluster Object Risk Alert Definitions

Alert

Alert Type

Alert Subtype

Description

After one additional host failure, vSAN Cluster will not have enough resources to rebuild all objects

Storage

Capacity

Triggered when after one additional host failure, vSAN Cluster will not have enough resources to rebuild all objects.

Capacity disk used for vSAN is smaller than 255 GB (default max component size).

Storage

Performance

Triggered when a capacity disk used for vSAN is smaller than 255 GB (default max component size), so virtual machines that run on the vSAN Datastore might experience disk space issues.

Capacity disk used for vSAN is smaller than 255 GB (default max component size).

Storage

Availability

Triggered when a capacity disk used for vSAN is smaller than 255 GB (default max component size), so virtual machines that run on the vSAN Datastore might experience disk space issues.

Controller with pass-through and RAID disks has issues.

Storage

Configuration

Triggered when a controller with pass-through and RAID disks has issues.

Disk format version of one or more vSAN disks is out of date

Storage

Configuration

Triggered when the disk format version of one or more vSAN disks is out of date and is not compatible with other vSAN disks. This can lead to problems in creating or powering on VMs, performance degradation, and EMM failures.

ESXi host issues retrieving hardware info.

Storage

Configuration

Triggered when the ESXi host issues retrieving hardware info.

Firmware provider hasn't all its dependencies met or is not functioning as expected.

Storage

Configuration

Triggered when a firmware provider has not met all its dependencies or is not functioning as expected.

Host with inconsistent extended configurations is detected.

Storage

Configuration

Triggered when a host with inconsistent extended configurations is detected.

vSAN cluster extended configurations are set as object repair timer is 60 minutes, site read locality is Enabled, customized swap object is Enabled, large scale cluster support is Disabled; For host with inconsistent extended configurations, vSAN cluster remediation is recommended, for host doesn't support any extended configuration, ESXi software upgrade is needed; And to make cluster scalability configuration take effect, host reboot could be required.

Inconsistent configuration (like dedup/compression, encryption) setup on hosts or disks with the cluster.

Storage

Configuration

Triggered when there is inconsistent configuration (like dedup/compression, encryption) setup on hosts or disks with the cluster.

Network adapter driver is not VMware certified.

Storage

Configuration

Triggered when the network adapter driver is not VMware certified.

Network adapter firmware is not VMware certified.

Storage

Configuration

Triggered when the network adapter firmware is not VMware certified.

Network adapter is not VMware certified.

Storage

Configuration

Triggered when the network adapter is not VMware certified.

Network configuration of the vSAN iSCSI target service is not valid.

Storage

Availability

Triggered when the network configuration of the vSAN iSCSI target service is not valid.

This health check validates the presence of the default vmknic for the vSAN iSCSI target service, and verifies that all the existing targets have valid vmknic configurations.

Non-vSAN disks are used for VMFS or Raw Device Mappings(RDMs).

Storage

Availability

Triggered when non-vSAN disks are used for VMFS or Raw Device Mappings (RDMs).

Number of vSAN components on a disk is reaching or has reached its limit.

Storage

Capacity

Triggered when the number of vSAN components on a disk is reaching or has reached its limit. This will cause failure in the deployment of new Virtual Machines and also impact rebuild operations.

Number of vSAN components on a host is reaching or has reached its limit.

Storage

Capacity

Triggered when the number of vSAN components on a host is reaching or has reached its limit.

This will cause failure in the deployment of new Virtual Machines and also impact rebuild operations.

One or more ESXi hosts in the cluster do not support CPU AES-NI or have it disabled.

Storage

Availability

Triggered when one or more hosts in the cluster do not support CPU AES-NI or have it disabled. As a result, the system might use the software encryption that is significantly slower than AES-NI.

RAID controller configuration has issues.

Storage

Configuration

Triggered when the RAID controller configuration has issues.

Storage I/O controller driver is not VMware certified

Storage

Configuration

Triggered when the stability and integrity of vSAN may be at risk as the storage I/O controller driver is not VMware certified.

Storage I/O controller drivers is not supported with the current version of ESXi running on the host

Storage

Configuration

Triggered when the stability and integrity of vSAN may be at risk as the storage I/O controller driver is not supported with the current version of ESXi running on the host.

Storage I/O Controller firmware not is VMware certified.

Storage

Configuration

Triggered when the storage I/O Controller firmware not is VMware certified.

Storage I/O controller is not compatible with the VMware Compatibility Guide

Storage

Configuration

Triggered when the vSAN environment may be at risk as the Storage I/O controller on the ESXi hosts that are participating in a vSAN cluster are not compatible with the VMware Compatibility Guide.

The current status of the Customer Experience Improvement Program (CEIP) not is enabled.

Storage

Availability

Triggered when the current status of the Customer Experience Improvement Program (CEIP) not is enabled.

The Internet connectivity is not available for vCenter Server.

Storage

Availability

Triggered when internet connectivity is not available for vCenter Server.

The resync operations are throttled on any hosts.

Storage

Configuration

Triggered when resync operations are throttled. Please clear the limit, unless you need it for particular cases like a potential cluster meltdown.

Time of hosts and VC are not synchronized within 1 minute.

Storage

Configuration

Triggered when the time of hosts and VC are not synchronized within 1 minute.

Any difference larger than 60 seconds will lead this check to fail. If the check fails, it is recommended that you check the NTP server configuration.

vCenter Server or any of the ESXi hosts experience problems when connecting to Key Management Servers (KMS).

Storage

Availability

Triggered when the vCenter Server or any of the hosts experience problems when connecting to KMS.

vCenter server state was not pushed to ESXi due to vCenter server being out of sync.

Storage

Configuration

Triggered when the vCenter server state was not pushed to ESXi due to vCenter server being out of sync.

During normal operation, the vCenter server state is regarded as source of truth, and ESXi hosts are automatically updated with the latest host membership list. When vCenter server is replaced or recovered from backup, the host membership list in vCenter server may be out of sync. This health check detects such cases, and alerts if vCenter server state was not pushed to ESXi due to vCenter server being out of sync. In such cases, first fully restore the membership list in vCenter server, and then perform 'Update ESXi configuration' action if required.

vSAN and VMFS datastores are on a same Dell H730 controller with the lsi_mr3driver.

Storage

Configuration

Triggered when the vSAN and VMFS datastores are on a same Dell H730 controller with the lsi_mr3driver.

vSAN build recommendation based on the available releases and VCG compatibility guide.

Storage

Availability

Triggered when the vSAN build is not compatible with available releases and VCG compatibility guide.

This is the ESXi build that vSAN recommends as the most appropriate, given the hardware, its compatibility per the VMware Compatibility Guide and the available releases from VMware.

vSAN build recommendation engine has all its dependencies met and is functioning as expected.

Storage

Availability

Triggered when the vSAN build recommendation engine has issues.

The vSAN Build Recommendation Engine relies on the VMware compatibility guide and VMware release metadata for its recommendation. To provide build recommendations, it also requires VMware Update Manager service availability, internet connectivity, and valid credentials for my.vmware.com. This health check ensures that all dependencies are met and that the recommendation engine is functioning correctly.

vSAN Cluster disk space capacity is less than 5%

Storage

Capacity

Triggered when the disk usage in a vSAN cluster reaches 95% of capacity.

Cleared by removing virtual machines that are no longer in use or adding more disks to the cluster.

vSAN Cluster disk space usage is approaching capacity

Storage

Capacity

Triggered when the disk usage in a vSAN cluster reaches 80% of capacity.

Cleared by removing virtual machines that are no longer in use or adding more disks to the cluster.

vSAN cluster is reaching or has reached its limit for components, free disk space and read cache reservations.

Storage

Capacity

Triggered when the vSAN cluster is reaching or has reached its limit for components, free disk space and read cache reservations.

vSAN Cluster virtual disk count capacity is less than 5%.

Storage

Capacity

Triggered when the number of virtual disks per host in the vSAN cluster reaches 95% of capacity.

Cleared by adding most hosts to the cluster.

vSAN Cluster virtual disk count is approaching capacity.

Storage

Capacity

Triggered when the number of virtual disks per host in the vSAN cluster reaches 75% of capacity.

Cleared by adding most hosts to the cluster.

vSAN configuration for LSI 3108-based controller has issues.

Storage

Configuration

Triggered when the vSAN configuration for LSI 3108-based controller has issues.

vSAN disk group type (All-Flash or Hybrid) for the used SCSI controller is not VMware certified.

Storage

Configuration

Triggered when the vSAN disk group type (All-Flash or Hybrid) for the used SCSI controller is not VMware certified.

vSAN enabled hosts have inconsistent values for advanced configuration options.

Storage

Configuration

Triggered when some advanced configuration settings have different values on different hosts in the vSAN cluster.

vSAN firmware version recommendation based on the VCG.

Storage

Configuration

Triggered when the vSAN firmware version recommendation based on the VCG check has issues.

vSAN has encountered an integrity issue with the metadata of an individual component on a physical disk.

Storage

Availability

Triggered when the vSAN has encountered an integrity issue with the metadata of an individual component on a physical disk.

vSAN HCL DB auto updater is not working properly.

Storage

Configuration

Triggered when the vSAN HCL DB auto updater is not working properly. This means that vSAN cannot download and update its HCL DB automatically.

vSAN HCL DB is not up-to-date.

Storage

Configuration

Triggered when the vSAN HCL DB is not up-to-date.

vSAN Health Service is not able to find the appropriate controller utility for the storage controller on the ESXi host.

Storage

Availability

Triggered when the vSAN Health Service is not able to find the appropriate controller utility for the storage controller on the ESXi host.

vSAN is running low on the vital memory pool (heaps) needed for the operation of physical disks.

Storage

Performance

Triggered when the vSAN is running low on the vital memory pool (heaps) needed for the operation of physical disks.

This can lead to a variety of performance issues such as virtual machine storage performance degradation, operation failures, or even ESXi hosts going unresponsive.

vSAN is running low on the vital memory pool (slabs) needed for the operation of physical disks.

Storage

Performance

Triggered when the vSAN is running low on the vital memory pool (slabs) needed for the operation of physical disks.

This can lead to a variety of performance issues such as virtual machine storage performance degradation, operation failures, or even ESXi hosts going unresponsive.

vSAN is using a physical disk which has high congestion value.

Storage

Performance

Triggered when the vSAN is using a physical disk which has high congestion value.

This can lead to a variety of performance issues such as virtual machine storage performance degradation, operation failures, or even ESXi hosts going unresponsive.

vSAN iSCSI target service home object has issues.

Storage

Availability

Triggered when the vSAN iSCSI target service home object has issues.

This health check verifies the integrity of the vSAN iSCSI target service home object. It also verifies that the configuration of the home object is valid.

vSAN iSCSI target service is not running properly or is not correctly enabled on the host.

Storage

Availability

Triggered when the vSAN iSCSI target service is not running properly or is not correctly enabled on the host.

This health check verifies the service runtime status of the vSAN iSCSI target service, and checks whether the service is correctly enabled on each host.

vSAN performance service statistics database object is reporting issues.

Storage

Availability

Triggered when the vSAN performance service statistics database object is reporting issues.

vSphere cluster members do not match vSAN cluster members.

Storage

Configuration

Triggered when the vSphere cluster members do not match vSAN cluster members.

Table 3. vSAN Cluster Object Efficiency Alert Definitions

Alert

Alert Type

Alert Subtype

Description

vSAN Cluster flash read cache is approaching capacity.

Storage

Capacity

Triggered when the Read Cache (RC) in the vSAN cluster reaches 80% of capacity.

Cleared by adding flash storage to the read cache.

vSAN Cluster flash read cache capacity is less than 5%.

Storage

Capacity

Triggered when the Read Cache (RC) in the vSAN cluster reaches 95% of capacity.

Cleared by adding flash storage to the read cache.

vSAN Adapter Instance Object Alert Definitions

Alerts on the vSAN Adapter Instance Object have health impact.

Alert

Alert Type

Alert Subtype

Description

Performance Service on vSAN cluster might be off or experience issues.

Storage

Configuration

Triggered when the vSphere Virtual SAN Performance Service is off or experiences issues for one of the vSAN-enabled cluster compute resources.

Cleared by enabling Virtual SAN performance service in vSphere.

vSAN adapter instance failed to collect data from vSAN Health Service. The health Service might have issues.

Storage

Configuration

Triggered when the vSAN adapter instance failed to collect data from vSAN Health Service. The health Service might have issues.

vSAN Disk Group Object Alert Definitions

Alerts on the vSAN Disk Group Object have efficiency impact.

Alert

Alert Type

Alert Subtype

Description

vSAN Disk Group read cache hit rate is less than 90%.

Storage

Performance

Triggered when the vSAN disk group read cache hit rate is less than 90%.

Cleared by adding more cache to accommodate the workload.

vSAN Disk Group read cache hit rate is less than 90% and write buffer free space is less than 10%.

Storage

Capacity

Triggered when the vSAN disk group read cache hit rate is less than 90% and the vSAN disk group write buffer free space is less than 10%.

Cleared by adding more flash capacity to the vSAN disk group.