The Management Pack for Cohesity creates alerts (and in some cases provides recommended actions) based on various symptoms it detects in your Cohesity Environment. See the table below for the list of alerts available in the Management Pack.

Example Alert


alert_detail_protection_job_backup_failed

Alerts List

Name Description Symptom Recommendation
Cluster's Disk Await High This alert indicates that a Warning alert was raised in Cohesity. Disk Await High This alert is triggered when the Cohesity Cluster detects high latencies on a disk.
Cluster's Product Model is Incompatible This alert indicates that a Warning alert was raised in Cohesity. Product Model Incompatible
Cluster's Proxy Server is Unreachable This alert indicates that a Warning alert was raised in Cohesity. Proxy Server Unreachable
Cluster has Firmware Incompatibilities This alert indicates that a Warning alert was raised in Cohesity. Firmware Incompatible This alert is triggered if any firmware version for any components in the Node is not supported.
Cluster's Raid has Degraded This alert indicates that a Critical alert was raised in Cohesity. Raid Degraded This alert is typically triggered when a boot SSD or NVME SSD is offline or has failed.
Cluster has Non Empty Unmounted Directory This alert indicates that a Warning alert was raised in Cohesity. Non Empty Unmounted Directory This alert can be caused by files and directories being written to the mount point or an internal error.
Cluster has VM Ware Mount Failure This alert indicates that a Warning alert was raised in Cohesity. VM Ware Mount Failure
Cluster's Rhino Unhealthy This alert indicates that a Warning alert was raised in Cohesity. Rhino Unhealthy
Cluster Skipped VM Volume Indexing This alert indicates that a Warning alert was raised in Cohesity. VM Volume Indexing Skipped
Cluster Upgrade Failed This alert indicates that a Critical alert was raised in Cohesity. Upgrade Failed After the Cohesity Cluster upgrade is initiated, if the first Node is unable to download the upgrade package and install, then the upgrade is marked as failed and aborted. Downloading the upgrade package may fail for a number of reasons, such as network failures, download checksum mismatch, or mismatched versions.
Cluster has New Upgrade Packages Available This alert indicates that a Info alert was raised in Cohesity. New Upgrade Packages Available If your Cohesity Cluster has automatic polling enabled, the Cluster raises this alert whenever Cohesity releases a new upgrade package and publishes it to the Cohesity cloud upgrade server.
Cluster has Time Service Issues This alert indicates that a Warning alert was raised in Cohesity. Time Service

This alert can be triggered by one of the following conditions:

  • The configured NTP server is invalid or down.
  • The configured NTP server only implements a subset of the NTP protocol.

NOTE: There are known time synchronization issues with Windows NTP servers and time.nist.gov.

Cluster has Upgraded Successfully This alert indicates that a Info alert was raised in Cohesity. Upgrade Succeeded
Cluster Upgrade Started This alert indicates that a Info alert was raised in Cohesity. Upgrade Started This alert is triggered after an upgrade is initiated from the Cohesity Dashboard or the Cohesity CLI.
Cluster has Failed To Add A Node This alert indicates that a Warning alert was raised in Cohesity. Add Node Failed
Cluster's IPMI Is Unreachable This alert indicates that a Info alert was raised in Cohesity. IPMI Unreachable The IPMI username and password specified for the Node is different than the IPMI username and password specified in the configuration settings for the Cohesity Cluster. This can occur when the IPMI username and password were changed directly on the Node, not in the configuration settings for the Cohesity Cluster.
Cluster's Upgrade Is Stuck This alert indicates that a Warning alert was raised in Cohesity. Upgrade Stuck
Cluster has Service Versions Mismatched This alert indicates that a Info alert was raised in Cohesity. Service Versions Mismatched This alert is triggered if any two Nodes are running different versions of the same Cohesity service.
Cluster has Unexpected Virtual Appliance Resources This alert indicates that a Info alert was raised in Cohesity. Virtual Appliance Resources Unexpected
Cluster has Service Gflags Set This alert indicates that a Info alert was raised in Cohesity. Cluster Service Gflags Found
Cluster's View Box Space Usage High This alert indicates that a Critical alert was raised in Cohesity. View Box Space Usage High This alert is triggered when the Cohesity Cluster detects that a View Box is running out of space.
Cluster's Node Failure Is Not Tolerated This alert indicates that a Info alert was raised in Cohesity. Node Failure Is Not Tolerated If a Node in the Cohesity Cluster fails, its contents need to be copied to the other Nodes. If the Cohesity Cluster detects that the contents of a potentially failed Node cannot be copied to the remaining Nodes due a lack of disk space, this alert is triggered to indicate that Node failures cannot be tolerated.
Cluster has Disk Offline This alert indicates that a Critical alert was raised in Cohesity. Disk Offline
Cluster has Invalid State This alert indicates that a Critical alert was raised in Cohesity. Invalid State This alert should not occur under normal usage.
Cluster had a Kernel Panic This alert indicates that a Critical alert was raised in Cohesity. Kernel Panic
Cluster Found New Disk This alert indicates that a Info alert was raised in Cohesity. New Disk Found This alert is triggered if a disk is added to a Node or if a disk is replaced in a Node.
Cluster has Low Logs Disk Space This alert indicates that a Info alert was raised in Cohesity. Logs Disk Space Low This alert indicates that the Cohesity log rotation service may have crashed, is unable to keep up with the logging rate, or is unable to remove some large files.
Cluster has a Disk Missing This alert indicates that a Warning alert was raised in Cohesity. Missing Disk

This alert can be caused by one of the following conditions:

  • The Cohesity Cluster was created with less disks than supported by the hardware model of the Node.
  • A Node with less disks than supported by hardware model of the Node is added to the Cohesity Cluster. The disk(s) may be pulled out or not completely inserted.
Cluster has Node Chassis Changed This alert indicates that a Warning alert was raised in Cohesity. Node Chassis Changed This alert may be triggered if a Node is moved from one chassis to another chassis.
Cluster has an Unexpected Bonding Mode This alert indicates that a Info alert was raised in Cohesity. Bonding Mode Unexpected

This alert can be caused by one of the following conditions:

  • The bonding mode is changed directly on the bond interface of Node(s) but the bonding mode configured on the Cohesity Cluster is not updated.
  • The bonding mode configured on the Cohesity Cluster is changed but the bonding mode on the bond interface of a Node is not updated.
Cluster has an Unexpected MTU This alert indicates that a Warning alert was raised in Cohesity. MTU Unexpected

This alert can be caused by one of the following conditions:

  • The MTU is changed directly on the bond interface of Node(s), and the MTU configured on the Cohesity Cluster is not updated.
  • The MTU configured on the Cohesity Cluster is changed, but the MTU on the bond interface of a Node is not updated.
Cluster has an Unexpected Gateway This alert indicates that a Warning alert was raised in Cohesity. Gateway Unexpected

This alert can be caused by one of the following conditions:

  • The gateway is changed directly on the bond interface of Node(s), and the gateway configured on the Cohesity Cluster is not updated.
  • The gateway configured on the Cohesity Cluster is changed, but the gateway  on the bond interface of a Node is not updated.
Cluster has Disk Marked Offline This alert indicates that a Critical alert was raised in Cohesity. Disk Marked Offline

This alert can be caused by one of the following conditions:

  • Disk read/write fails with an error.
  • Disk read/write access is hung and does not complete in a reasonable time.
Cluster has Disk With Too Many Delta Records This alert indicates that a Warning alert was raised in Cohesity. Disk With Too Many Delta Records

The clean up of the delta records is delayed due to the following conditions:

  • The NFS/SMB server process (called the Bridge) on the Cohesity Cluster is very busy.
  • The disks are very busy.
Cluster has Frequent Process Restarts This alert indicates that a Warning alert was raised in Cohesity. Frequent Process Restarts

A process can restart for one or more of the following reasons:

  • An unexpected condition (assert) was encountered.
  • An operation within the process is hung, which triggered the failfast mechanism to restart the process.
Cluster Disk Is Bad This alert indicates that a Warning alert was raised in Cohesity. Disk Is Bad This alert is triggered when the disk is detecting read/write errors.
Cluster Node Is Down This alert indicates that a Critical alert was raised in Cohesity. Node Is Down
Cluster has hit Write Limit This alert indicates that a Critical alert was raised in Cohesity. Write Limit The SSD is exceeding write thresholds, so it may need to be replaced soon.
Cluster has an IMPI Event This alert indicates that a Warning alert was raised in Cohesity. Ipmi Event
Cluster has Uncorrectable ECC Memory This alert indicates that a Critical alert was raised in Cohesity. Mem Uncorrectable Ecc
Cluster has Correctable ECC Memory This alert indicates that a Warning alert was raised in Cohesity. Mem Correctable Ecc
Cluster's Watchdog Triggered This alert indicates that a Critical alert was raised in Cohesity. Watchdog Triggered
Cluster has Temp Out Of High Range This alert indicates that a Critical alert was raised in Cohesity. Temp Out Of High Range
Cluster has Temp Out Of Low Range This alert indicates that a Critical alert was raised in Cohesity. Temp Out Of Low Range
Cluster has Volt Out Of High Range This alert indicates that a Critical alert was raised in Cohesity. Volt Out Of High Range
Cluster has Volt Out Of Low Range This alert indicates that a Critical alert was raised in Cohesity. Volt Out Of Low Range
Cluster HDD Removed This alert indicates that a Critical alert was raised in Cohesity. Hdd Drive Removed
Cluster HDD Fault This alert indicates that a Critical alert was raised in Cohesity. Hdd Drive Fault
Cluster has Power Supply Removed This alert indicates that a Critical alert was raised in Cohesity. Power Supply Removed
Cluster has Power Supply Inserted This alert indicates that a Info alert was raised in Cohesity. Power Supply Inserted
Cluster Node Removed This alert indicates that a Critical alert was raised in Cohesity. Node Removed
Cluster Node Inserted This alert indicates that a Info alert was raised in Cohesity. Node Inserted
Cluster Node Powered Off This alert indicates that a Critical alert was raised in Cohesity. Node Powered Off
Cluster Node Rebooted This alert indicates that a Info alert was raised in Cohesity. Node Rebooted
Cluster Link Is Up This alert indicates that a Info alert was raised in Cohesity. Link Is Up
Cluster Link Is Down This alert indicates that a Info alert was raised in Cohesity. Link Is Down
Cluster has IPMI SEL Cleared This alert indicates that a Info alert was raised in Cohesity. Ipmi Sel Cleared
Cluster has Thermal Trip Triggered This alert indicates that a Critical alert was raised in Cohesity. Thermal Trip Triggered This alert is triggered when the Node has detected a thermal issue, and the detected temperature is out of range.
Cluster has High Space Usage This alert indicates that a Critical alert was raised in Cohesity. Cluster Space Usage High This alert is triggered when the Cohesity Cluster detects that it is running out of disk space.
Cluster has Hung Task Progress This alert indicates that a Warning alert was raised in Cohesity. Task Progress Hung
Cluster's Health Affected This alert indicates that a Warning alert was raised in Cohesity. Cluster Health
Cluster's Health Affected This alert indicates that a Info alert was raised in Cohesity. Cluster Health A scan may not complete on schedule for many reasons. The issue may be benign and fix itself.
Cluster's Health Affected This alert indicates that a Warning alert was raised in Cohesity. Cluster Health
Cluster's Health Affected This alert indicates that a Warning alert was raised in Cohesity. Cluster Health
Cluster's Search Cluster Health Affected This alert indicates that a Critical alert was raised in Cohesity. Search Cluster Health
Cluster Search Node Busy This alert indicates that a Warning alert was raised in Cohesity. Search Node Busy

This alert can be caused by the following conditions:

  • There is high load on elastic search.
  • The elastic search is consuming too much memory.
Cluster's Active Directory Not Reachable This alert indicates that a Critical alert was raised in Cohesity. Active Directory Not Reachable

This alert can be caused by one of the following conditions:

  • The Active Directory server is down.
  • Incorrect network settings on the Cohesity Cluster.
  • Incorrect DNS settings on the Cohesity Cluster.
Cluster's Metadata Size Exceeds Threshold This alert indicates that a Critical alert was raised in Cohesity. Metadata Size Exceeds Threshold Contact Cohesity Support. Backup policies may need to be adjusted or more Nodes need to be added.
Cluster's Encryption Key Created This alert indicates that a Info alert was raised in Cohesity. Encryption Key Created
Cluster's Key Rotation Policy Changed This alert indicates that a Info alert was raised in Cohesity. Key Rotation Policy Changed
Cluster has KMS Created This alert indicates that a Info alert was raised in Cohesity. KMS Created
Cluster KMS Destroyed This alert indicates that a Info alert was raised in Cohesity. KMS Destroyed
Cluster has Remote Replication Task Stuck This alert indicates that a Warning alert was raised in Cohesity. Remote Replication Task Stuck
Cluster had Remote Replication Task Fail This alert indicates that a Warning alert was raised in Cohesity. Remote Replication Task Failed
Cluster's Remote Replication File Skipped This alert indicates that a Warning alert was raised in Cohesity. Remote Replication File Skipped
Cluster has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
External Target has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
Protection Job has Missing VM Backup This alert indicates that a Critical alert was raised in Cohesity. Missing VM Backup
Protection Job's VM Cracking Skipped This alert indicates that a Warning alert was raised in Cohesity. VM Cracking Skipped

This alert can be caused by the following conditions:

  • The Cohesity Cluster is not able to mount the VMDK.
  • The VM Snapshot has an issue.
Protection Job's Backup Job Succeeded This alert indicates that a Info alert was raised in Cohesity. Backup Job Succeeded
Protection Job's Backup Job Failed This alert indicates that a Critical alert was raised in Cohesity. Backup Job Failed

A Job Run can fail for any of the following reasons:

  • There is an issue with primary environment such as a removed VM or a Snapshot failure.
  • The primary storage is full. (The primary storage contains the Objects that are backed up by the Cohesity Cluster.)
  • The Cohesity Agent is unreachable while attempting to back up physical servers.The storage on the Cohesity Cluster is full.
Protection Job's Backup Job SLA Violated This alert indicates that a Warning alert was raised in Cohesity. Backup Job Sla Violated

A Job Run may take longer than the specified SLA for the following reasons:

  • If the primary storage is slow.
  • The network is slow.
  • The Cohesity Cluster is overloaded.
  • You specified SLA that is too short.
Protection Job Blackout Job Cancelled Blackout Window This alert indicates that a Warning alert was raised in Cohesity. Blackout Job Cancelled Blackout Window
Protection Job had Media Error During Archival This alert indicates that a Critical alert was raised in Cohesity. Media Error During Archival
Protection Job had Media Error During Restore This alert indicates that a Critical alert was raised in Cohesity. Media Error During Restore One or more tapes that required to restore data are not available.
Protection Job had Integral Volume Reaching Max Capacity This alert indicates that a Warning alert was raised in Cohesity. Integral Volume Reaching Max Capacity
Protection Job's Archive Job Failed This alert indicates that a Warning alert was raised in Cohesity. Archive Job Failed

This alert can be caused by one of the following conditions:

  • When an Archive task fails due to some external target connectivity and/or credentials issues.
  • When an Archive task fails due to an internal Cohesity issue.
Protection Job's Archive Job Stuck This alert indicates that a Warning alert was raised in Cohesity. Archive Job Stuck

This alert can be caused by one of the following conditions:

  • When an Archive task doesn't make progress due to some external target connectivity and/or credentials issues.
  • When an Archive task doesn't make progress due to an internal Cohesity issue.
Protection Job's Restore Job Stuck This alert indicates that a Warning alert was raised in Cohesity. Restore Job Stuck

This alert can be caused by one of the following conditions:

  • When an Restore task doesn't make progress due to some external target connectivity and/or credentials issues.
  • When an Restore task doesn't make progress due to an internal Cohesity issue.
Protection Job's Restore Job Failed This alert indicates that a Warning alert was raised in Cohesity. Restore Job Failed

This alert can be caused by one of the following conditions:

  • When a Restore task fails due to some external target connectivity and/or credentials issues.
  • When a Restore task fails due to an internal Cohesity issue.
Protection Job has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
Protection Policy has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
Protection Source has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
Restore Task has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
View has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
View Box has Audit Log This alert indicates that a Info alert was raised in Cohesity. Audit Log
Cloud and Local Disk over 70 percent Used The cloud and local disks for this Cluster are over 70% used. Cloud and Local Disk Capacity over 70% Cloud and Local Disk Capacity under 80% Review the disk usage for this Cluster.
Local Disk over 70 percent Used The local disks for this Cluster are over 70% used. Local Disk Capacity over 70% and Local Disk Capacity less 80% Review the disk usage for this Cluster.