The Management Pack for HPE ProLiant creates alerts and provides recommended actions based on various symptoms (health status notifications and events) that it detects in your HPE ProLiant environment. See the table below for the list of alerts available in the Management Pack.

Name

Description

Symptom

Recommendation

Blade Health: Warning

Blade Health: Warning

Blade Health: Warning

Blade health is degraded. Review the health of this blade's related components to diagnose the problem.

Blade Health: Critical

Blade Health: Critical

Blade Health: Critical

Blade health is failed. Review the health of this blade's related components to diagnose the problem.

Blade Memory Health: Warning

Blade Memory Health: Warning

Blade Memory Health: Warning

Blade memory health is degraded. Recommended actions:

  • Be sure the memory meets the blade requirements and is installed as required by the blade.

  • Some blades may require that memory banks be populated fully or that all memory within a memory bank must be the same size, type, and speed. To determine if the memory is installed properly, see the blade documentation.

  • Check any blade LEDs that correspond to memory slots.

  • If you are unsure which DIMM has failed, test each bank of DIMMs by removing all other DIMMs. Then isolate the failed DIMM by switching each DIMM in a bank with a known working DIMM.

  • Remove any third-party memory and run HP Insight Diagnostics.

Rack Memory Health: Warning

Rack Memory Health: Warning

Rack Memory Health: Warning

Rack memory health is degraded. Recommended actions:

  • Be sure the memory meets the rack requirements and is installed as required by the rack.

  • Some racks may require that memory banks be populated fully or that all memory within a memory bank must be the same size, type, and speed. To determine if the memory is installed properly, see the rack documentation.

  • Check any rack LEDs that correspond to memory slots.

  • If you are unsure which DIMM has failed, test each bank of DIMMs by removing all other DIMMs. Then, isolate the failed DIMM by switching each DIMM in a bank with a known working DIMM.

  • Remove any third-party memory and run HP Insight Diagnostics.

Blade Memory Health: Critical

Blade Memory Health: Critical

Blade Memory Health: Critical

Blade memory health is failed. Recommended actions:

  • Be sure the memory meets the blade requirements and is installed as required by the blade.

  • Some blades may require that memory banks be populated fully or that all memory within a memory bank must be the same size, type, and speed. To determine if the memory is installed properly, see the blade documentation.

  • Check any blade LEDs that correspond to memory slots.

  • If you are unsure which DIMM has failed, test each bank of DIMMs by removing all other DIMMs. Then, isolate the failed DIMM by switching each DIMM in a bank with a known working DIMM.

  • Remove any third-party memory and run HP Insight Diagnostics.

Rack Memory Health: Critical

Rack Memory Health: Critical

Rack Memory Health: Critical

Rack memory health is failed. Recommended actions:

  • Be sure the memory meets the rack requirements and is installed as required by the rack.

  • Some racks may require that memory banks be populated fully or that all memory within a memory bank must be the same size, type, and speed. To determine if the memory is installed properly, see the rack documentation.

  • Check any rack LEDs that correspond to memory slots.

  • If you are unsure which DIMM has failed, test each bank of DIMMs by removing all other DIMMs. Then, isolate the failed DIMM by switching each DIMM in a bank with a known working DIMM.

  • Remove any third-party memory and run HP Insight Diagnostics.

Blade Processor Health: Warning

Blade Processor Health: Warning

Blade Processor Health: Warning

Blade processor health is degraded. Recommended actions:

  • Be sure each processor is supported by the blade and is installed as directed in the blade documentation. The processor socket requires very specific installation steps and only supported processors should be installed. For processor requirements, see the blade documentation.

  • Be sure the blade ROM is current.

  • Be sure you are not mixing processor stepping, core speeds, or cache sizes if this is not supported on the blade. For more information, see the blade documentation. CAUTION: Removal of some processors and heatsinks require special considerations for replacement, while other processors and heatsinks are integrated and cannot be reused once separated. For specific instructions for the blade you are troubleshooting, refer to processor information in the blade user guide.

  • If the blade has only one processor installed, reseat the processor. If the problem is resolved after you restart the blade, the processor was not installed properly.

  • If the blade has only one processor installed, replace it with a known functional processor. If the problem is resolved after you restart the blade, the original processor failed.

  • If the blade has multiple processors installed, test each processor:

  1. Remove all but one processor from the blade. Replace each with a processor terminator board or blank, if applicable to the blade.

  2. Replace the remaining processor with a known functional processor. If the problem is resolved after you restart the blade, a fault exists with one or more of the original processors. Install each processor one by one, restarting each time, to find the faulty processor or processors. At each step, be sure the blade supports the processor configurations.

Rack Processor Health: Warning

Rack Processor Health: Warning

Rack Processor Health: Warning

Rack processor health is degraded. Recommended actions:

  • Be sure each processor is supported by the rack and is installed as directed in the rack documentation. The processor socket requires very specific installation steps and only supported processors should be installed. For processor requirements, see the rack documentation.

  • Be sure the rack ROM is current.

  • Be sure you are not mixing processor stepping, core speeds, or cache sizes if this is not supported on the rack. For more information, see the rack documentation. CAUTION: Removal of some processors and heatsinks require special considerations for replacement, while other processors and heatsinks are integrated and cannot be reused once separated. For specific instructions for the rack you are troubleshooting, refer to processor information in the rack user guide.

  • If the rack has only one processor installed, reseat the processor. If the problem is resolved after you restart the rack, the processor was not installed properly.

  • If the rack has only one processor installed, replace it with a known functional processor. If the problem is resolved after you restart the rack, the original processor failed.

  • If the rack has multiple processors installed, test each processor:

  1. Remove all but one processor from the rack. Replace each with a processor terminator board or blank, if applicable to the rack.

  2. Replace the remaining processor with a known functional processor. If the problem is resolved after you restart the rack, a fault exists with one or more of the original processors. Install each processor one by one, restarting each time, to find the faulty processor or processors. At each step, be sure the rack supports the processor configurations.

Blade Processor Health: Critical

Blade Processor Health: Critical

Blade Processor Health: Critical

Blade processor health is failed. Recommended actions:

  • Be sure each processor is supported by the blade and is installed as directed in the blade documentation. The processor socket requires very specific installation steps and only supported processors should be installed. For processor requirements, see the blade documentation.

  • Be sure the blade ROM is current.

  • Be sure you are not mixing processor stepping, core speeds, or cache sizes if this is not supported on the blade. For more information, see the blade documentation. CAUTION: Removal of some processors and heatsinks require special considerations for replacement, while other processors and heatsinks are integrated and cannot be reused once separated. For specific instructions for the blade you are troubleshooting, refer to processor information in the blade user guide.

  • If the blade has only one processor installed, reseat the processor. If the problem is resolved after you restart the blade, the processor was not installed properly.

  • If the blade has only one processor installed, replace it with a known functional processor. If the problem is resolved after you restart the blade, the original processor failed.

  • If the blade has multiple processors installed, test each processor:

  1. Remove all but one processor from the blade. Replace each with a processor terminator board or blank, if applicable to the blade.

  2. Replace the remaining processor with a known functional processor. If the problem is resolved after you restart the blade, a fault exists with one or more of the original processors. Install each processor one by one, restarting each time, to find the faulty processor or processors. At each step, be sure the blade supports the processor configurations.

Rack Processor Health: Critical

Rack Processor Health: Critical

Rack Processor Health: Critical

Rack processor health is failed. Recommended actions:

  • Be sure each processor is supported by the rack and is installed as directed in the rack documentation. The processor socket requires very specific installation steps and only supported processors should be installed. For processor requirements, see the rack documentation.

  • Be sure the rack ROM is current.

  • Be sure you are not mixing processor stepping, core speeds, or cache sizes if this is not supported on the rack. For more information, see the rack documentation. CAUTION: Removal of some processors and heatsinks require special considerations for replacement, while other processors and heatsinks are integrated and cannot be reused once separated. For specific instructions for the rack you are troubleshooting, refer to processor information in the rack user guide.

  • If the rack has only one processor installed, reseat the processor. If the problem is resolved after you restart the rack, the processor was not installed properly.

  • If the rack has only one processor installed, replace it with a known functional processor. If the problem is resolved after you restart the rack, the original processor failed.

  • If the rack has multiple processors installed, test each processor:

  1. Remove all but one processor from the rack. Replace each with a processor terminator board or blank, if applicable to the rack.

  2. Replace the remaining processor with a known functional processor. If the problem is resolved after you restart the rack, a fault exists with one or more of the original processors. Install each processor one by one, restarting each time, to find the faulty processor or processors. At each step, be sure the rack supports the processor configurations.

Network Adapter Health: Warning

Network Adapter Health: Warning

Network Adapter Health: Warning

Network adapter health is degraded. Recommended action:

  • Reseat the network adapter and restart the server.

  • Review the signal backplane on the server or the midplane for damage.

  • Replace the adapter.

Network Adapter Health: Critical

Network Adapter Health: Critical

Network Adapter Health: Critical

Network adapter health is failed. Recommended action:

  • Reseat the network adapter and restart the server.

  • Review the signal backplane on the server or the midplane for damage.

  • Replace the adapter.

Port Health: Warning

Port Health: Warning

Port Health: Warning

Port health is degraded. Recommended action:

  • Review the signal backplane on the server or the midplane for damage.

  • Replace the network adapter.

Port Health: Critical

Port Health: Critical

Port Health: Critical

Port health is failed. Recommended action:

  • Review the signal backplane on the server or the midplane for damage.

  • Replace the network adapter.

Rack Chassis Health: Warning

Rack Chassis Health: Warning

Rack Health: Warning

Rack chassis health is degraded. Review the health of this rack's related components to diagnose the problem.

Rack System Health: Warning

Rack System Health: Warning

Rack System Health: Warning

Rack system health is degraded. Review the health of this rack's related components to diagnose the problem.

Rack Chassis Health: Critical

Rack Chassis Health: Critical

Rack Health: Critical

Rack chassis health is failed. Review the health of this rack's related components to diagnose the problem.

Rack System Health: Critical

Rack System Health: Critical

Rack System Health: Critical

Rack system health is failed. Review the health of this rack's related components to diagnose the problem.

Power Supply Health: Warning

Power Supply Health: Warning

Power Supply Health: Warning

Power supply health is degraded. Recommended actions:

  • Be sure no loose connections exist.

  • Check the power source. If the power source is working properly, then replace the power supply.

  • Be sure the system has enough power, particularly if you recently added hardware, such as hard drives. Remove the newly added component and if the problem is no longer present, then additional power supplies are required. Check the system information from the IML.

  • If running a redundant configuration, be sure that all of the power supplies in the system have the same spare part number and are supported by the server.

Power Supply Health: Critical

Power Supply Health: Critical

Power Supply Health: Critical

Power supply health is failed. Recommended actions:

  • Be sure no loose connections exist.

  • Check the power source. If the power source is working properly, then replace the power supply.

  • Be sure the system has enough power, particularly if you recently added hardware, such as hard drives. Remove the newly added component and if the problem is no longer present, then additional power supplies are required. Check the system information from the IML.

  • If running a redundant configuration, be sure that all of the power supplies in the system have the same spare part number and are supported by the server.

Fan Health: Warning

Fan Health: Warning

Fan Health: Warning

Fan health is degraded. Recommended actions:

  • Be sure the fans are properly seated and working. Follow the procedures and warnings in the server documentation for removing the access panels and accessing and replacing fans. Unseat, and then reseat, each fan according to the proper procedures. Replace the access panels, and then attempt to restart the server.

  • Be sure the fan configuration meets the functional requirements of the server. See the server documentation.

  • Be sure no ventilation problems exist. If you have been operating the server for an extended period of time with the access panel removed, airflow may have been impeded, causing thermal damage to components. For further requirements, see the server documentation.

  • Be sure no POST error messages are displayed while booting the server that indicate temperature violation or fan failure information. For the temperature requirements for the server, see the server documentation.

  • Use iLO or an optional IML viewer to access the IML to see if any event list error messages relating to fans are listed.

  • In the iLO web interface, navigate to the Information > System Information page and verify the following information:

  1. Click the Fans tab and verify the fan status and fan speed.

  2. Click the Temperatures tab and verify the temperature readings for each location on the Temperatures tab. If a hot spot is located, then check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

  • Verify the fan airflow path is not blocked by cables or other material.

Fan Health: Critical

Fan Health: Critical

Fan Health: Critical

Fan health is failed. Recommended actions:

  • Be sure the fans are properly seated and working. Follow the procedures and warnings in the server documentation for removing the access panels and accessing and replacing fans. Unseat, and then reseat, each fan according to the proper procedures. Replace the access panels, and then attempt to restart the server.

  • Be sure the fan configuration meets the functional requirements of the server. See the server documentation.

  • Be sure no ventilation problems exist. If you have been operating the server for an extended period of time with the access panel removed, airflow may have been impeded, causing thermal damage to components. For further requirements, see the server documentation.

  • Be sure no POST error messages are displayed while booting the server that indicate temperature violation or fan failure information. For the temperature requirements for the server, see the server documentation.

  • Use iLO or an optional IML viewer to access the IML to see if any event list error messages relating to fans are listed.

  • In the iLO web interface, navigate to the Information > System Information page and verify the following information:

  1. Click the Fans tab and verify the fan status and fan speed.

  2. Click the Temperatures tab and verify the temperature readings for each location on the Temperatures tab. If a hot spot is located, then check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

  • Verify the fan airflow path is not blocked by cables or other material.

Rack Power Consumption: High

Rack Power Consumption: High

Rack Power Consumption: High

Rack power consumption is high. Recommended actions:

  • Be sure the power supplies are properly seated and operational.

  • Be sure the system has enough power, particularly if you recently added hardware, such as hard drives. Remove the newly added component and if the problem is no longer present, then additional power supplies are required.

Power Supply Average Output: High

Power Supply Average Output: High

Power Supply Average Output: High

Power supply average output is high. Recommended actions:

  • Be sure all the other power supplies are properly seated and operational.

  • Be sure the system has enough power, particularly if you recently added hardware, such as hard drives. Remove the newly added component and if the problem is no longer present, then additional power supplies are required.

Temperature Sensor Reading Exceeded Non-Critical Threshold

Temperature Sensor Reading Exceeded Non-Critical Threshold

Temperature Sensor Reading Exceeded Non-Critical Threshold

Temperature exceeded the non-critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Temperature Sensor Reading Exceeded Non-Critical Threshold

Temperature Sensor Reading Exceeded Non-Critical Threshold

Temperature Sensor Reading Exceeded Non-Critical Threshold

Temperature exceeded the non-critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Temperature Sensor Reading Exceeded Critical Threshold

Temperature Sensor Reading Exceeded Critical Threshold

Temperature Sensor Reading Exceeded Critical Threshold

Temperature exceeded the critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Temperature Sensor Reading Exceeded Critical Threshold

Temperature Sensor Reading Exceeded Critical Threshold

Temperature Sensor Reading Exceeded Critical Threshold

Temperature exceeded the critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Ambient Temperature Sensor Reading Exceeded Caution Threshold

Ambient Temperature Sensor Reading Exceeded Caution Threshold

Ambient Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Ambient Temperature Sensor Reading Exceeded Critical Threshold

Ambient Temperature Sensor Reading Exceeded Critical Threshold

Ambient Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

CPU Temperature Sensor Reading Exceeded Caution Threshold

CPU Temperature Sensor Reading Exceeded Caution Threshold

CPU Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

CPU Temperature Sensor Reading Exceeded Critical Threshold

CPU Temperature Sensor Reading Exceeded Critical Threshold

CPU Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Disk Temperature Sensor Reading Exceeded Caution Threshold

Disk Temperature Sensor Reading Exceeded Caution Threshold

Disk Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Disk Temperature Sensor Reading Exceeded Critical Threshold

Disk Temperature Sensor Reading Exceeded Critical Threshold

Disk Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

I/O Temperature Sensor Reading Exceeded Caution Threshold

I/O Temperature Sensor Reading Exceeded Caution Threshold

I/O Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

I/O Temperature Sensor Reading Exceeded Critical Threshold

I/O Temperature Sensor Reading Exceeded Critical Threshold

I/O Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Memory Temperature Sensor Reading Exceeded Caution Threshold

Memory Temperature Sensor Reading Exceeded Caution Threshold

Memory Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Memory Temperature Sensor Reading Exceeded Critical Threshold

Memory Temperature Sensor Reading Exceeded Critical Threshold

Memory Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Power Supply Temperature Sensor Reading Exceeded Caution Threshold

Power Supply Temperature Sensor Reading Exceeded Caution Threshold

Power Supply Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

Power Supply Temperature Sensor Reading Exceeded Critical Threshold

Power Supply Temperature Sensor Reading Exceeded Critical Threshold

Power Supply Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

System Temperature Sensor Reading Exceeded Caution Threshold

System Temperature Sensor Reading Exceeded Caution Threshold

System Temperature Above Caution

Temperature exceeded the caution threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.

System Temperature Sensor Reading Exceeded Critical Threshold

System Temperature Sensor Reading Exceeded Critical Threshold

System Temperature Above Critical

Temperature exceeded the Critical threshold. Recommended actions:

  • Check the airflow path for blockage by cables and other material.

  • Replace any required non-functioning fans and restart the server. For specifications on fan requirements, see the server documentation.

  • Be sure all fan slots have fans or blanks installed. For requirements, see the server documentation.