OpenStack Management Services Alert Definitions

The Management Pack for VMware Integrated OpenStack provides the following predefined alert definitions for the OpenStack Management Services object type.

Table 1. OpenStack Management Services Alert Definitions
Alert Name	Alert Description	Symptom	Recommendations	Impact	Severity
One or more keystone-all services are unavailable	One or more keystone-all services are unavailable, and may be down.	Self Metric: Service\|Keystone:Keystone-All\|Running (%) < 100	Restart any keystone services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of keystone-all services are unavailable	The majority of keystone-all services are unavailable, which may be impacting the ability for projects to authenticate.	Self Metric: Service\|Keystone:Keystone-All\|Running (%) <= 50	Restart any keystone services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All keystone-all services are unavailable	All keystone-all services are unavailable, which may be impacting the ability for projects to authenticate.	Self Metric: Service\|Keystone:Keystone-All\|Running (%) = 0	Restart any keystone services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more mySQL services are unavailable	One or more MySQL services are unavailable, and may be down.	Self Metric: Service\|MySQL\|Running (%) < 100	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of mySQL services are unavailable	The majority of MySQL services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|MySQL\|Running (%) <= 50	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All mySQL services are unavailable	All MySQL services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|MySQL\|Running (%) = 0	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more Apache services are unavailable	One or more Apache services are unavailable, and may be down.	Self Metric: Service\|Apache\|Running (%) < 100	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of Apache services are unavailable	The majority of Apache services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|Apache\|Running (%) <= 50	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All Apache services are unavailable	All Apache services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|Apache\|Running (%) = 0	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more RabbitMQ services are unavailable	One or more RabbitMQ services are unavailable, and may be down.	Self Metric: Service\|RabbitMQ\|Running (%) < 100	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of RabbitMQ services are unavailable	The majority of RabbitMQ services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|RabbitMQ\|Running (%) <= 50	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All RabbitMQ services are unavailable	All RabbitMQ services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|RabbitMQ\|Running (%) = 0	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more Memcached services are unavailable	One or more Memcached services are unavailable, and may be down.	Self Metric: Service\|Memcached\|Running (%) < 100	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of Memcached services are unavailable	The majority of Memcached services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|Memcached\|Running (%) <= 50	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All Memcached services are unavailable	All Memcached services are unavailable, which may be impacting the OpenStack infrastructure.	Self Metric: Service\|Memcached\|Running (%) = 0	Restart any application services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more heat-api services are unavailable	One or more heat-api services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api\|Running (%) < 100	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of heat-api services are unavailable	The majority of heat-api services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api\|Running (%) <= 50	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All heat-api services are unavailable	All heat-api services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api\|Running (%) = 0	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more heat-api-cfn services are unavailable	One or more heat-api-cfn are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api-Cfn\|Running (%) < 100	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of heat-api-cfn services are unavailable	The majority of heat-api-cfn services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api-Cfn\|Running (%) <= 50	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All heat-api-cfn services are unavailable	All heat-api-cfn services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api-Cfn\|Running (%) = 0	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more heat-engine services are unavailable	One or more heat-engine are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Engine\|Running (%) < 100	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of heat-engine services are unavailable	The majority of heat-engine services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Engine\|Running (%) <= 50	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All heat-engine services are unavailable	All heat-engine services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Engine\|Running (%) = 0	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more heat-api-cloudwatch services are unavailable	One or more heat-api-cloudwatch are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api-Cloudwatch\|Running (%) < 100	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of heat-api-cloudwatch services are unavailable	The majority of heat-api-cloudwatch services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api-Cloudwatch\|Running (%) <= 50	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operationss agent is running on the controller.	Health	Immediate
All heat-api-cloudwatch services are unavailable	All heat-api-cloudwatch services are unavailable, and may be down	Self Metric: Service\|Heat:Heat-Api-Cloudwatch\|Running (%) = 0	Restart any heat services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more jarvis services are unavailable	One or more jarvis are unavailable, and may be down	Self Metric: Service\|OMS:Jarvis\|Running (%) < 100	Restart any management services that are down. Check the log files for more information about what caused the failure. Verify that theEnd Point Operations agent is running on the controller.	Health	Warning
The majority of jarvis services are unavailable	The majority of jarvis services are unavailable, and may be down	Self Metric: Service\|OMS:Jarvis\|Running (%) <= 50	Restart any management services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All jarvis services are unavailable	All jarvis services are unavailable, and may be down	Self Metric: Service\|OMS:Jarvis\|Running (%) = 0	Restart any management services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more vPostGres services are unavailable	One or more vPostGres services are unavailable, and may be down	Self Metric: Service\|vPostgres\|Running (%) < 100	Restart any vpostgres services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of vPostGres services are unavailable	The majority of vPostGres services are unavailable, which may be impacting the OpenStack infrastructure	Self Metric: Service\|vPostgres\|Running (%) <= 50	Restart any vpostgres services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All vPostGres services are unavailable	All vPostGres services are unavailable, which may be impacting the OpenStack infrastructure	Self Metric: Service\|vPostgres\|Running (%) = 0	Restart any vpostgres services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical
One or more tc-oms services are unavailable	One or more tc-oms services are unavailable, and may be down	Self Metric: Service\|OMS:Tc-Oms\|Running (%) < 100	Restart any management services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Warning
The majority of tc-oms services are unavailable	The majority of tc-oms services are unavailable, which may be impacting the OpenStack infrastructure	Self Metric: Service\|OMS:Tc-Oms\|Running (%) <= 50	Restart any management services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Immediate
All tc-oms services are unavailable	All tc-oms services are unavailable, which may be impacting the OpenStack infrastructure	Self Metric: Service\|OMS:Tc-Oms\|Running (%) = 0	Restart any management services that are down. Check the log files for more information about what caused the failure. Verify that the End Point Operations agent is running on the controller.	Health	Critical