You use the Event Catalog to view the definitions of all of the events that the SDDC Manager monitors and records as part of its event-driven problem detection capabilities.

From the Events page, you open the Event Catalog by clicking Catalog. You can open the Events page from the SDDC Manager dashboard by navigating to the System Status page and clicking on the View Details button in the Events area.

Expand an event to see its definition, containing details such as its severity, description, resource hierarchy, categories, and type.





You can filter the displayed list by the event severity.

Hardware Operational Events

The software raises these events that are related to hardware operations. The event is raised when the software has determined the event's condition exists. When the event is raised, the event report includes identifying information about the hardware device for which the event was raised and its containing physical device, such as the server name in which the device resides and the name of the physical rack in which the server resides. As appropriate for the particular event, other relevant values are reported in the event, such as current temperature values for temperature-related events.

Table 1. Hardware Operational Events Raised in a Cloud Foundation Environment

Event Name

Severity

Short Description

BMC_AUTHENTICATION_FAILURE

WARNING

The software is unable to authenticate to the server's out-of-band (OOB) management port.

BMC_MANAGEMENT_FAILURE

WARNING

The software failed to perform a management operation using the server's OOB management port.

BMC_NOT_REACHABLE

WARNING

The software is unable to communicate with the server's OOB management port.

CPU_CAT_ERROR

ERROR

A CPU has shut down due to the processor's catastrophic error (CATERR) signal.

CPU_INITIALIZATION_ERROR

ERROR

The software detected that a CPU startup initialization error has occurred.

CPU_MACHINE_CHECK_ERROR

ERROR

Server CPU has failed due to CPU Machine Check Error.

CPU_POST_FAILURE

ERROR

Server CPU has shut down due to POST failure.

CPU_TEMPERATURE_ABOVE_UPPER_THRESHOLD

WARNING

CPU temperature has reached its maximum safe operating temperature.

CPU_TEMPERATURE_BELOW_LOWER_THRESHOLD

WARNING

CPU temperature has reached its minimum safe operating temperature.

CPU_THERMAL_TRIP

ERROR

Server CPU has shut down due to thermal error.

DIMM_ECC_ERROR

ERROR

The software detected an uncorrectable Error Correction Code (ECC) error for a server's memory.

DIMM_TEMPERATURE_ABOVE_UPPER_THRESHOLD

WARNING

Memory temperature has reached its maximum safe operating temperature.

DIMM_THERMAL_TRIP

ERROR

Memory has shut down due to thermal error.

HDD_DOWN

ERROR

Operational status is down for an HDD storage drive.

HDD_EXCESSIVE_READ_ERRORS

WARNING

Excessive read errors reported for an HDD storage drive.

HDD_EXCESSIVE_WRITE_ERRORS

WARNING

Excessive write errors reported for an HDD storage drive.

HDD_TEMPERATURE_ABOVE_THRESHOLD

WARNING

HDD storage drive temperature has reached its maximum safe operating temperature.

HDD_UP

INFO

Operational status is up for an HDD storage drive.

HDD_WEAROUT_ABOVE_THRESHOLD

WARNING

Wear-out state of an HDD storage drive is above its defined threshold.

HMS_AGENT_DOWN

CRITICAL

A physical rack's Hardware Management Services agent is down.

HMS_AGENT_UP

INFO

A physical rack's Hardware Management Services agent is operational.

MANAGEMENT_SWITCH_DOWN

CRITICAL

Operational status is down for a physical rack's management switch.

MANAGEMENT_SWITCH_PORT_DOWN

WARNING

Operational status is down for a switch port in a physical rack's management switch.

MANAGEMENT_SWITCH_PORT_UP

INFO

Operational status is up for a switch port in a physical rack's management switch.

MANAGEMENT_SWITCH_UP

INFO

Operational status is up for a physical rack's management switch.

NIC_LINK_DOWN

WARNING

Deprecated. NIC_PORT_DOWN event is used instead.

NIC_PACKET_DROP_ABOVE_THRESHOLD

WARNING

A NIC's packet drop is above its defined threshold.

NIC_PORT_DOWN

ERROR

Operational status is down for a NIC port.

NIC_PORT_UP

INFO

Operational status is up for a NIC port.

PCH_TEMPERATURE_ABOVE_THRESHOLD

WARNING

Platform controller hub [PCH] temperature has reached its maximum safe operating temperature.

SERVER_DOWN

ERROR

Server is in the powered-down state.

SERVER_PCIE_ERROR

ERROR

A server's system has PCIe errors.

SERVER_POST_ERROR

ERROR

A server's system has POST failures.

SERVER_UP

INFO

Server is in the powered-up state.

SPINE_SWITCH_DOWN

ERROR

Operational status is down for a physical rack's spine switch.

SPINE_SWITCH_PORT_DOWN

WARNING

Operational status is down for a switch port: in a physical rack's spine switch.

SPINE_SWITCH_PORT_UP

INFO

Operational status is up for a switch port: in a physical rack's spine switch.

SPINE_SWITCH_UP

INFO

Operational status is up for a physical rack's spine switch.

SSD_DOWN

ERROR

Operational status is down for an SSD storage device.

SSD_EXCESSIVE_READ_ERRORS

WARNING

Excessive read errors reported for an SSD storage drive.

SSD_EXCESSIVE_WRITE_ERRORS

WARNING

Excessive write errors reported for an SSD storage drive.

SSD_TEMPERATURE_ABOVE_THRESHOLD

WARNING

SSD storage drive temperature has reached its maximum safe operating temperature.

SSD_UP

INFO

Operational status is up for an SSD storage device.

SSD_WEAROUT_ABOVE_THRESHOLD

WARNING

Wear-out state of an SSD storage drive is above its defined threshold.

STORAGE_CONTROLLER_DOWN

ERROR

Operational status is down for a storage adapter.

STORAGE_CONTROLLER_UP

INFO

Operational status is up for a storage adapter.

TOR_SWITCH_DOWN

ERROR

Operational status is down for a physical rack's ToR switch.

TOR_SWITCH_PORT_DOWN

WARNING

Operational status is down for a switch port in a physical rack's ToR switch.

TOR_SWITCH_PORT_UP

INFO

Operational status is up for a switch port in a physical rack's ToR switch.

TOR_SWITCH_UP

INFO

Operational status is up for a physical rack's ToR switch.

Audit Events

In a Cloud Foundation environment, an audit event is an event raised for a user-initiated or system-generated action. The audit event is raised when the software has determined the event's related auditable condition exists. As appropriate for the particular event, when the event is raised, the event report includes information such as the user who initiated the event, the type of operation that was performed, whether the operation succeeded or failed, and so on.

Table 2. Audit Events Raised in a Cloud Foundation Environment

Event Name

Severity

Short Description

DOMAIN_ADD_FAILED​

WARNING

Creation and deployment of a workload domain failed.

DOMAIN_ADD_SUCCEEDED​

INFO

Creation and deployment of a workload domain succeed.

DOMAIN_RETRY_ADD

INFO

User has initiated the restart workflow action on a workload-domain-related workflow.

DOMAIN_STATUS_UPDATE​

INFO

A workload-domain-related workflow has changed status.

DOMAIN_TASK_ADD​ED

INFO

The software has added a new subtask to a workload-domain-related workflow. The software creates workflows for certain user actions and this event is raised when the software adds a new subtask to such workflows.

DOMAIN_TASK_FAILED​

WARNING

A subtask within a workload-domain-related workflow has failed.

DOMAIN_TASK_STATUS_UPDATE​

INFO

A subtask within a workload-domain-related workflow has changed status.

DOMAIN_TASK_SUCCEEDED

INFO

A subtask within a workload-domain-related workflow has completed successfully.

DOMAIN_VDI_ADD

INFO

User has initiated the operation to create a VDI workload domain in the environment.

DOMAIN_VIRTUAL_INFRASTRUCTURE_ADD

INFO

User has initiated the operation to create a Virtual Infrastructure workload domain in the environment.

PERMISSION_GRANT_FAILED

WARNING

User has initiated the action to assign a role granting permissions to a user failed.

PERMISSION_GRANT_SUCCEEDED

INFO

The user-initiated action to assign a role granting permissions to a user has succeeded.

PERMISSION_REVOKE_FAILED

WARNING

The user-initiated action to remove a role from a user and revoke the user's permissions granted by that role has failed.

PERMISSION_REVOKE_SUCCEEDED

INFO

The user-initiated action to remove a role from a user and revoke the user's permissions granted by that role has succeeded.

PERMISSION_UPDATE_FAILED

WARNING

The user-initiated action the action to change a user's existing role to another role has failed.

PERMISSION_UPDATE_SUCCEEDED

INFO

The user-initiated action to change a user's existing role to another role has completed successfully.

ROLE_ADD_FAILED

WARNING

The user-initiated action to create a new role in the environment has failed.

ROLE_ADD_SUCCEEDED

INFO

The user-initiated action to create a new role in the environment has completed successfully.

ROLE_DELETE_FAILED

WARNING

The user-initiated action to delete a role has failed.

ROLE_DELETE_SUCCEEDED

WARNING

The user-initiated action to delete a role has completed successfully.

ROLE_NAME_CHANGE_FAILED

WARNING

The user-initiated action to change a role name has failed.

ROLE_NAME_CHANGE_SUCCEEDED

INFO

The user-initiated action to change a role's name has completed successfully.

ROLE_PRIVILEGE_UPDATE_FAILED

WARNING

The user-initiated action to change the privileges associated with a role has failed.

ROLE_PRIVILEGE_UPDATE_SUCCEEDED

INFO

The user-initiated action to change the privileges associated with a role has completed successfully.

SERVER_POWER_CYCLE_FAILED

WARNING

The user-initiated action to power cycle a server has failed.

SERVER_POWER_CYCLE_SUCCEEDED

WARNING

The user-initiated action to power cycle a server has completed successfully.

SERVER_POWER_OFF_FAILED

WARNING

The user-initiated action to power off a server has failed.

SERVER_POWER_OFF_SUCCEEDED

WARNING

The user-initiated action to power off a server has completed successfully.

SERVER_POWER_ON_FAILED

WARNING

The user-initiated action to power on a server has failed.

SERVER_POWER_ON_SUCCEEDED

INFO

The user-initiated action to power on a server has completed successfully.

USER_LOG_IN_FAILED

WARNING

Log in to SDDC Manager failed for the user.

USER_LOG_IN_SUCCEEDED

INFO

Log in to SDDC Manager succeeded for the user.

USER_LOG_OUT_FAILED

WARNING

Log out from SDDC Manager failed for the user.

USER_LOG_OUT_SUCCEEDED

INFO

Log out from SDDC Manager succeeded for the user.

Life Cycle Management Events

The software raises these events that are related to the life cycle management operations that are available in your Cloud Foundation environment. As appropriate for the particular event, when the event is raised, the event report includes information such as the type of operation that was performed, whether the operation succeeded or failed, and the condition for which the event was raised. For details about using the life cycle management features available in your environment, see Patching and Upgrading Cloud Foundation.

Table 3. Life Cycle Management Events Raised in a Cloud Foundation Environment

Event Name

Severity

Short Description

BUNDLE_DOWNLOAD_FAILURE

ERROR

The software failed to download a bundle from the remote source location. The exact cause of the failure could not be detected by the software.

BUNDLE_DOWNLOAD_FILESIZE_MISMATCH

ERROR

The downloaded bundle's file size is greater than the file size specified in the bundle manifest.

BUNDLE_DOWNLOAD_INVALID_TAR_MANIFEST

ERROR

An error occurred while parsing the manifest file inside the downloaded bundle retrieved from the remote download source.

BUNDLE_DOWNLOAD_SCHEDULED

INFO

A bundle download is scheduled. The scheduled time is provided in the event description.

BUNDLE_DOWNLOAD_STARTED

INFO

Downloading the bundle from the bundles' remote source location has started.

BUNDLE_DOWNLOAD_SUCCEEDED

INFO

The software successfully downloaded the bundle from the bundle's remote source location.

BUNDLE_DOWNLOAD_TIMEOUT

ERROR

The bundle download process timed out while downloading the bundle from the remote source location.

BUNDLE_MANIFEST_DOWNLOAD_SUCCEEDED

INFO

The software successfully downloaded the bundle's manifest from the remote source location.

BUNDLE_MANIFEST_DOWNLOAD_FAILURE

ERROR

The software failed to retrieve the bundle manifest file from the remote source location. The exact cause of the failure could not be detected by the software.

BUNDLE_MANIFEST_INVALID

ERROR

The software has determined that the bundle manifest which was retrieved from the remote source location and written to the local repository is invalid.

BUNDLE_MANIFEST_SIGNATURE_INVALID

ERROR

The signature for the bundle manifest is invalid.

BUNDLE_MANIFEST_SIGNATURE_NOT_FOUND

ERROR

The software cannot locate the bundle manifest's signature file in the expected location. The signature file is used for validating the bundle manifest file.

BUNDLE_REPO_FILE_NOT_FOUND

WARNING

The software cannot locate the specified bundle at the expected location within the bundle repository.

BUNDLE_REPO_WRITE_FAILURE

ERROR

Problems with the bundle repository are preventing bundle downloads from completing successfully.

PARTIAL_BUNDLE_DOWNLOAD

ERROR

A bundle was not fully downloaded from its remote source location. The number of bytes downloaded does not match the number of bytes stated in the bundle manifest.

UPGRADE_ABORTED

WARNING

The software has automatically cancelled a scheduled upgrade because a workflow is taking place, such as a workload domain creation or deletion workflow.

UPGRADE_CANCELLED

INFO

User has cancelled the upgrade.

UPGRADE_COMPLETION

WARNING

The life cycle management upgrade completed. The upgraded component and the completion status is provided in the event description.

UPGRADE_FAILED

WARNING

Upgrade operation has failed.

UPGRADE_NOT_NEEDED

INFO

The software has determined all of the environment's components have up-to-date versions and upgrading them is not needed.

UPGRADE_SCHEDULED

INFO

A bundle upgrade is scheduled. The scheduled time is provided in the event description.

UPGRADE_STARTED

INFO

Upgrade operation has started.

UPGRADE_SUCCEEDED

INFO

Upgrade operation has succeeded.

UPGRADE_TIMEDOUT

WARNING

Upgrade operation has timed out.

VMWARE_DEPOT_CONNECT_FAILURE

WARNING

The software failed to connect to the remote source location from which the upgrade bundles are downloaded.

VMWARE_DEPOT_INDEX_FILE_NOT_FOUND

ERROR

The software cannot locate an index file at the remote source location.

VMWARE_DEPOT_INSUFFICIENT_PERMISSION

ERROR

The software failed to download a bundle or bundle manifest from the remote source location because the user account used to connect to the remote location does not have read permission for the remote directory or file.

VMWARE_DEPOT_INDEX_INVALID

ERROR

The retrieved bundle index is invalid.

VMWARE_DEPOT_MANIFEST_FILE_NOT_FOUND

ERROR

The software cannot locate a manifest file at the remote source location.

VMWARE_DEPOT_MISSING_BUNDLE

ERROR

The software cannot locate a bundle available for downloading from the remote source location.

VMWARE_DEPOT_UNKNOWN_HOST

ERROR

The software cannot resolve the VMware Depot host of the configured remote source location for downloading upgrade bundles.