The Cloud Foundation system provides built-in capabilities to help you perform effective operations monitoring, troubleshooting, performance management, infrastructure capacity planning, and compliance monitoring and auditing.

You use the built-in monitoring capabilities for these typical scenarios.

Scenario

Monitoring Area

Examples

Are the systems online?

Operations and incident monitoring

Alerts raised to notify about issues that might require human intervention.

Why did a storage drive fail?

Troubleshooting

Hardware-centric views spanning inventory, configuration, usage, and event history to provide for diagnosis and resolution.

Is the infrastructure meeting tenant service level agreements (SLAs)?

Performance management

Analysis of system and device-level metrics to identify causes and resolutions.

At what future time will the systems get overloaded?

Infrastructure capacity planning

Trend analysis of detailed system and device-level metrics, with summarized periodic reporting

What person performed which action and when?

Compliance monitoring and auditing

Event history of secured user action, with periodic reporting.

Workflow task history of actions performed in the system.

The monitoring capabilities involve these features:

Events

An event is a record of a system condition that is potentially significant or interesting to you, such as a degradation, failure, or user-initiated configuration change. Multiple events might be generated for the same condition.

Audit events

In Cloud Foundation, an audit event is an event raised for a user-initiated or system-generated actions. The following lists show some examples of actions that raise audit events. These lists are not meant to be a complete list of the actions that result in audit events.

Examples of user-initiated actions that raise audit events:

  • Users logging in and out of SDDC Manager

  • Users performing actions involving workflows, such as creating a workload domain

  • User actions involving provisioning

  • Users granting or revoking a role from other users

  • Account password changes, including successful and failed actions

  • Users performing actions on physical resources, such as powering off a host

  • Users performing the actions for life cycle management of the Cloud Foundation software

Examples of system-generated actions that raise audit events:

  • Validation activity, such as during the bring-up process

  • All workflows and tasks, including successful and failed actions

  • All actions of Cloud Foundation that are performed to fulfill user-initiated actions, such as host configuration activities to fulfill a user-initiated action to expand a workload domain

  • Network interface configuration changes

Alerts

An alert is a record of a known detected problem. Cloud Foundation has a built-in capability for detecting problems using events raised at a device level, and generating alerts that warn you about problems that would impact workload Service Level Agreements (SLAs) or which require human intervention. Multiple alerts are not generated for the same problem. Each alert generates two events, an event when the alert is raised and an event when the alert is cleared.

Workflows and tasks

A task is a unit of work performed by SDDC Manager that changes the state of a resource. A workflow is a long-running group of tasks that perform an overall goal, such as creating a workload domain.

vRealize Log Insight instanced deployed by Cloud Foundation

Use of the vRealize Log Insight instance deployed by Cloud Foundation is licensed separately. When this deployed vRealize Log Insight instance is licensed for use in your environment, events and log content for the physical resources and the VMware SDDC virtual infrastructure are sent to the vRealize Log Insight instance. As a result, when you log in to the vRealize Log Insight Web interface, you can obtain a unified view of event and syslog information to assist with troubleshooting. Data from the events and audit events raised by Cloud Foundation is also sent to vRealize Log Insight. You can use the searching, query, and reporting features of vRealize Log Insight to create trend reports and auditing reports from the event history. See Using vRealize Log Insight Capabilities in Your Cloud Foundation System.

Note:

The vRealize Log Insight environment that SDDC Manager deploys is sized for monitoring the hardware and software of your Cloud Foundation installation only. The default sizing accommodates the events and logs expected to be sent by the Cloud Foundation environment. This sizing might not accommodate the numbers of events and logs coming from additional applications or VMs that reside outside of your Cloud Foundation environment. Therefore, configuring the vRealize Log Insight environment that is deployed by SDDC Manager to collect events logs from additional applications or VMs that reside outside of your Cloud Foundation environment is not supported in this release.