Use Troubleshooting to Investigate a Reported Problem

To troubleshoot problems with the VPSALES4632 virtual machine, consider evaluating symptoms, examining time line information and events, and creating metric charts to find the root cause.

If a review of the alerts did not help you identify the cause of the problem reported for the virtual machine, use the following tabs: Alert > Symptoms, Event > Timeline, and All Metrics to troubleshoot the virtual machine history and current state.

Prerequisites

Locate the object for which the problem was reported. See Search for a Specific Object.
Review the alerts for the virtual machine to determine if the problem is already identified and recommendations made. See Review Alerts Related to Reported Problems.

Procedure

From the left menu, click Environment > Object Browser, and then click Inventory and select VPSALES4632 from the tree.
The main pane updates to display the object Summary tab.
Click the Alerts tab, click the Symptoms tab, and review the symptoms to determine if one of the symptoms is related to the reported problem.
Depending on how your alerts are configured, some symptoms might be triggered but not sufficient to generate an alert.
1. Review symptom names to determine if one or more symptoms are related to the reported problem.
  The Information column provides the triggering condition, trend, and current value. What are the most common symptoms that affect response time? Do you see any symptoms related to CPU or memory use?
2. Sort by the Created On date so that you can focus on the time frame in which your customer reported that the problem.
3. Click the Status: Active filter button to deactivate the filter so that you can review active and inactive symptoms.
It appears the problem is related to CPU or memory use. But you do not know if the problem is with the virtual machine or with the host.
Click the Events > Timeline tabs and review the alerts, symptoms, and change events that might help identify common trends that are contributing to the reported problem.
1. To determine if other virtual machines had symptoms triggered and alerts generated at the same time as your reported problem, click View From > Peer.
  Other virtual machine alerts are added to the time line. If you see that multiple virtual machines triggered symptoms in the same time frame, then you can investigate parent objects.
2. Click View From and select Host System from the Parent list.
  The alerts and symptoms that are associated with the host on which the virtual machine is deployed are added to the time line. Use the information to determine if a correlation exists between the reported problem and the alerts on the host.
Click the Events > Events tab to view changes in the collected metrics for the problematic virtual machine. Metrics might direct you toward the cause of the reported problem.
1. Manipulate the Date Controls to identify the approximate time when your customer reported the problem.
2. Use the Filters to filter on event criticality and status. Select Symptoms if you want to include the filters in your analysis.
3. Click an Event to view the details about the event.
4. Click View From, select Host System under Parents, and repeat the analysis.
Comparing events on the virtual machine and the host, and evaluating those results, indicates that CPU or memory problems are the likely cause of the problem.
If the problem relates to CPU or memory use, click All Metrics and create metric charts to identify whether it is CPU, memory, or both.
1. If the host is still the focus, begin by working with host metrics.
2. In the metric list, double-click the CPU Usage (%) and the Memory Usage (%) metrics to add them to the workspace on the right.
3. In the map, click the VPSALES4632 object.
  The metric list now displays the virtual machine metrics.
4. In the metric list, double-click the CPU Usage (%) and the Memory Usage (%) metrics to add them to the workspace on the right.
5. Review the host and virtual machine charts to see if you can identify a pattern that indicates the cause of the reported problem.
Comparing the four charts shows normal CPU use on both the host and the virtual machine, and normal memory use on the virtual machine. However, memory use on the host is consistently elevated three days before the reported problem on VPSALES4632.

Results

The host memory is consistently elevated, which impacts virtual machine response time. The number of running virtual machines is well within the supported number. The cause might be many intensive process applications on the virtual machines. Move some of the virtual machines to other hosts, distribute the workload, or power off idle virtual machines.

What to do next

In this example, use VMware Aria Operations to power off virtual machines on the host so that you can improve performance in the running virtual machines. See Run Actions.
If you want to use the combination of charts that you created on the All Metrics tab again, click Generate Dashboard.