You can create a log RCA (Root Cause Analysis) investigation to troubleshoot logs and identify the root cause for major issues in an environment. Each investigation variation automatically scans all the logs in a user-defined period to find the few significant ones.

Note: The Log RCA feature is being deprecated and will not be supported in future releases.

The algorithm detects anomalous logs by assessing their frequency, mean, and variance and groups them into log clusters based on their significance.

Prerequisites

Verify that you are logged in to the VMware Aria Operations for Logs (SaaS) web user interface as an administrator.

Procedure

  1. Click the two arrows icon in the upper-left corner of the screen to expand the main menu.
  2. Navigate to Analytics > Log RCA.
  3. In the upper-right corner of the Log RCA page, click New Investigation.
    If you are using log RCA for the first time, click Get Started Now to activate the feature.
    Note:
    • The log RCA service needs a few minutes to process logs, and a longer time to produce meaningful results. The accuracy of log RCA increases with the amount of time the service runs.
    • You cannot run an RCA for an issue that occurred before you activated the log RCA feature.
  4. Enter a name and optionally, a description for the investigation.
  5. Select the context for the investigation from the drop-down menu. The default context is the Org Level Context, which indicates the current region or organization.
    If there is only one context to choose from, the drop-down menu is not displayed.
  6. Enter a date and time for the investigation.
  7. Select a scan period for the investigation. This period is the duration before the incident date and time within the logs are scanned. You can select a scan period of 2 minutes or 5 minutes.
    For example, if you select the period as 5 minutes and the investigation date and time is August 17, 2021, 11:10 am, the logs within the time frame from August 17, 2021, 11:05 am to August 17, 2021, 11:10 am will be scanned.
  8. Select a sensitivity level to detect anamolous logs.
    For example, select the High sensitivity level to focus on minor deviations from the baseline. Select the Low sensitivity level to focus on major deviations for more concise results.
  9. Click Start to begin the log RCA investigation.

Results

The details of your investigation and its status appear in the Log RCA page. A green tick mark in the Variations and Status column indicates a successful RCA and a red exclamation mark indicates a failed RCA.

If you cannot locate your investigation in the page, use the search functionality and filters to find it.

Click the investigation name to view the significant log clusters. The clusters with the highest score are displayed at the top, and have a higher probability of being relevant to the root cause. In addition to the score, you can see the number of log messages and key terms in each cluster. The top activity for each cluster is also displayed. To view the logs in a cluster, click See All Activities. To view the logs in the Explore Logs page, click the three dots icon against the log cluster and click View in Explore Logs. This information can help you identify the root cause for your issue.

All your investigations and their related data are retained for a period displayed at the bottom of the Log RCA page. Click Change to update the retention period. In the Investigation Retention Period dialog box, select a retention period and click Save. You can select a minimum retention period of 7 days and maximum retention period of 180 days. When you start using log RCA, the default retention period is 30 days.

To retain an investigation for a longer period, in the Log RCA page, click the three dots icon against the investigation name and then click Save for Future.

To remove an investigation, click the three dots icon against the investigation name and then click Delete Investigation.

Optionally, you can create a variant of an investigation. Click the investigation name and then click New Variation. Update the scan period and log sensitivity level, and click Start. This variant is displayed when you open the investigation. Click each variant to view the corresponding log clusters and other information. The number of variants for each investigation is displayed in the Log RCA page, in the Variations and Status column.

What to do next

After viewing the results of your investigation and identifying the root cause, you can take the required steps to prevent similar issues in the future. You can also improve your monitoring by creating alert rules based on the discovered logs from the Explore Logs page.