Self-Healing feature enables VMware SD-WAN Enterprise and Managed Service Provider (MSP) users to activate and configure Self-Healing capabilities at the Customer, Profile, and Edge level.

Once the Operator user enables the Self-Healing feature for an Enterprise in VMware SD-WAN Orchestrator, VMware Edge Network Intelligence (ENI) monitors and tracks the VMware SD-WAN network for systemic and application performance issues across Edges. ENI then gathers data regarding Self-Healing actions and triggers remediation recommendations to the users on the SD-WAN side directly through the incident alert email.

Key Features

The following are two key features for Self-Healing:

  • Incidents - An "incident"is created when the analytics engine detects a sudden drop in application performance as compared to the recent history of 30 minutes. This feature works at the minutes timescale of the Self-Healing system.
  • Recommendations - A "recommendation" is created when the analytics engine identifies an Edge experiencing significantly poor application performance than other Edges in the network over a longer time window. This feature works at the days’ timescale of the Self-Healing system.
    Note: Currently, the Recommendations feature is not supported, it will be made available in future releases.

An incident outlines the following:

  • Impact - This denotes the number of Edges that have been affected by the application performance drop
  • Flow stats - This identifies the specific flow metrics that show a significant change in value, for example, average TCP latency spiked to 2.7s
  • Flow overlay route - This denotes the most common next-hop node on the SD-WAN overlay (or Direct) among the affected Edges
  • Other impact - This identifies if the application issue is specific to that application or does it affect other applications
  • Remediation - This identifies the remediation action which can be applied through a click or, if automatic remediation is enabled, has been applied by the Self-Healing system
An incident would trigger remediation action which is manually driven from ENI currently. In Manual remediation, the Self-Healing system only provides suggested remediation action and a UI workflow to trigger the remediation action. The remediation action can be triggered directly through the incident alert or through the ENI app.
Note: Currently, only Manual remediation is supported by ENI. Automatic remediation support is planned in future releases.

Benefits

  • Quickly identifies systemic issues in the network and proactively creates a recommendation when the analytics engine identifies an Edge experiencing significantly poor application performance than other Edges in the network.
  • Detects and remediates a sudden and significant end-user application performance degradation without much any user intervention.
  • Reduces time to find and fix application performance issues.