Describes a summary of alerts and events generated within the VMware SASE Orchestrator at the Operator level.

The document provides details about all Operator-level Orchestrator events. Although these events are stored within the SASE Orchestrator and displayed on the Orchestrator UI, most of them are generated by either an SD-WAN Gateway and/or one of its running components (MGD, PROCMON, and so on) with the exception of a few which are generated by the Orchestrator itself. You can configure notifications/alerts for events in Orchestrator only.

The following table provides an explanation for each of the columns in the "Operator-level Orchestrator Events" table:

Column name Details
EVENT Unique name of the event
DISPLAYED ON ORCHESTRATOR UI AS Specifies how the event is displayed on the Orchestrator.
SEVERITY The severity with which this event is usually generated.
GENERATED BY
The VMware SD-WAN component generating the notification can be one of the following:
  • SASE Orchestrator
  • SD-WAN Gateway
GENERATED WHEN Technical reason(s) and circumstances under which this event is generated.
RELEASE ADDED IN The release this event was first added. If not specified, this event existed prior to release 2.5.
DEPRECATED Specifies if the event is deprecated from a specific release.

Operator-level Orchestrator Events

EVENT DISPLAYED ON ORCHESTRATOR UI AS SEVERITY GENERATED BY GENERATED WHEN RELEASE ADDED IN DEPRECATED
GATEWAY_UP Gateway up INFO SASE Orchestrator A Gateway restores after losing connectivity with the Orchestrator.
GATEWAY_DOWN Gateway down INFO SASE Orchestrator A Gateway fails to communicate after losing connectivity with the Orchestrator.
GATEWAY_LARGE_PKT_SIZE Packet size limit exceeded INFO SD-WAN Gateway The packet size limit incoming from a Gateway's peer exceeded.
GATEWAY_SERVICE_FAILED Gateway service failed ERROR SD-WAN Gateway The GWD service on the Gateway fails.
GATEWAY_BFD_NEIGHBOR_UP BFD session established to Gateway neighbor INFO SD-WAN Gateway A Gateway BFD neighbor comes back up
GATEWAY_BFD_NEIGHBOR_DOWN Gateway BFD neighbor unavailable INFO SD-WAN Gateway A Gateway BFD neighbor comes back down
GATEWAY_ICMP_PROBE_UNSTABLE SD-WAN Gateway: ICMP probe unstable ALERT SD-WAN Gateway The ICMP probe goes down on Partner Gateway.
GATEWAY_REBALANCE Gateways rebalanced INFO SASE Orchestrator
PROXY_ENABLE_OPERATOR_ACCESS Partner access delegated to operator INFO SASE Orchestrator
PROXY_DISABLE_OPERATOR_ACCESS Partner access revoked to operator INFO SASE Orchestrator
VRF_ROUTEMAP_RULES_MAX_LIMIT_HIT VRF route map rules limit exceeded WARNING SD-WAN Gateway The VRF Inbound/Outbound route map maximum limit exceeded (32).
VRF_LIMIT_EXCEEDED VRF limit exceeded ALERT SD-WAN Gateway The VRF entries configured exceeded maximum limit (1000).
ENABLE_EXTERNAL_CA External CA Enabled CRITICAL SASE Orchestrator The ca.external.enable property is set to true. 4.3.0
DISABLE_EXTERNAL_CA External CA Disabled CRITICAL SASE Orchestrator

The ca.external.enable property is set to false.

4.3.0
INSERT_EXTERNAL_CA External CA Inserted CRITICAL SASE Orchestrator External CA is inserted into the VELOCLOUD_CERTIFICATE_AUTHORITY table and becomes a trusted issuer. 4.3.0
CREATE_COMPOSITE_ROLE Composite Role Created INFO SASE Orchestrator A composite role is created by an Enterprise, Partner, or Operator. 4.5.0
UPDATE_COMPOSITE_ROLE Composite Role Updated INFO SASE Orchestrator A composite role is updated by an Enterprise, Partner, or Operator. 4.5.0
DELETE_COMPOSITE_ROLE Composite Role Deleted INFO SASE Orchestrator A composite role is deleted by an Enterprise, Partner, or Operator. 4.5.0
CA_VALIDATION CA validation failure ALERT SASE Orchestrator The CA certificate attributes are rejected. 5.0.0
ENI_ACTIVATION_CONFIG_SENT ENI activation config sent INFO SASE Orchestrator The activation config has been successfully sent to the ENI server. 5.0.0
Auto_Rate_Limit_Enabled Auto Rate-Limit Enabled WARNING SD-WAN Gateway

The auto rate-limit capability is activated on Gateways if the Gateway detects that certain Edges are sending large amount of traffic which might be causing the Gateway to be unstable and drop packets. The event message includes the information about the list of Edges (Enterprise, Rate Limit Percentage) on which the auto rate-limit is activated.

5.2.0
Auto_Rate_Limit_Disabled Auto Rate-Limit Disabled WARNING SD-WAN Gateway Gateway auto rate-limit condition is restored. 5.2.0
POLL_IDPS_SIGNATURE_FAIL Failure in poll job that queries and downloads signature file from GSM ERROR SASE Orchestrator When SASE Orchestrator backend poll job has failed to retrieve or download suricata signature from GSM and update profiles with the new signature metadata. 5.2.0
IDPS_SIGNATURE_VCO_VERSION_CHECK_FAIL Querying existing signature version from local DB failed ERROR SASE Orchestrator When SASE Orchestrator backend poll job has failed to retrieve existing suricata signature version from Orchestrator's local database. 5.2.0
IDPS_SIGNATURE_GSM_VERSION_CHECK_FAIL Querying signature metadata from GSM failed ERROR SASE Orchestrator When SASE Orchestrator backend poll job has failed to retrieve existing suricata signature metadata (that includes signature version) from GSM. 5.2.0
IDPS_SIGNATURE_SKIP_DOWNLOAD_NO_UPDATE Skipping signature download due to no change in signature version INFO SASE Orchestrator When SASE Orchestrator backend poll job skips downloading suricata signature file due to no change in suricata signature file version. 5.2.0
IDPS_SIGNATURE_STORE_FAILURE_NO_PATH Filestore path not set to store signature file ERROR SASE Orchestrator When SASE Orchestrator backend poll job fails to store suricata signature file due to filestore path not being set. 5.2.0
IDPS_SIGNATURE_DOWNLOAD_SUCCESS Successfully downloaded signature file from GSM INFO SASE Orchestrator When SASE Orchestrator backend poll job successfully downloads suricata signature file from GSM. 5.2.0
IDPS_SIGNATURE_DOWNLOAD_FAILURE Failed to download signature file from GSM ERROR SASE Orchestrator When SASE Orchestrator backend poll job fails to download suricata signature file from GSM. 5.2.0
IDPS_SIGNATURE_STORE_SUCCESS Successfully stored the signature file in filestore INFO SASE Orchestrator When SASE Orchestrator backend poll job successfully stores the suricata signature file in local file store. 5.2.0
IDPS_SIGNATURE_STORE_SIGNATURE_FAILURE Failed to store the signature file in filestore ERROR SASE Orchestrator When SASE Orchestrator backend poll job fails to store the suricata signature file in local file store. 5.2.0
IDPS_SIGNATURE_METADATA_INSERT_SUCCESS Successfully added metadata of the signature file to local DB INFO SASE Orchestrator When SASE Orchestrator backend poll job successfully adds metadata of the suricata signature file to local DB. 5.2.0
IDPS_SIGNATURE_METADATA_INSERT_FAILURE Failure to add metadata of the signature file to local DB ERROR SASE Orchestrator When SASE Orchestrator backend poll job fails to add metadata of the suricata signature file to local DB. 5.2.0
SELF_HEALING_REPORT anomaly_type: <Remote Route inconsistency>, num_routes_recovered:<number>, shr state: <DONE> ALERT SD-WAN Gateway Generated when routes are detected as missing from a customer enterprise connected to a SD-WAN Gateway, and the Gateway corrects this issue by using the Self-Healing Routing feature to recover the missing routes. 5.2

SD-WAN Gateway Capacity Events

Currently, the SD-WAN Gateway assignment is based on Geo-proximity and doesn't take SD-WAN Gateway capacity health metrics into account. To improve the Edge-to-Gateway assignment the capacity health metrics (Edge Count, Tunnel Count, PKI Activated Tunnel Count, Flow count, NAT Count, Packet Queue Watermark, and Packet Drops) are monitored periodically based on warning and critical thresholds. When any of the metrics count is above the defined warning and critical thresholds, Gateway capacity events are generated and reported to the SASE Orchestrator. These events provide the Operator and Partners a clear visibility about the Gateway health for making intelligent and correct Gateway assignments.

The following are the SD-WAN Gateway capacity events generated based on the capacity threshold limits.

Metric Trigger Event Severity Message Event Detail
Edge Count Above Warning Threshold. The Warning threshold value is 90% of following Supported values:
  • 4 CPU, 32G MEM - 2000
  • 8 CPU, 32G MEM - 4000
GATEWAY_DEGRADED NOTICE Over capacity alert due to high number of connected Edges as Gateway has crossed warning threshold.
  • 4 CPU: The number of connected Edges is above the warning threshold (1800).
  • 8 CPU: The number of connected Edges is above the warning threshold (3600).
Edge Count Above Critical Threshold. The Critical threshold value is 95% of following Supported values:
  • 4 CPU, 32G MEM - 2000
  • 8 CPU, 32G MEM - 4000
GATEWAY_CRITICAL NOTICE Over capacity alert due to high number of connected Edges as Gateway has crossed critical threshold.
  • 4 CPU: The number of connected Edges is above the critical threshold (1900).
  • 8 CPU: The number of connected Edges is above the critical threshold (3800).
Edge Count Below Warning Threshold GATEWAY_STABLE INFO Over capacity condition due to high number of connected Edges restored. The number of connected Edges is within the acceptable threshold.
Tunnel Count Above Warning Threshold. The Warning threshold value is 90% of following Supported values:
  • 4 CPU, 32G MEM - 3000
  • 8 CPU, 32G MEM - 6000
GATEWAY_DEGRADED NOTICE Over capacity alert due to high number of tunnels.
  • 4 CPU: The number of tunnels is above the warning threshold (2700).
  • 8 CPU: The number of tunnels is above the warning threshold (5400).
Tunnel Count Above Critical Threshold. The Critical threshold value is 95% of following Supported values:
  • 4 CPU, 32G MEM - 3000
  • 8 CPU, 32G MEM - 6000
GATEWAY_CRITICAL NOTICE Over capacity alert due to high number of tunnels.
  • 4 CPU: The number of tunnels is above the critical threshold (2850).
  • 8 CPU: The number of tunnels is above the critical threshold (5700).
Tunnel Count Below Warning Threshold GATEWAY_STABLE INFO Over capacity condition due to high number of tunnels restored. The number of tunnels is within the acceptable threshold.
Flow Count Above Warning Threshold. The Warning threshold value is 50% of Supported value 1920000. GATEWAY_DEGRADED NOTICE Over capacity alert due to high number of flows. The number of flows is above the warning threshold (960000).
Flow Count Above Critical Threshold. The Critical threshold value is 75% of Supported value 1920000. GATEWAY_CRITICAL NOTICE Over capacity alert due to high number of flows. The number of flows is above the critical threshold (1440000)
Flow Count Below Warning Threshold GATEWAY_STABLE INFO Over capacity condition due to high number of flows restored. The number of flows is within the acceptable threshold.
NAT Entries Count Above Warning Threshold. The Warning threshold value is 50% of Supported value 1920000. GATEWAY_DEGRADED NOTICE Over capacity alert due to high number of NAT entries. The number of NAT entries is above the warning threshold (960000).
NAT Entries Count Above Critical Threshold. The Critical threshold value is 75% of Supported value 1920000. GATEWAY_CRITICAL NOTICE Over capacity alert due to high number of NAT entries. The number of NAT entries is above the critical threshold (1440000).
NAT Entries Count Below Warning Threshold GATEWAY_STABLE INFO Over capacity condition due to high number of NAT entries restored. The number of NAT entries is within the acceptable threshold.
Packet Queue Watermark Above Critical Threshold GATEWAY_CRITICAL NOTICE Over capacity alert due to high packet queue watermark. The packet queue watermark is above the critical threshold (6000) for 5 consecutive seconds.
Packet Queue Watermark Above Warning Threshold GATEWAY_DEGRADED NOTICE Over capacity alert due to high packet queue watermark. The packet queue watermark is above the warning threshold (2000) for 10 consecutive seconds.
Packet Queue Watermark Below Warning Threshold GATEWAY_STABLE INFO Over capacity condition due to high packet queue watermark restored. The packet queue watermark is within the acceptable threshold.
Packet Drop Count Above Critical Threshold GATEWAY_CRITICAL NOTICE Over capacity alert due to high number of packet drops. The number of packet drops is above the critical threshold (2000) for 5 consecutive seconds.
Packet Drop Count Above Warning Threshold GATEWAY_DEGRADED NOTICE Over capacity alert due to high number of packet drops. The number of packet drops is above the warning threshold (500) for 10 consecutive seconds.
Packet Drop Count Below Warning Threshold GATEWAY_STABLE INFO Over capacity condition due to high number of packet drops restored. The number of packet drops is within the acceptable threshold.