VMware SASE 5.0.1 | 20 June 2024

  • VMware SASE™ Orchestrator Version R5017-20231111-GA

  • VMware SD-WAN™ Gateway Version R5015-20230922-GA

  • VMware SD-WAN™ Edge Version R5015-20230922-GA

Check for additions and updates to these release notes.

What's In The Release Notes

The release notes cover the following topics:

This release is recommended for all customers who require the features and functionality first made available in Release 5.0.0, as well as those customers impacted by the issues listed below which have been resolved since Release 5.0.0.

Compatibility

Release 5.0.1 Orchestrators, Gateways, and Hub Edges support all previous VMware SD-WAN Edge versions greater than or equal to Release 3.2.2. 

Note:

Release 5.0.1 is classified as a maintenance release, and maintenance releases undergo a subset of interoperability testing because the protocol is identical to the major/minor release that they are a part of. Please consult the VMware SASE 5.0.0 Release Notes for a list of other software versions this version of the protocol has been tested against.

The following SD-WAN interoperability combinations were explicitly tested:

Orchestrator

Gateway

Edge

Hub

Branch/Spoke

5.0.1

5.0.1

4.3.1

4.3.1

5.0.1

5.0.1

4.5.0

4.5.0

5.0.1

5.0.1

5.0.0

5.0.0

5.0.1

4.3.1

4.3.1

4.3.1

5.0.1

4.5.0

4.5.0

4.5.0

5.0.1

5.0.0

5.0.0

5.0.0

5.0.0

5.0.1

5.0.1

5.0.1

Note:

The above table is fully valid for customers using SD-WAN services only. Customers requiring access to VMware Cloud Web Security or VMware Secure Access need their Edges upgraded to Release 4.5.0 or later.

Caution:

VMware SD-WAN Releases 3.2.x, 3.3.x, and 3.4.x have reached the End of Support.

  • Releases 3.2.x and 3.3.x reached End of General Support (EOGS) on December 15, 2021, and End of Technical Guidance (EOTG) March 15, 2022.

  • Release 3.4.x for the Orchestrator and Gateway reached End of General Support (EOGS) on March 30, 2022, and End of Technical Guidance (EOTG) on September 30, 2022.

  • Release 3.4.x for the Edge reached End of Support (EOGS) on December 31, 2022, and End of Technical Guidance (EOTG) on March 31, 2023.

  • For more information please consult the Knowledge Base article: Announcement: End of Support Life for VMware SD-WAN Release 3.x (84151)

Important:

VMware SD-WAN Release 4.0.x has reached End of Support; Releases 4.2.x, 4.3.x, and 4.5.x have reached End of Support for Gateways and Orchestrators.

  • Release 4.0.x reached End of General Support (EOGS) on September 30, 2022, and End of Technical Guidance (EOTG) December 31, 2022. 

  • Release 4.2.x Orchestrators and Gateways reached End of General Support (EOGS) on December 30, 2022, and End of Technical Guidance on (EOTG) March 30, 2023.   

  • Release 4.2.x Edges reached End of General Support (EOGS) on June 30, 2023, and will reach End of Technical Guidance (EOTG) September 30, 2025.

  • Release 4.3.x Orchestrators and Gateways reached End of General Support (EOGS) on June 30, 2023, and End of Technical Guidance (EOTG) September 30, 2023.

  • Release 4.3.x Edges reached End of General Support (EOGS) on June 30, 2023, and will reach End of Technical Guidance (EOTG) September 30, 2025.

  • Release 4.5.x Orchestrators and Gateways reached End of General Support (EOGS) on September 30, 2023, and End of Technical Guidance on (EOTG) December 31, 2023.

  • For more information please consult the Knowledge Base article: Announcement: End of Support Life for VMware SD-WAN Release 4.x (88319).

Note:

Release 3.x did not properly support AES-256-GCM, which meant that customers using AES-256 were always using their Edges with GCM deactivated (AES-256-CBC). If a customer is using AES-256, they must explicitly deactivate GCM from the Orchestrator prior to upgrading their Edges to a 4.x Release. Once all their Edges are running a 4.x release, the customer may choose between AES-256-GCM and AES-256-CBC.

Upgrade Paths for Orchestrator, Gateway, and Edge

The following lists the paths for customers wishing to upgrade their Orchestrator, Gateway, or Edge from an older release to Release 5.0.1.

Orchestrator

Due to infrastructure changes in the Orchestrator beginning in Release 4.0.0, any Orchestrator using a 3.x Release needs to be first upgraded to 4.0.0 prior to being upgraded to 5.0.1. Orchestrators using Release 4.0.0 or later can be upgraded to Release 5.0.1.  Thus, the upgrade paths for the Orchestrator are as follows:

Orchestrator using Release 3.x → 4.0.0 → 5.0.1.

Orchestrator using Release 4.x → 5.0.1.

Gateway

Gateway upgrades directly from 3.x to 5.0.1 are not supported. In place of upgrading, a 4.x Gateway needs to be freshly deployed with the same VM attributes, and the old instance is then deprecated.

Upgrading a Gateway using Release 4.0.0 or later is fully supported for all Gateway types.

Note: When deploying a new Gateway using 5.0.1 the VMware ESXi instance must be at least version 6.7, Update 3 up to version 7.0. Using an earlier ESXi instance will result in the Gateway's Dataplane Service failing when trying to run Release 5.0.0 or later.

Note: Prior to upgrading a Gateway to 5.0.1, the ESXi instance must be upgraded to at least version 6.7, Update 3 up to version 7.0. Using an earlier ESXi instance will result in the Gateway's Dataplane Service failing when trying to run Release 5.0.1 or later.

Edge

An Edge can be upgraded directly to Release 5.0.1 from Release 4.3.1 or later.

Important Notes

VMware Security Advisory 2024-0008

Mixing Wi-Fi Capable and Non-Wi-Fi Capable Edges in High Availability Is Not Supported 

Beginning in 2021, VMware SD-WAN introduced Edge models which do not include a Wi-Fi module: the Edge models 510N, 610N, 620N, 640N, and 680N. While these models appear identical to their Wi-Fi capable counterparts except for Wi-Fi, deploying a Wi-Fi capable Edge and a Non-Wi-Fi capable Edge of the same model (for example, an Edge 640 and an Edge 640N) as a High Availability pair is not supported. Customers should ensure that the Edges deployed as a High Availability pair are of the same type: both Wi-Fi capable, or both Non-Wi-Fi capable.

Grafana No Longer Available on Orchestrator

Release 5.0.0 and later Orchestrators do not include the Grafana application due to license restrictions. Grafana is primarily used by customers and partners who run the Orchestrator on-premises to monitor the Orchestrator's performance. Going forward for such needs, a customer or partner would need to host their own Grafana application outside the Orchestrator and configure Telegraf on the Orchestrator to point to it.

VMware SASE Builds Include a Fourth Digit

Beginning with Release 5.0.0 and going forward, the release build will now include a fourth digit.

For software releases, VMware SASE follows an a.b.c numbering scheme where:  

  • a = Major (for example, 5.0.0) → A release with multiple large features and potentially significant architectural changes.

  • b = Minor (for example, 5.2.0) → A release with a handful of small features or a couple of large features and no significant architectural changes

  • c = Maintenance (for example, 5.2.1) → A release with potentially a large number of fixes for field found issues and internally found issue fixes with no features except potentially new hardware platform support.

With Release 5.0.0 a fourth digit is added to Edge, Gateway, and Orchestrator builds, so the numbering is a.b.c.d where

  • d = Rollup Build (for example, 5.2.1.1) → A rollup is a cumulative aggregate of known customer found defect fixes or critical internal found defects.

Rollup Builds for 4.x and earlier are distinguished by the image name's GA date, which is not an optimal way of communicating the build version to a customer. Adding a fourth digit for 5.0.0 builds and later allows customers to more clearly see what software version is being used for a particular component.

This build numbering convention is true only for Release 5.0.0 and later and 4.x and earlier releases will continue with three digits with rollup builds identified in the existing manner by date.

Accessing Cloud Web Security and Secure Access

A customer wishing to access VMware Cloud Web Security or VMware Secure Access must upgrade their Edges to Release 4.5.0 or later.  These services are inaccessible on Edges using a release earlier than 4.5.0.

BGPv4 Filter Configuration Delimiter Change for AS-PATH Prepending

Through Release 3.x, the VMware SD-WAN BGPv4 filter configuration for AS-PATH prepending supported both comma and space based delimiters. However, beginning in Release 4.0.0 and forward, VMware SD-WAN only supports a space based delimiter in an AS-Path prepending configuration.Customers upgrading from 3.x to 4.x or 5.x need to edit their AS-PATH prepending configurations to "replace commas with spaces" prior to upgrade to avoid incorrect BGP best route selection.

Extended Upgrade Time for Edge 3x00 Models

Upgrades to this version may take longer than normal (3-5 minutes) on Edge 3x00 models (i.e., 3400, 3800 and 3810). This is due to a firmware upgrade which resolves issue 53676. If an Edge 3400 or 3800 had previously upgraded its firmware when on Release 3.4.5/3.4.6, 4.0.2, 4.2.1, 4.3.0, 4.5.0, or 5.0.0 then the Edge would upgrade as expected. For more information, please consult Fixed Issue 53676 in the respective release notes.

Limitation with BGP over IPsec on Edge and Gateway, and Azure Virtual WAN Automation

The BGP over IPsec on Edge and Gateway feature is not compatible with Azure Virtual WAN Automation from Edge or Gateway. Only static routes are supported when automating connectivity from an Edge or Gateway to an Azure vWAN.

Limitation When Deactivating Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810

When a user deactivates autonegotiation to hardcode speed and duplex on ports GE1 - GE4 on a VMware SD-WAN Edge model 620, 640 or 680; on ports GE3 or GE4 on an Edge 3400, 3800, or 3810; or on an Edge 520/540 when an SFP with a copper interface is used on ports SFP1 or SFP2, the user may find that even after a reboot the link does not come up.

This is caused by each of the listed Edge models using the Intel Ethernet Controller i350, which has a limitation that when autonegotiation is not used on both sides of the link, it is not able to dynamically detect the appropriate wires to transmit and receive on (auto-MDIX). If both sides of the connection are transmitting and receiving on the same wires, the link will not be detected. If the peer side also does not support auto-MDIX without autonegotiation, and the link does not come up with a straight cable, then a crossover Ethernet cable will be needed to bring the link up.

For more information please see the KB article Limitation When Deactivating Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810 (87208).

Available Languages

The VMware SASE Orchestrator using version 5.0.1 is localized into the following languages: Czech, English, European Portuguese, French, German, Greek, Italian, Spanish, Japanese, Korean, Simplified Chinese, and Traditional Chinese.

Document Revision History

June 20th, 2024. Fortieth Edition.

April 24th, 2024. Thirty-Ninth Edition.

  • Revised the wording for Fixed Issue #93237 where it reads "1000 Object Groups" to "Large number of Object Groups", as 1000 is not the exact threshold that triggers the issue as it could be encountered with a quantity of Object Groups in the high 100's.

April 12th, 2024. Thirty-Eighth Edition.

  • Corrected the wording for Open Issue #118704 to change the workaround from a CLI action to an action on the Orchestrator UI to restart the Edge service to remediate the issue.

  • Added Open Issue #142366 to the Edge/Gateway Known Issues section.

April 4th, 2024. Thirty-Seventh Edition.

  • Added an Important Note regarding CVE-2024-22247, which details a missing authentication and protection mechanism vulnerability that impacts an SD-WAN Edge. VMware's response to this vulnerability is documented in VMSA-2024-0008. More information on mitigating this vulnerability is found in the KB article: VMware Response to CVE-2024-22247 (VMSA-2024-0008) (97391).

  • Corrected wording in the Upgrade Paths for Orchestrator, Gateway, and Edge section regarding upgrading a 3.x Gateway to 5.x.

    • Where it read: Gateway upgrades from 3.x to 5.0.1 are not supported. In place of upgrading, a 3.x Gateway needs to be freshly deployed with the same VM attributes, and the old instance is then deprecated.

    • It now reads: Gateway upgrades directly from 3.x to 5.0.1 are not supported. In place of upgrading, a 4.x Gateway needs to be freshly deployed with the same VM attributes, and the old instance is then deprecated.

March 26th, 2024. Thirty-Sixth Edition.

March 4th, 2024. Thirty-Fifth Edition.

  • Added Open Issue #97055 to the Orchestrator Known Issues section. This issue is fixed in Orchestrator version 5.1.0 and later.

February 26th, 2024. Thirty-Fourth Edition.

  • Moved Issue #115136 from the Edge/Gateway Known Issues section to the Edge/Gateway Resolved Issues section for Edge/Gateway rollup build R5015-20230922-GA. This issue should have been documented as fixed in the Thirtieth Edition of these Release Notes.

December 18th, 2023. Thirty-Third Edition.

December 6th, 2023. Thirty-Second Edition.

  • Removed Known Issue #83166 which read: "When a VMware SD-WAN Gateway is freshly deployed with a AWS c5.4xlarge instance type from the AWS Portal with IPv6 option selected, neither IPv6 or the default routes are configured and the AWS Gateway IPv6 management tunnels are not forming." This issue will not be fixed and instead user documentation will add a requirement to only use the static mode of IPv4/IPv6 address assignment on interfaces for a Gateway because VMware SD-WAN does not support DHCP on the Gateway side.

November 15th, 2023. Thirty-First Edition.

  • Added a new Orchestrator rollup build R5017-20231111-GA to the Orchestrator Resolved sections. This is the seventh Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5017-20231111-GA includes the fixes for issues #102121, # 116531, and #131789, each of which is documented in this section.

September 27th, 2023. Thirtieth Edition.

  • Added a new Edge/Gateway rollup build R5015-20230922-GA to the Edge/Gateway Resolved section. This is the fifth Edge/Gateway rollup build and is the new Edge and Gateway GA build for Release 5.0.1.

    Edge and Gateway build R5015-20230922-GA includes the fixes for issues #93237, #95047, #95850, #97321, #98223, #101431, #103558, #103700, #106865, #109906, #110320, #110970, #111924, #112115, #115904, #116368, #117037, #118333, #118591, #119491, #121998, #123593, #124181, and #126336, each of which is documented in this section.

  • Added Open Issue #115136 and #125509 to the Edge and Gateway Known Issues section.

September 14th, 2023. Twenty-Ninth Edition.

  • Removed Open Issue #92676 because VMware Engineering determined that this issue is working as expected and a workaround is documented as a note in the Administration Guide under Configure BGP over IPsec From Gateways.

September 4th, 2023. Twenty-Eighth Edition.

  • Document Revision History reorganized to read from newest entries to oldest for an improved user experience.

  • Added Open Issue #117037 to the Edge and Gateway Known Issues section.

August 18th, 2023. Twenty-Seventh Edition.

August 3rd, 2023. Twenty-Sixth Edition.

  • Added a new Orchestrator rollup build R5016-20230801-GA to the Orchestrator Resolved sections. This is the sixth Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5016-20230801-GA includes the fixes for issues #64145#116531, and #122271, each of which is documented in this section.

  • Added Open Issues #106865 and #121998 to the Edge and Gateway Known Issues section.

July 26th, 2023. Twenty-Fifth Edition.

  • Added Fixed Issue #103708 to the Edge/Gateway Resolved Issues section for the fourth Edge rollup build R5014-20230713-GA. This issue was omitted from the previous edition of the 5.0.1 Release Notes.

July 14th, 2023. Twenty-Fourth Edition.

June 29th, 2023. Twenty-Third Edition.

  • Added a new Orchestrator rollup build R5015-20230628-GA to the Orchestrator Resolved sections. This is the fifth Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5015-20230628-GA includes the fixes for issues #109710#112605, #114291, and #114475, each of which is documented in this section.

  • Added Issue #107994 to the Edge and Gateway Known Issues section.

June 14th, 2023. Twenty-Second Edition.

April 25th, 2023. Twenty-First Edition.

  • Added Fixed Issue #93052 to the Edge/Gateway Resolved Issues section for original GA build R5010-20220729-GA. This issue was omitted in error from the first edition of the 5.0.1 Release Notes.

  • Updated the Compatibility section to mark all 3.x releases as having reached their End of Service Life (EOSL). Also updated the 4.x section to mark 4.2.x Orchestrators and Gateways as End of Service Life (EOSL).

April 12th, 2023. Twentieth Edition.

  • Added a new Orchestrator rollup build R5014-20230408-GA to the Orchestrator Resolved sections. This is the fourth Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5014-20230408-GA includes the fixes for issues #107766#108363, #110946, #111946, #111957, and #112201, each of which is documented in this section.

  • Added Open Issues #94980 and #110564 to Edge/Gateway Known Issues.

  • Revised Fixed Ticket #89217 to reflect a revised Edge version (R5012-20230327-GA-107522) needed to resolve the issue. Edge version R5012-20230327-GA-107522 adds the ability to upgrade sites deployed in a High Availability topology automatically through the Orchestrator. The previous Edge version associated with resolving this issue, R5012-20230123-GA-103475, did not include that capability and is marked as Degraded on all hosted Orchestrators.

March 26th, 2023. Nineteenth Edition.

  • Added a new Edge and Gateway rollup build R5013-20230322-GA to the Edge/Gateway Resolved section. This is the third Edge/Gateway rollup build and is the new Edge and Gateway GA build for Release 5.0.1.

  • Edge and Gateway build  R5013-20230322-GA includes the fixes for issues #78050, #80149, #84593, #86994, #95603, #96880, #97404, #97559, #98782, #99676, #103527, #103529, #103983, #104141, #104183, #104487, #105360, #105744, #106627, #106700, #107302, #107309, #107356, and #109131, each of which is documented in this section.

March 15th, 2023. Eighteenth Edition.

  • Added a new Orchestrator rollup build R5013-20230310-GA to the Orchestrator Resolved sections. This is the third Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5013-20230310-GA includes the fixes for issues #105610#106242, and #109595, each of which is documented in this section.

February 17th, 2023. Seventeenth Edition.

  • Removed Issue #39659 from the Edge/Gateway Known Issues section as this is a duplicate of another ticket, #39501 which was resolved in Release 4.3.0.

January 30th, 2023. Sixteenth Edition.

  • Revised Fixed Ticket #89217 to reflect a revised Edge version (R5012-20230123-GA-103475) and Platform Firmware version (R131-20221216-GA) needed to resolve the issue. The ticket also adds a link to the KB Article that covers #89217 and which includes step-by-step instructions for upgrading a 6x0 Edge.

  • In the Compatibility section, revised the Import Note regarding End of Support for 4.2.x and added Release 4.3.x to reflect newly revised dates for the SD-WAN Edge software.

December 16th, 2022. Fifteenth Edition.

  • Added a new Gateway rollup build R5012-20221214-GA to the Edge/Gateway Resolved sections. This is the second Gateway rollup build and is the new Gateway GA build for Release 5.0.1.

  • Gateway build R5012-20221214-GA includes the fixes for issues #96863, #97272, and #99650, each of which is documented in this section.

  • Important:

    Due to a build issue with the original 5.0.1.1 Gateway build (R5011-20221007-GA), Gateways cannot be upgraded to any other 5.0.1.1 Gateway build and must upgraded from 5.0.1.1 directly to 5.0.1.2.

  • Added a new Orchestrator rollup build R5012-20221214-GA to the Orchestrator Resolved sections. This is the second Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5012-20221214-GA includes the fixes for issues #96538, #100133, #101835, and #102806, each of which is documented in this section.

November 30th, 2022. Fourteenth Edition.

  • Replaced Orchestrator rollup build R5011-20221117-GA with revised build R5011-20221129-GA which corrects an upgrade issue seen by the VMware Operations team when upgrading an Orchestrator to build R5011-20221117-GA. The upgrade issue was caused by a version mismatch in the upgrade package Manifest, and this new build adds no new functionality.

November 22nd, 2022. Thirteenth Edition.

  • Added a new Orchestrator rollup build R5011-20221117-GA to the Orchestrator Resolved sections. This is the first Orchestrator rollup build and is the new Orchestrator GA build for Release 5.0.1.

  • Orchestrator build R5011-20221117-GA includes the fixes for issues #80735#88957, #97713#98086, #98357, #98518, #98654, #99109, #99247, #99250, #100656, and #101449, each of which is documented in this section.

  • Added Fixed Issue #89873 to the Edge/Gateway Resolved Issues section. This issue was omitted in error from the first edition of the 5.0.1 Release Notes.

  • Added Open Issue #97559 to Edge/Gateway Known Issues.

November 14th, 2022. Twelth Edition.

  • Added a new Edge rollup build R5012-20221107-GA to the Edge/Gateway Resolved section. This is the second Edge rollup build and is the new Edge GA build for Release 5.0.1 and is recommended for all customers running Edge Release 5.0.x.x.

  • Edge build R5012-20221107-GA includes the fixes for issues #96411#96441, #96888#97483#98514, #100377, and #101049 which are each documented in this section.

  • Important:

    Those using a previous Edge 5.0.x.x build should upgrade their Edges to 5.0.1.2.

October 31st, 2022. Eleventh Edition.

  • Added Fixed Issue #72491 to the Edge/Gateway Resolved section for the original 5.0.1 GA build R5010-20220729-GA. This issue was omitted in error from the first edition of the 5.0.1 Release Notes.

October 18th, 2022. Tenth Edition.

  • Added Fixed Issue #90876 to the Edge/Gateway Resolved section for the original 5.0.1 GA build R5010-20220729-GA. This issue was omitted in error from the first edition of the 5.0.1 Release Notes.

October 12th, 2022. Ninth Edition.

  • Added a new Edge/Gateway rollup build R5011-20221007-GA to the Edge/Gateway Resolved section. This is the first Edge/Gateway rollup build and is the new Edge/Gateway GA build for Release 5.0.1.

  • Edge/Gateway build R5011-20221007-GA includes the fixes for issues #89235#94430, #95503#96055#96231, #98157, and #99188 which are each documented in this section.

September 28th, 2022. Eighth Edition.

  • Added #98136 to the Edge/Gateway Known Issues section.

September 23rd, 2022. Seventh Edition.

  • Added Fixed Issue #96108 to the Orchestrator build R5010-20220912-GA in the Orchestrator Resolved Issues section, This issue was omitted in error from the Sixth Edition of these Release Notes.

  • Moved Fixed Issue #90749 from the Orchestrator build R5010-20220912-GA down to the original Orchestrator build R5010-20220803-GA in Orchestrator Resolved Issues section, as this is where the issue fix was actually added for Release 5.0.1.

  • Added Fixed Issue #87982 to the Edge/Gateway Resolved Issues section. This ticket was omitted in error from the first edition of the 5.0.1 Release Notes.

  • Added Open Issues #86098, #94204, #95565#96441, and #96888 to the Edge/Gateway Known Issues section.

September 16th, 2022. Sixth Edition.

  • Added an updated Orchestrator build R5010-20220912-GA to the Orchestrator Resolved section. Build R5010-20220912-GA replaces the original Orchestrator build R5010-20220817-GA, and is the new Orchestrator GA build for Release 5.0.1.

  • This updated Orchestrator build includes the fix for Issue #90749, #95847, and #96095, which are added to the Orchestrator Resolved section.

  • Added Fixed Issues #91875 to the Edge/Gateway Resolved Issues section. This tickets were omitted in error from the first edition of the 5.0.1 Release Notes.

  • Added Issues #96055 and #96231 to the Edge/Gateway Known Issues section.

September 9th, 2022. Fifth Edition.

  • Added Fixed Issues #87552, #90151, and #93383 to the Edge/Gateway Resolved Issues section. These tickets were omitted in error from the first edition of the 5.0.1 Release Notes.

  • Removed Open Issue #49712 from Edge/Gateway Known Issues as Engineering concluded it was caused by a configuration error versus a defect in the code.

  • Removed Open Issue #90065 from Edge/Gateway Known Issues as Engineering has not been able to replicate the issue and DR synchronization works as expected with the 5.0.1 Orchestrator Build.

August 18th, 2022. Fourth Edition.

  • Added an updated Orchestrator build R5010-20220817-GA to the Orchestrator Resolved section. Build R5010-20220817-GA replaces the original Orchestrator build R5010-20220803-GA, and is the new Orchestrator GA build for Release 5.0.1.

  • This updated Orchestrator build includes the fix for Issue #95613, which is added to the Orchestrator Resolved section.

August 11th, 2022. Second Edition.

  • Added Fixed Issues #89346, #90067, #90128, #90540, #91054, #91720, and #92082 to the Orchestrator Resolved Issues section. These tickets were omitted in error from the first edition of the 5.0.1 Release Notes.

August 15th, 2022. Third Edition.

  • Added Fixed Issue #89217 to the Edge/Gateway Resolved Issues section. This ticket was omitted in error from the first edition of the 5.0.1 Release Notes.

August 5th, 2022. First Edition.

Edge and Gateway Resolved Issues

Resolved in Edge/Gateway Version R5015-20230922-GA

Edge/Gateway build R5015-20230922-GA was released on 09-27-2023 and is the 5th Edge/Gateway rollup for Release 5.0.1.

This Edge/Gateway rollup build addresses the below critical issue since the 4th Edge/Gateway rollup build, R5014-20230713-GA.

  • Fixed Issue 93237: A VMware SD-WAN Edge configured with a large number of Object Groups entries may experience a Dataplane Service failure and restart to recover, which causes a 10-15 second customer traffic disruption.

    When a large number of Object Group entries (several hundred or more) are configured in the Configure > Business Policy page of the Orchestrator UI, the configuration that is pushed to the Edge can trigger an Edge memory corruption which causes the Edge service to fail, and then restart.

  • Fixed Issue 95047: When a security port scanning utility scans a VMware SD-WAN Edge where Edge Network Intelligence (Analytics) is not activated, the scan will report that Syslog Port 514 is closed, which means it could be accessible.

    Edge Network Intelligence listens on Port 514 (Syslog). If Analytics are not activated, Port 514 remains accessible, but it will not respond to requests. Therefore, a port scanner reports the port as "closed" (in other words, the port is accessible but there is no application listening on it).

  • Fixed Issue 95850: On a customer enterprise where OSPF is used, when a user generates a diagnostic bundle for a VMware SD-WAN Edge, the OSPF routes may flap during the bundle generation resulting in disrupted customer traffic.

    As part of the diagnostic bundle generation the commands vcdbgdump -r remote-routes and vcdbgdump -r remote_routes are run. As these commands take more than 40 seconds in a customer environment, the OSPF hellos that were queued to the event dispatcher thread were not processed. Due to this, the OSPF neighborship flaps, causing a network outage.

    On an Edge without a fix for this issue a customer should either not generate a diagnostic bundle except in a maintenance window or reach out to VMware SD-WAN Support to generate the bundle as they have internal tools to prevent the issue from occurring on a temporary basis.

  • Fixed Issue 97321: From the time a user activates Edge Network Intelligence Analytics on a VMware SD-WAN Edge, the Edge can potentially trigger an Edge Service restart, each instance of which causes 10-15 seconds of customer traffic disruption.

    When Analytics is enabled on the Edge, the Edge can experience an out of memory condition followed by a "double free" memory state. The Edge restarts its service to restore memory.

    The symptoms for this issue can happen multiple times while Analytics are activated.

  • Fixed Issue 98223: When Edge Network Intelligence Analytics is activated on a VMware SD-WAN Edge, the Edge may lose contact with the VMware SASE Orchestrator and cause the Orchestrator to mark the Edge as down on the Orchestrator UI.

    When Analytics is activated, the Edge communication with the Analytics backend sometimes gets mixed with the Edge communication with the Orchestrator. This results in a loss of communication with the Orchestrator which causes the Orchestrator to declare that the Edge is down when it is not.

  • Fixed Issue 101431: For a customer who subscribes to Edge Network Intelligence, when the user activates Analytics on a VMware SD-WAN Edge, the dashboard may display the message "No Management IP Assigned" for the Edge.

    In rare cases, the Edge does not send the Management IP address to the Edge Network Intelligence backend, and this results in the above message.

  • Fixed Issue 103558: On a customer enterprise using Edge Network Intelligence, when Analytics is activated for a VMware SD-WAN Edge the ENI dashboard may display "No Management IP Assigned" for that Edge.

    When Analytics is enabled, in rare cases the Edge does not send the Management IP address to the Edge Network Intelligence backend.

  • Fixed Issue 103700: An application may get misclassified by SD-WAN and matched to the wrong Business Policy or Firewall Rule despite that application having a customized entry in the customer's application map.

    Applications in an application map with a mustNotPerformDpi tag can still be classified via SD-WAN's Deep Packet Inspection (DPI) engine. In a large scale deployment, a collision can occur while looking up the application classification via fast database cache. As a result, although an application is configured with mustNotperformDpi, it can still be classified via DPI with a potentially unexpected classification.

  • Fixed Issue 106865: For a customer who uses the Edge Network Intelligence service and has activated Analytics on their enterprise, they may observe that non-IP traffic (for example, RADIUS authentication) is dropped.

    When Analytics is activated, if non-IP frames are received on a SD-WAN Edge interface, the Edge may mistakenly process these as IPv4 fragments and cause a fragment record leak. Over time, this can stop all fragment processing and all such packets are dropped. For a customer using RADIUS authentication, the result can be authentication becoming broken for all Edges at the same time.

    On an enterprise not using a fixed build, the only workaround is to reboot all Edges.

  • Fixed Issue 109906: A VMware SD-WAN Gateway may experience a Dataplane Service failure, generate a core, and restart to recover.

    This issue can be encountered when a corrupt out of band message is received which causes an array index overflow and triggers an exception and the failure of the Gateway's service.

  • Fixed Issue 110320: For a customer subscribing to Edge Network Intelligence where Analytics is activated on their enterprise, when the name of a VMware SD-WAN Edge is changed on the VMware SASE Orchestrator, this change may not be reflected in the Edge Network Intelligence UI.

    The Edge Network Intelligence sub module in the Edge does not react to the Edge name change and the result is the name change is not reflected on the Edge Network Intelligence UI.

  • Fixed Issue 110970: For a customer who subscribes to Edge Network Intelligence and has one or more sites deployed with a High Availability topology, when Analytics is activated for an HA site, Analytics may not work.

    Due to multiple race conditions, the Edge Network Intelligence thread may assume that the Active Edge is in the Standby role, and this stops Analytics functionality.

  • Fixed Issue 111924: A customer may observe that across all their sites Multi Path traffic (in other words, traffic that traverses the VMware SD-WAN Gateway) is being dropped even though their VMware SD-WAN Edge's tunnels to the Gateway are up and stable.

    There is no limit on the maximum number of times a Gateway can re-transmit a VCMP packet (SD-WAN's management protocol), and such re-transmits can overwhelm low bandwidth links. These re-transmits will also cause packet build-up on the scheduler when the Edge has a low bandwidth link since the re-transmits cannot be drained fast enough. Eventually the scheduler queues become full and lead to the scheduler dropping packets from all Edges. Direct traffic that does not use the Gateway would not be affected by this issue.

    When this issue is encountered on a Gateway without a fix for this issue, the only remediation is for an Operator user to identify the Edges which are causing the packet buildup on the scheduler using the debug.py --qos_dump_net command and block them in the affected Gateway.

  • Fixed Issue 112115: A VMware SD-WAN Edge under a high CPU load may experience a Dataplane Service failure and restart to recover.

    Under high CPU conditions, multiple service failures triggered by a mutex monitor can occour due to a lower priority thread acquiring the debug ring lock. The resolution to this issue is an enhancement to the Dataplane that makes that particular thread both lock-free and wait-free.

  • Fixed Issue 115136: A customer may observe a gradual memory usage increase on a VMware SD-WAN Edge in a customer enterprise that uses BGP for routing.

    The Edge's BGP daemon is causing a gradual memory leak on the Edge over several days and can do this even when BGP is not configured for that Edge. If the memory leak continues for a sufficient period to bring the Edge's memory usage beyond the critical threshold of 60% of available RAM for more than 90 seconds, the Edge will defensively restart its service to clear the leak which can result in customer traffic disruption for 10-15 seconds.

    The only remediation without an Edge fix is to restart the BGP process by terminating it, or preemptively perform an HA failover/Edge service restart in a suitable service window.

  • Fixed Issue 115904: When a uses triggers a diagnostic bundle for a VMware SD-WAN Edge using the VMware SASE Orchestrator, the Edge may experience a Dataplane Service failure, generate a core, and restart to recover.

    A user can generate an Edge diagnostic bundle on the SD-WAN > Diagnostics > Diagnostic Bundle page. When this action is taken, a race condition between dns_name_cache (addition and/or delete) and the DNS name cache can occur which causes the Edge service to try and access an in use or deleted element, which triggers a service failure with a SIGSEGV or SIGBUS reason.

  • Fixed Issue 116368: The routing logs on a VMware SD-WAN Gateway may reach capacity and not accumulate any additional entries.

    This issue is caused by the Gateway's routing software missing the log rotation configuration, whose purpose is to rotate the routing logs prior to reaching capacity so that new log entries can be added. Without this configuration, the routing logs do not rotate and result in Operators and Partners potentially missing critical log entries for a Gateway.

  • Fixed Issue 117037: For a customer using a Hub/Spoke topology where multiple WAN links are used to send and receive traffic from the Spoke Edge to the Hub Edge, customers may observe lower than expected performance for traffic that is steered by Business Policies because the WAN links are not aggregating the WAN link's bandwidth.

    SD-WAN uses a counter for accounting the number of packets buffered in a resequencing queue. This counter is managed per peer and used to make sure only 4K packets are buffered per peer. Under some conditions, this counter can become negative. Prior to Release 4.2.x, when this counter became negative, the respective counter was immediately reset back to 0 after flushing the packets in the resequencing queue. However, starting in Release 4.3.x, this counter is updated automatically to ensure that the counter stays within expected bounds.

    The result of this change in behavior can cause cases where the counter accounting is incorrect and the resequencing queue can stay at a very high number to which SD-WAN reacts by flushing every single packet. This action not only prevents bandwidth aggregation but can reduce the effectiveness of flows that would otherwise be on a single link.

    On Edges without a fix for this issue, the workaround is to configure business policies that steer matching traffic to a single mandatory link.

  • Fixed Issue 118333: For a customer site deployed with a High Availability topology where the HA Edge pair is either a model 520, 540, or 610, the customer may observe multiple HA failovers due to the site experiencing an active-active (split brain) condition.

    VMware SD-WAN Edge 520, 540, and 610's use a switch made by Marvel where if internet backhaul is configured can trigger a situation where the Standby Edge also becomes active while not demoting the Active Edge. Active-Active states are resolved by rebooting the Standby Edge and this will be recorded in the Edge Events.

  • Fixed Issue 118591: On a customer site deployed with an Enhanced High Availability topology, a customer may observe that traffic using a WAN link on the Standby Edge is disrupted by multiple flaps that Edge's WAN interface.

    A user would observe multiple Link Up/Down events in Monitor > Events. The event is triggered when either a high number of flows are sent (for that Edge model) or a high number of routes are installed.

  • Fixed Issue 119491: For a VMware SD-WAN Edge where Edge Network Intelligence Analytics is activated, the customer may observe a gradual increase in Memory Usage on the Edge.

    The specific scenario is an Edge where Analytics is activated and is also receiving RADIUS traffic, in that instance an Edge memory leak can happen. If the memory leak continues for a sufficient period to bring the Edge's memory usage beyond the critical threshold of 70% of available RAM, the Edge will defensively restart its service to clear the leak which can result in customer traffic disruption for 10-15 seconds.

  • Fixed Issue 121998: For a customer using the Stateful Firewall in a Hub/Spoke topology, traffic that matches a firewall rule configured for Spoke-to-Hub traffic where the rule includes a source VLAN may be dropped.

    When there is an application classification, business policy table, or firewall policy table version change, SD-WAN performs a firewall lookup for flows on its next packet. Due to a timing issue, that packet could be one from the management traffic (VCMP) side. As a result, during a firewall policy lookup key creation, SD-WAN swaps the Spoke Edge VLAN with the Hub Edge VLAN and this leads to not matching the rule and dropping that traffic.

    For an Edge without a fix for this issue, a customer can to change the Source from an Edge VLAN to 'Any'.

  • Fixed Issue 123593: For a customer site using a High Availability topology where the customer is also using Edge Network Intelligence with Analytics turned on, in rare conditions, the VMware SD-WAN HA Edge may not be able to retrieve the Analytics configurations from the Edge Network Intelligence backend.

    It is possible for both the Active and Standby Edges to acquire the token from the Edge Network Intelligence backend. If the Standby Edge obtains the token after the Active Edge, the Active Edge's token will be stale, resulting in this scenario.

  • Fixed Issue 124181: For a customer site using a High Availability topology where the customer is also using Edge Network Intelligence with Analytics turned on, the user may observe an Exception Error in the logs if the HA Edge cannot reach the ENI endpoint.

    The error reads "NameError: global name 'pyutil' is not defined" when the HA Edge cannot reach the Edge Network Intelligence (ENI) backend. This is an exception that occurs when handling another exception. It will only happen if the ENI endpoint is not reachable, which means that ENI is not working. The exception does not end the Edge's management process and does not have a critical impact on the customer site. However, because it appears in the logs it needs to be addressed.

  • Fixed Issue 126336: When deploying a Partner Gateway, BGP neighborship may not come up between a provider edge (PE) and the Partner Gateway.

    When this issue occurs, the BGP neighborship does not establish between the PE and the Gateway. The PE remains stuck in a connect state and does not send the ACK for a TCP handshake.

Resolved in Edge Version R5014-20230713-GA

Edge/Gateway build R5014-20230713-GA was released on 07-14-2023 and is the 4th Edge rollup for Release 5.0.1.

This Edge/Gateway rollup build addresses the below critical issue since the 3rd Edge/Gateway rollup build, R5013-20230322-GA.

In addition, Edge build R5014-20230713-GA (and all succeeding 5.0.1.x Edge versions) is the recommended build for customers who need to upgrade the Platform Firmware on an Edge model 6x0 as outlined in Fixed Issue 89217.

  • Fixed Issue 103708: When new rules are added in a BGP filter configuration, there may be unexpected BGP routes received and sent by the VMware SD-WAN Edge.

    When new rules are added to the BGP filters from the Orchestrator, the prefix lists are added in the Edge's routing configuration without removing the old entries. This behavior results in stale route prefix lists and unexpected filtering behavior.

  • Fixed Issue 105160: Upgrading the software of a VMware SD-WAN Edge may fail and the Edge does not retry the upgrade.

    When the issue is encountered, there is an exception in the Edge upgrade process which causes the Edge to update its configuration version for an Edge software upgrade without the Edge actually upgrading. As a result, the Edge will think it upgraded to the target version and make no attempt to retry the actually failed upgrade.

    The only way to correct an Edge in this state is to change the Edge version (downgrade or upgrade at the customer's discretion) and then after that Edge software update, retry the desired upgrade for the Edge.

Resolved in Edge/Gateway Version R5013-20230322-GA

Edge/Gateway build R5013-20230322-GA was released on 03-25-2023 and is the 3rd Edge/Gateway rollup for Release 5.0.1.

This Edge/Gateway rollup build addresses the below critical issues since the 2nd Edge/Gateway rollup build, R5012-20221214-GA.

  • Fixed Issue 78050: A VMware SD-WAN Edge may experience a Dataplane Service failure when a PPTP server is present on the LAN side.

    When a PPTP server is present in the LAN side, and a PPTP client from the internet connects to it via an inbound firewall rule, the Edge service can fail due to a PPTP control channel lookup failure. This control channel lookup is needed to ensure the GRE data channel is sent out via the same link back to a PPTP client.

    On an Edge using a build without a fix for this issue, a customer's only alternative is to not use PPTP sessions.

  • Fixed Issue 80149: If Layer 7 (L7) Health Check is activated for a Non SD-WAN Destination (NSD) or Cloud Security Service (CSS) where there are redundant tunnels, a customer may experience both tunnels simultaneously being marked as down and then coming up intermittently if there are transmission issues on the Primary tunnel.

    With this issue, L7 probes for both the Primary and Secondary Tunnels are sent via the Primary Tunnel Interface. If the Primary Tunnel interface has packet transmission failures (for example, high latency), it would affect both the Primary and Secondary L7 Probe packets and the tunnels would both get torn down simultaneously, impacting customer traffic for that NSD or CSS.

  • Fixed Issue 84593: Tunnels may not come up for a VMware SD-WAN Virtual Edge with a KVM type.

    When packets are received with a Frame Check Sequence (FCS), which adds 4 extra trailer bytes to each packet, the DPDK IPsec process fails to decrypt the packets because the the process uses the received frame length in its decryption operation instead of using the actual packet length. As a result the KVM Edge cannot build IPsec tunnels to the Gateaway or Orchestrator.

  • Fixed Issue 86994: On a customer enterprise where Dynamic Branch-to-Branch is activated, when attempting to troubleshoot a VMware SD-WAN Edge in this enterprise the dispcnt debugging command does not work.

    The dispcnt debug command does not provide all the counter values and fails with Domain (null) does not exist. This also fails when referring to the relevant logs in an Edge diagnostic bundle. This significantly hinders troubleshooting a customer network issue.

    This issue arises in enterprises where Dynamic Branch-to-Branch is activated due to the large quantity of tunnels that are created and torn down towards each peer. The counters to store various metrics of the peers are stored in a shared memory and over time, these shared memory segments get into a bad state due to a collision and the counters are not fetched by the dispcnt command.

    This issue can only be cleared by performing a service restart of the Edge.

  • Fixed Issue 95603: If a Zscaler server changes its IP address, the DNS lookup continues to use the old IP address which causes Non SD-WAN Destination (NSD) tunnel failure.

    If a remote server changes its IP address, the L7 health check fails and does not recover. The fix for this issue discovers the IP address change and flushes the L7 health check table.

    On a VMware SD-WAN Edge without a fix for this issue, rebooting the Edge reestablishes the tunnel.

  • Fixed Issue 96880: Remote IPv6 routes may not be present on a VMware SD-WAN Edge.

    The issue occurs if only one sub-interface has the IPv6 address in a segment but the parent routed interface does not have the IPv6 address.

    On an Edge without a fix for this issue, the workaround is to configure the IPv6 address on the parent interface.

  • Fixed Issue 97404: If a VMware SD-WAN Edge IPv6/IPv4 interface is moved to a different segment, the Edge does not learn the respective IPv6/IPv4 remote route on the new segment and the remote route persists on the old segment.

    Edge route learning is failing for this scenario because if an interface is moved from one segment to another, it is expected that the Edge process must restart to complete the configuration change and that does not happen for this issue (due to the Edge restart behavior change https://kb.vmware.com/s/article/60247.

    On Edge without a fix for this issue, a user can run Remote Actions > Restart Service to complete the configuration change. This should be done in a suitable maintenance window.

  • Fixed Issue 97559: On a customer site deployed with an Enhanced High Availability topology, a WAN link connected to the VMware SD-WAN Edge in a Standby role may show as down on the VMware SASE Orchestrator and not pass customer traffic even though the Edge's WAN interface where the WAN link is connected is up.

    A user looking at a tcpdump or diagnostic bundle logging would observe ARP requests coming in and the Standby Edge not responding as a result of its port being blocked.In Enhanced HA, when an Edge assumes the role of Standby, the following events should occur in sequence:

    1. The Standby Edge blocks all ports.

    2. The Standby Edge then detects that it is deployed in Enhanced HA and unblocks its WAN ports to pass traffic.

    When this issue occurs, Event 1, the initial port blocking takes an unexpectedly long time to complete and the follow-up Event 2, the unblocking of all WAN ports is completed prior to the completion of Event 1. And then Event 1 completes and thus the final state is all WAN ports are blocked on the Standby Edge.

    On an HA Edge without a fix for this issue, the workaround is to force an HA failover that promotes the Standby Edge to Active brings up the HA Edge's WAN link(s).

  • Fixed Issue 98782: A VMware SD-WAN Gateway may experience a Dataplane Service failure during IPsec tunnel establishment, generating a core and restarting as a result.

    When a Gateway experiences this issue, the restart can result in a brief disruption of customer traffic for both Edges connected to that Gateway and Non SD-WAN Destinations using the Gateway for IPsec tunnels. The is caused by a race condition when the Gateway is establishing an IPsec tunnel triggers the Dataplane Service failure.

  • Fixed Issue 99676: For a customer using Wavefront to monitor a VMware SD-WAN Gateway, the output does not include core usage, network interface (netif), and per core metrics.

    Gateway Release 5.0.1.3 includes an enhancement to export Gateway core usage, netif, and per core metrics to the Wavefront monitoring service.

  • Fixed Issue 103527: A PPTP (Point-to-Point Tunneling Protocol) session is not reestablished after disconnecting.

    After a PPTP session is reconnected the Edge sees the call request/reply re-transmits, but on receiving the transmitted reply the Edge returns an error without clearing the GRE-NAT entry. Further connection attempts are dropped due to the existing GRE-NAT entry.

    On an Edge without a fix for this issue, the workaround is to clear the NAT database by running Remote Diagnostics > Flush NAT.

  • Fixed Issue 103529: For a customer site using a High Availability topology where one or more 1:1 NAT rules are configured, after an HA failover traffic using a 1:1 NAT rule may be dropped.

    For an Edge in an HA setup with 1:1 NAT configured, when the respective flows are synchronized to the Standby Edge, the wrong destination route is selected due to missing information in flow-sync table, and wrong route selection can cause drops.

    On an HA Edge pair without a fix for this issue, running Remote Diagnostic > Flush Flows will temporarily resolve the issue until the next HA failover.

  • Fixed Issue 103983: For a VMware SD-WAN Edge which has a Non SD-WAN Destination via Edge using redundant tunnels and has turned on the L7 Health Check feature, when the primary tunnel goes down, the backup tunnel also goes down, resulting in all traffic dropping that uses this NSD.

    The issue is caused by L7 probes going out the wrong path and this causes the Edge to view the secondary tunnel as also down along the primary tunnel when the probes fail for the primary tunnel.

  • Fixed Issue 104141: Users behind a VMware SD-WAN Edge or customers connected to a VMware SD-WAN Gateway may experience significant issues for any traffic that is using that Edge, or traversing that Gateway to the point that no traffic may be forwarded.

    When the issue is encountered, the Edge or Gateway has an unbounded number of memory buffers (mbufs) being consumed by the jitter buffer queue due to increasing management tunnel time stamps received from a peer. This triggers integer underflow in the jitter calculation, causing packets to be buffered effectively indefinitely. At first this only affects buffered flows, but over a long enough period the number of mbufs consumed for the jitter buffer queue approaches the total of mbufs available and the SD-WAN device (Edge or Gateway) can become unable to forward all traffic entirely. If this affects a Gateway it would only affect multi-path traffic that traverses the Gateway and customer traffic going direct would not be affected.

    Another ticket, #105744 also addresses the symptoms found here but fixes a separate cause. The difference between the two tickets: the fix included in #104141 addresses the memory buffers being consumed by the jitter buffer queue due to the increasing management time stamps received by the peer. The fix included in #105744 restricts the jitter buffer count to 25% of the total memory buffers no matter what else happens to ensure that this issue cannot recur.

    Without a fix for this issue for either the Edge or Gateway, a user can monitor the memory buffer (mbuf) usage on the Orchestrator and look for increased mbuf usage due to packets being queued in the jitter buffer. If the user does observe the issue they can flush flows for the Edge (through Remote Diagnostics) or Gateway to temporarily alleviate the issue but the issue would eventually recur until the fix was applied.

  • Fixed Issue 104183: A VMware SD-WAN Edge which uses USB/LTE Modems may experience a Dataplane Service failure and generate a core file.

    An Edge can experience this issue when one or more USB modems go down and up (flap) multiple times.

  • Fixed Issue 104487: Customer sites whose VMware SD-WAN Edges use a particular VMware SD-WAN Gateway as their Primary Gateway may experience issues with user traffic destined for the Gateway because the Gateway cannot connect to the internet even though it shows as up on Orchestrator monitoring.

    When this issue occurs, the Gateway fails to transmit packets to the remote access service (RAS) due to these packets becoming stuck in the Gateway's transmit queue, as a result Edges connected to this Gateway cannot build tunnels to it. This issue occurs only for data packets containing customer traffic and not for keepalive packets between the Gateway and the RAS, which is why the Gateway will continue to show as up on Orchestrator monitoring despite the issue occurring. Customer traffic tagged as Direct to the internet would not be affected by this issue as it does not use a Gateway to reach the Internet.

  • Fixed Issue 105360: A VMware SD-WAN Gateway using a 5.0.x Release may experience multiple Dataplane Service failures with a restart after each one.

    This issue can occur when the Gateway receives fragmented packets directly from an Edge to the Gateway's IP address (for example, ICMP packets from a ping). This issue does not occur if the fragmented packets are sent through a VCMP (SD-WAN Management) tunnel or an IPsec tunnel.

  • Fixed Issue 105744: Users behind a VMware SD-WAN Edge or customers connected to a VMware SD-WAN Gateway may experience significant issues for any traffic that is using that Edge, or traversing that Gateway to the point that no traffic may be forwarded.

    This ticket and Issue #104141 are directly related and have the same symptoms and cause which will be repeated here: when the issue is encountered, the Edge or Gateway has an unbounded number of memory buffers (mbufs) being consumed by the jitter buffer queue due to increasing management tunnel time stamps received from a peer. This triggers integer underflow in the jitter calculation, causing packets to be buffered effectively indefinitely. At first this only affects buffered flows, but over a long enough period the number of mbufs consumed for the jitter buffer queue approaches the total of mbufs available and the SD-WAN device (Edge or Gateway) can become unable to forward all traffic entirely. If this affects a Gateway it would only affect multi-path traffic that traverses the Gateway and customer traffic going direct would not be affected.

    The difference between the two tickets: the fix included in #104141 addresses the memory buffers being consumed by the jitter buffer queue due to the increasing management time stamps recieved by the peer. The fix included in #105744 restricts the jitter buffer count to 25% of the total memory buffers to ensure that this issue could not recur.

    Without a fix for this issue for either the Edge or Gateway, a user can monitor both componants on the Orchestrator and look for increased mbuf usage due to packets being queued in the jitter buffer and the user can then flush flows for the Edge or Gateway to temporarily alleviate the issue, though the issue would eventually recur until the fix was applied.

  • Fixed Issue 106627: A customer who uses Layer 7 (L7) Health Check with a Non-SD-WAN Destination (NSD) or Cloud Security Service (CSS) where redundant tunnels are also configured may see that all tunnels show as down even though they are up.

    The issue is caused by the VMware SD-WAN Edge sending the L7 probes to the back-up tunnels instead of the primary tunnels and thus triggering a false indication that the tunnels are down.

  • Fixed Issue 106700: For users who have configured a loopback interface as the source interface for a Layer 7 (L7) Health Check on a VMware SD-WAN Edge, if a user changes any parameter of the loopback interface the L7 probes may fail and the IPsec tunnels associated with that L7 Health Check would report as down.

    When the loopback interface configuration is changed in any way the L7 probes can be sent to an interface designated as “None” with IP address 0.0.0.0 and thus the probes fail which results in the IPsec tunnel being marked as down.

  • Fixed Issue 107302: If the source interface chosen for Layer 7 (L7) Heath Check probes is modified, including changing the IP address, the IPsec tunnel associated with the L7 Health Check may be marked as down for up to 30 seconds.

    It may take up to 30 seconds for the probes to be corrected. This may lead to the IPsec tunnel being marked down if enough probes fail before the configuration is corrected.

  • Fixed Issue 107309: When a customer configures the L7 Health Check for a Non SD-WAN Destination via Edge on a 4.x Orchestrator and the Orchestrator is upgraded to Release 5.x, if the customer attempts to modify the L7 probe retry value, the Edge does not apply the new value.

    For example, if the L7 Health Check probe retry value is 3 (the tunnel is marked as down on 3 failed probes) and the customer changes this value to 1, the L7 Health Check continues to use the original value of 3 retries before the tunnel is marked down. This issue is fixed on the Edge build as the Edge is not applying the new configuration it is receiving from the Orchestrator.

  • Fixed Issue 107356: For a customer who has deployed a Non SD-WAN Destination (NSD) with redundant tunnels and Layer 7 (L7) Health Checks activated, a secondary NSD tunnel may not come up for up to 30 seconds after the primary tunnel has gone down.

    For an NSD configured with redundant primary/secondary tunnels, the L7 state for the secondary tunnel is carried forward from instance to instance. If the secondary tunnel comes up and is then terminated, the L7 state may be marked as down to keep the secondary tunnel in the down state.  When the secondary tunnel is later brought up again the L7 health check process may take up to 30 seconds to see the tunnel and resume sending L7 probes that verify the secondary tunnel is up.   

  • Fixed Issue 109131: A PPTP (Point-to-Point Tunneling Protocol) session does not reconnect after a disconnect.

    The issue occurs beause the GRE packets are dropped coming from the LAN side (the PPTP server) where the PPTP server is connected at the VMware SD-WAN Edge LAN side while the PPTP client is connected via the WAN (remotely).

    On an Edge without a fix for this issue, the workaround is to clear the NAT database by running Remote Diagnostics > Flush NAT.

Resolved in Gateway Version R5012-20221214-GA

Gateway build R5012-20221214-GA was released on 12-14-2022 and is the 2nd Gateway rollup for Release 5.0.1.

This Gateway rollup build addresses the below critical issues since Gateway build, R5011-20221007-GA.

Important:

Due to a build issue with the original 5.0.1.1 Gateway build (R5011-20221007-GA), Gateways cannot be upgraded to any other 5.0.1.1 Gateway build and must be upgraded from 5.0.1.1 directly to 5.0.1.2.

  • Fixed Issue 96863: A VMware SD-WAN Edge where the WAN links prefer IPv6 may experience an Dataplane Service failure, resulting in a brief disruption of customer traffic.

    The issue can occur on either an Edge or a VMware SD-WAN Gateway when IPv6 is activated, resulting in a service failure and a restart to recover.

  • Fixed Issue 97272: On a site with a High Availability topology where OSPF is used, when the site experiences a split brain condition (both VNware SD-WAN Edges are Active), the default route to the core router is removed and the HA site cannot reach peer sites in the network.

    The core router has its link-state advertisement (LSA) age synchronized with the Active Edge. When a HA split brain condition is experienced, the Standby Edge moves to active and sends a new LSA age to the Core Router. Since both the Active and Standby Edge have the same Router ID, a different LSA age is sent by the new Active Edge. This mismatch causes the LSA age to be set to a maximum value of 3600 in the Core router which also removes the core route to the HA site, resulting in a complete outage at the site.

  • Fixed Issue 99650: On a customer site where a Non SD-WAN Destination is configured with IKEv1, the tunnel may not form between a VMware SD-WAN Edge and the NSD peer and customer traffic would not be able to reach the peer site.

    During an initial packet classification, fragmented ESP packets on the NSD tunnel are incorrectly classified as ordinary user packets and then sent to an unmonitored queue. This causes the packets to accumulate in that queue and be never processed and leaked.

Resolved in Edge Version R5012-20221107-GA

Edge build R5012-20221107-GA was released on 11-14-2022 and is the 2nd Edge rollup for Release 5.0.1.

This Edge rollup build addresses the below critical issues since the 1st Edge rollup build, R5011-20221007-GA.

Important:

Those using a previous Edge 5.0.x.x build should upgrade their Edges to 5.0.1.2. Previous Edge 5.0.x.x builds have been marked as deprecated in VMware SASE hosted Orchestrators.

For more information see the Knowledge Base article https://kb.vmware.com/s/article/90042.

  • Fixed Issue 70311: A VMware SD-WAN Edge may experience a Dataplane Service Failure and restart as a result.

    During the Edge service restart, customer traffic would be disrupted for ~15-30 seconds. This issue occurs inconsistently, but when it does occur the Edge is tearing down an IKE security association (SA). This typically only occurs when: the SA timer (as configured on the VMware SD-WAN Orchestrator) expires; or the user modifies the IPSec configuration on the Orchestrator.

  • Fixed Issue 96411: A VMware SD-WAN Edge may experience a Dataplane Service failure and restart as a result, resulting in a 10-15 second interruption in customer traffic.

    The issue can occur on an Edge where frequent link flaps (where a WAN link goes down and then returns in rapid succession). The issue is caused by a memory corruption which results in a double free state and an Edge service failure.

  • Fixed Issue 96441: On a site using a High Availability Topology, the customer may observe frequent HA failovers.

    The issue is triggered by the HA interface being marked by the Edge as down and then coming back up within 500-1000ms which can trigger an HA failover. However, these interface down events are spurious and caused by a DPDK-enabled interface using polling with an interval of 500ms to determine interface status. Using this method, the underlying device driver can sometimes report a spurious interface down event and each event causes the Edge to mark the interface as down until the next poll of the interface status (in 500ms) reports that the interface is up.

  • Fixed Issue 96888: In certain load conditions, the routing protocols for either BGP or OSPF may randomly restart, leading to route re-convergence and traffic disruption.

    Under higher load conditions the BGP and OSPF routing protocol processes are made to wait longer than expected by the Edge CPU to get scheduled and this leads to a stall and restart of the routing protocol. The routing protocol delay is caused by insufficient CPU bandwidth allocation and can occur on any Edge model.

  • Fixed Issue 97483: For a site using a VMware SD-WAN Virtual Edge with 2 CPU cores, the throughput does not exceed 500 Mpbs in the transmission (Tx) direction.

    The throughput of a 2 core Virtual Edge is soft-capped (in other words, capped through the Edge software) to 500 Mbps to prevent the CPU from being overloaded. However, enhancements in Edge software have made it possible for a 2 core Virtual Edge to handle much more traffic without the CPU being overloaded and the existing 500 Mbps cap is now too limiting. With this Edge build, the 2 core throughput cap is raised to 1000 Mbps.

  • Fixed Issue 98514: On a customer enterprise deployed with a High Availability topology, whenever a configuration change is applied to the VMware SD-WAN HA Edges, the user would observe an event stating "Management service failed" on the Standby Edge and that the management service is restarting as a result.

    Since this is the management service (which does not involve customer traffic), and on the Standby Edge, there is no negative impact to client users at the HA site when the Standby Edge's management service restarts. This is still a critical event recorded in Edge Events that would greatly concern the customer administrators.

  • Fixed Issue 100377: When a VMware SD-WAN Edge is upgraded to Release 5.0.x, LAN-side client users may observe that all customer traffic drops and they are unable to connect to other sites and the internet.

    Description: This issue occurs randomly and affects LAN-side traffic. The Edge remains connected to both the Gateway and Orchestrator. On the Orchestrator when looking at Monitor > Edge > Health a user would observe an escalating level of handoff queue drops. The issue is caused by a change in behavior introduced in the 5.x Edge build that can lead to a packet leak, a particular packet connected with the change is no longer released and over time the packet buffer becomes overloaded and begins dropping all packets.

    On an Edge without a fix for this issue, a restart of the Edge service will clear the packet buffer, but only temporarily.

  • Fixed Issue 101049: If a customer enterprise uses both secure and non-secure routes, they may observe high path loss.

    An example of using both secure and non-secure routes would be an enterprise where a Partner Gateway is used and the Edge learns subnets from a BGP neighbor (non-secure) and then the Edge learns those same subnets from another Edge in the network (secure). In this scenario the secure route is preferred but if the secure route is revoked, the traffic is not switching to the non-secure route. The issue is caused by the Edge service not sequencing the management tunnels responsible for routing properly.

Resolved in Edge/Gateway Version R5011-20221007-GA

Edge/Gateway build R5011-20221007-GA was released on 10-11-2022 and is the 1st Edge/Gateway rollup for Release 5.0.1.

This Edge/Gateway rollup build addresses the below critical issues since the original GA build, R5010-20220729-GA.

  • Fixed Issue 89235: On a customer enterprise which uses a Hub/Spoke topology and employs internet backhaul policies, backhaul traffic from a VMware SD-WAN Spoke Edge which is destined for the Internet may be dropped by the Hub Edge.

    When this issue is encountered, the client users would notice issues for traffic destined for the Internet. The issue occurs after one of the following: an Edge power cycle (for example after a power outage), an Edge service restart, or a configuration change and is caused by a timing issue between the backhaul traffic originating from a Spoke Edge and the route advertised from the Spoke Edge.

    When encountering this issue on a Spoke Edge without this fix, a user should flush the flows on the affected Spoke Edge to restore normal routing of backhaul traffic. This can be done on the Orchestrator through Remote Diagnostics > Flush Flows.

  • Fixed Issue 94430: For a customer enterprise that uses a Hub/Spoke topology where multiple Hubs are deployed, a user behind a VMware SD-WAN Spoke Edge may observe issues with traffic that is destined for a Hub Edge.

    Client traffic issues occur when the Spoke Edge forwards traffic towards a Hub different than the one expected to receive the traffic. The issue is caused by the AS path length for the remote BGP routes not being calculated properly in certain scenarios. Because of this, the routes from the Hubs that should have a lower routing preference instead end up having greater AS_PATH length and may be preferred.

    If encountering this issue without a fix, the customer can withdraw and re-advertise the route that is expected to be preferred.

  • Fixed Issue 95503: In rare instances a customer may observe that a VMware SD-WAN Edge model 610, 610N, or 610-LTE shows the same MAC address for all Ethernet interfaces.

    An Edge 610 (any type) may show an eth0 MAC address ending with 0xF*. In such cases, GE1 through GE6 ports receive the same MAC address due to an issue with the script that calculates and allocates MAC addresses.

    The fix corrects this script behavior and an affected Edge 610 type would properly calculate and allocate unique MAC addresses once the Edge is upgraded to a build that includes it.

  • Fixed Issue 96055: A VMware SD-WAN Gateway may experience a Dataplane Service failure with Signal 6 (SIGABRT) and generate a core.

    This issue can occur if a VMware SD-WAN Edge sends a packet which refers to an invalid Segment to the Edge's Primary Gateway. When the Gateway receives this packet, the Gateway's process fails instead of handling the situation gracefully by discarding such packets.

  • Fixed Issue 96231: When a customer deploys a Non SD-WAN Destination (NSD) via Gateway with a Palo Alto Networks type and also configures Palo Alto's “Prisma tunnel monitoring feature” for use on this NSD, the user may observe that while IPsec tunnels are established for this NSD, they are continuously torn down and rebuilt every 5-15 seconds, causing disruption for traffic using the NSD.

    When the Prisma tunnel monitoring is enabled, the Prisma application sends ICMP packets encrypted to the SD-WAN Gateway and once the Gateway responds back to the ICMP packet, Prisma confirms that the tunnel is established. In effect, Prisma is a kind of IPsec tunnel liveliness check. However the problem in this instance is that the Gateway is dropping Prisma's ICMP packet and thus Prisma marks the tunnel as down which triggers the tunnel teardown and rebuild.

    The issues is caused by the Gateway receiving the ICMP packet and checking if its an echo request packet, but instead of checking the type field, the Gateway is incorrectly checking the code field in the ICMP header and this results in the Gateway discarding the ICMP packet which triggers Prisma to tear down the tunnel.

    On a Gateway that does not have a fix for this issue, the customer should not use Prisma for their Palo Alto type NSD.

  • Fixed Issue 98157: A VMware SD-WAN Gateway may experience a Dataplane Service failure and restart as a result.

    In rare instances when the source port of a SD-WAN tunnel changes (for example, because an intermediate NAT device restarts), the Gateway's dataplane process can fail and then restart and generate a core.

  • Fixed Issue 99188: In some situations a BGP session may not come up for a customer enterprise.

    This issue occurs when an ASN value greater than 2147483648 is configured. In such a case the configuration is not applied and hence the BGP sessions do not come up.

Resolved in Edge/Gateway Version R5010-20220729-GA

Edge and Gateway version R5010-20220729-GA was released on 08-05-2022 and resolves the follwing issues since Edge/Gateway version R5002-20220506-GA. This means that a fix for any Edge or Gateway issue listed in the 5.0.0 Release Notes is included in all Release 5.0.1 builds.

  • Fixed Issue 58791: A site deployed with a High Availability topology where BGP is used may encounter an issue where the VMware SD-WAN Edge repeatedly fails over.

    This issue affects HA sites configured within a Hub/Spoke topology where the HA site has greater than 512 BGPv4 filter prefixes configured.

    When BGP is used with multiple network commands configured and while the Standby Edge is coming up it parses the all configurations symmetrically and for every network command vtysh is spawned and as a result this is causing the verp thread to not run. The verp thread being delayed results in a delay in heartbeat processing which causes the Standby Edge to believe the Active Edge is down and the Standby Edge then becomes active which leads to a split-brain state (active-active). To recover from the split-brain state, the Standby Edge restarts which merely repeats the cycle.

    Without the fix the workaround is to reduce the number of BGP filter prefix configurations by aggregating them and getting the total number below 512 (256 Inbound, and 256 Outbound filters).

    Note:

    There is a similar issue where an HA site is disrupted when using BGP 'match and set' operations and this is tracked separately under Fixed Issue #84825, which is also fixed in this Edge build.

  • Fixed Issue 67458: When a VMware SD-WAN Hub Edge with a large number of Spoke Edges is upgraded to Release 4.2.1 or later, some tunnels to other Spoke Edges will not come up for the Hub Edge.

    A large number of Spoke Edges is understood at ~1000 or more. This issue is not consistent, but generally ~1/3rd of the VeloCloud Management Protocol (VCMP) tunnels are not established between the Hub Edge and the connected Spoke Edges. This is caused by the Hub Edge ignoring the MP_INITs as the number of half open TDs exceeds the Hub Edge's upper limit.

    When encountering this issue without the fix, restarting the Edge Service will restore full tunnel connectivity.

  • Fixed Issue 70129: When Syslog is activated on a large scale VMware SD-WAN Gateway, the /var/log folder may get filled up with syslog log files in a short period of time.

    Large scale is understood as a Gateway with ~4K peers and ~6K tunnels where there are usually ~100-150K flows and ~50-100K NAT entries. Short period can be as little as 24 hours with a syslog.log file of >3.2Gb in size. This is caused by some NAT logs being directed to the /var/log that should be directed to a different folder.

  • Fixed Issue 70586: When a routed interface on a VMware SD-WAN Edge is configured for 802.1x (uses RADIUS authentication), clients connected on that interface get silently de-authenticated whenever any other interface flaps (in other words, when any non-802.1x interface goes down and up in quick succession), and all of their traffic gets dropped until the client disconnects and then reconnects to the Edge.

    The Edge is not checking that the interface that flapped is actually the one that had 802.1x clients authenticated and thus treats any interface flap is if it were a 802.1x interface flap and acts accordingly.

    Without the fix, the only workaround is to force the client to physically disconnect and reconnect to get re-authenticated again.

  • Fixed Issue 74291: A VMware SD-WAN Edge in a High-Availability topology may appear as offline after a failover despite having internet access and functional DNS.

    This issue can occur after a High-Availability failover and is caused by a token error on the newly promoted Active Edge which results in a heartbeat failure to the Orchestrator. Without the heartbeat, the Orchestrator marks the Edge as down.

    Without an Edge build with the fix, the way to remediate the issue is to locally force another failover either through the local UI or by powercycling the Active Edge.

  • Fixed Issue 72925: For a customer who uses SNMP polling for monitoring their enterprise and deploys lower model VMware SD-WAN Edges (for example, Edge models 510, 520, or 610) which are running a 4.x software release, SNMP polling takes exceptionally long to process and can even timeout.

    This issue significantly reduces the effectiveness of SNMP polling for network monitoring when using Edges in the 510, 5x0, and 6x0 series. This issue is caused by the Release 4.x SNMPagent taking an unnecessarily long amount of time in traversing the debug command list, which is not actually required for the SNMP process.

  • Fixed Issue 73830: System Center Configuration Manager (SCCM) application traffic is being misclassified by the VMware SD-WAN Edge as Business Intelligence Service (BITS) traffic and customers using Business Policy or Firewall Rules designed for SCCM traffic will find that traffic impacted.

    The Edge's Deep Packet Inspection (DPI) engine is misclassifying the SCCM application packets as BITS traffic and if there are Business Policy or Firewall Rules designed to steer that traffic or ensure that traffic is allowed by Firewall rules, the misclassification my result in SCCM traffic being blocked with a resulting disruption to the customer. The remediation for this issue involved amending the default 4.5.1/5.0.1 and later application maps to ensure that this misclassification is prevented.

  • Fixed Issue 74316: A VMware SD-WAN Spoke Edge may not connect to any or all of the assigned Hub Edge Clusters, even if the Edge has a service restart or a full reboot.

    There is an issue with the cluster reassignment logic which creates cluster assignment mapping without the cluster member’s endpoint information in a specific Cluster-member-to-Super-Gateway overlay flap scenario. As a result, Spoke Edges assigned to the Hub Cluster member subsequently fails to receive the endpoint information of the Hub Cluster member leading to no overlays between Spoke Edges and Hub Clusters.

    Without the fix the only way to temporarily remediate the condition is for someone with Gateway access to trigger a cluster reassignment manually on the Super Gateways.

  • Fixed Issue 76690: A user may observe important logs missing when attempting to troubleshoot an issue for a VMware SD-WAN Edge because they have been crowded out by repeated entries of a less important event.

    In a diagnostic bundle, velocloud/log could have repeated logging of the event vc_peer_qos_update_cos_qlimits. The log level for this event is management plane and it can get logged repeatedly to the point that the log overflows and rolls over. In a troubleshooting scenario, this can result in important log messages being missed because they were rolled over and wiped out.

  • Fixed Issue 78276: On a VMware SD-WAN Gateway, running the comman debug.py -qos_net fails if the VMware SD-WAN Edge's name includes non-ASCII characters.

    An example of this in the field was the use of Chinese characters but it applies to any non-ASCII characters and can be observed as follows: change an Edge name to include non-ASCII characters and reboot the Edge. Then a Gateway connected to the Edge run the CLI command: debug.py --list 3, to get the Edge's logical ID. Then run the Gateway CLI command: debug.py -qos_net [logical ID] all stats and the user would observe the command fails.

  • Fixed Issue 78300: If a VMware SD-WAN Edge is using a WAN link configured to be a backup, a user may observe logs or Orchestrator Events which suggest that tunnels are coming up or going down for this link.

    By design, tunnels do not get established for backup links. But any tunnel request from a remote end (typically a dynamic Edge-to-Edge tunnel) might change the link status as it goes through the stack. In this fix, care has been taken so that no logs indicate that any tunnel formation or tear down is going on for the back up link.

  • Fixed Issue 78391: Traffic with the Speedtest application classification is not working properly.

    Both speedtest.net and fast.com have newly added speedtest server IP addresses that are missing in the default application map and as a result the Business Policy that deals with these application is not being applied.

    If not upgraded to Release 4.5.1 or 5.0.1, an Operator could add the required speedtest IPs to an existing application map using the VMware SASE Orchestrator's Application Map Editor.

  • Fixed Issue 79261: Office 365 / Microsoft 365 application traffic is misclassified as Tencent Meeting (VooV Meeting) application traffic on the VMware SD-WAN Edge.

    This can be disruptive for customers who rely on Business or Firewall Policies for routing and prioritizing Office 365 / Microsoft 365 traffic only to have that traffic classified as Tencent Meeting and thus hitting a completely different rule. The Issue is traced to incorrect application map subnets for Tencent that are corrected for the 4.3.2 default application map.  A customer not using 4.3.2 can have this corrected by an Operator who edits their application map through the Orchestrator Application Map Editor to correct the Tencent subnets.

  • Fixed Issue 80010: For a customer enterprise using a Hub/Spoke topology where SD-WAN Reachable is also configured, the Spoke to Gateway path (using a public WAN link) via the Hub path does not come up if the Spoke-to-Hub path is point-to-point.

    The SD-WAN Reachable feature, which is a passthrough for a Spoke Edge to connect to a Gateway through a connected Hub, is not supported if the Spoke Edge and the Hub Edge are connected by a point-to-point link (in other words, the Spoke's IP address matches the connected route on the Hub). The fix for this issue adds this functionality.

  • Fixed Issue 80196: A VMware SD-WAN Gateway may experience a Dataplane Service failure with a SIGXCPU message and the Gateway restarts the service to recover with an impact to Gateway traffic for 15-30 seconds.

    This issue is seen at high throughput for that Gateway relative to its throughput capacity and thus is more likely to happen in large scale deployments (for example, 4K Edges and 6K tunnels).  When traffic at a high rate hits the Gateway, in some instances the Gateway will experience a thread lock and generates a core while restarting. In the core the user would observe: Program terminated with signal SIGXCPU, CPU time limit exceeded.

  • Fixed Issue 80479: A VMware SD-WAN Gateway may experience a Dataplane Service failure with the Gateway restarting the service to recover which impacts Gateway traffic for 15-30 seconds.

    This issue can occur if a VMware SD-WAN Edge is connected to the Gateway with Edge-to-Edge (E2E) configured and a Loopback interface route advertised. When a user toggles off E2E for this Edge, this triggers a route initiation but the loopback route is not deleted, and the route updates the profile flag of the route. Next, if the user removes the advertise for the Loopback route, this deletes the route from the FIB but remains stale in the E2E table on the Gateway. If the Loopback route is then readvertised and is added to the FIB and after that the user toggles back on Edge-to-Edge which again just updates the flag, even though the route is present in the Gateway's E2E table (which is stale), the actual route ref_count is not correct. Finally, if the tunnel is torn down, this is what triggers the Dataplane Service failure on the Gateway.

    Without the fix, an Operator would need to make sure routes are withdrawn before a profile is changed for the Edge.

  • Fixed Issue 80496: Ping from a VMware SD-WAN Edge to a remote Edge's branch loopback IP address over a SD-WAN tunnel may not work.

    Issue is seen for a ping with a large enough packet size to cause fragmentation. When the ping is initiated with a large packet size from an Edge to a remote branch Edge's loopback IP address, the fragmented ICMP reply is reaching the Edge initiating the ping but does not reach the ping application since the next fragment is dropped.

  • Fixed Issue 80721: Partners and operators monitoring a VMware SD-WAN Gateway using Telegraf may observe that the metrics do not resume should Telegraf experience a network timeout.

    Gateways experiencing this issue are using Telegraf version 1.17.3. This is in contrast to the Telegraf version the VMware SASE Orchestrator is using: 1.21.1. This version mismatch is causing the issue with Telegraf getting stuck in the event of a network timeout. The Gateways with a fix for this issue include Telegraf version 1.21.1 as would any future Gateway build in that release train (in other words: 4.5.1 or 5.0.1).

    On a Gateway that experiences this issue, the only remediation is to restart Telegraf to resume sending metrics.

  • Fixed Issue 80814: On a VMware SD-WAN Edge where a Standard Firewall Allow rule is configured which has a local Edge client Source IP address and a remote client as the Destination IP Address, and which also has a "Deny All" rule for other traffic, the traffic from the remote client to the local client is dropped.

    This issue is encountered when there is a VLAN IP address mismatch between the source and destination hosts. When the source and destination hosts are part of different VLANs, the SD-WAN service prefers the source/destination IP address of the first packet as it is in the Firewall lookup key. As a result, for overlay inbound flows, there is a mismatch and traffic hits the Deny All firewall rule.

    Without the fix, the workaround for this issue is to revert the rule in the direction of first IP packet of the flow, so that the packet is able to match the firewall rule.

  • Fixed Issue 80897: For a customer enterprise where VMware SD-WAN Edges are connected to VMware SD-WAN Partner Gateways, users may observe poor performance for customer traffic.

    The poor performance is the result of routing issues stemming from the Partner Gateway distributing routes to the Edges where preferred secure static routes are available but the Edge does not properly label these routes as secure. The result is the Edge potentially advertising non-preferred non-secure routes over secure routes since all routes are treated equally when the expected behavior is to always prefer secure routes over non-secure routes.

    Note:

    Both the Partner Gateway and customer Edges must be upgraded to a build that includes this fix to resolve the issue.

  • Fixed Issue 81221: If a customer configures a 1:1 NAT rule for a VMware SD-WAN Edge and that Edge is rebooted, the rule no longer works.

    After the reboot, the Edge assigns the NAT address as the Edge interface address where the NAT rule is being applied and thus no tunnels are being built for traffic matching that rule.

    Without the fix, the only remediation is to run the Remote Diagnostic > Flush NAT, which flushes the entire NAT table and reestablishes correct NAT rule operation.

  • Fixed Issue 81809: When a user attempts to SSH to a VLAN IP on a VMware SD-WAN Edge from a remote client sitting behind another Edge or even from a VMware SD-WAN Gateway, the SSH attempt fails.

    An SSH attempt from a LAN client to an Edge VLAN IP works properly. Originally the Edge's Management IP was used to SSH to the Edge. However, after the Edge Management IP was deprecated, there was no option for the user to SSH to the Edge (via overlay from a remote Edge client) as the Loopback IP still does not support SSH.

  • Fixed Issue 81859: When a VMware SD-WAN Edge 610-LTE is activated to Edge Release 5.0.0, the CELL interface may not come up after the Edge is upgrade to that release.

    This issue is not consistent but when it occurs can have a major impact if the Edge 610-LTE's only public link is the mobile CELL link as the Edge would be effectively down and intervention for this Edge would need to be local.

    If encountering this issue on an Edge without the fix, the user would need to restart the Edge service (or power cycle/reboot it if the Edge is inaccessible through the Orchestrator) or restart the Edge's modem to restore the CELL interface.

  • Fixed Issue 82182: For a VMware SD-WAN Edge Model 510 or 510-LTE which is running Edge Release 5.0.0, when a user attempts a service restart of the Edge, the Edge may also reboot.

    An Edge reboot would disrupt customer traffic for 2-3 minutes while the Edge was going through the reboot process. On an Edge 510/510-LTE, there is a Wi-Fi device hang monitoring script which may fail to stop during the Edge service restart and this triggers the reboot.

    Without the fix a user would need to restart the Edge service, but Edge service restarts for these models should only be done in a maintenance window or with the understanding this issue may arise.

  • Fixed Issue 82264: A VMware SD-WAN Virtual Edge which uses an AWS C4 instance cannot be upgraded to Release 5.0.0.

    An AWS C4 Virtual Edge upgraded to Release 5.0.0 does not recover and the only remediation is for the user to reactivate the Edge to a non-5.0.0 version.  No other AWS instances (for example, C5) are affected by this issue, but due to the critical nature of this defect an AWS Edge Upgrade software package is not available for Release 5.0.0.

    Edge Release 5.0.1 and later correct this issue and the AWS C4 instance can be successfully upgraded to Release 5.0.1 and later.

  • Fixed Issue 82463: For a site configured with a Cloud Security Service (CSS), the VMware SD-WAN Edge may drop traffic destined for the CSS.

    If the site is routing all internet traffic through a CSS, the impact of this issue can be significant. When the issue occurs, CSS packets are sent on the incorrect interface with the IP address of the actual interface as the source which leads to a failure in application access. The issue is caused by a potential race between the CSS context lookup thread and the outgoing interface selection thread which leads to the incorrect association of the outgoing interface with the flow and some flows on the CSS paths fail.

    Without the fix, when experiencing the issue the user can remediate it by starting a new flow, or flushing all flows on the Edge by using Remote Diagnostics > Flush Flows.

  • Fixed Issue 82485: On an entry level VMware SD-WAN Edge model (for example, Edge 510, 510-LTE, or 610) if a user runs the Remote Diagnostic "Route Table Dump", the Orchestrator UI page may time out and not return a result.

    The issue is encountered if there are more than 16000 routes as it take the Edge more than 30 seconds to return the results. 30 seconds is the timeout limit for the page's WebSocket and so no result is returned. The fix for the issue optimizes the route table walk to ensure timeouts do not occur.

  • Fixed Issue 82522: When high throughput traffic hits a VMware SD-WAN Edge, there may be a drop in actual throughput observed on the Edge.

    Under high throughput, the Edge's NDP (Neighbor Discovery Protocol) thread is acquiring lock twice even for NDP entries which were marked reachable and further processing was not required. These duplicate locks cause the throughput to decrease due to the additional processing.

  • Fixed Issue 82652: For a customer using a Cloud Security Service (CSS) where L7 Health Check is configured, the VMware SD-WAN Edge makes no attempt to recover an IPsec CSS tunnel that has been marked as down for more than five minutes.

     In the current implementation of the L7 Health Check, the Edge sends L7 probes on all CSS tunnels and if those probes fail a set number of times, the Edge marks that tunnel as Down and then continues to send L7 probes and waits for the tunnel to come up on its own. The issue being that no attempt is made to recover a tunnel once it is in a Down state for more than five minutes where IKE remains up (if IKE is also down the IPsec tunnel is automatically reset after 20 seconds).

    The fix in this ticket enhances the L7 Health Check by including an additional step for IPsec-based CSS tunnels: if an IPsec-based CSS tunnel remains down for longer than five minutes (no successful L7 probes) while IKE for the tunnel remains up during the same period, the Edge will tear down the IPsec tunnel and reset the IKE in an attempt to recover the CSS tunnel. L7 probes would continue to be sent while this occurs and if successful would mark the tunnel as Up. If the tunnel remained down, the same step would be applied after an additional five minutes.

    This added behavior only applies to a CSS with IPsec tunnels and not ones using GRE tunnels.

  • Fixed Issue 82790: On VMware SD-WAN Gateways deployed in Azure environments, the Gateway interface counters are not exported to the Wavefront monitoring service.

    Azure is an environment where DPDK is not configured for use, and only DPDK interface counters (throughput rate, PPS, and drop counters) are exported to the Wavefront service. This leads to reduced monitoring abilities in platforms like Azure where DPDK is not used.

  • Fixed Issue 82839: If a user performs an IPsec automation deletion action for a Zscaler Cloud Security Service tunnel on a VMware SASE Orchestrator, the action also deletes all the VPN credentials associated with the respective Zscaler location resulting in deletion of the location itself.

    The IPsec automation deletion action should only delete the VPN credential associated with it from the Zscaler Location. All other VPN credentials associated with the respective Zscaler Location should remain untouched.

  • Fixed Issue 83029: For either a standalone VMware SD-WAN Edge or a site deployed with a High-Availability topology where one or more PPPoE links are used, if the PPPoE endpoint IP changes after either an Edge interface flap for that PPPoE link or when an HA site experiences a failover, traffic would not pass on the affected PPPoE link(s).

    On a site that uses PPPoE links, along with a change in the PPPoE endpoint IP, the impact would mean no customer traffic would pass. The issue is caused due to the presence of a stale default route, which is a route using the old IP address of the PPPoE endpoint on the Edge that is not deleted after a new PPPoE endpoint IP address is received.

    Without the fix, an onsite user would need to either disconnect each PPPoE cable and reconnect it to force a renegotiation or reboot the Edge, which would also force a renegotiation.

  • Fixed Issue 83083: A VMware SD-WAN Gateway upgraded to Release 4.3.1 or later may experience a slow memory leak which can lead to the Gateway's service restarting to clear the memory.

    Gateway restarts can be disruptive to customer traffic for the 30-45 seconds it takes the for the Gateway service to restart. Each time an Operator user runs the debug.py --flow_dump all all all command on the Gateway, the Gateway will leak some of its memory.  Running this debug command a sufficient number of times will cause the Gateway's memory usage to reach a critical level and trigger a Gateway service restart to clear the memory.

    For a Gateway without the fix, an Operator must avoid running the debug.py --flow_dump all all all command on the Gateway. If using this debug command is unavoidable, monitor the memory usage and schedule maintenance windows to preemptively restart the service to clear the memory prior to an unscheduled restart.

  • Fixed Issue 83209: For customers using OSPF in their enterprise, OSPF routing may not work as expected.

    The Issue occurs when there is a change in the OSPF router-id and the Edge service is restarted. Only loopback interfaces and Interfaces with 'Advertise' flag set are considered for router-id selection. When there is a new loopback interface configured with a higher IP address, upon restarting the Edge service, the new loopback IP address is selected as the router-id and if the Edge is elected as the DR (Designated Router) the issue is seen.

    Without the fix, the only workaround is to force the use of the old Router ID. To bring back the old Router ID and set the Advertise Flag on the respective interface (an Edge service restart will be required).

  • Fixed Issue 83402: On a VMware SD-WAN Edge with multiple WAN links, one or more WAN links may stop passing traffic.

    On the WAN link(s) that stop passing traffic, the DHCP acquired address is not renewed and the WAN interface's address is lost. Issue occurs when there are multiple interfaces acquiring IP addresses using DHCP and the DHCP server is in a different network from the client. The outgoing interface of a DHCP renew unicast packet is determined through route lookup. Since there are multiple default routes with different metric values learned through different interfaces, the DHCP request packets might get sent out of a different interface. 

    Without the fix, an onsite user would need to unplug and then plug back in the affected WAN link from the Edge to force it to get its IP address again.

  • Fixed Issue 83411: When High Availability is turned on with a newly activated VMware SD-WAN Edge, the HA Edge pair may go offline.

    When HA is turned on, all the Edge interface MAC addresses are changed to Virtual MAC addresses, and during the issue state the DPDK configuration is not updated with these VMAC addresses.  As a result, the heartbeat packets destined for the Orchestrator are dropped due to a destination MAC mismatch, and the Orchestrator marks the HA Edge pair as down.

  • Fixed Issue 83424: An SNMP walk may not work properly for interface and path related OIDs.

    When a snmpbulkwalk command is done for some Edge deployments, the SNMP walk may take too long and time out. The fix for this issue optimizes SNMP and ensures faster responses to SNMP walks requests. However, it should be noted that in rare instances the issue may still occur as the SNMP process remains a lower priority process on the Edge.

  • Fixed Issue 83428: On a customer enterprise using a Hub/Spoke topology, the static tunnels between a VMware Hub Edge and Spoke Edge may stop passing traffic while attempting to measure bandwidth on the tunnel.

    On the Hub Edge, there is no mechanism for handling a scenario where tunnel preference is updated in the middle of the tunnel establishment process. The bandwidth measurement process then put the tunnel into a stalled state and traffic cannot pass on this tunnel. Customer traffic can reroute through the Gateway, but this may introduce latency into the Hub/Spoke traffic.

  • Fixed Issue 83432: For a site deployed with a High-Availability topology, when additional tunnels are added to the site the VMware SD-WAN Standby Edge may experience a Dataplane Service crash and generate a core.

    A common way tunnels are added is by adding WAN links to the HA Edges.  When the number of tunnels the Standby Edge needs to synchronize with the Active Edge exceeds 80, this triggers an exception and a Dataplane Service failure on the Standby. When this issue is encountered on a conventional HA topology the customer impact would be minimal as the Standby Edge does not pass customer traffic. On an Enhanced HA deployment, where the Standby Edge is also passing traffic, the reboot(s) would disrupt some customer traffic. 

  • Fixed Issue 83611: Customer may observe an unusually high number of EDGE_NEW_DEVICE events from a VMware SD-WAN Hub Edge on the VMware SASE Orchestrator UI.

    The issue can be encountered with the following topology: Client Device – Spoke Edge – Hub Edge --DHCP Server. With this topology, every time a client user behind a Spoke Edge sends a DHCP packet, the Spoke Edge properly triggers an Edge_New_Device event for this client device.  But when the Hub Edge receives the DHCP Relay packet, the Hub Edge again triggers an Edge_New_Device event to the Orchestrator, and this event is incorrect.

  • Fixed Issue 83699: When a VMware SD-WAN Gateway is set to quiesced mode from the VMware SASE Orchestrator's New UI, when the user selects a new replacement Gateway, the Orchestrator does not allow any configuration changes to the replacement Gateway.

    This issue happens after activating the Non SD-WAN Destination migration process via the Orchestrator's New UI part of which is selecting a New Gateway, which is the Gateway replacing the quiesced Gateway. Once that New Gateway is designated as the replacement Gateway, when attempting to make any configuration change to the replacement Gateway the Operator would observe an error message thrown similar to: GATEWAY_SERVICE_STATE_INVALID: Cannot change the state of the gateway to null, as it is already used as a replacement gateway.

  • Fixed Issue 83928: A VMware SD-WAN Edge may experience high CPU usage and poor customer traffic performance.

    Users would also be able to observe poor QoE scores when looking at the Orchestrator's Monitor > Edge > QoE screen for that Edge. The issue is caused by an ACL (Access Control List) rule getting instantiated multiple times in the Edge and it is stressing the Edge's CPU capacity to process this many ACL rules at once and this results in the Edge being unable to process customer traffic properly.

  • Fixed Issue 83946: VMware SD-WAN Edge LAN-side clients may observe disruptions in traffic, and for a site using RADIUS authentication, client users may observe authentication failures.

    Large packets will be fragmented and these fragmented packets can be dropped by the Edge. The packets are dropped due to a memory leak during fragment IP identification translation during some error scenarios and if the Edge limit for fragmented packets is exceeded, then further fragmented packets will be dropped by the Edge.

    For customers using RADIUS where large packets from a wireless client to an Edge using RADIUS authentication are involved this can cause authentication failures. For example, large packets from a wireless LAN controller (WLC) to a RADIUS server may be dropped.

  • Fixed Issue 84106: A VMware SD-WAN Edge may export NetFlow statistics at the incorrect time interval which would cause the receiving systems to be out of sync.

    NetFlow packets can have an additional 5 second delay from the configured interval. The is because the NetFlow exporter checks for export time once every 5 seconds only, and as a result the NetFlow packets can have a delay of 5 seconds between the configured interval and the actual export interval.

  • Fixed Issue 84359: When a VMware SD-WAN Edge interface flaps, it is possible that multiple IPv4 addresses can be assigned to it.

    When an interface, configured with a DHCP client flaps (goes down and up in rapid succession), the entire DHCP client process is carried out again and there could be scenarios where a different IP address is acquired each time. In this case, the older IP address is not cleared and stale.

    Without this fix, the only way to remediate the issue if for a user has to manually delete the IP addresses from the interface through the Linux shell using the ip address del command.

  • Fixed Issue 84501: For a customer enterprise using 802.1x authentication (for example, RADIUS, Cisco ISE), when the VMware SD-WAN Edges are upgraded to Release 4.3.1 or later, client devices connected to the Edge may fail authentication against the Network Access Server (NAS) that is hosted over the WAN.

    The NAS IP address is set as a loopback IP address by default in the RADIUS or ISE packets sent from the Edge (Authenticator) to the RADIUS or ISE Server and this can cause the authentication packets to not reach the NAS, causing the authentication failure. To remediate the issue, builds with this fix set the NAS IP address as the source interface IP address selected and configured with 802.1x Authentication settings. If 'Auto' is selected as source interface, the loopback IP will be set as NAS IP address by default.

  • Fixed Issue 84825: When a large bulk routing configuration is applied on a VMware SD-WAN Edge in a single step, the Edge may experience repeated Dataplane Service failures resulting in repeated restarts of the Edge service to recover from each failure.

    When a standalone (non-HA) Edge encounters this issue, there is significant impact to customer traffic because while a single Edge service restart disrupts traffic for ~15  seconds, repeated Edge service restarts would result in disruptions of ~60 seconds or more. On a site with a High-Availability topology, the customer would observe repeated failovers resulting from the Edge service restarts which would also disrupt customer traffic.

    This issue occurs when a bulk routing configuration involving a large number of neighbors and route-maps is applied on an Edge in a single step. The Edge system faces great stress while converting these configurations into command specifications and applying them on routing protocols in a short span of time and this causes the repeated Edge service failures and restarts.

    On an Edge build without the fix, to mitigate the risk of this issue a customer user would need to do the following:

    • Instead of applying a large configuration in a single step, the configuration should be broken into multiple smaller sections with each section applied separately.

    • The number of routing filters should be minimized.

    • The Edge should only be deliberately restarted in a maintenance window and Edge service restarts should be generally avoided if there are a number of routing filters configured, as the entire Edge configuration is applied at once during restart which would greatly increase the risk of encountering this issue.

  • Fixed Issue 84847: Customers deploying a USB-based LTE modem on a VMware SD-WAN Edge, or deploying a VMware SD-WAN Edge LTE model (510-LTE or 610-LTE) may experience intermittent issues with building tunnels from the CELL interface after the modem is reset.

    When the LTE modem is reset in one of the following scenarios:

    On an Edge using a USB modem, by removing and re-plugging in the modem from the USB port.

    On an Edge-LTE, after an Edge reboot or by resetting the CELL1 interface via the Test & Troubleshoot > Remote Diagnostics > Reset USB Modem > CELL1.

    In either scenario the underlying network device changes from wwan0 to wwan1 and the Edge does not honor this new name because it appears to be a duplicate interface.

    Without the fix the workaround to restore the LTE interface is to restart the Edge Service through Remote Actions > Restart Service.

  • Fixed Issue 85369: For a site deployed with a High-Availability topology, the customer may observe customer traffic disruptions and possibly multiple reboots of the VMware SD-WAN Standby Edge.

    Multiple threads on the HA Edges are becoming suspended leading to various issues in HA, including but not limited to an HA Active-Active state. If the site does become Active-Active, a conventional HA setup would experience minimal traffic disruption since the Standby Edge does not pass traffic in this topology, but on an Enhanced HA deployment, where the Standby Edge is also passing traffic, the reboot(s) would disrupt some customer traffic. The other way multiple thread suspension can impact a customer is through path disruption which would also be observed as customer traffic disruption. Thus a customer HA site could encounter this issue without necessarily seeing the signs of an Active-Active scenario where the Standby Edge reboots.

    The root cause for multiple HA Edge threads getting suspended remains under investigation.

  • Fixed Issue 85375: Customers using either USB-based LTE modems on a VMware SD-WAN Edge or VMware SD-WAN Edge LTE models (510-LTE or 610-LTE) may experience disruptions to LTE traffic.

    A user would observe on Edge logs RX errors which increment without any traffic passing through the LTE interface. One aspect of the issue is that it occurs only if the MTU for the LTE link is less than 1500.

  • Fixed Issue 85459: An attempt to SSH either from an Edge LAN-side client to an Edge, or from a remote branch Edge client to an Edge may not work after LAN side NAT rules rules are configured.

    SSH reply packet packets coming from the Edge's SSH process go through the Edge's dataplane service and since LAN side NAT rules are configured, it is possible the SSH reply packets use LAN side NAT rules to go to different destination than the original client that generated the SSH traffic which causes an SSH attempt to an Edge to not work.

    On an Edge which lacks the fix, the only workaround is to remove the NAT rule.

  • Fixed Issue 86032: When upgrading a VMware SD-WAN Gateway from Release 4.3.x to 4.5.1 or 5.0.0, the Gateway will drop communication with the VMware SASE Orchestrator and eth0 and eth1 interfaces are removed.

    The core issue is the Gateway's dataplane process stops after the upgrade. This is caused by the Telegraf service failing to start and since the Telegraf service is activated as part of the Gateway startup script, if Telgraf fails to start, the Gateway's service fails to start as well.

    If a Gateway is upgraded to a build without this fix, the only way to remediate the issue is to run vc_procmon restart for the Gateway along with a Telegraf service restart.

  • Fixed Issue 86103: For a customer enterprise that uses RADIUS authentication, client users at some sites may be unable to connect to VMware SD-WAN Edges and pass traffic.

    The issues is caused by the Edge incorrectly categorizing fragmented RADIUS packets with the DF (Don't Fragment) bit set in the IP header as non-fragmented.  One or more of these packets fails to reach multiple Edges with the result that traffic that relies on RADIUS authentication will not pass for those Edges. This issue can occur in any topology including Hub/Spoke and simple Branch-to-Branch.

    Without the fix the only workaround is to configure the RADIUS server to not set the DF bit in the IP header while sending fragmented packets.

  • Fixed Issue 86314: A VMware SD-WAN Edge may perform an incorrect Stateful Firewall rule lookup when a LAN-side NAT flow is initiated by a remote peer.

    When a user configures a LAN-side source NAT (for example, to hide an internal IP subnet behind the Edge) on an Edge where the Stateful Firewall is in use, and a flow is initiated by a remote peer, an erroneous firewall lookup is done for the first return packet.

    For example, suppose that an Edge has the following configuration:

    LAN-side NAT: [source] inside address: 10.0.2.25/32 outside address: 7.0.2.25/32 Static route: 7.0.2.25/32 [advertise] next hop: 10.0.2.1

    A remote client sending a ping to 7.0.2.25 from 10.0.1.25 would result in two firewall rule lookups on the Edge. The first incoming packet would result in a firewall lookup for 10.0.1.25 > 7.0.2.25, and then the first return packet would result in a firewall rule lookup using the non-NAT IP for 10.0.2.25 > 10.0.1.25. This second firewall rule lookup is done in error.

    Without this fix, the user would need to create an additional firewall rule to allow the first return packet of the flow.

  • Fixed Issue 86617: A customer enterprise that uses a Loopback IP Address with Partner Gateways where BGP is configured may observe that traffic that should use the Loopback IP routes is getting dropped with a resulting disruption in that customer traffic.

    The Loopback IP Address routes are missing on the VMware SD-WAN Edge and is caused by a scenario where BGP is configured for the Edge and Partner Gateway and a Loopback IP Address is sent over BGP to the Edge, but the Edge does not learn the Loopback IP route.

  • Fixed Issue 86740: When running the Remote Diagnostic "Interface Status", a GPON-type SFP module will not show up when it is deployed in a VMware SD-WAN Edge's SFP2 interface.

    The issue is caused by a flaw in the remote diagnostic back-end script which runs on the Edge and does not properly account for the Edge's SFP2 interface.

  • Fixed Issue 86808: Some BGP routes are advertised when they should not be as per BGP filters (or not advertised when they should be).

    For a given route-map rule, the Edge could either have a prefix-list or a community-list configuration for the Edge's routing based on the rule match-type. However, for route-map unapply functions, the Edge is trying to delete both the prefix-list and community-list for each rule, one of which must be non-existent.

    Previously, this did not cause any issues as the commands for non-existent prefix-lists and/or community-lists used to be sent to the Edge's routing process as a separate vtysh command, which would just end up being no-ops and would not impact other commands. At that time, this was a deliberate call as it kept things simple in the route-map unapply functions.

    However, as part of the fix for Issue #84825, the Edge started batching multiple prefix-lists/community-list removal vtysh commands together to be sent to the Edge's routing process. Now, when trying to delete non-existent prefix-list/community-list causes the whole command batch to fail and fills the Edge with a stale prefix-lists/community-lists configuration in the Edge's routing process.

    On an Edge without a fix for this issue, a user needs to restart the Edge Service to ensure all BGP routes are properly advertised.

  • Fixed Issue 87304: If a user deactivates a LAN interface on a VMware SD-WAN Edge using the VMware SASE Orchestrator UI, the interface will still be reported as 'UP' by SNMP.

    The key debug process for interfaces output does not include the physical port details for Edge LAN interfaces (for example, GE1 or GE2). As a result when SNMP polls those interfaces it always returns a result of UP regardless of how these interfaces are configured.

  • Fixed Issue 87552: When a VMware SD-WAN Edge is configured to use either a Non SD-WAN Destination (NSD) via Edge or a Cloud Security Service (CSS), the VMware SD-WAN Edge may periodically experience a Dataplane Service failure and restart when NSD or CSS tunnels are unstable.

    When the Edge tears down a tunnel to either an NSD or a CSS (IPsec or GRE), the incorrect release of a previously chosen tunnel is performed that triggers an exception in the Edge Dataplane Service and an Edge Service restart to restore the service. Restarting the Edge service will result in a 10-15 second disruption of customer traffic.

    On an Edge without a fix for this issue, associating an NSD via Edge or CSS to one WAN link will decrease the likelihood of this issue occurring. In other words, instead of configuriing an NSD or CSS on multiple WAN links, choose one WAN link only. This will lose redundancy but will mitigate the impact of this issue.

  • Fixed Issue 87612: For a VMware SD-WAN Edge with VNF Insertion on one or more VLANs, client users on those VLANs are unable to obtain IP addresses from a DHCP Relay server.

    The Edge is not forwarding the DHCP relay packets and thus the client users are not receiving IP addresses.

    Without the fix, the only workaround is to disable VNF Insertion on the VLAN.

  • Fixed Issue 87982: A VMware SD-WAN Edge using a Metanoia-type SFP module with a private PPPoE WAN link may be unable to establish BGP peering and connect to other sites.

    VLAN tagged packets using a private PPPoE link are corrupted by the Edge and never reach their destination as a result. This issue does not affect public PPPoE links.

  • Fixed Issue 88757: A user running the Remote Diagnostic > Route Table Dump on the Orchestrator UI may find the attempt times out and the page returns no result.

    The Route Table Dump diagnostic times out because the WebSocket timeout is 30 seconds and for a site with a large number of routes the amount of time the debug command takes to deliver all the routes to the Orchestrator may exceed that. The fix here is to lower the time out of the route dump process to less than 30 seconds and prevent the WebSocket from timing out prior to that, which ensures that the Route Table Dump will return a result.

  • Fixed Issue 88796: When deploying either a VMware SASE Orchestrator or a VMware SD-WAN Gateway and using an OVA on vSphere, the OVF properties set as part of the deployment (password, network information, etc.) are not applied to the image and the system cannot be accessed after deployment.

    This only affects new system deployed from OVA using OVF/vApp properties (versus using ISO files). This issue is caused by upstream changes to cloud-init in recent updates.

    On a Gateway without the fix, the workaround is for the Operator to deploy the system using a cloud-init user-data ISO file.

    Note:

    This open ticket applies only to Gateway builds.  The Orchestrator issue is fixed with the Release R5002-20220517-GA build and later.

  • Fixed Issue 89217: A VMware SD-WAN Edge in the 6x0 model line (610, 610N, 610-LTE, 620, 620N, 640, 640N, 680, 680N) may suddenly power off for no reason.

    The 6x0 Edge would have all lights off, both the front status LED and the rear Ethernet port lights, and can only be recovered by manually power cycling the Edge.

    The cause of the issue is traced to a PIC microcontroller exclusive to the Edge 6x0 line which uses a PIC firmware version of v20M or earlier (v20L, v20K, v20J). This issue can only occur when the 6x0 Edge uses a PIC version of v20M or earlier, but even with this version the odds of experiencing the power off issue are rare (approximately 1/1,000). The issue cannot occur on a 6x0 Edge with a PIC firmware version of v20N or later.

    Note:

    A 6x0 Edge's Firmware including PIC version can be determined on an Orchestrator using 5.x by going to the Monitor > Edge > Overview page for that Edge and clicking the dropdown information box next to the Edge name which includes the Edge Information, Device Version, and the Device Firmware.

    The issue is resolved by upgrading the 6x0 Edge to Platform Firmware 1.3.1 (R131-20221216-GA), which includes PIC version v20N. To do this the 6x0 Edge must be connected to a VMware SASE Orchestrator using Release 5.x (5.0.0 or later), and the 6x0 Edge must first be upgraded to Edge build R5014-20230713-GA. Once the 6x0 Edge is upgraded to R5014-20230713-GA, the user would then update the 6x0 Edge Platform Firmware to version R131-20221216-GA in the same way that an Edge's software version is modified.

    For more information and a step-by-step guide to upgrading a 6x0 Edge to Platform Firmware 1.3.1, see the KB Article: VMware SD-WAN 6X0 model Edges may power off with no LEDs and require a power cycle to come back to a working state (88970). This KB article was updated on April 4th, 2023 to reflect the new Edge and Platform Software needed to resolve the issue.

    For information on uploading a Platform Firmware bundle to an Orchestrator, consult the Platform Firmware and Factory Images with New Orchestrator UI section of the VMware SD-WAN Operator Guide.

    For information on updating a 6x0 Edge’s Platform Firmware, consult the View or Modify Edge Information section of the VMware SD-WAN Administration Guide.

    Note:

    If the user prefers to keep the Edge on a lower software release (for example, Release 4.3.1, or 4.5.1), the customer can temporarily upgrade the Edge to R5014-20230713-GA, perform the Platform Firmware upgrade to version 1.3.1 (R131-20221216-GA) so that the PIC version is v20N, and then downgrade the Edge’s software back to their preferred version. Downgrading the 6x0 Edge's software to an earlier version does not also downgrade the Edge's Platform Firmware and the Edge would continue to use Platform Firmware version 1.3.1. In this use case the customer Edges would need to be on an Orchestrator using Release 5.x.

    Note:

    If the 6x0 Edge is on an Orchestrator that does not use version 5.x and has experienced this issue and requires an update of its PIC firmware, the customer may reach out to VMware SD-WAN Support and they will manually update the Edge’s PIC version.

  • Fixed Issue 89873: A user may observe an increase in memory utilization on a VMware SD-WAN Edge resulting in a Memory Usage Warning Event on the Orchestrator and potentially an unscheduled Edge Service restart to recover the Edge's memory.

    This issue occurs when UDP flows with unique IP address and ports are processed at a high rate on the Edge. Flow creation is handled asynchronously on the Edge and when multiple packets of a same flow are enqueued to the flow creation service, the flow objects are leaked and result in an Edge memory leak.  The impact is more commonly observed on entry level Edge models (for example, the 510, 610, or 620) which have smaller amounts of Edge memory, but over a long enough period every Edge model could reach a critical memory level (60% memory utilization for longer than 90 seconds) and restart.  An unplanned Edge Service restart to clear the memory can cause a brief disruption in customer traffic. 

    Without a fix for this issue, the only way to prevent this issue impacting a customer site is to monitor the memory. When memory utilization reaches 40% and the Orchestrator records a Memory Warning Event, schedule a Edge Service Restart in a maintenance window to clear the memory and ensure minimal customer impact.

  • Fixed Issue 90151: For BGP over IPsec on Gateway, applying different BGP filters to primary and secondary neighbors may not work as expected.

    When different filters are applied to a Non SD-WAN Destination (NSD)-BGP on the VMware SD-WAN Gateway's primary and secondary neighbors, only one gets applied to both BGP neighbors.

    The cause of this issue is that for Partner Gateway (PG)-BGP, the SD-WAN service identifies BGP filters using a combination of enterprise_logical_id and segment_id and using enterprise_logical_id was sufficient for Partner Gateway-BGP, because for a given enterprise/segment combination, SD-WAN can only have 1 PG-BGP neighbor.

    However, this method was inherited for NSD-BGP on the Gateway where there can be up to 2 BGP neighbors (Primary and Secondary) for the same enterprise-segment combination. As a result, the enterprise_logical_id and segment_id combination does not suffice for differentiating between filters of 2 different NSD-BGP neighbors.

  • Fixed Issue 90283: A customer may experience poor audio and/or video quality for VoIP and videotelephony calls if Underlay Accounting is turned on for the WAN link being used on the VMware SD-WAN Edge.

    When checking the logs, the user would observe packets for bidirectional traffic where the traffic is asymmetrically routed, and one of the routes is via the underlay. In other words, when the routes for a flow are asymmetric such that in one direction the traffic takes an underlay route and in the reverse direction it takes an overlay path and where Underlay Accounting is toggled on for that WAN link, packet loss may be experienced on bidirectional flows which are typical of, but not limited to, VoIP and videotelephony calls.

  • Fixed Issue 90876: DNS fails on a Non-Global Segment for a one-hop away client who is connected to a VMware SD-WAN Edge either by a LAN interface or a routed sub-interface without a Gateway IP.

    The cause of the issue differs depending on which Edge interface type the one-hop away client is using:

    • If the one-hop away client connected to an Edge via a LAN port, DNS resolution fails for the Non-Global Segment as the Edge routes the reply packet to the client is the VCE1 interface and the Edge process treats it as a Global segment. As a result, the reply packet is dropped as a static route is available in the Global Segment routing table.

    • If the one-hop away client is connected to an Edge via a routed sub-interface port which does not have a Gateway IP address and a static route for the client on Orchestrator, then DNS resolution fails for the client as the Edge does not have a route for the client. It matches the connected route and sends an ARP for the destination IP itself and ARP fails and the reply is not sent.

    For an Edge that does not have a fix for this issue, the workaround for a client using an Edge's LAN is to only use the Global Segment.  For a client using a routed sub-interface, the workaround is to provide a Gateway IP address and, if that is not possible, only use the Global Segment.

  • Fixed Issue 91875: For a customer who has configured a WAN link as a Backup on a VMware SD-WAN Edge, they may observe the backup WAN link becoming active intermittently even though the conditions requiring the link to become active are not present.

    The issue is caused by a race condition on an Edge process that leads the Edge to erroneously think the backup WAN link is needed and proceeds to build a tunnel for that link which the Edge has no failsafe for detecting and tearing down down this erroneous tunnel.

  • Fixed Issue 93052: Client users behind a VMware SD-WAN Edge may observe degraded traffic quality including high latency and slow throughput speeds.

    This immediate cause of the issue is a Path FSM (Finite State Machine) thread running with 100% Edge CPU usage. When an Edge CPU is running at 100% this will result in degraded path quality.

    The reason the Path FSM thread is maxing out the Edge CPU is the result of unreliable counter values which leads the Path FSM thread to conclude that there are more messages in the queue (when there was none actually) which is served by this thread. This resulted in the thread running all the time without sleeping. The fix adds an API which checks actual queue data structures to determine the state of the queue.

  • Fixed Issue 93383: A VMware SD-WAN Edge may suffer one or more Dataplane Service failures with a disruption in customer traffic.

    The issue is caused by a rare instance of a mismatch of the number of interfaces stored in the Edge in two different data structures which triggers an exception and results in the Edge service failing one or more times. The Edge service needs to restart to recover which, in a non-HA deployment, would cause a 10-15 second disruption of customer traffic for each restart. However, if the Edge service fails three consecutive times, the Edge will require a reboot or power cycle to recover.

Orchestrator Resolved Issues

Resolved in Orchestrator Version R5017-20231111-GA

Orchestrator build R5017-20231111-GA was released on 11-14-2023 and is the 7th Orchestrator rollup for Release 5.0.1.

This Orchestrator rollup build addresses the below issues since the 6th Orchestrator rollup build, R5016-20230801-GA.

  • Fixed Issue 102121: When an Edge's Secure Access configuration is updated multiple times without any updates to the Edge's Firewall configuration while using the Orchestrator UI, the Secure Access configuration updates may stop being sent to the Edges.

    The issue has been encountered most often in testing where Engineering deliberately forces a large number of Secure Access updates without any Firewall updates. However it can, in rare instances, be triggered by a customer in the field.

    If experiencing this issue on an Orchestrator without a fix, a user can manually update the Edge's Firewall configuration from the UI just once. After the manual Firewall update, the user can redo the Edge Secure Access configuration change from the UI and the Orchestrator would push the Edge Secure Access configuration change from the UI to the Edges.

  • Fixed Issue 116531: If a user attempts to generate a report on the VMware SASE Orchestrator where at least one Edge description includes a comma (,) the report may not format properly.

    When an Edge description includes a comma (as shown in the screenshot below) the Orchestrator's report service confuses this and breaks out the text after each comma into the report's next column, versus containing the entire text string in the Edge Description column as expected.

    So instead of Bytes Transmitted having the values associated with it, it shows the text after the first comma, Bytes Received would have the text after the second comma (if there was one), and so forth. The report would still include the data for Bytes Transmitted and Bytes Received, it would just be pushed farther to the right and not be aligned to the correct columns.

    On an Orchestrator without a fix for this issue, the user would need to ensure the Edge description used no commas.

    Note:

    The fix for this issue was first listed in Orchestrator Release 5.0.1.6. However, that fix did not resolve every instance of the issue as exporting reports with Application names are still impacted when using commas. The issue is regarded as fully resolved in 5.0.1.7 where Application names are also fixed.

  • Fixed Issue 131789: When configuring Single Sign On (SSO) for an organization, even though the role information is present in the Identity Provider (IdP) response, the users cannot login to the Orchestrator.

    The Orchestrator cannot match the role of a user logging in via Single Sign On (SSO), if the IdP sends role information in a nested JSON structure. Starting in version 5.0.1.7, the Orchestrator can reference and match the role of a SSO user even if it is present in a nested JSON structure.

    If experiencing this issue on an Orchestrator without a fix for this issue, the workaround is to configure the IdP to send the role detail in the immediate level, and not in the nested structure.

Resolved in Orchestrator Version R5016-20230801-GA

Orchestrator build R5016-20230801-GA was released on 08-02-2023 and is the 6th Orchestrator rollup for Release 5.0.1.

This Orchestrator rollup build addresses the below critical issues since the 5th Orchestrator rollup build, R5015-20230628-GA.

  • Fixed Issue 64145: A customer may not be able to successfully configure Partner Gateway handoffs on the VMware SASE Orchestrator.

    As part of an earlier change to the Orchestrator, the handoff configurations under the "gatewayList" key can be unintentionally deleted as a byproduct of calling the "updateConfigurationModule" if the Partner Gateway was first deployed on an older Orchestrator build and had a legacy key inside of its handoff configuration that the Orchestrator UI no longer uses.

  • Fixed Issue 122271: When a customer adds additional LAN-side NAT rules to a profile using the VMware SASE Orchestrator's New UI, they may observe that all traffic matching these rules fails for the Edges using the profile.

    The New UI incorrectly calculates the LAN-side NAT outside mask from the inside address prefix. When the rules are not written such that the inside and outside prefix are the same (in other words: 1:1) the behavior of the rules changes and can become nonfunctional if a user modifies any LAN Side NAT rule from the New UI.

    On an Orchestrator without a fix for this issue, the user should edit LAN-side NAT rules using the Orchestrator's Classic UI.

Resolved in Orchestrator Version R5015-20230628-GA

Orchestrator build R5015-20230628-GA was released on 06-29-2023 and is the 5th Orchestrator rollup for Release 5.0.1.

This Orchestrator rollup build addresses the below critical issues since the 4th Orchestrator rollup build, R5014-20230408-GA.

  • Fixed Issue 109710: A Partner Administrator may not be able to configure a Partner Hand-Off on the VMware SASE Orchestrator.

    Partner Administrator users receive "cannot set property v6Detail of undefined" error when configuring static routes on a partner hand off when configured at the per Gateway level. Operator Users can make the change without a problem.

  • Fixed Issue 112605: A customer may observe that when attempting to assign a Hub Cluster through the Configure > Device > Cloud VPN > Edge to SD-WAN Sites, the profile is unresponsive, and the configuration cannot be saved.

    The Orchestrator creates duplicate configuration associations where there are multiple backhaul Business Policies, and the duplicate references trigger the failure to configure and assign Hub Clusters.

  • Fixed Issue 114291: When using the Orchestrator's New UI, if a Cloud Security Service (CSS) is configured on a profile, the user cannot switch between segments and there is no opportunity to save Device Settings after a CSS is changed on several different segments.

    This issue is only observed on the New UI and not when using the Classic UI.

  • Fixed Issue 114475: When an Operator attempts to upgrade a VMware SASE Orchestrator from Release 4.2.0 to 5.1.0, the Orchestrator may report that the upgrade failed.

    In the logs the Operator would observe this entry: Error while initializing CWS Server service Error: Too many connections. This issue is triggered by MySQL restarting before vco-db-schema is installed which is caused by MySQL not setting the maximum number of connections. In addition, while the Orchestrator reports that the installation failed, in reality the installation did finish and they can restart the Orchestrator and all services will work as expected.

Resolved in Orchestrator Version R5014-20230408-GA

Orchestrator build R5014-20230408-GA was released on 04-11-2023 and is the 4th Orchestrator rollup for Release 5.0.1.

This Orchestrator rollup build addresses the below critical issues since the 3rd Orchestrator rollup build, R5013-20230310-GA.

  • Fixed Issue 107766: When a customer configures either a Non SD-WAN Destination via Edge or a Cloud Security Service (CSS) and also configures the Level 7 (L7) Health Check option, the customer may observe that the tunnels are unexpectedly marked as down or up compared with what should occur based on their L7 Health Check configuration value.

    The issue is that the Orchestrator pushes to the VMware SD-WAN Edge the default L7 Health Check parameters regardless of what the customer has configured. As a result, even if the conditions of the tunnel match what the customer configured the tunnel status can remain unchanged because it is adhering to the default L7 Health Check value.

  • Fixed Issue 108363: After a VMware SASE Orchestrator is upgraded to a 5.x Release, VMware SD-WAN Edges which deploy Cloud Security Services (CSS) like Zscaler and have a Level 7 (L7) Health Check also configured may experience a loss of traffic using that CSS for about 30 seconds.

    After the Orchestrator is upgraded, it triggers a configuration update to all Edges and this can cause some CSS sites with L7 Health Check configured to go down until the configuration is corrected.

    The issue is linked to Fixed Issue #107302 which addresses the issue on the Edge. The fix here prevents the Orchestrator from triggering configuration updates to Edges on an upgrade and thus protects Edges which do not have the fix for #107302.

  • Fixed Issue 110946: A VMware SD-WAN Orchestrator using release 4.2.x or earlier may fail when upgraded to become a SASE Orchestrator using release 4.3.x or later.

    A 4.2.x or earlier Orchestrator does not clean up the apt cache prior to running the apt update service routine when the Orchestrator is upgraded to 4.3.x or later, and as a result the MySQL database restarts during the upgrade and the upgrade fails.

    If upgrading to an Orchestrator version without a fix for this issue, the Operator can run the command rm -rf /var/lib/apt/lists/ prior to the upgrade.

  • Fixed Issue 111946: A user cannot see the paths on the Edge > Monitor > Paths tab on a VMware SASE Orchestrator when the peer list is greater than 100.

    When a user navigates to the Edge > Monitor > Paths tab, the Orchestrator's backend returns all records even if there are more than 100. This is because the backend omits the limit constraint, which is the maximum number of records that should be returned. The records that are returned after the limit count are unnormalized, meaning that they are not formatted in a way that is compatible with the UI. This causes an error in the UI. The Orchestrator should only return the records that are within the submitted limit.

  • Fixed Issue 111957: After an Operator upgrades a VMware SASE Orchestrator, users may observe errors related to failed long running schema updates. For example, newly learned routes (BGP or OSPF) may be missing from the OFC page. Errors will also be observed in the upload logs on an Orchestrator related to learned routes.

    A foreign key on VELOCLOUD_LEARNED_PARTNER_ROUTE_ASSOC is dropped and re-added on an Orchestrator upgrade from Release 4.2.x to Release 4.3.x or later as a long running schema update. This foreign key does not already exist on some Orchestrators that had their upgrade path through some older 2.1.x builds with an upgrade defect AND a Gateway that was learning BGP routes was deleted from the Orchestrator. In such Orchestrators, the foreign key addition on 4.2.x > 4.3.x+ upgrade fails if there is a foreign key violating records already present in the table.

    The fix for this issue corrects the root cause by deleting the foreign key that is violating the records and then re-adding the foreign key.

    If upgrading to an Orchestrator without a fix for this issue, the workaround is to run the following query in VeloCloud schema of MySQL: DELETE FROM VELOCLOUD_LEARNED_PARTNER_ROUTE_ASSOC WHERE gatewayId not in (select id from VELOCLOUD_GATEWAY) and then retriggering the long running schema updates.

  • Fixed Issue 112201: If a user configures a Cloud Security Service (CSS) through an API and sets the CSS to None (Empty), under the VMware Edge's CSS configuration the VMware SASE Orchestrator will not show a configuration, but the Edge's database and API responses have the CSS up and working.

    The Orchestrator does not associate the CSS with the Edge Device Settings object through the CSS field of the segment which was used by the New Orchestrator UI to determine which CSS object is related to a specific segment, only reference objects are present. A CSS in this state cannot be used as part of a Business Policy.

    On an Orchestrator without a fix for this issue, customers should configure a new CSS using the Orchestrator's UI and not with an API.

Resolved in Orchestrator Version R5013-20230310-GA

Orchestrator build R5013-20230310-GA was released on 03-13-2023 and is the 3rd Orchestrator rollup for Release 5.0.1.

This Orchestrator rollup build addresses the below critical issues since the 2nd Orchestrator rollup build, R5012-20221214-GA.

  • Fixed Issue 105610: When a user attempts to create a new IPv4 Object Group or attempts to update an existing IPv4 Object Group which includes a wild card mask that starts with '255' and ends with '0' (for example, 255.0.1.0), the VMware SASE Orchestrator does not allow this wild card mask and throws an error even though this is a valid wild card expression and it should be permitted.

    Beginning with 5.0.x and later, Orchestrators lack the validation for wild card masks in Object Groups, and as a result throws an error when a user configures a wild card mask for one.

  • Fixed Issue 106242: A user accessing the Diagnostics > Remote Diagnostics page on the VMware SASE Orchestrator may experience being unexpectedly logged out of the Remote Diagnostics page while performing any Edge diagnostic.

    When a user encounters this issue, it is because the Orchestrator has reached a limit to the number of connections possible, and the Orchestrator signs out Remote Diagnostics users to ensure normal functioning. The issue is caused by the Orchestrator erroneously not releasing database connections once they are no longer needed, causing the Orchestrator to trigger connection limit behaviors.

  • Fixed Issue 109595: If an Operator tries to upgrade a VMware SASE Orchestrator from a 4.x version to a 5.x version, the upgrade may fail.

    If encountering this issue the Operator would observe the error message: UPGRADE - ERROR - installer failed with return code 1. Issue is caused by the Orchestrator's docker software not creating a required folder when upgraded to a 5.x build.

Resolved in Orchestrator Version R5012-20221214-GA

Orchestrator build R5012-20221214-GA was released on 12-14-2022 and is the 2nd Orchestrator rollup for Release 5.0.1.

This Orchestrator rollup build addresses the below critical issues since the 1st Orchestrator rollup build, R5011-20221129-GA.

  • Fixed Issue 96538: The Remote Diagnostic 'Show BGP Neighbor Learned Routes' fails.

    An interoperability issue in the underlying API call causes validation errors when running the 'Show BGP Neighbor Learned Routes' Remote Diagnostics.

  • Fixed Issue 100133: The Orchestrator experiences performance issues on each Push to Edge configuration.

    When a customer configures a very large number of Business Policy rules by associating an Edge cluster, the Orchestrator experiences performance issues on each Push to Edge configuration.

  • Fixed Issue 101835: The Cloud VPN section is not available in the new Orchestrator UI, if the user selects a non-global segment where Cloud VPN is configured.

    In the new Orchestrator UI, on Configure > Edge > Device settings page, Cloud VPN section is not available if the user selects a non-global segment where Cloud VPN is configured.

  • Fixed Issue 102806: Customer cannot edit the Partner Gateway Handoff configuration at per Gateway level.

    This issue occurs when a customer configures the Partner Gateway Handoff during an upgrade.

Resolved in Orchestrator Version R5011-20221129-GA

Orchestrator build R5011-20221129-GA was released on 11-29-2022 and is the 1st Orchestrator rollup for Release 5.0.1.

Orchestrator build R5011-20221129-GA replaces build R5011-20221117-GA and corrects an upgrade issue seen by the VMware Operations team when upgrading an Orchestrator to build R5011-20221117-GA. The upgrade issue was caused by a version mismatch in the upgrade package Manifest, and this new build adds no new functionality.

This Orchestrator rollup build also addresses the below critical issues since the original Orchestrator GA build, R5010-20220912-GA.

  • Fixed Issue 80735: When a user changes the configuration of a Profile that is also assigned to one or more VMware SD-WAN Edges, the BGP filters on the Profile level are removed.

    The user would observe "ERR: invalid filter ref" on the VMware SASE Orchestrator UI in the places where previously the user could see the details of the BGP filter. The impact to a customer's networking that relies on BGP would be significant and the only way to restore the BGP filters would be to manually restore them.

  • Fixed Issue 88957: A user's configuration of a VLAN with a /30 subnet is not implemented.

    After the user configures and applies a /30 subnet for a VLAN, the Orchestrator automatically sets the DCHP start range for x.x.x.1, versus the correct x.x.x.2. Even when a user manually overrides this configuration by manually changing the DHCP start range back to .2, every time the configuration is loaded, the Orchestrator changes it back to .1.

  • Fixed Issue 97713: When a customer looks at the Monitor > Network Service > Edge Cluster table, they would observe no Edge Cluster Health statistical metrics on the VMware SASE Orchestrator UI.

    The issue is the result of the Cluster Edges uploading its statistics to the Orchestrator and while doing so sends numbers the Orchestrator does not expect. The Orchestrator discards all of the health stats instead of storing them in its database.

  • Fixed Issue 98086: A Partner Administrator with a role of IT Specialist, Customer Support, or Enterprise cannot view path stats on the VMware SASE Orchestrator UI.

    The Orchestrator does not provide privileges for these Partner Administrator roles and these users are not allowed to see any graphs or metrics under the Path Stats tab.

  • Fixed Issue 98357: When a user tries to add an incremental ALLOW privilege to an existing role, it fails with an error.

    When a user is trying to edit a customization package on a 5.x Orchestrator to add ALLOW privileges for actions like View Path Stats, the customization package is rejected by the Orchestrator.


    The issue is caused by an API validator which only processes deny privileges and does not process allow privileges at all.

  • Fixed Issue 98518: If a user removes a VMware SD-WAN Gateway from a Gateway Pool where no customers are assigned, customers who use Partner Gateways may observe that their Partner Handoff Configurations have also been removed for multiple Gateways.

    When a Gateway is removed from a Gateway Pool, the Orchestrator checks for handoffs and is erroneously viewing some handoffs as not in use when they are. This results in the Orchestrator unsetting and then overriding the handoff configurations for that Gateway because of the erroneous understanding that there are no handoff configurations in use.

  • Fixed Issue 98654: On a VMware SASE Orchestrator is upgraded to Release 5.0.0.0 or later, when a VMware SD-WAN Edge is configured to CERTIFICATE REQUIRED mode, the Edge may lose communication with the Orchestrator and be marked as down after a certificate renewal.

    Release Orchestrator 5.0.0.0 introduced a new feature to verify Extended Key Usage of client certificates for Edges configured as CERTIFICATE REQUIRED. There was a defect introduced in this verification process that erroneously treated valid certificates as invalid and caused heartbeats to fail.

  • Fixed Issue 99109: When a VMware SASE Orchestrator is upgraded to 5.0.0 or later, users on a customer enterprise with an existing Zscaler deployment are unable to make changes to their Zscaler settings with the error "Cloud Subscription cannot be null and should match Cloud Name".

    Orchestrator Release 5.0.0 introduces a new Configure Cloud Subscription process for customers configuring a new Cloud Service like Zscaler. As part of this added mechanism, when the Orchestrator is upgraded to Release 5.0.0/5.0.1, existing deployments were expected to be detected by the Orchestrator and automatically migrated to the Cloud Subscription. However, in this issue that is not taking place and a customer user is forced to manually configure each Edge for Zscaler using the new Configure Cloud Subscription method to make any changes to their existing Zscaler configuration.

  • Fixed Issue 99247: For a customer enterprise using a Zscaler Cloud Security Service (CSS) where the user has configured a Business Policy to backhaul traffic using the CSS, when the VMware SASE Orchestrator is upgraded to Release 5.0.0/5.0.1, the customer would observe they can no longer make CSS configuration changes on their VMware SD-WAN Edges.

    When looking at Configure > Edge > Device Settings, the user would observe that "Cloud Subscription" is now locked and "None" is selected. Any attempts to make a configuration change are returned with an error that "Cloud Subscription cannot be NONE". The user is also unable to select a Cloud Subscription without first detecting the backhaul business policy.

  • Fixed Issue 99250: A VMware SD-WAN Edge's CPU core temperature is not properly graphed on the VMware SASE Orchestrator.

    When a user looks at Monitor > Edge > System tab, they would observe that the CPU Core Temperature always shows as .


    The Edge is reporting the correct CPU core temperature to the Orchestrator but the number is being discarded due to a formatting issue and the Orchestrator displays a default value of 0 in the absence of any data.

  • Fixed Issue 100656: When a user goes to Monitor > Edge > QoE on the VMware SASE Orchestrator UI, they may observe large gaps in the QoE graph for the Edge.

    The issue is the result of the VMware SASE Orchestrator erroring out in a backfill function when it detects no QoE data in its query time period of 15 minutes.

  • Fixed Issue 101449: If a user configures more than 32 subinterfaces on a VMware SD-WAN Edge using Release 4.3.x or later, the Orchestrator will throw an error and prevent the configuration from being applied.

    The restriction is designed to protect a customer enterprise that has Edges running a release below 4.3.x (for example, 4.2.2 or 3.4.6), and more then 32 subinterfaces is not supported. With this change, the Orchestrator will allow more than 32 subinterfaces to be configured and the customer will be warned to only do this if the Edge is using Release 4.3.0 or later.

Resolved in Orchestrator Version R5010-20220912-GA

Orchestrator version R5010-20220912-GA was released on 09-13-2022 and is the updated Orchestrator GA build for Release 5.0.1.

The updated R5010-20220912-GA Orchestrator build resolves the below issues since Orchestrator build R5010-20220817.

  • Fixed Issue 96108: When a VMware SASE Orchestrator is upgraded to a 5.x build, a customer may observe missing memory usage statistics for their VMware SD-WAN Edges when looking at the Monitor > Edge pages of the UI.

    The issue is caused during the migration to a 5.x Orchestrator by older Edges sending a different name for their health statistics memory field (memPct) when the Orchestrator is expecting to receive the Edge's historic health statistics memory field using the current name (memoryPct). As a result, the Orchestrator ignores the Edge health statistics memory field value submitted with the unexpected memPct name, and the Orchestrator defaults the health statistics memory field value to zero.

    The fix for this issue resolves the other cause of missing Edge health statistics on the Orchestrator UI, with the first cause being fixed in #90749 on the original 5.0.1 GA build.

  • Fixed Issue 96095: When a VMware SASE Orchestrator is configured for Disaster Recovery (DR), the IPv6 Orchestrator IP address is cleared from all the Operator Profiles and IPv6-only SD-WAN Edges are marked as down by the Orchestrator.

    If an Operator Profile is configured with an Orchestrator IP address for both an IPv4 and IPv6 type, but sets up DR using only the IPv4 address, the IPv6 address is removed from the Operator Profile in the Orchestrator. This causes IPv6-only Edges to stop communicating with the Orchestrator which will mark those Edges as down and stop pushing configuration changes to them.

    If this issue is encountered on an Orchestrator without the fix, post-upgrade the DR needs to be broken and set up again for the management IP addresses to be restored.

  • Fixed Issue 95847: When a VMware SASE Orchestrator is upgraded to version 5.0.1, the Operator performing the upgrade may observe the schema upgrade was not successful and must be manually rerun.

    When an Orchestrator is upgraded to a version with a new ClickHouse schema there is the potential for a race condition on the backend and the old schema version is not up and ready prior to being upgraded. As a result, the Operator needs to manually rerun the schema upgrade.

Resolved in Orchestrator Version R5010-20220817-GA

Orchestrator version R5010-20220817-GA was released on 08-17-2022 and is the updated Orchestrator GA build for Release 5.0.1.

This Orchestrator build replaces the original GA build R5010-20220803-GA, which included Issue #95613. This issue was discovered during an Orchestrator upgrade after this build was released on 08-05-2022. Customers must only use the R5010-20220817-GA build and not use R5010-20220803-GA.

  • Fixed Issue 95613: When a VMware SASE Orchestrator is upgraded to build R5010-20220803-GA, a customer connected to that Orchestrator may experience difficulty monitoring their Edges and, if they use API calls, observe that those calls fail. An Operator user would observe the API process fail and require a restart along with high CPU usage on the Orchestrator.

    This issue is triggered by a user configuring their enterprise to make API calls without any time interval gap (in other words, each API call is immediately followed by another). This activity causes the v2 API process that handles the API call to experience an exception and fail when it receives the API request. The failure of the v2 API process means Edge monitoring for data like link statistics (which relies on API calls) will not display accurate data and customer enterprises using API calls would find them failing as well. In addition, if the same enterprise continues making API calls with no time interval gap, the v2 API process will effectively be stuck in a cycle of failure and restart until those API calls can be stopped or modified to include a time interval. 

    The failure and restart of the v2 API process also causes high Orchestrator CPU consumption which would impact the overall performance of the Orchestrator beyond handling API calls.

Resolved in Orchestrator Version R5010-20220803-GA

Orchestrator version R5010-20220803-GA was released on 08-05-2022 and resolves the follwing issues since Orchestrator version 5002-20220517-GA. This means that a fix for any Edge or Gateway issue listed in the 5.0.0 Release Notes is included in all Release 5.0.1 builds.

  • Fixed Issue 49535: On the Monitor > Network Services page, the VMware SASE Orchestrator does not immediately update the BGP neighbor state of a VMware SD-WAN Edge that has gone offline.

    The BGP Edge Neighbor State table will continue to show the offline Edge as "Established" and remain that way for hours after the Edge has gone offline. This impacts any user who relies on the Orchestrator UI to check these details.

  • Fixed Issue 68463: When looking at the Monitor > QoE graph on the VMware SASE Orchestrator where the graph section is listed as yellow/fair for quality, there may be a discrepancy between the reason the graph section is listed as yellow/fair when looking at the Classic UI versus the New UI.

    When encountering this issue, on the Classic UI if the pop-up box lists Latency as the reason for the fair QoE score, the New UI would have a pop up box that lists Jitter as the cause for the fair QoE score. The issue is caused by an incorrect mapping of the Latency and Jitter values on the New UI.

  • Fixed Issue 70005: When using VMware Cloud Web Security, a user can edit an existing Security Policy and rename it an empty or blank name and save it on the VMware SASE Orchestrator.

    A user cannot create a Security Policy with an empty/blank name but can edit an existing policy to configure the name to be blank and the Orchestrator permits the change and does not throw an error.

  • Fixed Issue 76036: Attempting to access either 'Partner Overview' page and/or a 'Configure > Customer' page for that Partner on a VMware SASE Orchestrator fails to load with an "An unexpected error has occurred" message.

    The Partner Overview page and/or a Configure > Customer page for a customer supported by that partner may fail to load because the enterpriseProxy /getEnterpriseProxyGatewayPools API times out.  The trigger for these pages not loading is if the they include a large number of Gateway pools and Gateways which may lead to the enterpriseProxy /getEnterpriseProxyGatewayPools API used on the page timing out, and causing the page loading issue for each UI page.

  • Fixed Issue 76036: Attempting to access either 'Partner Overview' page and/or a 'Configure > Customer' page for that Partner on a VMware SASE Orchestrator fails to load with an "An unexpected error has occurred" message.

    The Partner Overview page and/or a Configure > Customer page for a customer supported by that partner may fail to load because the enterpriseProxy /getEnterpriseProxyGatewayPools API times out.  The trigger for these pages not loading is if the they include a large number of Gateway pools and Gateways which may lead to the enterpriseProxy /getEnterpriseProxyGatewayPools API used on the page timing out, and causing the page loading issue for each UI page.

  • Fixed Issue 81835: The Monitor > Edge > QoE page of the VMware SASE Orchestrator UI may not accurately represent a WAN link's status (whether it is online, offline, or degraded) or accurately represent link metrics for the time period selected.

    Different time intervals can lead to the QoE graph showing different results for a WAN link status. And for a link's metrics, the QoE graph may present a particular QoE value (latency, loss, or jitter) that does not reflect the real metric value at that exact time.

    This issue is caused by multiple WAN links belonging to different enterprises being assigned the same link logical ID which leads to a malfunction in the Orchestrator's link data backfill process. The Orchestrator erroneously assumes the WAN link logical Id to be unique because it is not tied to a customer's enterprise ID. This allows for duplicate link logical IDs and the possibility of incorrect link metrics and status.

    The fix for this issue stores the link keys in the Orchestrator's database as a combination of the customer enterprise logical ID and the WAN link's logical ID, ensuring each WAN link is unique.

  • Fixed Issue 82725: A VMware SASE Orchestrator may not generate the password reset link correctly.

    This issue occurs when the URL for the Orchestrator is not exactly https://domain/ or https://domain/operator/.  However, if for example the URL is https://domain/test/ the password reset link does not work and directs you back to the login page.

    When encountering this issue on an Orchestrator without the fix, if the Orchestrator URL cannot be corrected to a URL as shown above, the only option is for a Superuser or Operator to manually enter a new password for the user and then share that with the affected user so that they could in turn reconfigure a different password for themselves once they were successfully logged back in.

  • Fixed Issue 82775: On a VMware SASE Orchestrator using Release 5.0.0, when a Zscaler type Cloud Security Service (CSS) is configured for a customer and associated with a VMware SD-WAN Edge, and then a Business Policy is configured with a CSS backhaul rule, a user is unable to change the CSS hash or encryption parameters for that CSS.

    The Orchestrator locks the user out from modifying the Zscaler CSS configuration once it is associated with an CSS Backhaul Business Policy.

    On an Orchestrator without a fix for this issue, the user needs to delete the CSS Backhaul Business Policy to modify the Zscaler CSS configuration and then recreate the same Business Policy.

  • Fixed Issue 82864: On a VMware SASE Orchestrator using Release 5.0.0, when a user is on the Configure > Profiles page and selects 'Modify', the user is redirected to the Profile > Overview page instead of the Profile > Device Settings page.

    The Configure > Profiles 'Modify' button is not mapping to the correct page.

  • Fixed Issue 83165: An Operator user is not able to transfer a Customer to a Partner on the VMware SASE Orchestrator with the reason that they do not have the same Gateway Pool, even though both do have the same Gateway Pool.

    This is caused by an API call network/getNetworkEnterpriseProxies not returning the Gateway Pool details and leading the Orchestrator to think the Partner and Customer do not have the same Gateway Pool and rejecting the assignment.

  • Fixed Issue 83538: For customers using the Secure Access service, when creating a Remote Access service, the Enterprise & Network Settings Screen shows internal error message keys on the VMware SASE Orchestrator.

    When creating a Remote Access service, if the user enters invalid data in the customer subnet or subnet bits fields, an untranslated error message is displayed below these fields. This error message is of no use to the user and does not point to resolving the actual issue regarding invalid data in either field.

  • Fixed Issue 83539: On a VMware SASE Orchestrator deployed with a Disaster Recovery (DR) configuration, when the Orchestrator is upgraded to a new software version, the DR synchronization fails.

    DR is running properly prior to the upgrade, but when an Operator user upgrades the Active and Standby Orchestrators, DR status will show as failed.

  • Fixed Issue 83582: When upgrading a VMware SASE Orchestrator from Release 4.5.0 to Release 5.0.0, the process takes much longer than expected and until the process completes all Orchestrator services are unavailable.

    The schema update can take more than 15 minutes for the Edge Statistics table to update during the upgrade when the LRQ schema should be updated instead and this is causing a major delay in the Orchestrator update completing.

  • Fixed Issue 83822: For customers using VMware Cloud Web Security, when looking at Monitor > Logs > Web Logs on the VMware SASE Orchestrator, the user is only able to see a maximum of 100 logs and cannot load more pages to see additional logs.

    With this issue the user is stuck using the maximum 100 logs for a single page with no additional logs viewable as pagination is broken for Web Logs on the Orchestrator UI. This is a major hindrance for users because it means if they want to load a large time period (for example, 30 days) they would be unable to see all the logs for that period.  The only workaround is to load up short time periods that return 100 or less logs. 

  • Fixed Issue 84152: When a customer generates a Top Talkers report for their enterprise, the Top Talker names may be listed as 'Unknown'.

    "Top Talkers" are the top sources from all the flows in a given time range. The Top Talker name may not show if the client device is not present for the (Source IP + MAC Address) unique pair.  This happens because the client devices are saved based on which Visibility Mode (IP Address or MAC Address) is configured for the VMware SD-WAN Edge.  For example, an Orchestrator may save a device for (IP Address 1, MAC Address 1) and then the (IP Address 2, MAC Address 2) record is not be saved if Visibility Mode is set to IP Address. This would lead to the Top Talker corresponding to IP Address 2/MAC Address1 being marked as 'Unknown'.

  • Fixed Issue 84214: When an Operator user is on the Gateways page of a VMware SASE Orchestrator UI, they may be unable to assign a particular Gateway for the role of Super Gateway.

    When a Gateway is already assigned the role of both Super and Alternate Super Gateway, and the Operator tries to edit the Super Gateway assignment of an enterprise from the Customer Usage list on the Gateways > Configure Gateways screen, the UI does not correctly find associated data about the Super Gateway and the Assign Super Gateway dialog does not show up while also throwing an error in the console. 

  • Fixed Issue 84969: When a VMware SD-WAN Edge running a 4.2.x Release which is also configured with an overridden non-default Management IP is upgraded to Release 4.3.x or higher on a VMware SD-WAN Orchestrator running 4.3.x or higher, the Edge may lose the configured overridden Management IP.

    An Orchestrator running 4.3.x or higher is not automatically creating the loopback interface while also retaining the overridden non-default Management IP for an Edge, when that Edge is upgraded from 4.2.x to a 4.3.x or later build.

  • Fixed Issue 86546: For customers using VMware Secure Access, a user may not be able to use Secure Access on some SASE PoPs, and some may even show as offline on the VMware SASE Orchestrator.

    VMware Gateways that are not configured for use with Secure Access (in other words, Gateways which do not have a geneve tunnel with the tunnel server on the PoP) are also given information about the Secure Access service by the Orchestrator. This leads to a broken route being picked in some instances for routing customer traffic. This issue can be encountered only when more than one Gateway is assigned per PoP per Gateway pool on a particular Orchestrator.

    On an Orchestrator that does not have the fix for this issue, the workaround is to add and keep only one Gateway per PoP in each Gateway Pool so that this Gateway always gets picked for Secure Access and the establishing of the correct route.

  • Fixed Issue 86848: When a customer administrator makes a failed login attempt using the Native (username/password) method to their customer enterprise on the VMware SASE Orchestrator, the Orchestrator does not log the failed attempt on the Monitor > Events page of the UI.

    The Orchestrator should log every login attempt whether it is successful or not to ensure proper accountability of all user accounts and to all the administrators to detect unusual login activity. The issue is caused by the Orchestrator not including the enterpriseId metadata to a failed username/password authorization attempt. This only affects customer users using Native (username/password) authorization and customer enterprises using Single Sign On (SSO) are not impacted by this issue.

  • Fixed Issue 87111: When a VMware SASE Orchestrator is upgraded to 4.3.x or later, the VMware SD-WAN Edges connected to the Orchestrator which are configured to use BGP do not have the uplink flag configured.

    The BGP uplink flag is added as a configuration in the SD-WAN 4.3.0 Release and Edge Versions 4.3.0 and later are expecting an uplink flag to be present.  However, the Orchestrator is not pushing the configuration update to all Edges that are missing this flag after the Orchestrator is upgraded.

  • Fixed Issue 88621: A VMware SD-WAN Gateway being migrated is unable to have its configuration modified and saved on the VMware SASE Orchestrator.

    An Operator user cannot update the location for a production Gateway, as they attempt to save the configuration the Orchestrator returns the error: GATEWAY_SERVICE_STATE_INVALID: Cannot change the state of the gateway to null, as it is already used as a replacement gateway.

  • Fixed Issue 89346: On a VMware SASE Orchestrator using build 5.0.0.2, when generating a New Report from the Monitor Customers screen, the newly generated report is always delivered in English, even if the Report Language was specified as a non-English language.

    The downloaded report should be displayed in the language specified under Report Language, but instead the language used is always English.

  • Fixed Issue 89800: When a user updates the Segment Property on the VMware SASE Orchestrator, the Edge tunnels to their Zscaler Cloud Security Service (CSS) go down and traffic routed to Zscaler is dropped.

    If a user has a CSS configured under Configure > Network Service (any CSS type) and then configures the FQDN and PSK authentication details at Configure > Edge > Device > Cloud Security Service using Edge Override, when a user updates any Segment in the Configure > Segment section of the Orchestrator, the Edge's CSS authentication configuration is deleted and the Edge can no longer connect to the Zscaler peer.

  • Fixed Issue 90128: On a customer enterprise which has a Cloud Security Services (CSS) configured, when the user changes the CSS configuration, the CSS event includes the PSK key of the CSS.

    While this behavior does not provide a direct vulnerability, the CSS PSK value is sensitive information that should not be included in a log file.

  • Fixed Issue 90540: On a VMware SASE Orchestrator using Release 5.0.0, when a VMware SD-WAN Edge using Edge Release 4.5.1 is upgraded to Release 5.0.0, the Edge loses DNS functionality and experiences a loss of connectivity with the internet.

    As part of the Edge upgrade to 5.0.0, the Orchestrator's role is to push an updated Edge configuration and the DNS part of that configuration was not compatible with a 4.5.x Edge build causing the DNS settings to be lost and preventing connectivity to the Internet. The Edge would continue to pass traffic to other locations (for example, the Orchestrator, other Edges, Hub Edges, and Non SD-WAN Destinations) where DNS is not a factor.

  • Fixed Issue 90067: When a VMware SASE Orchestrator is upgraded to 4.5.1 or 5.0.0, the Operator may observe high CPU usage and load issues.

    During the upgrade the Orchestrator loses a critical system property:  edge.learnedRoute.maxRoutePerCall. This property caps the number of routing protocol events that can be received by the Orchestrator at any one time. In the absence of this property, an Orchestrator could be flooded with routing protocol events that place it under a high load which can impact the Orchestrator's performance. The fix ensures that system property edge.learnedRoute.maxRoutePerCall persists over Orchestrator upgrades.

  • Fixed Issue 90749: When a VMware SASE Orchestrator is upgraded to a 5.x build, a customer may observe the loss of historic statistics for one or more of their VMware SD-WAN Edges when looking at the Monitor > Edge pages of the UI.

    In the Orchestrator logs, an Operator would observe "Error while migrating health stats" and "Error while writing data file to clickhouse" log messages with timestamps immediately after the Orchestrator being upgraded to a 5.x build. The issue is triggered during the Orchestrator upgrade by an Edge sending any invalid data (for example, an invalid tunnel count with a negative number) to the Orchestrator which results in the Orchestrator rejecting not only the invalid data, but the entire historic data batch for that particular Edge. As a result, the user observes large historic time gaps in the graphs for that Edge when looking at Monitor > Edge pages post Orchestrator upgrade. The issue does not uniformly impact all Edges connected to the Orchestrator, only the small number that send out invalid data.

    Note:

    There is a related Issue #96108 that also causes missing Edge health statistics that is fixed in Orchestrator build R5010-20220912-GA.

  • Fixed Issue 90835: For a customer using the VMware Cloud Web Security service, the user cannot configure Office 365 domain rules for web proxy in Cloud Web Security using the VMware SASE Orchestrator.

    The user cannot configure Office 365 (recently renamed to Microsoft 365) domain rules for web proxy in Cloud Web Security using the PAC file wizard.

  • Fixed Issue 91054: For a customer using VMware Cloud Web Security, a user may encounter multiple usability issues on the VMware SASE Orchestrator UI when attempting to configure Single Sign-On Authentication.

    The issues a user could encounter while configuring Single Sing-On in the Cloud Web Security service include:

    • Certificate errors showing on the main Authentication page instead of on the Certificate page.• A user can sometimes save an invalid certificate.• Changing a certificate can sometimes reset the other values on the Authentication form.• Individual fields do not show validation messages inline with the field.• When saving the Authentication page, the Orchestrator UI does not show a progress spinner.• The Verbose Debugging tooltip shows "t+2hrs" instead of an actual time.• In some languages, the Single Sign-On toggle label wraps to more than one line.• The Save Changes footer layout is incorrect on short screens.

    All of the listed issues are resolved on an Orchestrator that includes a fix for #91054.

  • Fixed Issue 91179: For a VMware SD-WAN Edge which has a WAN link configured as Hot Standby, if the Hot Standby link's status is standby, the VMware SASE Orchestrator's New UI displays the incorrect status for the Hot Standby link (Active).

    The Orchestrator's Classic UI shows the correct status for the link (Idle), so this is limited to the New UI only. The issue is caused by the New UI not getting the correct update on the change of status for a Hot Standby WAN link. 

  • Fixed Issue 91720: For a customer enterprise that uses a Hub/Spoke topology, a user can remove a VMware SD-WAN Hub Edge from the Backhaul Hub configuration even though that Hub is being used with a Business Policy configured to use internet backhaul.

    Once a Business Policy for backhauling Spoke Edge traffic through a Hub Edge has been configured, the expected behavior is that the VMware SASE Orchestrator "locks" that Hub Edge and prevents a user from removing it from the Backhaul Hub configuration in the Configure > Device Settings section. However, with this issue the user can remove the Hub Edge and cause significant customer traffic disruption. 

  • Fixed Issue 92082: For a customer using VMware Cloud Web Security, the customer may observe that the Content Filtering rules do not honor the configured domain.

    The Content Filtering rules override the configured domain provided if the user has also selected ALL for Categories. Or, if the user selects NONE for Categories, the wizard defaulted this choice to mean ALL Categories, hence the domains were not honored here as well. This is caused by an issue in the content filtering wizard and API. If the user configures at least one Category, the Domain is honored.

    On an Orchestrator without this fix, the user would need to configure specific categories along with domains, and then the Orchestrator would honor domains in content filtering.

Known Issues

Open Issues in Release 5.0.1.

Edge and Gateway Known Issues

  • Issue 14655:

    Plugging or unplugging an SFP adapter may cause the device to stop responding on the Edge 540, Edge 840, and Edge 1000 and require a physical reboot.

    Workaround: The Edge must be physically rebooted.  This may be done either on the Orchestrator using Remote Actions > Reboot Edge, or by power-cycling the Edge.

  • Issue 25504:

    Static route costs greater than 255 may result in unpredictable route ordering.  

    Workaround: Use a route cost between 0 and 255.

  • Issue 25595:

    A restart may be required for changes to static SLA on a WAN overlay to work properly.  

    Workaround: Restart Edge after adding and removing Static SLA from WAN overlay.

  • Issue 25742:

    Underlay accounted traffic is capped at a maximum of the capacity towards the VMware SD-WAN Gateway, even if that is less than the capacity of a private WAN link which is not connected to the Gateway.

  • Issue 25758:

    USB WAN links may not update properly when switched from one USB port to another until the VMware SD-WAN Edge is rebooted.

    Workaround: Reboot the Edge after moving USB WAN links from one port to another.

  • Issue 25855:

    A large configuration update on the Partner Gateway (e.g. 200 BGP-configured VRFs) may cause latency to increase for approximately 2-3 seconds for some traffic via the VMware SD-WAN Gateway.

    Workaround: No workaround available.

  • Issue 25921:

    VMware SD-WAN Hub High Availability failover takes longer than expected (up to 15 seconds) when there are three thousand branch Edges connected to the Hub.

    Workaround: No workaround available.

  • Issue 25997:

    The VMware SD-WAN Edge may require a reboot to properly pass traffic on a routed interface that has been converted to a switched port.

    Workaround: Reboot the Edge after making the configuration change.

  • Issue 26421:

    The primary Partner Gateway for any branch site must also be assigned to a VMware SD-WAN Hub cluster for tunnels to the cluster to be established.

    Workaround: No workaround available.

  • Issue 28175:

    Business Policy NAT fails when the NAT IP overlaps with the VMware SD-WAN Gateway interface IP.

    Workaround: No workaround available.

  • Issue 31210:

    VRRP: ARP is not resolved in the LAN client for the VRRP virtual IP address when the VMware SD-WAN Edge is primary with a non-global CDE segment running on the LAN interface. 

    Workaround: No workaround available.

  • Issue 32731:

    Conditional default routes advertised via OSPF may not be withdrawn properly when the route is deactivated.

    Workaround: Reactivating the route, followed by deactivating it again will retract it successfully. 

  • Issue 32960:

    Interface “Autonegotiation” and “Speed” status might be displayed incorrectly on the Local Web UI for activated VMware SD-WAN Edges.

    Workaround: Refer to the Orchestrator UI under Remote Diagnostics > Interface Status.

  • Issue 32981:

    Hard-coding speed and duplex on a DPDK-configured port may require a VMware SD-WAN Edge reboot for the configurations to take effect as it requires turning DPDK off.

  • Issue 34254:

    When a Zscaler CSS is created and the Global Segment has FQDN/PSK settings configured, these settings are copied to Non-Global Segments to form IPsec tunnels to a Zscaler CSS.

  • Issue 35778:

    When there are multiple user-defined WAN links on a single interface, only one of those WAN links can have a GRE tunnel to Zscaler. 

    Workaround: Use a different interface for each WAN link that needs to build GRE tunnels to Zscaler.

  • Issue 36923:

    Cluster name may not be updated properly in the NetFlow interface description for a VMware SD-WAN Edge which is connected to that Cluster as its Hub.

  • Issue 38682:

    A VMware SD-WAN Edge acting as a DHCP server on a DPDK-configured interface may not properly generate “New Client Device" events for all connected clients.

  • Issue 38767:

    When a WAN overlay that has GRE tunnels to Zscaler configured is changed from auto-detect to user-defined, stale tunnels may remain until the next restart.

    Workaround: Restart the Edge to clear the stale tunnel.

  • Issue 39134:

    The System health statistic “CPU Percentage” may not be reported correctly on Monitor > Edge > System for the VMware SD-WAN Edge, and on Monitor > Gateways for the VMware SD-WAN Gateway.

    Workaround: Users should use handoff queue drops for monitoring Edge capacity not CPU percentage.

  • Issue 39374:

    Changing the order of VMware SD-WAN Partner Gateways assigned to a VMware SD-WAN Edge may not properly set Gateway 1 as the local Gateway to be used for bandwidth testing.

  • Issue 39608:

    The output of the Remote Diagnostic “Ping Test” may display invalid content briefly before showing the correct results.

    Workaround: There is not workaround for this issue.

  • Issue 39624:

    Ping through a subinterface may fail when the parent interface is configured with PPPoE.

  • Issue 39753:

    Toggling Dynamic Branch-to-Branch VPN to off may cause existing flows currently being sent using Dynamic Branch-to-Branch to stall.

    Workaround: Only deactivate Dynamic Branch-to-Branch VPN in a maintenance window.

  • Issue 40421:

    Traceroute is not showing the path when passing through a VMware SD-WAN Edge with an interface configured as a switched port.

  • Issue 40096:

    If an activated VMware SD-WAN Edge 840 is rebooted, there is a chance an SFP module plugged into the Edge will stop passing traffic even though the link lights and the VMware SD-WAN Orchestrator will show the port as 'UP'. 

    Workaround: Unplug the SFP module and then replug it back into the port.

  • Issue 42278:

    For a specific type of peer misconfiguration, the VMware SD-WAN Gateway may continuously send IKE init messages to a Non-SD-WAN peer. This issue does not disrupt user traffic to the Gateway; however, the Gateway logs will be filled with IKE errors and this may obscure useful log entries.

  • Issue 42872:

    Activating Profile Isolation on a Hub profile where a Hub Cluster is associated does not revoke the Hub routes from the routing information base (RIB).

  • Issue 43373:

    When the same BGP route is learnt from multiple VMware SD-WAN Edges, if this route is moved from preferred to eligible exit in the Overlay Flow Control, the Edge is not removed from the advertising list and continues to be advertised.

    Workaround: Activate Distributed Cost Calculation (DCC) on the VMware SD-WAN Orchestrator.

  • Issue 44995:

    OSPF routes are not revoked from VMware SD-WAN Gateways and VMware SD-WAN Spoke Edges when the routes are withdrawn from the Hub Cluster.

  • Issue 45189:

    With source LAN side NAT is configured, the traffic from a VMware SD-WAN Spoke Edge to a Hub Edge is allowed even without the static route configuration for the NAT subnet.

  • Issue 45302:

    In a VMware SD-WAN Hub Cluster, if one Hub loses connectivity for more than 5 minutes to all of the VMware SD-WAN Gateways common between itself and its assigned Spoke Edges, the Spokes may in rare conditions be unable to retain the hub routes after 5 minutes. The issue resolves itself when the Hub regains contact with the Gateways.

  • Issue 46053:

    BGP preference does not get auto-corrected for overlay routes when its neighbor is changed to an uplink neighbor.

    Workaround: An Edge Service Restart will correct this issue.

  • Issue 46137:

    A VMware SD-WAN Edge running 3.4.x software does not initiate a tunnel with AES-GCM encryption even if the Edge is configured for GCM.

    Workaround: If a customer is using AES-256, they must explicitly disable GCM from the Orchestrator prior to upgrading their Edges to a 4.x Release. Once all their Edges are running a 4.x release, the customer may choose between AES-256-GCM and AES-256-CBC.

  • Issue 46216:

    On a Non SD-WAN Destinations via Gateway or Edge where the peer is an AWS instance, when the peer initiates Phase-2 re-key, the Phase-1 IKE is also deleted and forces a re-key.  This means the tunnel is torn down and rebuilt, causing packet loss during the tunnel rebuild.

    Workaround: To avoid tunnel destruction, configure the Non SD-WAN Destinations via Gateway/Edge or CSS IPsec rekey timer to less than 60 minutes.  This prevents AWS from initiating the re-key.

  • Issue 46391:

    For a VMware SD-WAN Edge 3800, the SFP1 and SFP2 interfaces each have issues with Multi-Rate SFPs (i.e. 1/10G) and should not be used in those ports.

    Workaround: Please use single rate SFP's per the KB article VMware SD-WAN Supported SFP Module List (79270).  Multi-Rate SFPs may be used with SFP3 and SFP4.

  • Issue 47664:

    In a Hub and Spoke configuration where Branch-to-Branch via Hub VPN is not configured, trying to U-turn Branch-to-Branch traffic using a summary route on an L3 switch/router will cause routing loops.

    Workaround: Configure Cloud VPN to activate Branch-to-Branch VPN and select “Use Hubs for VPN”.

  • Issue 47681:

    When a host on the LAN side of a VMware SD-WAN Edge uses the same IP as that Edge’s WAN interface, the connection from the LAN host to the WAN does not work.

  • Issue 48530:

    VMware SD-WAN Edge 6x0 models do not perform autonegotiation for triple speed (10/100/1000 Mbps) copper SFP's.

    Workaround: Edge 520/540 supports triple speed copper SFPs but this model has been marked for End-of-Sale by Q1 2021.

  • Issue 48597: Multihop BGP neighborship does not stay up if one of the two paths to the peer goes down.

    If there is a Multihop BGP neighborship with a peer to which there are multiple paths and one of them goes down, user will notice that the BGP neighborship goes down and does not come up using the other available path(s). This includes the Local IP-loopback neighborship case too.

    Workaround: There is no workaround for this issue.

  • Issue 50518:

    On a VMware SD-WAN Gateway where PKI is configured, if >6000 PKI tunnels attempt to connect to the Gateway, the tunnels may not all come up because inbound SAs do not get deleted.

    Note: Tunnels using pre-shared key (PSK) authentication do not have this issue.

  • Issue 51436: For a site using an Enhanced High-Availability topology while deploying a VMware SD-WAN Edge using an LTE modem, if the site gets into a "split-brain" state, the HA failover takes ~5-6 minutes.

    As part of the recovery from a split-brain state, the LAN ports are brought down on the Active Edge and this impacts LAN traffic during the time the ports are down and until the site can recover.

    Workaround: There is no workaround for this issue

  • Issue 52955: DHCP decline is not sent from Edge and DHCP rebinding is not restarted after DAD failure in Stateful DHCP.

    If DHCPv6 server allocates an address which is detected as duplicate by the kernel during a DAD check then the DHCPv6 client does not send a decline. This will lead to traffic dropping as the interface address will be marked as DAD check failed and will not be used. This will not lead to any traffic looping in the network but traffic blackholing will be seen.

    Workaround: There is no workaround for this Issue.

  • Issue 53219: After a VMware SD-WAN Hub Cluster rebalances, a few Spoke Edges may not have their RPF interface/IIF set properly.

    On the affected Spoke Edges, multicast traffic will be impacted. What happens is that after a cluster rebalance, some of the Spoke Edge fail to send a PIM join.

    Workaround: This issue will persist until the affected Spoke Edge has an Edge Service restart.

  • Issue 53337: Packet drops may be observed with an AWS instance of a VMware SD-WAN Gateway when the throughput is above 3200 Mbps.

    When traffic exceeds a throughput above 3200 Mbps and a packet size of 1300 bytes, packets drops are observed at RX and at IPv4 BH handoff.

    Workaround: There is no workaround for this issue.

  • Issue 53934: In an enterprise where a VMware SD-WAN Hub Cluster is configured, if the primary Hub has Multihop BGP neighborships on the LAN side, the customer may experience traffic drops on a Spoke Edge when there is a LAN side failure or when BGP is not configured on all segments.

    In a Hub cluster, the primary Hub has Multihop BGP neighborship with a peer device to learn routes. If the physical interface on the Hub by which BGP neighborship is established, goes down, then BGP LAN routes may not become zero despite BGP view being empty. This may cause Hub Cluster rebalancing to not happen. The issue may also be observed when BGP is not configured for all segments and when there are one or more Multihop BGP neighborships.

    Workaround: Restart the Hub which had the LAN-side failure (or BGP not activated).

  • Issue 57210: Even when a VMware SD-WAN Edge is working normally and is able to reach the internet, the LED in the Local UI's Overview page shows as "Red".

    The Edge's Local UI determines the Edge's connectivity by whether it can resolve a well-known name via Google's DNS resolver (8.8.8.8). If it cannot do so for any reason, then it thinks it is offline and shows the LED as red.

    Workaround: There is no workaround for this issue, except to ensure that DNS traffic to 8.8.8.8 can reach the destination and be resolved successfully.

  • Issue 61543: If more than one 1:1 NAT rule is configured on different interfaces with the same Inside IP, the inbound traffic can be received on one interface and the outbound packets of the same flow can be routed via different interface.

    For the NAT flows from Outside to Inside, the 1:1 NAT rules will be matched against the Outside IP and the interface where the packets are received. For the outbound packets of the same flow, the VMWare SD-WAN Edge will try to match the NAT rules again comparing the Inside IP and the outbound traffic can be routed via the interface configured in the first matching rule with "Outbound Traffic" configured.

    Workaround: There is no workaround for this issue outside of ensuring no more than one 1:1 NAT rule is configured with a particular Inside IP address.

  • Issue 62701: For a VMware SD-WAN Edge deployed as part of an Edge Hub Cluster, If Cloud VPN is not activated under the Global Segment but is activated under a Non-Global Segment, a control plane update sent by the Orchestrator may cause all the WAN links to flap on the Hub Edge.

    The Hub Edge's WAN links going down, then up in rapid succession (flap) will impact real time traffic like voice calls. This issue was observed on a customer deployment where Cloud VPN was not activated on the Hub Edge's Global segment, but the Cluster configuration was configured as on, which means this Hub Edge was part of a Cluster (and a Cluster configuration is applicable to all segments). When a configuration change is pushed to the Hub Edge, the Hub Edge's dataplane will start parsing data and will start with the Global Segment where it will see Cloud VPN not activated and the Hub Edge erroneously thinks clustering has been deactivated on this Global Segment. As a result, the Hub Edge will tear down all tunnels from the Hub's WAN link(s) which will cause link flaps on all that Edge's WAN links. For any such incident the WAN links only go down and recover a single time per control pane update.

    Workaround: The workaround is to activate Cloud VPN on all segments, meaning the Global Segment and all Non-Global Segments.

  • Issue 65560: Traffic from a customer to PE (Provider Edge) device fails.

    BGP neighborship between a Partner Gateway and Provider Edge does not get established when tag-type is selected as "none" on the handoff configuration. This is because ctag, stag values get picked from /etc/config/gatewayd instead of the handoff configuration on the Orchestrator when tag-type is "none".

    Workaround: Update the ctag, stag values to 0 each under vrf_vlan->tag_info in /etc/config/gatewayd. Do a vc_procmon restart.

  • Issue 67879: A Cloud Security Service (CSS) tunnel is deleted after a user changes a WAN Overlay setting from auto-detect to user-defined on a WAN interface setting.

    After saving the changes, the CSS tunnels do not come back up until the customer takes down and then puts back up the tunnel. Changing the WAN configuration will bring down the CSS tunnel and parse the CSS setup again. However, in some corner cases, the nvs_config->num_gre_links is 0 and the CSS tunnel fails to come up.

    Workaround: Deactivate the CSS setup, and then reactivate it and this will bring the CSS tunnel up.

  • Issue 68057: DHCPv6 release packet is not sent from the VMware SD-WAN Edge on the changing of a WAN interface address mode from DHCP stateful to static IPv6 address and the lease remains active till reaching its valid time.

    The DHCPv6 client possesses a lease which it does not release when the configuration change is done. The lease remains valid till its lifetime expires in the DHCPv6 server and is deleted.

    Workaround: There is no way of remediating this issue as the lease would remain active till valid lifetime.

  • Issue 68851: If a VMware SD-WAN Edge and VMware SD-WAN Gateway each have the same TCP syslog server configured, the TCP connection is not established from the Edge to the syslog server.

    If the Edge and Gateway each have the same TCP server and if the syslog packets from the Edge are routed via the Gateway, the syslog server sends a TCP reset to the Edge.

    Workaround: Send the syslog packets direct from the Edge instead of routing via a Gateway or configure a different syslog server for the Edge and Gateway.

  • Issue 69284: For a site using a High-Availability topology where the Edges deploy VNF's in an HA configuration and are using Release 4.x, if these HA Edges are downgraded to a 3.4.x Release where HA VNF's are not supported, and then upgraded to 4.5.0, when the HA VNF's are reactivated, the Standby Edge VNF will not come up.

    The VNF state on the Standby Edge is communicated as down via SNMP. If the HA VNF pair is downgraded from a version supporting VNF-HA (release 4.0+) to a release which does not support it with VNF configured on the Orchestrator. This issue will be seen when the Edge is upgraded back to a version supporting VNF-HA and it is configured on the Orchestrator again.

    Workaround: VNF should first be deactivated in the case of an HA configuration if the Edge is being downgraded to a version which does not support it.

  • Issue 71719: PPTP Connection is not Established along Edge to Cloud path.

    Connection to the PPTP server behind the VMware SD-WAN Edge does not get established.

    Workaround: There is no workaround for this issue, not even an Edge restart or reboot.

  • Issue 72358: If the IP address of a VMware SD-WAN Orchestrator DNS name changes, the VMware SD-WAN Gateway's management plane process fails to resolve it properly and the Gateway will be unable to connect to the Orchestrator.

    The Gateway's management process periodically checks the DNS resolution of the Orchestrator's DNS name, to see if it has changed recently so that the Gateway can connect to the right host. The DNS resolution code has an issue in it so that all of these resolution checks fail, and the Gateway will keep using the old address and thus no longer be able to connect to the Orchestrator.

    Workaround: Until this issue is resolved, an Operator User should not change the IP address of the Orchestrator. If the Orchestrator's IP address must be changed, all Gateways connecting to that Orchestrator will have to be reactivated.

  • Issue 77541: When a USB modem that supports IPv6 is unplugged and then replugged into a VMware SD-WAN Edge USB interface, an IPv6 address may not provisioned to the USB interface.

    This affects USB modems that are IP-based, versus being managed by the ModemManager application. Most Inseego modems are IP-based and this is important because Inseego is the modem manufacturer VMware SASE recommends. USB modems supporting IPv6 which use ModemManager versus being IP-based will be fine in a plug out and plug in scenario.

    Workaround: The Edge needs to be rebooted (or power-cycled) after the USB modem is replugged into the Edge's USB port. Post reboot, the Edge will retrieve the IPv6 address for the modem.

  • Fixed Issue 81181: The changes a user makes in Configure > Edge > Device may not be applied to the VMware SD-WAN Edge, even though the Edge is connected to the VMware SASE Orchestrator.

    Under heavy system load and with frequent rapid configuration changes, the Edge's management configuration thread can get stuck, and the Edge does not apply any further configuration changes.

    A user can restart the Edge service to clear a particular instance of the issue, but the issue can recur if the conditions are met at a later time.

  • Issue 81852: For a VMware SD-WAN Edge that is using a Zscaler type Cloud Security Service (CSS) which uses GRE tunnels that has turned on L7 Health Check, when that Edge is upgraded to Release 5.0.0, in some instances the customer may observe L7 Health Check errors.

     This is typically seen during software upgrade or during startup time. When L7 Health check for a CSS using GRE tunnels is turned on, error messages related to socket getaddress error may be seen. The observed error is intermittently seen, and not consistent. Because of this, L7 Health Check probe messages are not sent out.

    Workaround: Without the fix, to remediate the issue, a user needs to turn off and then turn back on the L7 Health Check configuration, and this feature would then work as expected.

  • Issue 82184: On a VMware SD-WAN Edge which is running Edge Release 5.0.0, when a traceroute or traceroute6 is run to the Edge's br-network IPv4/IPv6 address, the traceroute will not properly terminate when a UDP probe used.

    Traceroute or traceroute6 to the Edge's br-network IPv4/IPv6 address will not properly when Default Mode (in other words, UDP probe) is used.

    Workaround: Use -I option in traceroute and traceroute6 to use ICMP probe and then traceroute to br-network IPv4/IPv6 address will work as expected.

  • Issue 83227: For a VMware SD-WAN Edge running Release 5.0.0 which is configured with 128 Segments, the Edge's dnsmasq process will stop and exit.

    When IPv6 is activated on 128 segments and DCPv6 servers are configured in the LAN of each segment, the dnsmasq process will stop as the total open file descriptors is exceeded. The dnsmasq process will continue for ~30 minutes before exiting at which point the Edge's DHCP assignment of IP addresses will fail. 

    Workaround: Rebooting the Edge restores the dnsmasq process for ~30 minutes but it will fail again. The only real workaround is to reduce the number of segments to less than 128.

  • Issue 84790: When a VMware SD-WAN Edge with any model type other than 510/510-LTE is rebooted, the Edge may erroneously report the critical event Unable to launch service wifihang to the VMware SASE Orchestrator.

    The wifihang event message is designed for use only with the Edge 510/510-LTE models and alerts a customer to a problem with that Edge model's Wi-Fi process. When this event message is observed on any other Edge model, whether that model uses Wi-Fi or not (for example: the Edge 3400), the event message is spurious and the event can be safely ignored.

    Workaround: A user can safely ignore the wifihang event message on any Edge other than an Edge 510 or 510-LTE as it is spurious.

  • Issue 86098: For a site using an Enhanced High-Availability topology where a PPPoE WAN link is used on the Standby Edge, a user may observe that the default proxy route is not installed in the Active Edge and traffic using that link fails.

    When an Enhanced HA Edge pair come up, the PPPoE link synchronizes with the Standby Edge and provides a default route with a next hop of 0.0.0.0.  As a result this route is not installed on the Active and traffic using this link is dropped.

    Workaround: There is no workaround for this issue.

  • Issue 90884: For a customer enterprise using a Hub Cluster/Spoke topology, when a Hub Edge in the Cluster is reassigned to one or more Spoke Edges, the users at those Spoke Edge locations may experience traffic failure.

    Hub Edges in a Cluster can be reassigned as part of a Cluster rebalance when an enterprise upgrades their Edge software, so this issue may be observed post-upgrade. When this issue is encountered, the VMware SD-WAN Gateway does not send the new Spoke Edge routes to the Hub Edge because the Gateway is expecting all Hub Edges to have all Spoke Edge routes, and thus these routes are not in the Hub Edge's routing table. As a result traffic between the Spoke Edges and the Hub Edge in the Cluster is impacted because the forwarding path is down.

    Workaround: If the issue is encountered in an enterprise not using Gateways with a fix for this issue, it can be temporarily resolved by performing a route reinit on the Hub Edge.

  • Issue 92481: If a WAN interface on a VMware SD-WAN Edge is deactivated on the VMware SASE Orchestrator, the interface will still be reported as 'UP' by SNMP.

    The key debug process for interfaces output does not include the physical port details for Edge WAN interfaces (for example, GE3 or GE4 on an Edge 6x0 or 3x00 model). As a result when SNMP polls those interfaces it always returns a result of UP regardless of how these interfaces are configured.

    Workaround: There is no workaround for this issue.

  • Issue 93062: When a user runs the Remote Diagnostic "Interface Status" on the VMware Orchestrator, the Orchestrator either returns an error for that test and does not complete or the test does not return results for routed interfaces.

    The error message seen is "error reading data for test". If the test does complete, the results for routed interfaces are empty with no information about speed or duplex. Either way the Interface Status is broken. The issue is related to the debug command that underlies Interface Status omitting DPKD activated ports.

    Workaround: The user would need to generate a diagnostic bundle for the Edge to see the status for routed interface

  • Issue 93141: On a site deployed with a High Availability topology, a customer using an L2 switch upstream of the HA Edge pair may observe in the switch logs evidence of an L2 traffic loop, though there is no actual loop.

    The issue is caused by the HA Edge sending the HA interface heartbeat with the Virtual MAC address to the Orchestrator instead of the interfaces actual MAC address, which is caused by the HA Edge storing the Virtual MAC address in its MAC file. As a result the connected L2 switch detects traffic from the same source MAC coming from two different Edge interfaces and would log it as an L2 loop. This issue is cosmetic at the log level as there is no actual L2 loop and there is no customer traffic disruption or loss of contact with the Orchestrator arising from this issue.

    Workaround: The customer can ignore L2 loop detection events from upstream switches that arise out of the Edge's HA interface (usually GE1). 

  • Issue 94204: A user may observe that attempts to generate a diagnostic bundle for a VNF capable VMware SD-WAN Edge fail.

    A diagnostic bundle fails to complete on a VNF capable Edge because the Edge runs out of disk space. This can happen if the Edge has generated one or more cores and is caused by the Edge sending these cores to the /vnf/tmp folder. Each core is unpacked in the /vnf/tmp folder and due to a core's unpacked size quickly fills this folder which causes the diagnostic bundle to fail. 

    VNF (Virtual Network Function) capable Edges include the following models: 520v, 620, 640, 680, and 840.

    Workaround: There is no workaround for this issue.

  • Issue 94980: For a site deployed with a High Availability topology, the VMware SD-WAN Standby Edge may experience a Dataplane Service error and restart after a PPPoE WAN link is configured for the HA Edges.

    When examining the core generated by the Standby Edge, a user would see the message vc_is_use_cloud_gateway_set after the PPPoE link is configured.

    Workaround: There is no workaround for this issue beyond configuring PPPoE links in a maintenance window to manage the risk of this action.

  • Issue 95565: On a site using a High Availability topology, the VMware SD-WAN Active Edge may experience a Dataplane Service failure with a core generated and triggering a High Availability failover.

    The issue is triggered by the Active Edge's WAN links flapping one or more times (going down and then come up rapidly) while also using SNMP where there are frequent SNMP queries. There is a timing issue where the interface coming back up and the SNMP query together can trigger a deadlock which causes the Dataplane Service to fail and generate a core. While only a single WAN link flap can cause this issue, the greater the frequency of WAN link flaps, the greater the potential for this issue to occur.

    Workaround: On an HA Edge pair that experiences this issue and does not have the fix, the workaround is to disable SNMP as this is a timing issue and this reduces the risk.

  • Issue 98136: For customer enterprises using a Hub/Spoke topology where Dynamic Branch To Branch VPN is configured, client users behind a SD-WAN Spoke Edge may observe that some traffic has unexpected latency resulting from the traffic using a sub-optimal path.

    Spoke Edge traffic that experiences this issue uses a route that was initially a non-uplink route for a Hub Edge not included in the Profile the Spoke Edge was using. A Dynamic Branch To Branch VPN tunnel can be formed from the Spoke Edge to the Hub Edge because of traffic being sent towards some other unrelated prefix and in this instance the non-uplink route is installed in the Spoke Edge.

    As a result of this non-uplink route, all traffic towards this prefix starts going through the Hub Edge and the non-uplink route becomes uplink (community change to uplink community) but the non-uplink route installed previously is not revoked and the traffic takes the Hub Edge path as long as the Dynamic Branch To Branch VPN tunnel remains up.

    Workaround: Wait for the Dynamic Branch To Branch VPN tunnel to tear down, after which the uplink route will not be installed in the Spoke Edge when a new Dynamic Branch To Branch VPN tunnel is formed towards the Hub Edge.

  • Issue 98694: When a customer enterprise is configured with redundant static routes, if the primary route goes down, the alternate route(s) is not advertised, and traffic is dropped.

    When an interface goes down on a VMware SD-WAN Edge, the alternate routes are not advertised to the VMware SD-WAN Gateway even though the routes via the interface are now unreachable. Routes for the prefix will not be present in the Gateway even though there are alternate routes through other interfaces for those prefixes on the Edge. The issue is because the SD-WAN service sends a route delete without checking if there is an alternate reachable static route while handling an interface down.

    Workaround: An Edge Service Restart through the Orchestrator would recover the issue temporarily.

  • Issue 106160: With a VMware SD-WAN Edge configured as a DNS server and a next hop is a Gateway defined for an interface for which clients query the DNS server, there is no response.

    The DNS request packet is received by the DNS server, however the reply packet does a route table lookup based on IPtables connection tracking and finds the next hop Gateway IP address and resolves the MAC address. The end result is the DNS reply packet will use the MAC address of the Gateway, not the sender.

    Workaround: There is no workaround for this issue.

  • Issue 107994: On a customer enterprise where there are Secure Edge Access users with a privileged level, High Availability operations like running Remote Diagnostics > HA Info from the VMware SASE Orchestrator UI, or logging into the Standby Edge may fail.

    When a privileged level Secure Edge Access user is provisioned, the root account is completely blocked. The problem is that High Availability operations rely on communicating with the Standby Edge as root, and since the root account is completely blocked, the result is that any HA operations performed do not work. The issue exists regardless of the user role (Superuser, Standard, or any other role).

    Workaround: The customer has two options to work around this issue.

    1. They can switch back to password-based authentication.

    2. They can delete all privileged Secure Edge Access users or change them to basic users.

  • Issue 110564: For a customer site deployed in an Enhanced High Availability topology, the TCP session used to synchronize data between the Active and Standby Edge may go down with the result that WAN link traffic is not forwarded on the Standby Edge.

    There could be a scenario where a child process is using the port intended for TCP sessions between the Active and Standby Edge. In this scenario, the Active Edge cannot bring up the TCP server due to bind errors, and results in the Standby Edge's interface state not being exchanged and its WAN link(s) cannot be used for forwarding traffic.

    Workaround: There is no workaround for this issue.

  • Issue 117565: Users on a customer enterprise configured with a Partner Gateway may observe that MultiPath traffic (traffic that traverses a VMware SD-WAN Gateway) drops.

    Traffic going direct to the internet/cloud or Hub-to-Spoke traffic is not affected as this traffic does not use a Gateway. The issue is triggered when the Partner Handoff is deactivated for the Gateway, which results in all Gateway IPsec (VCMP) tunnels going down for that Gateway to the customer enterprise. The issue is caused by the Gateway handoff IP address not being cleared after the handoff is deactivated and the Gateway continues to perform the same subnet check with this now invalid handoff IP address.

    Workaround: Rebooting the Gateway will resolve a particular instance of the issue but does not prevent recurrences under the same conditions.

  • Issue 118704: A user may observe abnormally high latency values for paths measured between SD-WAN Edges and SD-WAN Gateways even though actual Edge-to-Gateway packet latency is much lower.

    A race condition has been identified with clock synchronization resulting in latency values measured incorrectly. This issue is cosmetic and there is no performance impact to customer traffic but it does negatively impact a customer's ability to properly monitor Edge links and paths.

    Workaround:Clock synchronization can be reset by restarting the Edge Service. This can be done using the Orchestrator UI by navigating to the Diagnostics > Remote Actions page, and then checking the affected Edge and selecting the Restart Service option.

  • Issue 125509: A customer enterprise using lower end VMware SD-WAN Edge models may experience flaps for BFD, BGP, or OSPF, depending on the routing protocol being used.

    On entry level Edge platforms (510, 520, 540, 610, and 620) at a high flow scale and coupled with dynamic routing and/or High Availability configuration, OSPF/BGP routing flaps may be observed when aggressive Hello and Dead interval timers are configured. In addition, if the customer also uses Edge Network Intelligence with Analytics turned on, the potential to encounter this issue increases.

    Workaround: If experiencing this issue, the workaround is to revert to default interval timers for OSPF (10, 40) or BGP (60, 180) or disable BFD entirely.

  • Issue 134374: Inbound firewall rules associated with a VMware SD-WAN Edge CELL interface are not applied as expected in a sequence.

    If the Edge CELL interface has either not yet been created or is not up (for example, no SIM card inserted), when the Firewall configuration is parsed, then the CELL interface value does not get populated correctly for the Firewall rule and traffic does not match the rule.

    Workaround: Delete the rule and recreate it after the CELL interface is up.

  • Issue 142366: For a customer enterprise site connected to Partner Gateways where one or more static routes are configured, client users working behind an SD-WAN Edge may observe intermittent traffic loss if a static route via the Primary Partner Gateway is unreachable.

    When the same static route is reachable via two or more Partner Gateways, if the route via the Partner Gateway in the Primary role is unreachable, traffic from an Edge can experience intermittent traffic loss. This issue is the result of the Edge API failing to properly check for reachability on a route lookup which causes the Edge to continue to use the Primary Partner Gateway even though reachability is false.

    Workaround: The issue can be temporarily remediated by the Partner shutting down the Primary Partner Gateway until the static route becomes reachable again. Shutting down the Primary Partner Gateway prevents the Edge from including it in route reachability lookups and ensures traffic matching that static route uses a secondary Partner Gateway. However this can be disruptive for all customers using this Partner Gateway as their Primary Gateway and should be done in a maintenance window by the Partner if possible.

  • Issue 145393: A customer enterprise site deployed with an Edge model 620, 640, or 680 where firewall logging is configured may observe that the Edge no longer either firewall or standard debugging logs.

    When this issue is encountered, a 6x0 Edge's eMMC storage experiences an excessive level of wear due to the high volume of writes and rewrites that can be triggered by enabling logging for firewall rules which are matched by a large number of new connections per second in a high traffic customer environment. This issue results in the Edge defensively moving the file partition which hosts logging to a read-only state, and no additional logs are stored.

    Workaround: If a customer has an Edge 620, 640, or 640 Edge model and is also using firewall logging, they should avoid enabling logging for firewall rules which can potentially match a large number of new connections in a high traffic environment. The excessive logging frequency that would result can cause undue wear on the Edge's storage and trigger this issue.

Orchestrator Known Issues

  • Issue 21342:

    When assigning Partner Gateways per-segment, the proper list of Gateway Assignments may not show under the Operator option "View" Gateways on the VMware SD-WAN Edge monitoring list.

  • Issue 24269:

    Monitor > Transport > Loss not graphing observed WAN link loss while QoE graphs do reflect this loss. 

  • Issue 25932:

    The VMware SD-WAN Orchestrator allows VMware SD-WAN Gateways to be removed from the Gateway Pool even when they are in use.

  • Issue 32335:

    The ‘End User Service Agreement’ (EUSA) page throws an error when a user is trying to accept the agreement.

    Workaround: Ensure no leading or trailing spaces are found in Enterprise Name.

  • Issue 32435:

    A VMware SD-WAN Edge override for a policy-based NAT configuration is permitted for tuples which are already configured at the profile level and vice versa.

  • Issue 32856:

    Though a business policy is configured to use the Hub cluster to backhaul internet traffic, the user can unselect the Hub cluster from a profile on a VMware SD-WAN Orchestrator that has been upgraded from Release 3.2.1 to Release 3.3.x.

  • Issue 35658:

    When a VMware SD-WAN Edge is moved from one profile to another which has a different CSS setting (e.g. IPsec in profile1 to GRE in profile2), the Edge level CSS settings will continue to use the previous CSS settings (e.g. IPsec versus GRE). 

    Workaround: At the Edge level, deactivate GRE, and then reactivate GRE to resolve the issue.

  • Issue 35667:

    When a VMware SD-WAN Edge is moved from one profile to another profile which has the same CSS setting but a different GRE CSS name (the same endpoints), some GRE tunnels will not show in monitoring.

    Workaround: At the Edge level, deactivate GRE and then reactivate GRE to resolve the issue.

  • Issue 36665:

    If the VMware SD-WAN Orchestrator cannot reach the internet, user interface pages that require accessing the Google Maps API may fail to load entirely.

  • Issue 32913:

    After activating High Availability, multicast details for the VMware SD-WAN Edge are not displayed on the Monitoring Page. A failover resolves the issue.

  • Issue 33026:

    The ‘End User Service Agreement’ (EUSA) page does not reload properly after deleting the agreement.

  • Issue 38056:

    The Edge-Licensing export.csv file not show region data.

  • Issue 38843:

    When pushing an application map, there is no Operator event, and the Edge event is of limited utility.

  • Issue 39633:

    The Super Gateway hyperlink does not work after a user assigns the Alternate Gateway as the Super Gateway.

  • Issue 39790:

    The VMware SD-WAN Orchestrator allows a user to configure a VMware SD-WAN Edge’s routed interface to have greater than the supported 32 subinterfaces, creating the risk that a user can configure 33 or more subinterfaces on an interface which would cause a Dataplane Service Failure for the Edge.

  • Issue 41691:

    User cannot change the 'Number of addresses' field although the DHCP pool is not exhausted on the Configure > Edge > Device page.

  • Issue 43276:

    User cannot change the Segment type when a VMware SD-WAN Edge or Profile has a Partner Gateway configured.

    Workaround: Temporarily remove the Partner Gateway configuration from the Profile or Edge so that the Segment can be changed between private and regular. Alternatively, the user can remove the Segment from the profile and make the change from there.

  • Issue 47713:

     If a Business Policy Rule is configured while Cloud VPN is toggled off, the NAT configuration must be reconfigured upon turning on Cloud VPN.

  • Issue 47820:

    If a VLAN is configured with DHCP toggled off at the Profile level, while also having an Edge Override for this VLAN on that Edge with DHCP activated, and there is an entry for the DNS server field set to none (no IP configured), the user will be unable to make any changed on the Configure > Edge > Device page and will get an error message of ‘invalid IP address []’ that does not explain or point to the actual problem.

  • Issue 48085: The VMware SD-WAN Orchestrator allows a user to delete a VLAN which is associated with an interface.

    When encountering this issue, the user would see an error message similar to "VLAN ID [xx] cannot be removed, in use by edge [b1-edge1 (GEx-disabled]".

  • Issue 51722: On the VMware SASE Orchestrator, the time range selector is no greater than two weeks for any statistic in the Monitor > Edge tabs.

    The time range selector does not show options greater than "Past 2 Weeks" in Monitor > Edge tabs even if the retention period for a set of statistics is much longer than 2 weeks. For example, flow and link statistics are retained for 365 days by default (which is configurable), while path statistics are retained only for 2 weeks by default (also configurable). This issue is making all monitor tabs conform to the lowest retained type of statistic versus allowing a user to select a time period that is consistent with the retention period for that statistic.

    Workaround: A user may use the "Custom" option in the time range selector to see data for more than 2 weeks.

  • Issue 60522: On the VMware SD-WAN Orchestrator UI, the user observes a large number of error messages when they try to remove a segment.

    The issue can be observed when adding a segment to a profile and the associating the segment with multiple VMware SD-WAN Edges. When the user attempts to remove the added segment from the profile, they will see a large number of error messages.

    Workaround: There is no workaround for this issue.

  • Issue 82680: For customer using MT-GRE Tunnel Automation, when a user turns off the Cloud-to-Cloud Interconnect (CCI) flag on a VMware SD-WAN Gateway which is configured to use CCI, the Zscaler MT-GRE entries may not get deleted from the Zscaler portal consistently.

    After a CCI site has been deleted from the Gateway, the entries for this site should also be removed. This issue has only been seen during test automation and has not been reproduced manually, but remains a risk.

    Workaround: Manually delete the resource from Zscaler before retrying.

  • Issue 82681: For customer using MT-GRE Tunnel Automation, when a user turns off the Cloud-to-Cloud Interconnect (CCI) flag on a VMware SD-WAN Gateway which is configured to use CCI, and the user deactivates the CCI flag from a VMware SD-WAN Edge with CCI configured which is using a Zscaler Cloud Security Service, the Zscaler MT-GRE entries may not get deleted from the Edge or from the Zscaler portal.

    After a CCI site has been deleted from the Gateway, the entries for this site should also be removed. This issue has only been seen during test automation and has not been reproduced manually, but remains a risk.

    Workaround: Manually delete the resource from Zscaler before retrying.

  • Issue 97055: For a user logged onto the VMware SASE Orchestrator as either a Partner or an Enterprise Administrator, if that user goes to the Configure > Edges page of the New UI, they would observe that they do not have the Assign Software/Firmware Image option in the dropdown menu when one or more Edges is selected.

    The result is that a user cannot assign software or firmware to an individual or group of Edges when using the New UI. The issue is caused by the incorrect privileges being assigned for Partner and Enterprise users on the New UI only, as these privileges are correct on the Classic UI.

    Workaround: On a 5.0.1 Orchestrator, the user can optionally switch to the Classic UI and perform the task. This issue is fixed on any Orchestrator using version 5.1.0 or later.

check-circle-line exclamation-circle-line close-line
Scroll to top icon