Updated 31 October 2022
VMware SASE™ Orchestrator Version R5010-20220912-GA
Check regularly for additions and updates to these release notes.
What's in the Release NotesThe release notes cover the following topics:
- Recommended Use
- Upgrade Paths for Orchestrator, Gateway, and Edge
- Important Notes
- Revision History
- Resolved Issues
- Known Issues
This release is recommended for all customers who require the features and functionality first made available in Release 5.0.0, as well as those customers impacted by the issues listed below which have been resolved since Release 5.0.0.
Release 5.0.1 Orchestrators, Gateways, and Hub Edges support all previous VMware SD-WAN Edge versions greater than or equal to Release 3.2.2.
Note: Release 5.0.1 is classified as a maintenance release, and maintenance releases undergo a subset of interoperability testing because the protocol is identical to the major/minor release that they are a part of. Please consult the VMware SASE 5.0.0 Release Notes for a list of other software versions this version of the protocol has been tested against.
The following SD-WAN interoperability combinations were explicitly tested:
Note: The above table is fully valid for customers using SD-WAN services only. Customers requiring access to VMware Cloud Web Security or VMware Secure Access need their Edges upgraded to Release 4.5.0 or later.
Warning: VMware SD-WAN Releases 3.2.x and 3.3.x have reached the End of Support.
- Releases 3.2.x and 3.3.x reached End of General Support (EOGS) on December 15, 2021, and End of Technical Guidance (EOTG) March 15, 2022.
Warning: VMware SD-WAN Releases 3.4.x is approaching End of Support for the Orchestrator and Gateway.
- Release 3.4.x for the Orchestrator and Gateway reached End of General Support (EOGS) on March 30, 2022, and will reach End of Technical Guidance (EOTG) September 30, 2022.
- Note: This is for the Orchestrator and Gateway only. 3.4.x for the Edge is scheduled to enter its End of Support window beginning on December 31, 2022.
- For more information please consult the Knowledge Base article: Announcement: End of Support Life for VMware SD-WAN Release 3.x (84151)
Warning: VMware SD-WAN Releases 4.0.x and 4.2.x are approaching End of Support.
- Release 4.0.x will reach End of General Support (EOGS) on September 30, 2022, and End of Technical Guidance (EOTG) December 31, 2022.
- Release 4.2.x Orchestrators and Gateways will reach End of General Support (EOGS) on December 30, 2022, and End of Technical Guidance (EOTG) March 30, 2023.
- Release 4.2.x Edges will reach End of General Support (EOGS) on June 30, 2023, and End of Technical Guidance (EOTG) September 30, 2023.
- For more information please consult the Knowledge Base article: Announcement: End of Support Life for VMware SD-WAN Release 4.x (88319)
Note: Release 3.x did not properly support AES-256-GCM, which meant that customers using AES-256 were always using their Edges with GCM deactivated (AES-256-CBC). If a customer is using AES-256, they must explicitly deactivate GCM from the Orchestrator prior to upgrading their Edges to a 4.x Release. Once all their Edges are running a 4.x release, the customer may choose between AES-256-GCM and AES-256-CBC.
The following lists the paths for customers wishing to upgrade their Orchestrator, Gateway, or Edge from an older release to Release 5.0.1.
Due to infrastructure changes in the Orchestrator beginning in Release 4.0.0, any Orchestrator using a 3.x Release needs to be first upgraded to 4.0.0 prior to being upgraded to 5.0.1. Orchestrators using Release 4.0.0 or later can be upgraded to Release 5.0.1. Thus, the upgrade paths for the Orchestrator are as follows:
Orchestrator using Release 3.x → 4.0.0 → 5.0.1.
Orchestrator using Release 4.x → 5.0.1.
Gateway upgrades from 3.x to 5.0.1 are not supported. In place of upgrading, a 3.x Gateway needs to be freshly deployed with the same VM attributes, and the old instance is then deprecated.
Upgrading a Gateway using Release 4.0.0 or later is fully supported for all Gateway types.
Note: When deploying a new Gateway using 5.0.1 the VMware ESXi instance must be at least version 6.7, Update 3 up to version 7.0. Using an earlier ESXi instance will result in the Gateway's Dataplane Service failing when trying to run Release 5.0.0 or later.
Note: Prior to upgrading a Gateway to 5.0.1, the ESXi instance must be upgraded to at least version 6.7, Update 3 up to version 7.0. Using an earlier ESXi instance will result in the Gateway's Dataplane Service failing when trying to run Release 5.0.1 or later.
An Edge can be upgraded directly to Release 5.0.1 from any Release 3.x or later.
Mixing Wi-Fi Capable and Non-Wi-Fi Capable Edges in High Availability Is Not Supported
Beginning in 2021, VMware SD-WAN introduced Edge models which do not include a Wi-Fi module: the Edge models 510N, 610N, 620N, 640N, and 680N. While these models appear identical to their Wi-Fi capable counterparts except for Wi-Fi, deploying a Wi-Fi capable Edge and a Non-Wi-Fi capable Edge of the same model (for example, an Edge 640 and an Edge 640N) as a High-Availability pair is not supported. Customers should ensure that the Edges deployed as a High Availability pair are of the same type: both Wi-Fi capable, or both Non-Wi-Fi capable.
Grafana No Longer Available on Orchestrator
Release 5.0.0 and later Orchestrators do not include the Grafana application due to license restrictions. Grafana is primarily used by customers and partners who run the Orchestrator on-premises to monitor the Orchestrator's performance. Going forward for such needs, a customer or partner would need to host their own Grafana application outside the Orchestrator and configure Telegraf on the Orchestrator to point to it.
VMware SASE Builds Include a Fourth Digit
Beginning with Release 5.0.0 and going forward, the release build will now include a fourth digit.
For software releases, VMware SASE follows an a.b.c numbering scheme where:
- a = Major (for example, 5.0.0) → A release with multiple large features and potentially significant architectural changes.
- b = Minor (for example, 5.2.0) → A release with a handful of small features or a couple of large features and no significant architectural changes
- c = Maintenance (for example, 5.2.1) → A release with potentially a large number of fixes for field found issues and internally found issue fixes with no features except potentially new hardware platform support.
With Release 5.0.0 a fourth digit is added to Edge, Gateway, and Orchestrator builds, so the numbering is a.b.c.d where
- d = Rollup Build (for example, 188.8.131.52) → A rollup is a cumulative aggregate of known customer found defect fixes or critical internal found defects.
Rollup Builds for 4.x and earlier are distinguished by the image name's GA date, which is not an optimal way of communicating the build version to a customer. Adding a fourth digit for 5.0.0 builds and later allows customers to more clearly see what software version is being used for a particular component.
This build numbering convention is true only for Release 5.0.0 and later and 4.x and earlier releases will continue with three digits with rollup builds identified in the existing manner by date.
Accessing Cloud Web Security and Secure Access
A customer wishing to access VMware Cloud Web Security or VMware Secure Access must upgrade their Edges to Release 4.5.0 or later. These services are inaccessible on Edges using a release earlier than 4.5.0.
BGPv4 Filter Configuration Delimiter Change for AS-PATH Prepending
Through Release 3.x, the VMware SD-WAN BGPv4 filter configuration for AS-PATH prepending supported both comma and space based delimiters. However, beginning in Release 4.0.0 and forward, VMware SD-WAN only supports a space based delimiter in an AS-Path prepending configuration.
Customers upgrading from 3.x to 4.x or 5.x need to edit their AS-PATH prepending configurations to "replace commas with spaces" prior to upgrade to avoid incorrect BGP best route selection.
Extended Upgrade Time for Edge 3x00 Models
Upgrades to this version may take longer than normal (3-5 minutes) on Edge 3x00 models (i.e., 3400, 3800 and 3810). This is due to a firmware upgrade which resolves issue 53676. If an Edge 3400 or 3800 had previously upgraded its firmware when on Release 3.4.5/3.4.6, 4.0.2, 4.2.1, 4.3.0, 4.5.0, or 5.0.0 then the Edge would upgrade as expected. For more information, please consult Fixed Issue 53676 in the respective release notes.
Limitation with BGP over IPsec on Edge and Gateway, and Azure Virtual WAN Automation
The BGP over IPsec on Edge and Gateway feature is not compatible with Azure Virtual WAN Automation from Edge or Gateway. Only static routes are supported when automating connectivity from an Edge or Gateway to an Azure vWAN.
Limitation When Deactivating Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810
When a user deactivates autonegotiation to hardcode speed and duplex on ports GE1 - GE4 on a VMware SD-WAN Edge model 620, 640 or 680; on ports GE3 or GE4 on an Edge 3400, 3800, or 3810; or on an Edge 520/540 when an SFP with a copper interface is used on ports SFP1 or SFP2, the user may find that even after a reboot the link does not come up.
This is caused by each of the listed Edge models using the Intel Ethernet Controller i350, which has a limitation that when autonegotiation is not used on both sides of the link, it is not able to dynamically detect the appropriate wires to transmit and receive on (auto-MDIX). If both sides of the connection are transmitting and receiving on the same wires, the link will not be detected. If the peer side also does not support auto-MDIX without autonegotiation, and the link does not come up with a straight cable, then a crossover Ethernet cable will be needed to bring the link up.
For more information please see the KB article Limitation When Deactivating Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810 (87208).
August 5th, 2022. First Edition.
August 11th, 2022. Second Edition.
- Added Fixed Issues #89346, #90067, #90128, #90540, #91054, #91720, and #92082 to the Orchestrator Resolved Issues section. These tickets were omitted in error from the first edition of the 5.0.1 Release Notes.
August 15th, 2022. Third Edition.
- Added Fixed Issue #89217 to the Edge/Gateway Resolved Issues section. This ticket was omitted in error from the first edition of the 5.0.1 Release Notes.
August 18th, 2022. Fourth Edition.
- Added an updated Orchestrator build R5010-20220817-GA to the Orchestrator Resolved section. Build R5010-20220817-GA replaces the original Orchestrator build R5010-20220803-GA, and is the new Orchestrator GA build for Release 5.0.1.
- This updated Orchestrator build includes the fix for Issue #95613, which is added to the Orchestrator Resolved section.
September 9th, 2022. Fifth Edition.
- Added Fixed Issues #87552, #90151, and #93383 to the Edge/Gateway Resolved Issues section. These tickets were omitted in error from the first edition of the 5.0.1 Release Notes.
- Removed Open Issue #49712 from Edge/Gateway Known Issues as Engineering concluded it was caused by a configuration error versus a defect in the code.
- Removed Open Issue #90065 from Edge/Gateway Known Issues as Engineering has not been able to replicate the issue and DR synchronization works as expected with the 5.0.1 Orchestrator Build.
September 16th, 2022. Sixth Edition.
- Added an updated Orchestrator build R5010-20220912-GA to the Orchestrator Resolved section. Build R5010-20220912-GA replaces the original Orchestrator build R5010-20220817-GA, and is the new Orchestrator GA build for Release 5.0.1.
- This updated Orchestrator build includes the fix for Issue #90749, #95847, and #96095, which are added to the Orchestrator Resolved section.
- Added Fixed Issues #91875 to the Edge/Gateway Resolved Issues section. This tickets were omitted in error from the first edition of the 5.0.1 Release Notes.
- Added Issues #96055 and #96231 to the Edge/Gateway Known Issues section.
September 23rd, 2022. Seventh Edition.
- Added Fixed Issue #96108 to the Orchestrator build R5010-20220912-GA in the Orchestrator Resolved Issues section, This issue was omitted in error from the Sixth Edition of these Release Notes.
- Moved Fixed Issue #90749 from the Orchestrator build R5010-20220912-GA down to the original Orchestrator build R5010-20220803-GA in Orchestrator Resolved Issues section, as this is where the issue fix was actually added for Release 5.0.1.
- Added Fixed Issue #87982 to the Edge/Gateway Resolved Issues section. This ticket was omitted in error from the first edition of the 5.0.1 Release Notes.
- Added Open Issues #86098, #94204, #95565, #96441, and #96888 to the Edge/Gateway Known Issues section.
September 28th, 2022. Eighth Edition.
- Added #98136 to the Edge/Gateway Known Issues section.
October 12th, 2022. Ninth Edition.
- Added a new Edge/Gateway rollup build R5011-20221007-GA to the Edge/Gateway Resolved section. This is the first Edge/Gateway rollup build and is the new Edge/Gateway GA build for Release 5.0.1.
- Edge/Gateway build R5011-20221007-GA includes the fixes for issues #89235, #94430, #95503, #96055, #96231, #98157, and #99188 which are each documented in this section.
October 18th, 2022. Tenth Edition.
- Added Fixed Issue #90876 to the Edge/Gateway Resolved section for the original 5.0.1 GA build R5010-20220729-GA. This issue was omitted in error from the first edition of the 5.0.1 Release Notes.
October 31st, 2022. Eleventh Edition.
- Added Fixed Issue #72491 to the Edge/Gateway Resolved section for the original 5.0.1 GA build R5010-20220729-GA. This issue was omitted in error from the first edition of the 5.0.1 Release Notes.
The resolved issues are grouped as follows.Edge/Gateway Resolved Issues
Resolved in Edge/Gateway Version R5011-20221007-GA
Edge/Gateway build R5011-20221007-GA was released on 10-11-2022 and is the 1st Edge/Gateway rollup for Release 5.0.1.
This Edge/Gateway rollup build addresses the below critical issues since the original GA build, R5010-20220729-GA.
- Fixed Issue 89235: On a customer enterprise which uses a Hub/Spoke topology and employs internet backhaul policies, backhaul traffic from a VMware SD-WAN Spoke Edge which is destined for the Internet may be dropped by the Hub Edge.
When this issue is encountered, the client users would notice issues for traffic destined for the Internet. The issue occurs after one of the following: an Edge power cycle (for example after a power outage), an Edge service restart, or a configuration change and is caused by a timing issue between the backhaul traffic originating from a Spoke Edge and the route advertised from the Spoke Edge.
When encountering this issue on a Spoke Edge without this fix, a user should flush the flows on the affected Spoke Edge to restore normal routing of backhaul traffic. This can be done on the Orchestrator through Remote Diagnostics > Flush Flows.
- Fixed Issue 94430: For a customer enterprise that uses a Hub/Spoke topology where multiple Hubs are deployed, a user behind a VMware SD-WAN Spoke Edge may observe issues with traffic that is destined for a Hub Edge.
Client traffic issues occur when the Spoke Edge forwards traffic towards a Hub different than the one expected to receive the traffic. The issue is caused by the AS path length for the remote BGP routes not being calculated properly in certain scenarios. Because of this, the routes from the Hubs that should have a lower routing preference instead end up having greater AS_PATH length and may be preferred.
If encountering this issue without a fix, the customer can withdraw and re-advertise the route that is expected to be preferred.
- Fixed Issue 95503: In rare instances a customer may observe that a VMware SD-WAN Edge model 610, 610N, or 610-LTE shows the same MAC address for all Ethernet interfaces.
An Edge 610 (any type) may show an eth0 MAC address ending with 0xF*. In such cases, GE1 through GE6 ports receive the same MAC address due to an issue with the script that calculates and allocates MAC addresses.
The fix corrects this script behavior and an affected Edge 610 type would properly calculate and allocate unique MAC addresses once the Edge is upgraded to a build that includes it.
- Fixed Issue 96055: A VMware SD-WAN Gateway may experience a Dataplane Service failure with Signal 6 (SIGABRT) and generate a core.
This issue can occur if a VMware SD-WAN Edge sends a packet which refers to an invalid Segment to the Edge's Primary Gateway. When the Gateway receives this packet, the Gateway's process fails instead of handling the situation gracefully by discarding such packets.
- Fixed Issue 96231: When a customer deploys a Non SD-WAN Destination via Gateway with a Palo Alto Networks type and also configures Palo Alto's “Prisma tunnel monitoring feature” for use on this NSD, the user may observe that while IPsec tunnels are established for this NSD, they are continuously torn down and rebuilt every 5-15 seconds, causing disruption for traffic using the NSD.
When the Prisma tunnel monitoring is enabled, the Prisma application sends ICMP packets encrypted to the SD-WAN Gateway and once the Gateway responds back to the ICMP packet, Prisma confirms that the tunnel is established. In effect, Prisma is a kind of IPsec tunnel liveliness check. However the problem in this instance is that the Gateway is dropping Prisma's ICMP packet and thus Prisma marks the tunnel as down which triggers the tunnel teardown and rebuild.
The issues is caused by the Gateway receiving the ICMP packet and checking if its an echo request packet, but instead of checking the type field, the Gateway is incorrectly checking the code field in the ICMP header and this results in the Gateway discarding the ICMP packet which triggers Prisma to tear down the tunnel.
On a Gateway that does not have a fix for this issue, the customer should not use Prisma for their Palo Alto type NSD.
- Fixed Issue 98157: A VMware SD-WAN Gateway may experience a Dataplane Service failure and restart as a result.
In rare instances when the source port of a SD-WAN tunnel changes (for example, because an intermediate NAT device restarts), the Gateway's dataplane process can fail and then restart and generate a core.
- Fixed Issue 99188: In some situations a BGP session may not come up for a customer enterprise.
This issue occurs when an ASN value greater than 2147483648 is configured. In such a case the configuration is not applied and hence the BGP sessions do not come up.
Edge/Gateway Resolved Issues
Resolved in Edge/Gateway Build R5010-20220729-GA
The below issues have been resolved since Edge/Gateway Build R5002-20220506-GA.
- Fixed Issue 58791: A site deployed with a High-Availability topology where BGP is used may encounter an issue where the VMware SD-WAN Edge repeatedly fails over.
This issue affects HA sites configured within a Hub/Spoke topology where the HA site has greater than 512 BGPv4 filter prefixes configured.
When BGP is used with multiple network commands configured and while the Standby Edge is coming up it parses the all configurations symmetrically and for every network command vtysh is spawned and as a result this is causing the verp thread to not run. The verp thread being delayed results in a delay in heartbeat processing which causes the Standby Edge to believe the Active Edge is down and the Standby Edge then becomes active which leads to a split-brain state (active-active). To recover from the split-brain state, the Standby Edge restarts which merely repeats the cycle.
Without the fix the workaround is to reduce the number of BGP filter prefix configurations by aggregating them and getting the total number below 512 (256 Inbound, and 256 Outbound filters).
Note: There is a similar issue where an HA site is disrupted when using BGP 'match and set' operations and this is tracked separately under Fixed Issue #84825, which is also fixed in this Edge build.
- Fixed Issue 67458: When a VMware SD-WAN Hub Edge with a large number of Spoke Edges is upgraded to Release 4.2.1 or later, some tunnels to other Spoke Edges will not come up for the Hub Edge.
A large number of Spoke Edges is understood at ~1000 or more. This issue is not consistent, but generally ~1/3rd of the VeloCloud Management Protocol (VCMP) tunnels are not established between the Hub Edge and the connected Spoke Edges. This is caused by the Hub Edge ignoring the
MP_INITs as the number of half open TDs exceeds the Hub Edge's upper limit.
When encountering this issue without the fix, restarting the Edge Service will restore full tunnel connectivity.
- Fixed Issue 70129: When Syslog is activated on a large scale VMware SD-WAN Gateway, the /var/log folder may get filled up with syslog log files in a short period of time.
Large scale is understood as a Gateway with ~4K peers and ~6K tunnels where there are usually ~100-150K flows and ~50-100K NAT entries. Short period can be as little as 24 hours with a syslog.log file of >3.2Gb in size. This is caused by some NAT logs being directed to the /var/log that should be directed to a different folder.
- Fixed Issue 70586: When a routed interface on a VMware SD-WAN Edge is configured for 802.1x (uses RADIUS authentication), clients connected on that interface get silently de-authenticated whenever any other interface flaps (in other words, when any non-802.1x interface goes down and up in quick succession), and all of their traffic gets dropped until the client disconnects and then reconnects to the Edge.
The Edge is not checking that the interface that flapped is actually the one that had 802.1x clients authenticated and thus treats any interface flap is if it were a 802.1x interface flap and acts accordingly.
Without the fix, the only workaround is to force the client to physically disconnect and reconnect to get re-authenticated again.
- Fixed Issue 72925: For a customer who uses SNMP polling for monitoring their enterprise and deploys lower model VMware SD-WAN Edges (for example, Edge models 510, 520, or 610) which are running a 4.x software release, SNMP polling takes exceptionally long to process and can even timeout.
This issue significantly reduces the effectiveness of SNMP polling for network monitoring when using Edges in the 510, 5x0, and 6x0 series. This issue is caused by the Release 4.x SNMPagent taking an unnecessarily long amount of time in traversing the debug command list, which is not actually required for the SNMP process.
- Fixed Issue 73830: System Center Configuration Manager (SCCM) application traffic is being misclassified by the VMware SD-WAN Edge as Business Intelligence Service (BITS) traffic and customers using Business Policy or Firewall Rules designed for SCCM traffic will find that traffic impacted.
The Edge's Deep Packet Inspection (DPI) engine is misclassifying the SCCM application packets as BITS traffic and if there are Business Policy or Firewall Rules designed to steer that traffic or ensure that traffic is allowed by Firewall rules, the misclassification my result in SCCM traffic being blocked with a resulting disruption to the customer. The remediation for this issue involved amending the default 4.5.1/5.0.1 and later application maps to ensure that this misclassification is prevented.
- Fixed Issue 74291: A VMware SD-WAN Edge in a High-Availability topology may appear as offline after a failover despite having internet access and functional DNS.
This issue can occur after a High-Availability failover and is caused by a token error on the newly promoted Active Edge which results in a heartbeat failure to the Orchestrator. Without the heartbeat, the Orchestrator marks the Edge as down.
Without an Edge build with the fix, the way to remediate the issue is to locally force another failover either through the local UI or by powercycling the Active Edge.
- Fixed Issue 74316: A VMware SD-WAN Spoke Edge may not connect to any or all of the assigned Hub Edge Clusters, even if the Edge has a service restart or a full reboot.
There is an issue with the cluster reassignment logic which creates cluster assignment mapping without the cluster member’s endpoint information in a specific Cluster-member-to-Super-Gateway overlay flap scenario. As a result, Spoke Edges assigned to the Hub Cluster member subsequently fails to receive the endpoint information of the Hub Cluster member leading to no overlays between Spoke Edges and Hub Clusters.
Without the fix the only way to temporarily remediate the condition is for someone with Gateway access to trigger a cluster reassignment manually on the Super Gateways.
- Fixed Issue 76690: A user may observe important logs missing when attempting to troubleshoot an issue for a VMware SD-WAN Edge because they have been crowded out by repeated entries of a less important event.
In a diagnostic bundle, velocloud/log could have repeated logging of the event vc_peer_qos_update_cos_qlimits. The log level for this event is management plane and it can get logged repeatedly to the point that the log overflows and rolls over. In a troubleshooting scenario, this can result in important log messages being missed because they were rolled over and wiped out.
- Fixed Issue 78276: On a VMware SD-WAN Gateway, running the debug.py -qos_net fails if the VMware SD-WAN Edge's name includes non-ASCII characters.
An example of this in the field was the use of Chinese characters but it applies to any non-ASCII characters and can be observed as follows: change an Edge name to include non-ASCII characters and reboot the Edge. Then a Gateway connected to the Edge run the CLI command: 'debug.py --list 3', to get the Edge's logical ID. Then run the Gateway CLI command: 'debug.py -qos_net [logical ID] all stats' and the user would observe the command fails.
- Fixed Issue 78300: If a VMware SD-WAN Edge is using a WAN link configured to be a backup, a user may observe logs or Orchestrator Events which suggest that tunnels are coming up or going down for this link.
By design, tunnels do not get established for backup links. But any tunnel request from a remote end (typically a dynamic Edge-to-Edge tunnel) might change the link status as it goes through the stack. In this fix, care has been taken so that no logs indicate that any tunnel formation or tear down is going on for the back up link.
- Fixed Issue 78391: Traffic with the Speedtest application classification is not working properly.
Both speedtest.net and fast.com have newly added speedtest server IP addresses that are missing in the default application map and as a result the Business Policy that deals with these application is not being applied.
If not upgraded to Release 4.5.1 or 5.0.1, an Operator could add the required speedtest IPs to an existing application map using the VMware SASE Orchestrator's Application Map Editor.
- Fixed Issue 79261: Office 365 / Microsoft 365 application traffic is misclassified as Tencent Meeting (VooV Meeting) application traffic on the VMware SD-WAN Edge.
This can be disruptive for customers who rely on Business or Firewall Policies for routing and prioritizing Office 365 / Microsoft 365 traffic only to have that traffic classified as Tencent Meeting and thus hitting a completely different rule. Issued is traced to incorrect application map subnets for Tencent that are corrected for the 4.5.1 and 5.0.1 default application map. A customer not using 4.5.1/5.0.1 can have this corrected by an Operator who edits their application map through the Orchestrator Application Map Editor to correct the Tencent subnets.
- Fixed Issue 80010: For a customer enterprise using a Hub/Spoke topology where SD-WAN Reachable is also configured, the Spoke to Gateway path (using a public WAN link) via the Hub path does not come up if the Spoke-to-Hub path is point-to-point.
The SD-WAN Reachable feature, which is a passthrough for a Spoke Edge to connect to a Gateway through a connected Hub, is not supported if the Spoke Edge and the Hub Edge are connected by a point-to-point link (in other words, the Spoke's IP address matches the connected route on the Hub). The fix for this issue adds this functionality.
- Fixed Issue 80196: A VMware SD-WAN Gateway may experience a Dataplane Service failure with a SIGXCPU message and the Gateway restarts the service to recover with an impact to Gateway traffic for 15-30 seconds.
This issue is seen at high throughput for that Gateway relative to its throughput capacity and thus is more likely to happen in large scale deployments (for example, 4K Edges and 6K tunnels). When traffic at a high rate hits the Gateway, in some instances the Gateway will experience a thread lock and generates a core while restarting. In the core the user would observe: "Program terminated with signal SIGXCPU, CPU time limit exceeded".
- Fixed Issue 80479: A VMware SD-WAN Gateway may experience a Dataplane Service failure with the Gateway restarting the service to recover which impacts Gateway traffic for 15-30 seconds.
This issue can occur if a VMware SD-WAN Edge is connected to the Gateway with Edge-to-Edge (E2E) configured and a Loopback interface route advertised. When a user toggles off E2E for this Edge, this triggers a route initiation but the loopback route is not deleted, and the route updates the profile flag of the route. Next, if the user removes the advertise for the Loopback route, this deletes the route from the FIB but remains stale in the E2E table on the Gateway. If the Loopback route is then readvertised and is added to the FIB and after that the user toggles back on Edge-to-Edge which again just updates the flag, even though the route is present in the Gateway's E2E table (which is stale), the actual route ref_count is not correct. Finally, if the tunnel is torn down, this is what triggers the Dataplane Service failure on the Gateway.
Without the fix, an Operator would need to make sure routes are withdrawn before a profile is changed for the Edge.
- Fixed Issue 80496: Ping from a VMware SD-WAN Edge to a remote Edge's branch loopback IP address over a SD-WAN tunnel may not work.
Issue is seen for a ping with a large enough packet size to cause fragmentation. When the ping is initiated with a large packet size from an Edge to a remote branch Edge's loopback IP address, the fragmented ICMP reply is reaching the Edge initiating the ping but does not reach the ping application since the next fragment is dropped.
- Fixed Issue 80721: Partners and operators monitoring a VMware SD-WAN Gateway using Telegraf may observe that the metrics do not resume should Telegraf experience a network timeout.
Gateways experiencing this issue are using Telegraf version 1.17.3. This is in contrast to the Telegraf version the VMware SASE Orchestrator is using: 1.21.1. This version mismatch is causing the issue with Telegraf getting stuck in the event of a network timeout. The Gateways with a fix for this issue include Telegraf version 1.21.1 as would any future Gateway build in that release train (in other words: 4.5.1 or 5.0.1).
On a Gateway that experiences this issue, the only remediation is to restart Telegraf to resume sending metrics.
- Fixed Issue 80814: On a VMware SD-WAN Edge where a Standard Firewall Allow rule is configured which has a local Edge client Source IP address and a remote client as the Destination IP Address, and which also has a "Deny All" rule for other traffic, the traffic from the remote client to the local client is dropped.
This issue is encountered when there is a VLAN IP address mismatch between the source and destination hosts. When the source and destination hosts are part of different VLANs, the SD-WAN service prefers the source/destination IP address of the first packet as it is in the Firewall lookup key. As a result, for overlay inbound flows, there is a mismatch and traffic hits the Deny All firewall rule.
Without the fix, the workaround for this issue is to revert the rule in the direction of first IP packet of the flow, so that the packet is able to match the firewall rule.
- Fixed Issue 80897: For a customer enterprise where VMware SD-WAN Edges are connected to VMware SD-WAN Partner Gateways, users may observe poor performance for customer traffic.
The poor performance is the result of routing issues stemming from the Partner Gateway distributing routes to the Edges where preferred secure static routes are available but the Edge does not properly label these routes as secure. The result is the Edge potentially advertising non-preferred non-secure routes over secure routes since all routes are treated equally when the expected behavior is to always prefer secure routes over non-secure routes.
Note: Both the Partner Gateway and customer Edges must be upgraded to a build that includes this fix to resolve the issue.
- Fixed Issue 81221: If a customer configures a 1:1 NAT rule for a VMware SD-WAN Edge and that Edge is rebooted, the rule no longer works.
After the reboot, the Edge assigns the NAT address as the Edge interface address where the NAT rule is being applied and thus no tunnels are being built for traffic matching that rule.
Without the fix, the only remediation is to run the Remote Diagnostic "Flush NAT", which flushes the entire NAT table and reestablishes correct NAT rule operation.
- Fixed Issue 81809: When a user attempts to SSH to a VLAN IP on a VMware SD-WAN Edge from a remote client sitting behind another Edge or even from a VMware SD-WAN Gateway, the SSH attempt fails.
An SSH attempt from a LAN client to an Edge VLAN IP works properly. Originally the Edge's Management IP was used to SSH to the Edge. However, after the Edge Management IP was deprecated, there was no option for the user to SSH to the Edge (via overlay from a remote Edge client) as the Loopback IP still does not support SSH.
- Fixed Issue 81859: When a VMware SD-WAN Edge 610-LTE is activated to Edge Release 5.0.0, the CELL interface may not come up after the Edge is upgrade to that release.
This issue is not consistent but when it occurs can have a major impact if the Edge 610-LTE's only public link is the mobile CELL link as the Edge would be effectively down and intervention for this Edge would need to be local.
If encountering this issue on an Edge without the fix, the user would need to restart the Edge service (or power cycle/reboot it if the Edge is inaccessible through the Orchestrator) or restart the Edge's modem to restore the CELL interface.
- Fixed Issue 82182: For a VMware SD-WAN Edge Model 510 or 510-LTE which is running Edge Release 5.0.0, when a user attempts a service restart of the Edge, the Edge may also reboot.
An Edge reboot would disrupt customer traffic for 2-3 minutes while the Edge was going through the reboot process. On an Edge 510/510-LTE, there is a Wi-Fi device hang monitoring script which may fail to stop during the Edge service restart and this triggers the reboot.
Without the fix a user would need to restart the Edge service, but Edge service restarts for these models should only be done in a maintenance window or with the understanding this issue may arise.
- Fixed Issue 82264: A VMware SD-WAN Virtual Edge which uses an AWS C4 instance cannot be upgraded to Release 5.0.0.
An AWS C4 Virtual Edge upgraded to Release 5.0.0 does not recover and the only remediation is for the user to reactivate the Edge to a non-5.0.0 version. No other AWS instances (for example, C5) are affected by this issue, but due to the critical nature of this defect an AWS Edge Upgrade software package is not available for Release 5.0.0.
Edge Release 5.0.1 and later correct this issue and the AWS C4 instance can be successfully upgraded to Release 5.0.1 and later.
- Fixed Issue 82463: For a site configured with a Cloud Security Service (CSS), the VMware SD-WAN Edge may drop traffic destined for the CSS.
If the site is routing all internet traffic through a CSS, the impact of this issue can be significant. When the issue occurs, CSS packets are sent on the incorrect interface with the IP address of the actual interface as the source which leads to a failure in application access. The issue is caused by a potential race between the CSS context lookup thread and the outgoing interface selection thread which leads to the incorrect association of the outgoing interface with the flow and some flows on the CSS paths fail.
Without the fix, when experiencing the issue the user can remediate it by starting a new flow, or flushing all flows on the Edge by using Remote Diagnostics > Flush Flows.
- Fixed Issue 82485: On an entry level VMware SD-WAN Edge model (for example, Edge 510, 510-LTE, or 610) if a user runs the Remote Diagnostic "Route Table Dump", the Orchestrator UI page may time out and not return a result.
The issue is encountered if there are more than 16000 routes as it take the Edge more than 30 seconds to return the results. 30 seconds is the timeout limit for the page's WebSocket and so no result is returned. The fix for the issue optimizes the route table walk to ensure timeouts do not occur.
- Fixed Issue 82522: When high throughput traffic hits a VMware SD-WAN Edge, there may be a drop in actual throughput observed on the Edge.
Under high throughput, the Edge's NDP (Neighbor Discovery Protocol) thread is acquiring lock twice even for NDP entries which were marked reachable and further processing was not required. These duplicate locks cause the throughput to decrease due to the additional processing.
- Fixed Issue 82652: For a customer using a Cloud Security Service (CSS) where L7 Health Check is configured, the VMware SD-WAN Edge makes no attempt to recover an IPsec CSS tunnel that has been marked as down for more than five minutes.
In the current implementation of the L7 Health Check, the Edge sends L7 probes on all CSS tunnels and if those probes fail a set number of times, the Edge marks that tunnel as Down and then continues to send L7 probes and waits for the tunnel to come up on its own. The issue being that no attempt is made to recover a tunnel once it is in a Down state for more than five minutes where IKE remains up (if IKE is also down the IPsec tunnel is automatically reset after 20 seconds).
The fix in this ticket enhances the L7 Health Check by including an additional step for IPsec-based CSS tunnels: if an IPsec-based CSS tunnel remains down for longer than five minutes (no successful L7 probes) while IKE for the tunnel remains up during the same period, the Edge will tear down the IPsec tunnel and reset the IKE in an attempt to recover the CSS tunnel. L7 probes would continue to be sent while this occurs and if successful would mark the tunnel as Up. If the tunnel remained down, the same step would be applied after an additional five minutes.
This added behavior only applies to a CSS with IPsec tunnels and not ones using GRE tunnels.
- Fixed Issue 82790: On VMware SD-WAN Gateways deployed in Azure environments, the Gateway interface counters are not exported to the Wavefront monitoring service.
Azure is an environment where DPDK is not configured for use, and only DPDK interface counters (throughput rate, PPS, and drop counters) are exported to the Wavefront service. This leads to reduced monitoring abilities in platforms like Azure where DPDK is not used.
- Fixed Issue 82839: If a user performs an IPsec automation deletion action for a Zscaler Cloud Security Service tunnel on a VMware SASE Orchestrator, the action also deletes all the VPN credentials associated with the respective Zscaler location resulting in deletion of the location itself.
The IPsec automation deletion action should only delete the VPN credential associated with it from the Zscaler Location. All other VPN credentials associated with the respective Zscaler Location should remain untouched.
- Fixed Issue 83029: For either a standalone VMware SD-WAN Edge or a site deployed with a High-Availability topology where one or more PPPoE links are used, if the PPPoE endpoint IP changes after either an Edge interface flap for that PPPoE link or when an HA site experiences a failover, traffic would not pass on the affected PPPoE link(s).
On a site that uses PPPoE links, along with a change in the PPPoE endpoint IP, the impact would mean no customer traffic would pass. The issue is caused due to the presence of a stale default route, which is a route using the old IP address of the PPPoE endpoint on the Edge that is not deleted after a new PPPoE endpoint IP address is received.
Without the fix, an onsite user would need to either disconnect each PPPoE cable and reconnect it to force a renegotiation or reboot the Edge, which would also force a renegotiation.
- Fixed Issue 83083: A VMware SD-WAN Gateway upgraded to Release 4.3.1 or later may experience a slow memory leak which can lead to the Gateway's service restarting to clear the memory.
Gateway restarts can be disruptive to customer traffic for the 30-45 seconds it takes the for the Gateway service to restart. Each time an Operator user runs the debug.py --flow_dump all all all command on the Gateway, the Gateway will leak some of its memory. Running this debug command a sufficient number of times will cause the Gateway's memory usage to reach a critical level and trigger a Gateway service restart to clear the memory.
For a Gateway without the fix, an Operator must avoid running the debug.py --flow_dump all all all command on the Gateway. If using this debug command is unavoidable, monitor the memory usage and schedule maintenance windows to preemptively restart the service to clear the memory prior to an unscheduled restart.
- Fixed Issue 83209: For customers using OSPF in their enterprise, OSPF routing may not work as expected.
The Issue occurs when there is a change in the OSPF router-id and the Edge service is restarted. Only loopback interfaces and Interfaces with 'Advertise' flag set are considered for router-id selection. When there is a new loopback interface configured with a higher IP address, upon restarting the Edge service, the new loopback IP address is selected as the router-id and if the Edge is elected as the DR (Designated Router) the issue is seen.
Without the fix, the only workaround is to force the use of the old Router ID. To bring back the old Router ID and set the Advertise Flag on the respective interface (an Edge service restart will be required).
- Fixed Issue 83402: On a VMware SD-WAN Edge with multiple WAN links, one or more WAN links may stop passing traffic.
On the WAN link(s) that stop passing traffic, the DHCP acquired address is not renewed and the WAN interface's address is lost. Issue occurs when there are multiple interfaces acquiring IP addresses using DHCP and the DHCP server is in a different network from the client. The outgoing interface of a DHCP renew unicast packet is determined through route lookup. Since there are multiple default routes with different metric values learned through different interfaces, the DHCP request packets might get sent out of a different interface.
Without the fix, an onsite user would need to unplug and then plug back in the affected WAN link from the Edge to force it to get its IP address again.
- Fixed Issue 83411: When High Availability is turned on with a newly activated VMware SD-WAN Edge, the HA Edge pair may go offline.
When HA is turned on, all the Edge interface MAC addresses are changed to Virtual MAC addresses, and during the issue state the DPDK configuration is not updated with these VMAC addresses. As a result, the heartbeat packets destined for the Orchestrator are dropped due to a destination MAC mismatch, and the Orchestrator marks the HA Edge pair as down.
- Fixed Issue 83424: An SNMP walk may not work properly for interface and path related OIDs.
When a snmpbulkwalk command is done for some Edge deployments, the SNMP walk may take too long and time out. The fix for this issue optimizes SNMP and ensures faster responses to SNMP walks requests. However, it should be noted that in rare instances the issue may still occur as the SNMP process remains a lower priority process on the Edge.
- Fixed Issue 83428: On a customer enterprise using a Hub/Spoke topology, the static tunnels between a VMware Hub Edge and Spoke Edge may stop passing traffic while attempting to measure bandwidth on the tunnel.
On the Hub Edge, there is no mechanism for handling a scenario where tunnel preference is updated in the middle of the tunnel establishment process. The bandwidth measurement process then put the tunnel into a stalled state and traffic cannot pass on this tunnel. Customer traffic can reroute through the Gateway, but this may introduce latency into the Hub/Spoke traffic.
- Fixed Issue 83432: For a site deployed with a High-Availability topology, when additional tunnels are added to the site the VMware SD-WAN Standby Edge may experience a Dataplane Service crash and generate a core.
A common way tunnels are added is by adding WAN links to the HA Edges. When the number of tunnels the Standby Edge needs to synchronize with the Active Edge exceeds 80, this triggers an exception and a Dataplane Service failure on the Standby. When this issue is encountered on a conventional HA topology the customer impact would be minimal as the Standby Edge does not pass customer traffic. On an Enhanced HA deployment, where the Standby Edge is also passing traffic, the reboot(s) would disrupt some customer traffic.
- Fixed Issue 83611: Customer may observe an unusually high number of EDGE_NEW_DEVICE events from a VMware SD-WAN Hub Edge on the VMware SASE Orchestrator UI.
The issue can be encountered with the following topology: Client Device – Spoke Edge – Hub Edge --DHCP Server. With this topology, every time a client user behind a Spoke Edge sends a DHCP packet, the Spoke Edge properly triggers a Edge_New_Device event for this client device. But when the Hub Edge receives the DHCP Relay packet, the Hub Edge again triggers an Edge_New_Device event to the Orchestrator, and this event is incorrect.
- Fixed Issue 83699: When a VMware SD-WAN Gateway is set to quiesced mode from the VMware SASE Orchestrator's New UI, when the user selects a new replacement Gateway, the Orchestrator does not allow any configuration changes to the replacement Gateway.
This issue happens after activating the Non SD-WAN Destination migration process via the Orchestrator's New UI part of which is selecting a New Gateway, which is the Gateway replacing the quiesced Gateway. Once that New Gateway is designated as the replacement Gateway, when attempting to make any configuration change to the replacement Gateway the Operator would observe an error message thrown similar to: "GATEWAY_SERVICE_STATE_INVALID: Cannot change the state of the gateway to null, as it is already used as a replacement gateway".
- Fixed Issue 83928: A VMware SD-WAN Edge may experience high CPU usage and poor customer traffic performance.
Users would also be able to observe poor QoE scores when looking at the Orchestrator's Monitor > Edge > QoE screen for that Edge. The issue is caused by an ACL (Access Control List) rule getting instantiated multiple times in the Edge and it is stressing the Edge's CPU capacity to process this many ACL rules at once and this results in the Edge being unable to process customer traffic properly.
- Fixed Issue 83946: VMware SD-WAN Edge LAN-side clients may observe disruptions in traffic, and for a site using RADIUS authentication, client users may observe authentication failures.
Large packets will be fragmented and these fragmented packets can be dropped by the Edge. The packets are dropped due to a memory leak during fragment IP identification translation during some error scenarios and if the Edge limit for fragmented packets is exceeded, then further fragmented packets will be dropped by the Edge.
For customers using RADIUS where large packets from a wireless client to an Edge using RADIUS authentication are involved this can cause authentication failures. For example, large packets from a wireless LAN controller (WLC) to a RADIUS server may be dropped.
- Fixed Issue 84106: A VMware SD-WAN Edge may export NetFlow statistics at the incorrect time interval which would cause the receiving systems to be out of sync.
NetFlow packets can have an additional 5 second delay from the configured interval. The is because the NetFlow exporter checks for export time once every 5 seconds only, and as a result the NetFlow packets can have a delay of 5 seconds between the configured interval and the actual export interval.
- Fixed Issue 84359: When a VMware SD-WAN Edge interface flaps, it is possible that multiple IPv4 addresses can be assigned to it.
When an interface, configured with a DHCP client flaps (goes down and up in rapid succession), the entire DHCP client process is carried out again and there could be scenarios where a different IP address is acquired each time. In this case, the older IP address is not cleared and stale.
Without this fix, the only way to remediate the issue if for a user has to manually delete the IP addresses from the interface through the Linux shell using the "ip address del" command.
- Fixed Issue 84501: For a customer enterprise using 802.1x authentication (for example, RADIUS, Cisco ISE), when the VMware SD-WAN Edges are upgraded to Release 4.3.1 or later, client devices connected to the Edge may fail authentication against the Network Access Server (NAS) that is hosted over the WAN.
The NAS IP address is set as a loopback IP address by default in the RADIUS or ISE packets sent from the Edge (Authenticator) to the RADIUS or ISE Server and this can cause the authentication packets to not reach the NAS, causing the authentication failure. To remediate the issue, builds with this fix set the NAS IP address as the source interface IP address selected and configured with 802.1x Authentication settings. If 'Auto' is selected as source interface, the loopback IP will be set as NAS IP address by default.
- Fixed Issue 84825: When a large bulk routing configuration is applied on a VMware SD-WAN Edge in a single step, the Edge may experience repeated Dataplane Service failures resulting in repeated restarts of the Edge service to recover from each failure.
When a standalone (non-HA) Edge encounters this issue, there is significant impact to customer traffic because while a single Edge service restart disrupts traffic for ~15 seconds, repeated Edge service restarts would result in disruptions of ~60 seconds or more. On a site with a High-Availability topology, the customer would observe repeated failovers resulting from the Edge service restarts which would also disrupt customer traffic.
This issue occurs when a bulk routing configuration involving a large number of neighbors and route-maps is applied on an Edge in a single step. The Edge system faces great stress while converting these configurations into command specifications and applying them on routing protocols in a short span of time and this causes the repeated Edge service failures and restarts.
On an Edge build without the fix, to mitigate the risk of this issue a customer user would need to do the following:
- Instead of applying a large configuration in a single step, the configuration should be broken into multiple smaller sections with each section applied separately.
- The number of routing filters should be minimized.
- The Edge should only be deliberately restarted in a maintenance window and Edge service restarts should be generally avoided if there are a number of routing filters configured, as the entire Edge configuration is applied at once during restart which would greatly increase the risk of encountering this issue.
- Fixed Issue 84847: Customers deploying a USB-based LTE modem on a VMware SD-WAN Edge, or deploying a VMware SD-WAN Edge LTE model (510-LTE or 610-LTE) may experience intermittent issues with building tunnels from the CELL interface after the modem is reset.
When the LTE modem is reset in one of the following scenarios:
On an Edge using a USB modem, by removing and re-plugging in the modem from the USB port.
On an Edge-LTE, after an Edge reboot or by resetting the CELL1 interface via the Test & Troubleshoot > Remote Diagnostics > Reset USB Modem > CELL1.
In either scenario the underlying network device changes from wwan0 to wwan1 and the Edge does not honor this new name because it appears to be a duplicate interface.
Without the fix the workaround to restore the LTE interface is to restart the Edge Service through Remote Actions > Restart Service.
- Fixed Issue 85369: For a site deployed with a High-Availability topology, the customer may observe customer traffic disruptions and possibly multiple reboots of the VMware SD-WAN Standby Edge.
Multiple threads on the HA Edges are becoming suspended leading to various issues in HA, including but not limited to an HA Active-Active state. If the site does become Active-Active, a conventional HA setup would experience minimal traffic disruption since the Standby Edge does not pass traffic in this topology, but on an Enhanced HA deployment, where the Standby Edge is also passing traffic, the reboot(s) would disrupt some customer traffic. The other way multiple thread suspension can impact a customer is through path disruption which would also be observed as customer traffic disruption. Thus a customer HA site could encounter this issue without necessarily seeing the signs of an Active-Active scenario where the Standby Edge reboots.
The root cause for multiple HA Edge threads getting suspended remains under investigation.
- Fixed Issue 85375: Customers using either USB-based LTE modems on a VMware SD-WAN Edge or VMware SD-WAN Edge LTE models (510-LTE or 610-LTE) may experience disruptions to LTE traffic.
A user would observe on Edge logs RX errors which increment without any traffic passing through the LTE interface. One aspect of the issue is that it occurs only if the MTU for the LTE link is less than 1500.
- Fixed Issue 85459: An attempt to SSH either from an Edge LAN-side client to an Edge, or from a remote branch Edge client to an Edge may not work after LAN side NAT rules rules are configured.
SSH reply packet packets coming from the Edge's SSH process go through the Edge's dataplane service and since LAN side NAT rules are configured, it is possible the SSH reply packets use LAN side NAT rules to go to different destination than the original client that generated the SSH traffic which causes an SSH attempt to an Edge to not work.
On an Edge which lacks the fix, the only workaround is to remove the NAT rule.
- Fixed Issue 86032: When upgrading a VMware SD-WAN Gateway from Release 4.3.x to 4.5.1 or 5.0.0, the Gateway will drop communication with the VMware SASE Orchestrator and eth0 and eth1 interfaces are removed.
The core issue is the Gateway's dataplane process stops after the upgrade. This is caused by the Telegraf service failing to start and since the Telegraf service is activated as part of the Gateway startup script, if Telgraf fails to start, the Gateway's service fails to start as well.
If a Gateway is upgraded to a build without this fix, the only way to remediate the issue is to run 'vc_procmon restart' for the Gateway along with a Telegraf service restart.
- Fixed Issue 86103: For a customer enterprise that uses RADIUS authentication, client users at some sites may be unable to connect to VMware SD-WAN Edges and pass traffic.
The issues is caused by the Edge incorrectly categorizing fragmented RADIUS packets with the DF (Don't Fragment) bit set in the IP header as non-fragmented. One or more of these packets fails to reach multiple Edges with the result that traffic that relies on RADIUS authentication will not pass for those Edges. This issue can occur in any topology including Hub/Spoke and simple Branch-to-Branch.
Without the fix the only workaround is to configure the RADIUS server to not set the DF bit in the IP header while sending fragmented packets.
- Fixed Issue 86314: A VMware SD-WAN Edge may perform an incorrect Stateful Firewall rule lookup when a LAN-side NAT flow is initiated by a remote peer.
When a user configures a LAN-side source NAT (for example, to hide an internal IP subnet behind the Edge) on an Edge where the Stateful Firewall is in use, and a flow is initiated by a remote peer, an erroneous firewall lookup is done for the first return packet.
For example, suppose that an Edge has the following configuration:
LAN-side NAT: [source] inside address: 10.0.2.25/32 outside address: 184.108.40.206/32
Static route: 220.127.116.11/32 [advertise] next hop: 10.0.2.1
A remote client sending a ping to 18.104.22.168 from 10.0.1.25 would result in two firewall rule lookups on the Edge. The first incoming packet would result in a firewall lookup for 10.0.1.25 > 22.214.171.124, and then the first return packet would result in a firewall rule lookup using the non-NAT IP for 10.0.2.25 > 10.0.1.25. This second firewall rule lookup is done in error.
Without this fix, the user would need to create an additional firewall rule to allow the first return packet of the flow.
- Fixed Issue 86617: A customer enterprise that uses a Loopback IP Address with Partner Gateways where BGP is configured may observe that traffic that should use the Loopback IP routes is getting dropped with a resulting disruption in that customer traffic.
The Loopback IP Address routes are missing on the VMware SD-WAN Edge and is caused by a scenario where BGP is configured for the Edge and Partner Gateway and a Loopback IP Address is sent over BGP to the Edge, but the Edge does not learn the Loopback IP route.
- Fixed Issue 86740: When running the Remote Diagnostic "Interface Status", a GPON-type SFP module will not show up when it is deployed in a VMware SD-WAN Edge's SFP2 interface.
The issue is caused by a flaw in the remote diagnostic back-end script which runs on the Edge and does not properly account for the Edge's SFP2 interface.
- Fixed Issue 86808: Some BGP routes are advertised when they should not be as per BGP filters (or not advertised when they should be).
For a given route-map rule, the Edge could either have a prefix-list or a community-list configuration for the Edge's routing based on the rule match-type. However, for route-map unapply functions, the Edge is trying to delete both the prefix-list and community-list for each rule, one of which must be non-existent.
Previously, this did not cause any issues as the commands for non-existent prefix-lists and/or community-lists used to be sent to the Edge's routing process as a separate vtysh command, which would just end up being no-ops and would not impact other commands. At that time, this was a deliberate call as it kept things simple in the route-map unapply functions.
However, as part of the fix for Issue #84825, the Edge started batching multiple prefix-lists/community-list removal vtysh commands together to be sent to the Edge's routing process. Now, when trying to delete non-existent prefix-list/community-list causes the whole command batch to fail and fills the Edge with a stale prefix-lists/community-lists configuration in the Edge's routing process.
On an Edge without a fix for this issue, a user needs to restart the Edge Service to ensure all BGP routes are properly advertised.
- Fixed Issue 87304: If a user deactivates a LAN interface on a VMware SD-WAN Edge using the VMware SASE Orchestrator UI, the interface will still be reported as 'UP' by SNMP.
The key debug process for interfaces output does not include the physical port details for Edge LAN interfaces (for example, GE1 or GE2). As a result when SNMP polls those interfaces it always returns a result of UP regardless of how these interfaces are configured.
- Fixed Issue 87552: On a site using a Non SD-WAN Destination (NSD) via Edge, the VMware SD-WAN Edge may periodically experience a Dataplane Service failure and restart when Edge-to-NSD tunnels are unstable.
When an Edge-to-NSD tunnel is torn down, the incorrect release of a previously chosen tunnel is performed that triggers an exception in the Edge Dataplane Service and a restart is required to restore the service. Restarting the Edge service will result in a 10-15 second disruption of customer traffic.
If the Edge uses a build without the fix, limiting a NSD via Edge to one WAN link will decrease the likelihood of this issue occurring.
- Fixed Issue 87612: For a VMware SD-WAN Edge with VNF Insertion on one or more VLANs, client users on those VLANs are unable to obtain IP addresses from a DHCP Relay server.
The Edge is not forwarding the DHCP relay packets and thus the client users are not receiving IP addresses.
Without the fix, the only workaround is to disable VNF Insertion on the VLAN.
- Fixed Issue 87982: A VMware SD-WAN Edge using a Metanoia-type SFP module with a private PPPoE WAN link may be unable to establish BGP peering and connect to other sites.
VLAN tagged packets using a private PPPoE link are corrupted by the Edge and never reach their destination as a result. This issue does not affect public PPPoE links.
- Fixed Issue 88757: A user running the Remote Diagnostic "Route Table Dump" on the Orchestrator UI may find the attempt times out and the page returns no result.
The Route Table Dump diagnostic times out because the WebSocket timeout is 30 seconds and for a site with a large number of routes the amount of time the debug command takes to deliver all the routes to the Orchestrator may exceed that. The fix here is to lower the time out of the route dump process to less than 30 seconds and prevent the WebSocket from timing out prior to that, which ensures that the Route Table Dump will return a result.
- Fixed Issue 88796: When deploying either a VMware SASE Orchestrator or a VMware SD-WAN Gateway and using an OVA on vSphere, the OVF properties set as part of the deployment (password, network information, etc.) are not applied to the image and the system cannot be accessed after deployment.
This only affects new system deployed from OVA using OVF/vApp properties (versus using ISO files). This issue is caused by upstream changes to cloud-init in recent updates.
On a Gateway without the fix, the workaround is for the Operator to deploy the system using a cloud-init user-data ISO file.
Note: This open ticket applies only to Gateway builds. The Orchestrator issue is fixed with the Release R5002-20220517-GA build and later.
- Fixed Issue 89217: A VMware SD-WAN Edge in the 6x0 model line (610, 610N, 610-LTE, 620, 620N, 640, 640N, 680, 680N) may suddenly power off for no reason.
The 6x0 Edge would have all lights off, both the front status LED and the rear Ethernet port lights, and can only be recovered by manually power cycling the Edge.
The cause of the issue is traced to a PIC microcontroller exclusive to the Edge 6x0 line which uses a PIC firmware version of v20M or earlier (v20L, v20K, v20J). This issue can only occur when the 6x0 Edge uses a PIC version of v20M or earlier, but even with this version the odds of experiencing the power off issue are rare (approximately 1/1,000). The issue cannot occur on a 6x0 Edge with a PIC firmware version of v20N or later.
Note: A 6x0 Edge's Firmware including PIC version can be determined on an Orchestrator using 5.x by going to the Monitor > Edge > Overview page for that Edge and clicking the dropdown information box next to the Edge name which includes the Edge Information, Device Version, and the Device Firmware.
The issue is resolved by upgrading the 6x0 Edge to Platform Firmware 1.3.0 (R130-20220328-GA), which includes PIC version v20N. To do this the 6x0 Edge must be connected to a VMware SASE Orchestrator using Release 5.x (5.0.0 or later), and the 6x0 Edge must also be using a Release 5.x build as well. The user would then update the 6x0 Edge Platform Firmware to version 1.3.0 (R130-20220328-GA) in the same way that an Edge's software version is modified.
For information on uploading a Platform Firmware bundle to an Orchestrator, consult the Platform Firmware and Factory Images with New Orchestrator UI section of the VMware SD-WAN Operator Guide.
For information on updating a 6x0 Edge’s Platform Firmware, consult the View or Modify Edge Information section of the VMware SD-WAN Administration Guide.
Note: If the user prefers to keep the Edge on a lower software release (for example, Release 4.3.1, or 4.5.1), the customer can temporarily upgrade the Edge to a 5.x build, perform the Platform Firmware upgrade to version 1.3.0 (R130-20220328-GA) so that the PIC version is v20N, and then downgrade the Edge’s software back to their preferred version. Downgrading the 6x0 Edge's software to an earlier version does not also downgrade the Edge's Platform Firmware and the Edge would continue to use Platform Firmware version 1.3.0 (R130-20220328-GA). In this use case the customer Edges would need to be on an Orchestrator using Release 5.x.
Note: If the 6x0 Edge is on an Orchestrator that does not use version 5.x and has experienced this issue and requires an update of its PIC firmware, the customer may reach out to VMware SD-WAN Support and they will manually update the Edge’s PIC version.
- Fixed Issue 90151: For BGP over IPsec on Gateway, applying different BGP filters to primary and secondary neighbors may not work as expected.
When different filters are applied to a Non SD-WAN Destination (NSD)-BGP on the VMware SD-WAN Gateway's primary and secondary neighbors, only one gets applied to both BGP neighbors.
The cause of this issue is that for Partner Gateway (PG)-BGP, the SD-WAN service identifies BGP filters using a combination of enterprise_logical_id and segment_id and using enterprise_logical_id was sufficient for Partner Gateway-BGP, because for a given enterprise-segment combination, we could only have 1 PG-BGP neighbor.
However, this method was inherited for NSD-BGP on the Gateway where there can be up to 2 BGP neighbors (Primary and Secondary) for the same enterprise-segment combination. As a result, the enterprise_logical_id and segment_id combination does not suffice for differentiating between filters of 2 different NSD-BGP neighbors.
- Fixed Issue 90283: A customer may experience poor audio and/or video quality for VoIP and videotelephony calls if Underlay Accounting is turned on for the WAN link being used on the VMware SD-WAN Edge.
When checking the logs, the user would observe packets for bidirectional traffic where the traffic is asymmetrically routed, and one of the routes is via the underlay. In other words, when the routes for a flow are asymmetric such that in one direction the traffic takes an underlay route and in the reverse direction it takes an overlay path and where Underlay Accounting is toggled on for that WAN link, packet loss may be experienced on bidirectional flows which are typical of, but not limited to, VoIP and videotelephony calls.
- Fixed Issue 90876: DNS fails on a Non-Global Segment for a one-hop away client who is connected to a VMware SD-WAN Edge either by a LAN interface or a routed sub-interface without a Gateway IP.
The cause of the issue differs depending on which Edge interface type the one-hop away client is using.
- If the one-hop away client connected to an Edge via a LAN port, DNS resolution fails for the Non-Global Segment as the Edge routes the reply packet to the client is the VCE1 interface and the Edge process treats it as a Global segment. As a result, the reply packet is dropped as a static route is available in the Global Segment routing table.
- If the one-hop away client is connected to an Edge via a routed sub-interface port which does not have a Gateway IP address and a static route for the client on Orchestrator, then DNS resolution fails for the client as the Edge does not have a route for the client. It matches the connected route and sends an ARP for the destination IP itself and ARP fails and the reply is not sent.
For an Edge that does not have a fix for this issue, the workaround for a client using an Edge's LAN is to only use the Global Segment. For a client using a routed sub-interface, the workaround is to provide a Gateway IP address and, if that is not possible, only use the Global Segment.
- Fixed Issue 91875: For a customer who has configured a WAN link as a Backup on a VMware SD-WAN Edge, they may observe the backup WAN link becoming active intermittently even though the conditions requiring the link to become active are not present.
The issue is caused by a race condition on an Edge process that leads the Edge to erroneously think the backup WAN link is needed and proceeds to build a tunnel for that link which the Edge has no failsafe for detecting and tearing down down this erroneous tunnel.
- Fixed Issue 93383: A VMware SD-WAN Edge may suffer one or more Dataplane Service failures with a disruption in customer traffic.
The issue is caused by a rare instance of a mismatch of the number of interfaces stored in the Edge in two different data structures which triggers an exception and results in the Edge service failing one or more times. The Edge service needs to restart to recover which, in a non-HA deployment, would cause a 10-15 second disruption of customer traffic for each restart. However, if the Edge service fails three consecutive times, the Edge will require a reboot or power cycle to recover.
Resolved in Orchestrator Version R5010-20220912-GA
Orchestrator version R5010-20220912-GA was released on 09-13-2022 and is the updated Orchestrator GA build for Release 5.0.1.
The updated R5010-20220912-GA Orchestrator build resolves the below issues since Orchestrator build R5010-20220817.
- Fixed Issue 95847: When a VMware SASE Orchestrator is upgraded to version 5.0.1, the Operator performing the upgrade may observe the schema upgrade was not successful and must be manually rerun.
When an Orchestrator is upgraded to a version with a new ClickHouse schema there is the potential for a race condition on the backend and the old schema version is not up and ready prior to being upgraded. As a result, the Operator needs to manually rerun the schema upgrade.
- Fixed Issue 96095: When a VMware SASE Orchestrator is configured for Disaster Recovery (DR), the IPv6 Orchestrator IP address is cleared from all the Operator Profiles and IPv6-only SD-WAN Edges are marked as down by the Orchestrator.
If an Operator Profile is configured with an Orchestrator IP address for both an IPv4 and IPv6 type, but sets up DR using only the IPv4 address, the IPv6 address is removed from the Operator Profile in the Orchestrator. This causes IPv6-only Edges to stop communicating with the Orchestrator which will mark those Edges as down and stop pushing configuration changes to them.
If this issue is encountered on an Orchestrator without the fix, post-upgrade the DR needs to be broken and set up again for the management IP addresses to be restored.
- Fixed Issue 96108: When a VMware SASE Orchestrator is upgraded to a 5.x build, a customer may observe missing memory usage statistics for their VMware SD-WAN Edges when looking at the Monitor > Edge pages of the UI.
The issue is caused during the migration to a 5.x Orchestrator by older Edges sending a different name for their health statistics memory field (memPct) when the Orchestrator is expecting to receive the Edge's historic health statistics memory field using the current name (memoryPct). As a result, the Orchestrator ignores the Edge health statistics memory field value submitted with the unexpected memPct name, and the Orchestrator defaults the health statistics memory field value to zero.
The fix for this issue resolves the other cause of missing Edge health statistics on the Orchestrator UI, with the first cause being fixed in #90749 on the original 5.0.1 GA build.
Resolved in Orchestrator Version R5010-20220817-GA
Orchestrator version R5010-20220817-GA was released on 08-17-2022 and is the updated Orchestrator GA build for Release 5.0.1.
This Orchestrator build replaces the original GA build R5010-20220803-GA, which included Issue #95613. This issue was discovered during an Orchestrator upgrade after this build was released on 08-05-2022. Customers must only use the R5010-20220817-GA build and not use R5010-20220803-GA.
- Fixed Issue 95613: When a VMware SASE Orchestrator is upgraded to build R5010-20220803-GA, a customer connected to that Orchestrator may experience difficulty monitoring their Edges and, if they use API calls, observe that those calls fail. An Operator user would observe the API process fail and require a restart along with high CPU usage on the Orchestrator.
This issue is triggered by a user configuring their enterprise to make API calls without any time interval gap (in other words, each API call is immediately followed by another). This activity causes the v2 API process that handles the API call to experience an exception and fail when it receives the API request. The failure of the v2 API process means Edge monitoring for data like link statistics (which relies on API calls) will not display accurate data and customer enterprises using API calls would find them failing as well. In addition, if the same enterprise continues making API calls with no time interval gap, the v2 API process will effectively be stuck in a cycle of failure and restart until those API calls can be stopped or modified to include a time interval.
The failure and restart of the v2 API process also causes high Orchestrator CPU consumption which would impact the overall performance of the Orchestrator beyond handling API calls.
Resolved in Orchestrator Build R5010-20220803-GA
The below issues have been resolved since Orchestrator Build 5002-20220517-GA.
- Fixed Issue 49535: On the Monitor > Network Services page, the VMware SASE Orchestrator does not immediately update the BGP neighbor state of a VMware SD-WAN Edge that has gone offline.
The BGP Edge Neighbor State table will continue to show the offline Edge as "Established" and remain that way for hours after the Edge has gone offline. This impacts any user who relies on the Orchestrator UI to check these details.
- Fixed Issue 68463: When looking at the Monitor > QoE graph on the VMware SASE Orchestrator where the graph section is listed as yellow/fair for quality, there may be a discrepancy between the reason the graph section is listed as yellow/fair when looking at the Classic UI versus the New UI.
When encountering this issue, on the Classic UI, if the pop-up box lists Latency as the reason for the fair score, the New UI would have a pop up box that lists Jitter as the cause for the fair score. The issue is caused by an incorrect mapping of the Latency and Jitter values on the New UI.
- Fixed Issue 70005: When using VMware Cloud Web Security, a user can edit an existing Security Policy and rename it an empty or blank name and save it on the VMware SASE Orchestrator.
A user cannot create a Security Policy with an empty/blank name but can edit an existing policy to configure the name to be blank and the Orchestrator permits the change and does not throw an error.
- Fixed Issue 76036: Attempting to access either 'Partner Overview' page and/or a 'Configure > Customer' page for that Partner on a VMware SASE Orchestrator fails to load with an "An unexpected error has occurred" message.
The Partner Overview page and/or a Configure > Customer page for a customer supported by that partner may fail to load because the `enterpriseProxy /getEnterpriseProxyGatewayPools` API times out. The trigger for these pages not loading is if the they include a large number of Gateway pools and Gateways which may lead to the enterpriseProxy /getEnterpriseProxyGatewayPools API used on the page timing out, and causing the page loading issue for each UI page.
- Fixed Issue 81835: The Monitor > Edge > QoE page of the VMware SASE Orchestrator UI may not accurately represent a WAN link's status (whether it is online, offline, or degraded) or accurately represent link metrics for the time period selected.
Different time intervals can lead to the QoE graph showing different results for a WAN link status. And for a link's metrics, the QoE graph may present a particular QoE value (latency, loss, or jitter) that does not reflect the real metric value at that exact time.
This issue is caused by multiple WAN links belonging to different enterprises being assigned the same link logical ID which leads to a malfunction in the Orchestrator's link data backfill process. The Orchestrator erroneously assumes the WAN link logical Id to be unique because it is not tied to a customer's enterprise ID. This allows for duplicate link logical IDs and the possibility of incorrect link metrics and status.
The fix for this issue stores the link keys in the Orchestrator's database as a combination of the customer enterprise logical ID and the WAN link's logical ID, ensuring each WAN link is unique.
- Fixed Issue 82725: A VMware SASE Orchestrator may not generate the password reset link correctly.
This issue occurs when the URL for the Orchestrator is not exactly https://domain/ or https://domain/operator/. However, if for example the URL is https://domain/test/ the password reset link does not work and directs you back to the login page.
When encountering this issue on an Orchestrator without the fix, if the Orchestrator URL cannot be corrected to a URL as shown above, the only option is for a Superuser or Operator to manually enter a new password for the user and then share that with the affected user so that they could in turn reconfigure a different password for themselves once they were successfully logged back in.
- Fixed Issue 83165: An Operator user is not able to transfer a Customer to a Partner on the VMware SASE Orchestrator with the reason that they do not have the same Gateway Pool, even though both do have the same Gateway Pool.
This is caused by an API call network/getNetworkEnterpriseProxies not returning the Gateway Pool details and leading the Orchestrator to think the Partner and Customer do not have the same Gateway Pool and rejecting the assignment.
- Fixed Issue 83538: For customers using the Secure Access service, when creating a Remote Access service, the Enterprise & Network Settings Screen shows internal error message keys on the VMware SASE Orchestrator.
When creating a Remote Access service, if the user enters invalid data in the customer subnet or subnet bits fields, an untranslated error message is displayed below these fields. This error message is of no use to the user and does not point to resolving the actual issue regarding invalid data in either field.
- Fixed Issue 83539: On a VMware SASE Orchestrator deployed with a Disaster Recovery (DR) configuration, when the Orchestrator is upgraded to a new software version, the DR synchronization fails.
DR is running properly prior to the upgrade, but when an Operator user upgrades the Active and Standby Orchestrators, DR status will show as failed.
- Fixed Issue 83582: When upgrading a VMware SASE Orchestrator from Release 4.5.0 to Release 5.0.0, the process takes much longer than expected and until the process completes all Orchestrator services are unavailable.
The schema update can take more than 15 minutes for the Edge Statistics table to update during the upgrade when the LRQ schema should be updated instead and this is causing a major delay in the Orchestrator update completing.
- Fixed Issue 83822: For customers using VMware Cloud Web Security, when looking at Monitor > Logs > Web Logs on the VMware SASE Orchestrator, the user is only able to see a maximum of 100 logs and cannot load more pages to see additional logs.
With this issue the user is stuck using the maximum 100 logs for a single page with no additional logs viewable as pagination is broken for Web Logs on the Orchestrator UI. This is a major hindrance for users because it means if they want to load a large time period (for example, 30 days) they would be unable to see all the logs for that period. The only workaround is to load up short time periods that return 100 or less logs.
- Fixed Issue 84152: When a customer generates a Top Talkers report for their enterprise, the Top Talker names may be listed as 'Unknown'.
"Top Talkers" are the top sources from all the flows in a given time range. The Top Talker name may not show if the client device is not present for the (Source IP + MAC Address) unique pair. This happens because the client devices are saved based on which Visibility Mode (IP Address or MAC Address) is configured for the VMware SD-WAN Edge. For example, an Orchestrator may save a device for (IP Address 1, MAC Address 1) and then the (IP Address 2, MAC Address 2) record is not be saved if Visibility Mode is set to IP Address. This would lead to the Top Talker corresponding to IP Address 2/MAC Address1 being marked as 'Unknown'.
- Fixed Issue 84214: When an Operator user is on the Gateways page of a VMware SASE Orchestrator UI, they may be unable to assign a particular Gateway for the role of Super Gateway.
When a Gateway is already assigned the role of both Super and Alternate Super Gateway, and the Operator tries to edit the Super Gateway assignment of an enterprise from the Customer Usage list on the Gateways > Configure Gateways screen, the UI does not correctly find associated data about the Super Gateway and the Assign Super Gateway dialog does not show up while also throwing an error in the console.
- Fixed Issue 84969: When a VMware SD-WAN Edge running a 4.2.x Release which is also configured with an overridden non-default Management IP is upgraded to Release 4.3.x or higher on a VMware SD-WAN Orchestrator running 4.3.x or higher, the Edge may lose the configured overridden Management IP.
An Orchestrator running 4.3.x or higher is not automatically creating the loopback interface while also retaining the overridden non-default Management IP for an Edge, when that Edge is upgraded from 4.2.x to a 4.3.x or later build.
- Fixed Issue 86546: For customers using VMware Secure Access, a user may not be able to use Secure Access on some SASE PoPs, and some may even show as offline on the VMware SASE Orchestrator.
VMware Gateways that are not configured for use with Secure Access (in other words, Gateways which do not have a geneve tunnel with the tunnel server on the PoP) are also given information about the Secure Access service by the Orchestrator. This leads to a broken route being picked in some instances for routing customer traffic. This issue can be encountered only when more than one Gateway is assigned per PoP per Gateway pool on a particular Orchestrator.
On an Orchestrator that does not have the fix for this issue, the workaround is to add and keep only one Gateway per PoP in each Gateway Pool so that this Gateway always gets picked for Secure Access and the establishing of the correct route.
- Fixed Issue 86848: When a customer administrator makes a failed login attempt using the Native (username/password) method to their customer enterprise on the VMware SASE Orchestrator, the Orchestrator does not log the failed attempt on the Monitor > Events page of the UI.
The Orchestrator should log every login attempt whether it is successful or not to ensure proper accountability of all user accounts and to all the administrators to detect unusual login activity. The issue is caused by the Orchestrator not including the 'enterpriseId' metadata to a failed username/password authorization attempt. This only affects customer users using Native (username/password) authorization and customer enterprises using Single Sign On (SSO) are not impacted by this issue.
- Fixed Issue 87111: When a VMware SASE Orchestrator is upgraded to 4.3.x or later, the VMware SD-WAN Edges connected to the Orchestrator which are configured to use BGP do not have the uplink flag configured.
The BGP uplink flag is added as a configuration in the SD-WAN 4.3.0 Release and Edge Versions 4.3.0 and later are expecting an uplink flag to be present. However, the Orchestrator is not pushing the configuration update to all Edges that are missing this flag after the Orchestrator is upgraded.
- Fixed Issue 88621: A VMware SD-WAN Gateway being migrated is unable to have its configuration modified and saved on the VMware SASE Orchestrator.
An Operator user cannot update the location for a production Gateway, as they attempt to save the configuration the Orchestrator returns the error "GATEWAY_SERVICE_STATE_INVALID: Cannot change the state of the gateway to null, as it is already used as a replacement gateway".
- Fixed Issue 89346: On a VMware SASE Orchestrator using build 126.96.36.199, when generating a New Report from the Monitor Customers screen, the newly generated report is always delivered in English, even if the Report Language was specified as a non-English language.
The downloaded report should be displayed in the language specified under Report Language, but instead the language used is always English.
- Fixed Issue 89800: When a user updates the Segment Property on the VMware SASE Orchestrator, the Edge tunnels to their Zscaler Cloud Security Service (CSS) go down and traffic routed to Zscaler is dropped.
If a user has a CSS configured under Configure > Network Service (any CSS type) and then configures the FQDN and PSK authentication details at Configure > Edge > Device > Cloud Security Service using Edge Override, when a user updates any Segment in the Configure > Segment section of the Orchestrator, the Edge's CSS authentication configuration is deleted and the Edge can no longer connect to the Zscaler peer.
- Fixed Issue 90128: On a customer enterprise which has a Cloud Security Services (CSS) configured, when the user changes the CSS configuration, the CSS event includes the PSK key of the CSS.
While this behavior does not provide a direct vulnerability, the CSS PSK value is sensitive information that should not be included in a log file.
- Fixed Issue 90540: On a VMware SASE Orchestrator using Release 5.0.0, when a VMware SD-WAN Edge using Edge Release 4.5.1 is upgraded to Release 5.0.0, the Edge loses DNS functionality and experiences a loss of connectivity with the internet.
As part of the Edge upgrade to 5.0.0, the Orchestrator's role is to push an updated Edge configuration and the DNS part of that configuration was not compatible with a 4.5.x Edge build causing the DNS settings to be lost and preventing connectivity to the Internet. The Edge would continue to pass traffic to other locations (for example, the Orchestrator, other Edges, Hub Edges, and Non SD-WAN Destinations) where DNS is not a factor.
- Fixed Issue 90067: When a VMware SASE Orchestrator is upgraded to 4.5.1 or 5.0.0, the Operator may observe high CPU usage and load issues.
During the upgrade the Orchestrator loses a critical system property: edge.learnedRoute.maxRoutePerCall. This property caps the number of routing protocol events that can be received by the Orchestrator at any one time. In the absence of this property, an Orchestrator could be flooded with routing protocol events that place it under a high load which can impact the Orchestrator's performance. The fix ensures that system property edge.learnedRoute.maxRoutePerCall persists over Orchestrator upgrades.
- Fixed Issue 90749: When a VMware SASE Orchestrator is upgraded to a 5.x build, a customer may observe the loss of historic statistics for one or more of their VMware SD-WAN Edges when looking at the Monitor > Edge pages of the UI.
In the Orchestrator logs, an Operator would observe "Error while migrating health stats" and "Error while writing data file to clickhouse" log messages with timestamps immediately after the Orchestrator being upgraded to a 5.x build. The issue is triggered during the Orchestrator upgrade by an Edge sending any invalid data (for example, an invalid tunnel count with a negative number) to the Orchestrator which results in the Orchestrator rejecting not only the invalid data, but the entire historic data batch for that particular Edge. As a result, the user observes large historic time gaps in the graphs for that Edge when looking at Monitor > Edge pages post Orchestrator upgrade. The issue does not uniformly impact all Edges connected to the Orchestrator, only the small number that send out invalid data.
Note: There is a related Issue #96108 that also causes missing Edge health statistics that is fixed in Orchestrator build R5010-20220912-GA.
- Fixed Issue 90835: For a customer using the VMware Cloud Web Security service, the user cannot configure Office 365 domain rules for web proxy in Cloud Web Security using the VMware SASE Orchestrator.
The user cannot configure Office 365 (recently renamed to Microsoft 365) domain rules for web proxy in Cloud Web Security using the PAC file wizard.
- Fixed Issue 91054: For a customer using VMware Cloud Web Security, a user may encounter multiple usability issues on the VMware SASE Orchestrator UI when attempting to configure Single Sign-On Authentication.
The issues a user could encounter while configuring Single Sing-On in the Cloud Web Security service include:
• Certificate errors showing on the main Authentication page instead of on the Certificate page.
• A user can sometimes save an invalid certificate.
• Changing a certificate can sometimes reset the other values on the Authentication form.
• Individual fields do not show validation messages inline with the field.
• When saving the Authentication page, the Orchestrator UI does not show a progress spinner.
• The Verbose Debugging tooltip shows "t+2hrs" instead of an actual time.
• In some languages, the Single Sign-On toggle label wraps to more than one line.
• The Save Changes footer layout is incorrect on short screens.
All of the listed issues are resolved on an Orchestrator that includes a fix for #91054.
- Fixed Issue 91179: For a VMware SD-WAN Edge which has a WAN link configured as Hot Standby, if the Hot Standby link's status is standby, the VMware SASE Orchestrator's New UI displays the incorrect status for the Hot Standby link (Active).
The Orchestrator's Classic UI shows the correct status for the link (Idle), so this is limited to the New UI only. The issue is caused by the New UI not getting the correct update on the change of status for a Hot Standby WAN link.
- Fixed Issue 91720: For a customer enterprise that uses a Hub/Spoke topology, a user can remove a VMware SD-WAN Hub Edge from the Backhaul Hub configuration even though that Hub is being used with a Business Policy configured to use internet backhaul.
Once a Business Policy for backhauling Spoke Edge traffic through a Hub Edge has been configured, the expected behavior is that the VMware SASE Orchestrator "locks" that Hub Edge and prevents a user from removing it from the Backhaul Hub configuration in the Configure > Device Settings section. However, with this issue the user can remove the Hub Edge and cause significant customer traffic disruption.
- Fixed Issue 92082: For a customer using VMware Cloud Web Security, the customer may observe that the Content Filtering rules do not honor the configured domain.
The Content Filtering rules override the configured domain provided if the user has also selected ALL for Categories. Or, if the user selects NONE for Categories, the wizard defaulted this choice to mean ALL Categories, hence the domains were not honored here as well. This is caused by an issue in the content filtering wizard and API. If the user configures at least one Category, the Domain is honored.
On an Orchestrator without this fix, the user would need to configure specific categories along with domains, and then the Orchestrator would honor domains in content filtering.
Open Issues in Release 5.0.1.
The known issues are grouped as follows.
- Edge/Gateway Known Issues
- Orchestrator Known Issues
- Cloud Web Security Known Issues
- Secure Access Known Issues
- Issue 14655:
Plugging or unplugging an SFP adapter may cause the device to stop responding on the Edge 540, Edge 840, and Edge 1000 and require a physical reboot.
Workaround: The Edge must be physically rebooted. This may be done either on the Orchestrator using Remote Actions > Reboot Edge, or by power-cycling the Edge.
- Issue 25504:
Static route costs greater than 255 may result in unpredictable route ordering.
Workaround: Use a route cost between 0 and 255
- Issue 25595:
A restart may be required for changes to static SLA on a WAN overlay to work properly.
Workaround: Restart Edge after adding and removing Static SLA from WAN overlay
- Issue 25742:
Underlay accounted traffic is capped at a maximum of the capacity towards the VMware SD-WAN Gateway, even if that is less than the capacity of a private WAN link which is not connected to the Gateway.
- Issue 25758:
USB WAN links may not update properly when switched from one USB port to another until the VMware SD-WAN Edge is rebooted.
Workaround: Reboot the Edge after moving USB WAN links from one port to another.
- Issue 25855:
A large configuration update on the Partner Gateway (e.g. 200 BGP-configured VRFs) may cause latency to increase for approximately 2-3 seconds for some traffic via the VMware SD-WAN Gateway.
Workaround: No workaround available.
- Issue 25921:
VMware SD-WAN Hub High Availability failover takes longer than expected (up to 15 seconds) when there are three thousand branch Edges connected to the Hub.
- Issue 25997:
The VMware SD-WAN Edge may require a reboot to properly pass traffic on a routed interface that has been converted to a switched port.
Workaround: Reboot the Edge after making the configuration change.
- Issue 26421:
The primary Partner Gateway for any branch site must also be assigned to a VMware SD-WAN Hub cluster for tunnels to the cluster to be established.
- Issue 28175:
Business Policy NAT fails when the NAT IP overlaps with the VMware SD-WAN Gateway interface IP.
- Issue 31210:
VRRP: ARP is not resolved in the LAN client for the VRRP virtual IP address when the VMware SD-WAN Edge is primary with a non-global CDE segment running on the LAN interface.
- Issue 32731:
Conditional default routes advertised via OSPF may not be withdrawn properly when the route is deactivated. Reactivating the route, followed by deactivating it again will retract it successfully.
- Issue 32960:
Interface “Autonegotiation” and “Speed” status might be displayed incorrectly on the Local Web UI for activated VMware SD-WAN Edges.
- Issue 32981:
Hard-coding speed and duplex on a DPDK-configured port may require a VMware SD-WAN Edge reboot for the configurations to take effect as it requires turning DPDK off.
- Issue 34254:
When a Zscaler CSS is created and the Global Segment has FQDN/PSK settings configured, these settings are copied to Non-Global Segments to form IPsec tunnels to a Zscaler CSS.
- Issue 35778:
When there are multiple user-defined WAN links on a single interface, only one of those WAN links can have a GRE tunnel to Zscaler.
Workaround: Use a different interface for each WAN link that needs to build GRE tunnels to Zscaler.
- Issue 36923:
Cluster name may not be updated properly in the NetFlow interface description for a VMware SD-WAN Edge which is connected to that Cluster as its Hub.
- Issue 38682:
A VMware SD-WAN Edge acting as a DHCP server on a DPDK-configured interface may not properly generate “New Client Device" events for all connected clients.
- Issue 38767:
When a WAN overlay that has GRE tunnels to Zscaler configured is changed from auto-detect to user-defined, stale tunnels may remain until the next restart.
Workaround: Restart the Edge to clear the stale tunnel.
- Issue 39134:
The System health statistic “CPU Percentage” may not be reported correctly on Monitor > Edge > System for the VMware SD-WAN Edge, and on Monitor > Gateways for the VMware SD-WAN Gateway.
Workaround: Users should use handoff queue drops for monitoring Edge capacity not CPU percentage.
- Issue 39374:
Changing the order of VMware SD-WAN Partner Gateways assigned to a VMware SD-WAN Edge may not properly set Gateway 1 as the local Gateway to be used for bandwidth testing.
- Issue 39608:
The output of the Remote Diagnostic “Ping Test” may display invalid content briefly before showing the correct results.
- Issue 39624:
Ping through a subinterface may fail when the parent interface is configured with PPPoE.
- Issue 39659:
On a site configured for Enhanced High Availability, with one WAN link on each VMware SD-WAN Edge, when the standby Edge has only PPPoE connected and the active has only non-PPPoE connected, a split brain state (active/active) may be possible if the HA cable fails.
- Issue 39753:
Toggling Dynamic Branch-to-Branch VPN to off may cause existing flows currently being sent using Dynamic Branch-to-Branch to stall.
- Issue 40096:
If an activated VMware SD-WAN Edge 840 is rebooted, there is a chance an SFP module plugged into the Edge will stop passing traffic even though the link lights and the VMware SD-WAN Orchestrator will show the port as 'UP'.
Workaround: Unplug the SFP module and then replug it back into the port.
- Issue 40421:
Traceroute is not showing the path when passing through a VMware SD-WAN Edge with an interface configured as a switched port.
- Issue 42278:
For a specific type of peer misconfiguration, the VMware SD-WAN Gateway may continuously send IKE init messages to a Non-SD-WAN peer. This issue does not disrupt user traffic to the Gateway; however, the Gateway logs will be filled with IKE errors and this may obscure useful log entries.
- Issue 42388:
On a VMware SD-WAN Edge 540, an SFP port is not detected after deactivating and then reactivating the interface from the VMware SD-WAN Orchestrator.
- Issue 42872:
Activating Profile Isolation on a Hub profile where a Hub cluster is associated does not revoke the Hub routes from the routing information base (RIB).
- Issue 43373:
When the same BGP route is learnt from multiple VMware SD-WAN Edges, if this route is moved from preferred to eligible exit in the Overlay Flow Control, the Edge is not removed from the advertising list and continues to be advertised.
Workaround: Activate Distributed Cost Calculation (DCC) on the VMware SD-WAN Orchestrator.
- Issue 44995:
OSPF routes are not revoked from VMware SD-WAN Gateways and VMware SD-WAN Spoke Edges when the routes are withdrawn from the Hub Cluster.
- Issue 45189:
With source LAN side NAT is configured, the traffic from a VMware SD-WAN Spoke Edge to a Hub Edge is allowed even without the static route configuration for the NAT subnet.
- Issue 45302:
In a VMware SD-WAN Hub Cluster, if one Hub loses connectivity for more than 5 minutes to all of the VMware SD-WAN Gateways common between itself and its assigned Spoke Edges, the Spokes may in rare conditions be unable to retain the hub routes after 5 minutes. The issue resolves itself when the Hub regains contact with the Gateways.
- Issue 46053:
BGP preference does not get auto-corrected for overlay routes when its neighbor is changed to an uplink neighbor.
Workaround: An Edge Service Restart will correct this issue.
- Issue 46137:
A VMware SD-WAN Edge running 3.4.x software does not initiate a tunnel with AES-GCM encryption even if the Edge is configured for GCM.
- Issue 46216:
On a Non SD-WAN Destinations via Gateway or Edge where the peer is an AWS instance, when the peer initiates Phase-2 re-key, the Phase-1 IKE is also deleted and forces a re-key. This means the tunnel is torn down and rebuilt, causing packet loss during the tunnel rebuild.
Workaround: To avoid tunnel destruction, configure the Non SD-WAN Destinations via Gateway/Edge or CSS IPsec rekey timer to less than 60 minutes. This prevents AWS from initiating the re-key.
- Issue 46391:
For a VMware SD-WAN Edge 3800, the SFP1 and SFP2 interfaces each have issues with Multi-Rate SFPs (i.e. 1/10G) and should not be used in those ports.
Workaround: Please use single rate SFP's per the KB article VMware SD-WAN Supported SFP Module List (79270). Multi-Rate SFPs may be used with SFP3 and SFP4.
- Issue 46918:
A VMware SD-WAN Spoke Edge using the 3.4.2 Release does not update the private network id of a Cluster Hub node properly.
- Issue 47084:
A VMware SD-WAN Hub Edge cannot establish more than 750 PIM (Protocol-Independent Multicast) neighbors when it has 4000 Spoke Edges attached.
- Issue 47664:
In a Hub and Spoke configuration where Branch-to-Branch via Hub VPN is not configured, trying to U-turn Branch-to-Branch traffic using a summary route on an L3 switch/router will cause routing loops.
Workaround: Configure Cloud VPN to activate Branch-to-Branch VPN and select “Use Hubs for VPN”.
- Issue 47681:
When a host on the LAN side of a VMware SD-WAN Edge uses the same IP as that Edge’s WAN interface, the connection from the LAN host to the WAN does not work.
- Issue 47787:
A VMware SD-WAN Spoke Edge configured with a backhaul business policy incorrectly sends traffic via the VMware SD-WAN Gateway path if that flow is initiated from the Hub Edge to that Spoke Edge.
- Issue 48166:
A VMware SD-WAN Virtual Edge on KVM is not supported when using a Ciena virtualization OS and the Edge will experience recurring Dataplane Service Failures.
- Issue 48175:
A VMware SD-WAN Edge running Release 3.4.2 will form an OSPF adjacency on a non-global segment if the non-global segment has an interface configured in the same IP range as an interface configured on the global segment
- Issue 48502:
In some scenarios, a VMware SD-WAN Hub Edge being used to backhaul internet traffic may experience a Dataplane Service Failure due the improper handling of backhaul return packets.
- Issue 48530:
VMware SD-WAN Edge 6x0 models do not perform autonegotiation for triple speed (10/100/1000 Mbps) copper SFP's.
Workaround: Edge 520/540 supports triple speed copper SFPs but this model has been marked for End-of-Sale by Q1 2021.
- Issue 48597: Multihop BGP neighborship does not stay up if one of the two paths to the peer goes down
If there is a Multihop BGP neighborship with a peer to which there are multiple paths and one of them goes down, user will notice that the BGP neighborship goes down and does not come up using the other available path(s). This includes the Local IP-loopback neighborship case too.
Workaround: There is no workaround for this issue.
- Issue 48666:
IPsec-fronted Gateway Path MTU calculation does not account for 61 Byte IPsec overhead, resulting in higher MTU advertisement to LAN client and subsequent IPsec packet fragmentation.
Workaround: There is no workaround for this issue.
- Issue 49738:
In some cases, when a VMware SD-WAN Spoke Edge is configured to use multiple Hub Edges, the Spoke Edge may not form tunnels to one of the Hubs configured in the Hub list.
- Issue 50518:
On a VMware SD-WAN Gateway where PKI is configured, if >6000 PKI tunnels attempt to connect to the Gateway, the tunnels may not all come up because inbound SAs do not get deleted.
Note: Tunnels using pre-shared key (PSK) authentication do not have this issue.
- Issue 51436: For a site using an Enhanced High-Availability topology while deploying a VMware SD-WAN Edge using an LTE modem, if the site gets into a "split-brain" state, the HA failover takes ~5-6 minutes.
As part of the recovery from a split-brain state, the LAN ports are brought down on the Active Edge and this impacts LAN traffic during the time the ports are down and until the site can recover.
Workaround: There is no workaround for this issue
- Issue 52955: DHCP decline is not sent from Edge and DHCP rebinding is not restarted after DAD failure in Stateful DHCP.
If DHCPv6 server allocates an address which is detected as duplicate by the kernel during a DAD check then the DHCPv6 client does not send a decline. This will lead to traffic dropping as the interface address will be marked as DAD check failed and will not be used. This will not lead to any traffic looping in the network but traffic blackholing will be seen.
Workaround: There is no workaround for this Issue.
- Issue 53219: After a VMware SD-WAN Hub Cluster rebalances, a few Spoke Edges may not have their RPF interface/IIF set properly.
On the affected Spoke Edges, multicast traffic will be impacted. What happens is that after a cluster rebalance, some of the Spoke Edge fail to send a PIM join.
Workaround: This issue will persist until the affected Spoke Edge has an Edge Service restart.
- Issue 53337: Packet drops may be observed with an AWS instance of a VMware SD-WAN Gateway when the throughput is above 3200 Mbps.
When traffic exceeds a throughput above 3200 Mbps and a packet size of 1300 bytes, packets drops are observed at RX and at IPv4 BH handoff.
Workaround: There is no workaround for this issue.
- Issue 53359: BGP/BFD session may fail during some DDoS attack scenarios.
If traffic is flooded from the client connected to the routed interface to the LAN client, the BGP/BFD session can fail. Also when real-time high priority traffic is flooded to the overlay destination, the BGP/BFD session can fail.
Workaround: There is no workaround for this issue.
- Issue 53830: On a VMware SD-WAN Edge, some of the routes in BGP view may not have the correct preference and advertise values when DCC flag is configured causing incorrect sorting order in the Edge's FIB.
When Distributed Cost Calculation (DCC) is configured in a scaled scenario with a large number of routes on an Edge, when looking at an Edge diagnostic bundle for the log bgp_view some of the routes may not be correctly updated with the preference and advertise values. This issue, if found at all, would be a found in a few Edges as part of a large enterprise (100+ Spoke Edges connected to either Hub Edges or Hub Clusters).
Workaround: This issue can be addressed by either relearning the underlay BGP routes or performing a "Refresh" option on the OFC page of the VMware SD-WAN Orchestrator for the affected routes. Please note that performing a "Refresh" of a route would re-learn the routes from all the Edges in the enterprise.
- Issue 53934: In an enterprise where a VMware SD-WAN Hub Cluster is configured, if the primary Hub has Multihop BGP neighborships on the LAN side, the customer may experience traffic drops on a Spoke Edge when there is a LAN side failure or when BGP is not configured on all segments.
In a Hub cluster, the primary Hub has Multihop BGP neighborship with a peer device to learn routes. If the physical interface on the Hub by which BGP neighborship is established, goes down, then BGP LAN routes may not become zero despite BGP view being empty. This may cause Hub Cluster rebalancing to not happen. The issue may also be observed when BGP is not configured for all segments and when there are one or more Multihop BGP neighborships.
Workaround: Restart the Hub which had the LAN-side failure (or BGP not activated).
- Issue 57210: Even when a VMware SD-WAN Edge is working normally and is able to reach the internet, the LED in the Local UI's Overview page shows as "Red".
The Edge's Local UI determines the Edge's connectivity by whether it can resolve a well-known name via Google's DNS resolver (188.8.131.52). If it cannot do so for any reason, then it thinks it is offline and shows the LED as red.
Workaround: There is no workaround for this issue, except to ensure that DNS traffic to 184.108.40.206 can reach the destination and be resolved successfully.
- Issue 61543: If more than one 1:1 NAT rule is configured on different interfaces with the same Inside IP, the inbound traffic can be received on one interface and the outbound packets of the same flow can be routed via different interface.
For the NAT flows from Outside to Inside, the 1:1 NAT rules will be matched against the Outside IP and the interface where the packets are received. For the outbound packets of the same flow, the VMWare SD-WAN Edge will try to match the NAT rules again comparing the Inside IP and the outbound traffic can be routed via the interface configured in the first matching rule with "Outbound Traffic" configured.
Workaround: There is no workaround for this issue outside of ensuring no more than one 1:1 NAT rule is configured with a particular Inside IP address.
- Issue 62701: For a VMware SD-WAN Edge deployed as part of an Edge Hub Cluster, If Cloud VPN is not activated under the Global Segment but is activated under a Non-Global Segment, a control plane update sent by the Orchestrator may cause all the WAN links to flap on the Hub Edge.
The Hub Edge's WAN links going down, then up in rapid succession (flap) will impact real time traffic like voice calls. This issue was observed on a customer deployment where Cloud VPN was not activated on the Hub Edge's Global segment, but the Cluster configuration was configured as on, which means this Hub Edge was part of a Cluster (and a Cluster configuration is applicable to all segments). When a configuration change is pushed to the Hub Edge, the Hub Edge's dataplane will start parsing data and will start with the Global Segment where it will see Cloud VPN not activated and the Hub Edge erroneously thinks clustering has been deactivated on this Global Segment. As a result, the Hub Edge will tear down all tunnels from the Hub's WAN link(s) which will cause link flaps on all that Edge's WAN links. For any such incident the WAN links only go down and recover a single time per control pane update.
Workaround: The workaround is to activate Cloud VPN on all segments, meaning the Global Segment and all Non-Global Segments.
- Issue 63629: In a network topology where the VMware SD-WAN Hub Edge and Spoke Edge have different IP family preferences (in other words, IPv4/IPv6 dual stack), the customer can see more bandwidth allocated to the peer than expected.
If both IPv4 and IPv6 families are configured, the Edge internally creates two different link objects. The bandwidth values are added for both of them when it should be added only for one.
Workaround: The workaround for this issue is to not have different tunnel preferences if the Hub/Spoke topology has dual stack configured.
- Issue 65560: Traffic from a customer to PE (Provider Edge) device fails.
BGP neighborship between a Partner Gateway and Provider Edge does not get established when tag-type is selected as "none" on the handoff configuration. This is because ctag, stag values get picked from /etc/config/gatewayd instead of the handoff configuration on the Orchestrator when tag-type is "none".
Workaround: Update the ctag, stag values to 0 each under vrf_vlan->tag_info in /etc/config/gatewayd. Do a vc_procmon restart.
- Issue 67879: A Cloud Security Service (CSS) tunnel is deleted after a user changes a WAN Overlay setting from auto-detect to user-defined on a WAN interface setting.
After saving the changes, the CSS tunnels do not come back up until the customer takes down and then puts back up the tunnel. Changing the WAN configuration will bring down the CSS tunnel and parse the CSS setup again. However, in some corner cases, the nvs_config->num_gre_links is 0 and the CSS tunnel fails to come up.
Workaround: Deactivate the CSS setup, and then reactivate it and this will bring the CSS tunnel up.
- Issue 68057: DHCPv6 release packet is not sent from the VMware SD-WAN Edge on the changing of a WAN interface address mode from DHCP stateful to static IPv6 address and the lease remains active till reaching its valid time.
The DHCPv6 client possesses a lease which it does not release when the configuration change is done. The lease remains valid till its lifetime expires in the DHCPv6 server and is deleted.
Without the fix, there is no way of remediating this issue as the lease would remain active till valid lifetime.
- Issue 68851: If a VMware SD-WAN Edge and VMware SD-WAN Gateway each have the same TCP syslog server configured, the TCP connection is not established from the Edge to the syslog server.
If the Edge and Gateway each have the same TCP server and if the syslog packets from the Edge are routed via the Gateway, the syslog server sends a TCP reset to the Edge.
Workaround: Send the syslog packets direct from the Edge instead of routing via a Gateway or configure a different syslog server for the Edge and Gateway..
- Issue 69284: For a site using a High-Availability topology where the Edges deploy VNF's in an HA configuration and are using Release 4.x, if these HA Edges are downgraded to a 3.4.x Release where HA VNF's are not supported, and then upgraded to 4.5.0, when the HA VNF's are reactivated, the Standby Edge VNF will not come up.
The VNF state on the Standby Edge is communicated as down via SNMP. If the HA VNF pair is downgraded from a version supporting VNF-HA (release 4.0+) to a release which does not support it with VNF configured on the Orchestrator. This issue will be seen when the Edge is upgraded back to a version supporting VNF-HA and it is configured on the Orchestrator again.
Workaround: VNF should first be deactivated in the case of an HA configuration if the Edge is being downgraded to a version which does not support it.
- Issue 70311: A VMware SD-WAN Edge may experience a Dataplane Service Failure and restart as a result.
During the Edge service restart, customer traffic would be disrupted for ~15-30 seconds. This issue occurs inconsistently, but when it does occur the Edge is tearing down an IKE security association (SA). This typically only occurs when: the SA timer (as configured on the VMware SD-WAN Orchestrator) expires; or the user modifies the IPSec configuration on the Orchestrator.
Workaround: There is no workaround for this issue.
- Issue 71719: PPTP Connection is not Established along Edge to Cloud path.
Connection to the PPTP server behind the VMware SD-WAN Edge does not get established.
Workaround: There is no workaround for this issue, not even an Edge restart or reboot.
- Issue 72358: If the IP address of a VMware SD-WAN Orchestrator DNS name changes, the VMware SD-WAN Gateway's management plane process fails to resolve it properly and the Gateway will be unable to connect to the Orchestrator.
The Gateway's management process periodically checks the DNS resolution of the Orchestrator's DNS name, to see if it has changed recently so that the Gateway can connect to the right host. The DNS resolution code has an issue in it so that all of these resolution checks fail, and the Gateway will keep using the old address and thus no longer be able to connect to the Orchestrator.
Workaround: Until this issue is resolved, an Operator User should not change the IP address of the Orchestrator. If the Orchestrator's IP address must be changed, all Gateways connecting to that Orchestrator will have to be reactivated.
- Issue 77541: When a USB modem that supports IPv6 is unplugged and then replugged into a VMware SD-WAN Edge USB interface, an IPv6 address may not provisioned to the USB interface.
This affects USB modems that are IP-based, versus being managed by the ModemManager application. Most Inseego modems are IP-based and this is important because Inseego is the modem manufacturer VMware SASE recommends. USB modems supporting IPv6 which use ModemManager versus being IP-based will be fine in a plug out and plug in scenario.
Workaround: The Edge needs to be rebooted (or power-cycled) after the USB modem is replugged into the Edge's USB port. Post reboot, the Edge will retrieve the IPv6 address for the modem.
- Issue 81852: For a VMware SD-WAN Edge that is using a Zscaler type Cloud Security Service (CSS) which uses GRE tunnels that has turned on L7 Health Check, when that Edge is upgraded to Release 5.0.0, in some instances the customer may observe L7 Health Check errors.
This is typically seen during software upgrade or during startup time. When L7 Health check for a CSS using GRE tunnels is turned on, error messages related to socket getaddress error may be seen. The observed error is intermittently seen, and not consistent. Because of this, L7 Health Check probe messages are not sent out.
Workaround: Without the fix, to remediate the issue, a user needs to turn off and then turn back on the L7 Health Check configuration, and this feature would then work as expected.
- Issue 82184: On a VMware SD-WAN Edge which is running Edge Release 5.0.0, when a traceroute or traceroute6 is run to the Edge's br-network IPv4/IPv6 address, the traceroute will not properly terminate when a UDP probe used.
Traceroute or traceroute6 to the Edge's br-network IPv4/IPv6 address will not properly when Default Mode (in other words, UDP probe) is used.
Workaround: Use -I option in traceroute and traceroute6 to use ICMP probe and then traceroute to br-network IPv4/IPv6 address will work as expected.
- Issue 82415: A VMware SD-WAN Gateway deployed a KVM image with Intel® Ethernet Server Adapter X710; SR-IOV; and Bond0 does not respond if activated to Release 4.5.0 or 5.0.0.
For a Gateway so deployed paths do not come up and the debug.py commands do not work.
Workaround: There is no workaround. Avoid using these builds for this specific Gateway deployment until new builds with this issue fixed are rolled out.
- Issue 83166: When a VMware SD-WAN Gateway is freshly deployed with a AWS c5.4xlarge instance type from the AWS Portal with IPv6 option selected, neither IPv6 or the default routes are configured.
As a result of IPv6 and default routes not being configured, the AWS Gateway IPv6 management tunnels are not forming and the Gateway will not work.
Workaround: There is no workaround for this issue, avoid deploying a Gateway with the properties mentioned above.
- Issue 83227: For a VMware SD-WAN Edge running Release 5.0.0 which is configured with 128 Segments, the Edge's dnsmasq process will stop and exit.
When IPv6 is activated on 128 segments and DCPv6 servers are configured in the LAN of each segment, the dnsmasq process will stop as the total open file descriptors is exceeded. The dnsmasq process will continue for ~30 minutes before exiting at which point the Edge's DHCP assignment of IP addresses will fail.
Workaround: Rebooting the Edge restores the dnsmasq process for ~30 minutes but it will fail again. The only real workaround is to reduce the number of segments to less than 128.
- Issue 86098: For a site using an Enhanced High-Availability topology where a PPPoE WAN link is used on the Standby Edge, a user may observe that the default proxy route is not installed in the Active Edge and traffic using that link fails.
When an Enhanced HA Edge pair come up, the PPPoE link synchronizes with the Standby Edge and provides a default route with a next hop of 0.0.0.0. As a result this route is not installed on the Active and traffic using this link is dropped.
Workaround: There is no workaround for this issue.
- Issue 92481: If a WAN interface on a VMware SD-WAN Edge is deactivated on the VMware SASE Orchestrator, the interface will still be reported as 'UP' by SNMP.
The key debug process for interfaces output does not include the physical port details for Edge WAN interfaces (for example, GE3 or GE4 on an Edge 6x0 or 3x00 model). As a result when SNMP polls those interfaces it always returns a result of UP regardless of how these interfaces are configured.
Workaround: There is no workaround for this issue.
- Issue 92676: For a customer deployment where a Non VMware SD-WAN Destination (NSD) via Gateway is configured to use redundant tunnels and redundant Gateways and is also using BGP over IPsec, if the Primary and Secondary Gateways advertise a prefix with an equal AS path to the Primary and Secondary NSD tunnels, the Primary NSD tunnel will prefer a redundant Gateway path over the Primary Gateway.
The impact of the Primary NSD over Gateway tunnel preferring the redundant Gateway path over the Primary Gateway is experienced only for return traffic to the Gateway from the NSD.
Without a fix for this issue, a user would need to configure a higher (3 or more) metric on the redundant Gateway for the interested prefix as this will help the NSD's primary tunnel choose the Primary Gateway for return traffic.
Workaround: Configure a higher (3 or more) metric on the redundant Gateway for the interested prefix as this will help the NSD's primary tunnel choose the Primary Gateway for return traffic.
- Issue 93062: When a user runs the Remote Diagnostic "Interface Status" on the VMware Orchestrator, the Orchestrator either returns an error for that test and does not complete or the test does not return results for routed interfaces.
The error message seen is "error reading data for test". If the test does complete, the results for routed interfaces are empty with no information about speed or duplex. Either way the Interface Status is broken. The issue is related to the debug command that underlies Interface Status omitting DPKD activated ports.
Workaround: The user would need to generate a diagnostic bundle for the Edge to see the status for routed interfaces.
- Issue 93141: On a site deployed with a High Availability topology, a customer using an L2 switch upstream of the HA Edge pair may observe in the switch logs evidence of an L2 traffic loop, though there is no actual loop.
The issue is caused by the HA Edge sending the HA interface heartbeat with the Virtual MAC address to the Orchestrator instead of the interfaces actual MAC address, which is caused by the HA Edge storing the Virtual MAC address in its MAC file. As a result the connected L2 switch detects traffic from the same source MAC coming from two different Edge interfaces and would log it as an L2 loop. This issue is cosmetic at the log level as there is no actual L2 loop and there is no customer traffic disruption or loss of contact with the Orchestrator arising from this issue.
Workaround: The customer can ignore L2 loop detection events from upstream switches that arise out of the Edge's HA interface (usually GE1).
- Issue 94204: A user may observe that attempts to generate a diagnostic bundle for a VMware SD-WAN Edge fail.
The Edge diagnostic bundles fail to complete because the Edge runs out of disk space. This can happen if the Edge has generated one or more cores and is caused by the Edge sending these cores to the /vnf/tmp folder. Each core is unpacked in the /vnf/tmp folder and due to a core's unpacked size quickly fills this folder which causes the diagnostic bundle to fail.
Workaround: There is no workaround for this issue.
- Issue 95565: On a site using a High Availability topology, the VMware SD-WAN Active Edge may experience a Dataplane Service failure with a core generated and triggering a High Availability failover.
The issue is triggered by the Active Edge's WAN links flapping one or more times (going down and then come up rapidly) while also using SNMP where there are frequent SNMP queries. There is a timing issue where the interface coming back up and the SNMP query together can trigger a deadlock which causes the Dataplane Service to fail and generate a core. While only a single WAN link flap can cause this issue, the greater the frequency of WAN link flaps, the greater the potential for this issue to occur.
Workaround: On an HA Edge pair that experiences this issue and does not have the fix, the workaround is to disable SNMP as this is a timing issue and this reduces the risk.
- Issue 96441: On a site using a High Availability Topology, the customer may observe frequent HA failovers.
The issue is triggered by the HA interface being marked by the Edge as down and then coming back up within 500-1000ms which can trigger an HA failover. However, these interface down events are spurious and caused by a DPDK-enabled interface using polling with an interval of 500ms to determine interface status. Using this method, the underlying device driver can sometimes report a spurious interface down event and each event causes the Edge to mark the interface as down until the next poll of the interface status (in 500ms) reports that the interface is up.
Workaround: There is no workaround for this issue.
- Issue 96888: In certain load conditions, the routing protocols for either BGP or OSPF may randomly restart, leading to route re-convergence and traffic disruption.
Under higher load conditions the BGP and OSPF routing protocol processes are made to wait longer than expected by the Edge CPU to get scheduled and this leads to a stall and restart of the routing protocol. The routing protocol delay is caused by insufficient CPU bandwidth allocation and can occur on any Edge model.
Workaround: If an Edge is experiencing this issue, a customer may contact VMware Support for assistance.
- Issue 98136: For customer enterprises using a Hub/Spoke topology where Dynamic Branch To Branch VPN is configured, client users behind a SD-WAN Spoke Edge may observe that some traffic has unexpected latency resulting from the traffic using a sub-optimal path.
Spoke Edge traffic that experiences this issue uses a route that was initially a non-uplink route for a Hub Edge not included in the Profile the Spoke Edge was using. A Dynamic Branch To Branch VPN tunnel can be formed from the Spoke Edge to the Hub Edge because of traffic being sent towards some other unrelated prefix and in this instance the non-uplink route is installed in the Spoke Edge.
As a result of this non-uplink route, all traffic towards this prefix starts going through the Hub Edge and the non-uplink route becomes uplink (community change to uplink community) but the non-uplink route installed previously is not revoked and the traffic takes the Hub Edge path as long as the Dynamic Branch To Branch VPN tunnel remains up.
Workaround: Wait for the Dynamic Branch To Branch VPN tunnel to tear down, after which the uplink route will not be installed in the Spoke Edge when a new Dynamic Branch To Branch VPN tunnel is formed towards the Hub Edge.
- Issue 19566:
After High Availability failover, the serial number of the standby VMware SD-WAN Edge may be shown as the active serial number in the Orchestrator.
- Issue 21342:
When assigning Partner Gateways per-segment, the proper list of Gateway Assignments may not show under the Operator option "View" Gateways on the VMware SD-WAN Edge monitoring list.
- Issue 24269:
Monitor > Transport > Loss not graphing observed WAN link loss while QoE graphs do reflect this loss.
- Issue 25932:
The VMware SD-WAN Orchestrator allows VMware SD-WAN Gateways to be removed from the Gateway Pool even when they are in use.
- Issue 32335:
The ‘End User Service Agreement’ (EUSA) page throws an error when a user is trying to accept the agreement.
Workaround: Ensure no leading or trailing spaces are found in Enterprise Name.
- Issue 32435:
A VMware SD-WAN Edge override for a policy-based NAT configuration is permitted for tuples which are already configured at the profile level and vice versa.
- Issue 32856:
Though a business policy is configured to use the Hub cluster to backhaul internet traffic, the user can unselect the Hub cluster from a profile on a VMware SD-WAN Orchestrator that has been upgraded from Release 3.2.1 to Release 3.3.x.
- Issue 32913:
After activating High Availability, multicast details for the VMware SD-WAN Edge are not displayed on the Monitoring Page. A failover resolves the issue.
- Issue 33026:
The ‘End User Service Agreement’ (EUSA) page does not reload properly after deleting the agreement.
- Issue 34828:
Traffic cannot pass between a VMware SD-WAN Spoke Edge using release 2.x and a Hub Edge using release 3.3.1.
- Issue 35658:
When a VMware SD-WAN Edge is moved from one profile to another which has a different CSS setting (e.g. IPsec in profile1 to GRE in profile2), the Edge level CSS settings will continue to use the previous CSS settings (e.g. IPsec versus GRE).
Workaround: At the Edge level, deactivate GRE, and then reactivate GRE to resolve the issue.
- Issue 35667:
When a VMware SD-WAN Edge is moved from one profile to another profile which has the same CSS setting but a different GRE CSS name (the same endpoints), some GRE tunnels will not show in monitoring.
Workaround: At the Edge level, deactivate GRE and then reactivate GRE to resolve the issue.
- Issue 36665:
If the VMware SD-WAN Orchestrator cannot reach the internet, user interface pages that require accessing the Google Maps API may fail to load entirely.
- Issue 38056:
The Edge-Licensing export.csv file not show region data.
- Issue 38843:
When pushing an application map, there is no Operator event, and the Edge event is of limited utility.
- Issue 39633:
The Super Gateway hyper link does not work after a user assigns the Alternate Gateway as the Super Gateway.
- Issue 39790:
The VMware SD-WAN Orchestrator allows a user to configure a VMware SD-WAN Edge’s routed interface to have greater than the supported 32 subinterfaces, creating the risk that a user can configure 33 or more subinterfaces on an interface which would cause a Dataplane Service Failure for the Edge.
- Issue 40341:
Though the Skype application is properly categorized on the backend as Real Time traffic, when editing the Skype Business Policy on the VMware SD-WAN Orchestrator, the Service Class may erroneously display “Transactional”.
- Issue 41691:
User cannot change the 'Number of addresses' field although the DHCP pool is not exhausted on the Configure > Edge > Device page.
- Issue 43276:
User cannot change the Segment type when a VMware SD-WAN Edge or Profile has a partner gateway configured.
- Issue 47269:
The VMware SD-WAN 510-LTE interface may appear for Edge models that do not support an LTE interface.
- Issue 47713:
If a Business Policy Rule is configured while Cloud VPN is toggled off, the NAT configuration must be reconfigured upon turning on Cloud VPN.
- Issue 47820:
If a VLAN is configured with DHCP toggled off at the Profile level, while also having an Edge Override for this VLAN on that Edge with DHCP activated, and there is an entry for the DNS server field set to none (no IP configured), the user will be unable to make any changed on the Configure > Edge > Device page and will get an error message of ‘invalid IP address ’ that does not explain or point to the actual problem.
- Issue 48085:
The VMware SD-WAN Orchestrator allows a user to delete a VLAN which is associated with an interface.
- Issue 48737:
On a VMware SD-WAN Orchestrator which is using the Release 4.0.0 new user interface, If a user is on a Monitor page and changes the Start & End time interval and then navigates between tabs, the Orchestrator does not update Start & End interval time to the new values.
- Issue 49225:
VMware SD-WAN Orchestrator does not enforce a limit of 32 total VLANs.
- Issue 49790:
When a VMware SD-WAN Edge is activated to Release 4.0.0, the activation is posted twice in Events.
Workaround: Ignore the duplicate event.
- Issue 50531:
When two Operators of differing privileges use the same browser window when accessing the New UI on a 4.0.0 Release version of the VMware SD-WAN Orchestrator, and the Operator with lesser privileges tries to login after the Operator with higher privileges, that lesser privileged Operator will observe multiple errors stating that the "user does not have privilege".
Note: There is no escalation in privileges for the Operator with lower privileges, only the display of error messages.
Workaround: The next operator may refresh that page prior to logging in to prevent seeing the errors, or each Operator may use different browser windows to avoid this display issue.
- Issue 51722: On the Release 4.0.0 VMware SD-WAN Orchestrator, the time range selector is no greater than two weeks for any statistic in the Monitor > Edge tabs.
The time range selector does not show options greater than "Past 2 Weeks" in Monitor > Edge tabs even if the retention period for a set of statistics is much longer than 2 weeks. For example, flow and link statistics are retained for 365 days by default (which is configurable), while path statistics are retained only for 2 weeks by default (also configurable). This issue is making all monitor tabs conform to the lowest retained type of statistic versus allowing a user to select a time period that is consistent with the retention period for that statistic.
Workaround: A user may use the "Custom" option in the time range selector to see data for more than 2 weeks.
- Issue 60522: On the VMware SD-WAN Orchestrator UI, the user observes a large number of error messages when they try to remove a segment.
The issue can be observed when adding a segment to a profile and the associating the segment with multiple VMware SD-WAN Edges. When the user attempts to remove the added segment from the profile, they will see a large number of error messages.
Workaround: There is no workaround for this issue.
- Issue 60039: RMA Reactivation does not work when the VMware SD-WAN Edge model is changed.
When performing an RMA Reactivation for a site where the Edge model is also being changed, the VMware SD-WAN Orchestrator does not save the model change making the reactivation link ineffective. This only affects RMA Reactivations where the Edge model is changed, an RMA Reactivation where the Edge model remains the same will work as expected.
Workaround: If using a different Edge model for a site, the user would need to create a new Edge and manually apply all Edge-specific settings.
- Issue 62624: User sees the Customer name when attempting to uncheck the Partner Gateway checkbox while the Partner Gateway is in use.
When a user unchecks the Partner Gateway checkbox for a particular Gateway on the VMware SD-WAN Orchestrator UI while the Gateway is used by one or more customers and a customer profile as well, the Orchestrator shows only the name of the Profile and Edge not the name of the customer names using the Gateway.
Workaround: There is no workaround for this issue.
- Issue 68463: When using the New UI on the VMware SD-WAN Orchestrator and looking at the QoE section, the wrong graph values are shown.
When going on QoE in the old UI "Latency Fair" is displayed on the graph, whereby when visiting the new UI (for the same Edge and time) "Jitter Fair" is displayed. This is caused by QoE being incorrectly mapped on the New UI.
Workaround: There is no work around for this issue beyond using the Old UI to confirm correct QoE values.
- Issue 82680: For customer using MT-GRE Tunnel Automation, when a user turns off the Cloud-to-Cloud Interconnect (CCI) flag on a VMware SD-WAN Gateway which is configured to use CCI, the Zscaler MT-GRE entries may not get deleted from the Zscaler portal consistently.
After a CCI site has been deleted from the Gateway, the entries for this site should also be removed. This issue has only been seen during test automation and has not been reproduced manually, but remains a risk.
Workaround: Manually delete the resource from Zscaler before retrying.
- Issue 82681: For customer using MT-GRE Tunnel Automation, when a user turns off the Cloud-to-Cloud Interconnect (CCI) flag on a VMware SD-WAN Gateway which is configured to use CCI, and the user deactivates the CCI flag from a VMware SD-WAN Edge with CCI configured which is using a Zscaler Cloud Security Service, the Zscaler MT-GRE entries may not get deleted from the Edge or from the Zscaler portal.
After a CCI site has been deleted from the Gateway, the entries for this site should also be removed. This issue has only been seen during test automation and has not been reproduced manually, but remains a risk.
Workaround: Manually delete the resource from Zscaler before retrying.
- Issue 82775: On a VMware SASE Orchestrator using Release 5.0.0, when a Zscaler type Cloud Security Service (CSS) is configured for a customer and associated with a VMware SD-WAN Edge, and then a Business Policy is configured with a CSS backhaul rule, a user is unable to change the CSS hash or encryption parameters for that CSS.
The Orchestrator locks the user out from modifying the Zscaler CSS configuration once it is associated with an CSS Backhaul Business Policy.
Workaround: The user needs to delete the CSS Backhaul Business Policy to modify the Zscaler CSS configuration and then recreate the same Business Policy.
- Issue 82864: On a VMware SASE Orchestrator using Release 5.0.0, when a user is on the Configure > Profiles page and selects 'Modify', the user is redirected to the Profile > Overview page instead of the Profile > Device Settings page.
The Configure > Profiles 'Modify' button is not mapping to the correct page.
Workaround: There is no workaround.
- Issue 94668: When configuring a Business Policy rule where the NAT option is checked, a user can configure invalid symbols for the Source NAT IPv6 or Destination NAT IPv6 Address that are accepted by the VMware SASE Orchestrator.
An invalid symbol is anything that should not be used to configure an IPv6 address (for example, %, *, or +). As a result the Orchestrator accepts an invalid configuration and the rule will not work for NAT because the IPv6 address is invalid.
Workaround: There is no workaround beyond the user ensuring that the IPv6 addresses are valid when configured for Source or Destination NAT IPv6.
- Issue 62934: For an enterprise using VMware Cloud Web Security, if a client user opens a Chrome browser in Incognito and attempts to download a file, the download may occasionally not be successful.
Incognito requires turning on 3rd party cookies. Turn on 3rd party cookies and retry the operation. On an unsuccessful download the user would observe a screen which reads either "Error occurred contact your administrator" or for files from a custom web server: "This page is not working". Occasionally some web servers or Files may have a variance in File signature, that the Cloud Web Security Service may not be able to recognize, and hence this issue.
Workaround: Turn on allowing 3rd party Cookies and retry. No known workaround for this issue if using an Incognito window.
- Issue 63149: When a customer deployment has overlapping subnets in a profile and configures a subnet for a VMware Cloud Web Security policy, and associates the Cloud Web Security policy to the profile and segment, Edge clients on that subnet will not be able to connect to the internet.
If there are overlapping subnets configured for the LAN segments behind VMware SD-WAN Edges within the same segment, then the resources behind the Edges cannot have Cloud Web Security policies applied for the internet-bound traffic. This is because there is no way to uniquely identify the destination Edge for the return traffic from the internet to Cloud Web Security.
Workaround: Turn on LAN side NAT on the Edge and have a unique subnet represent the traffic originated from resources behind the Edge.
- Issue 65001: For a customer using VMware Cloud Web Security, a user cannot configure the Inspection Engine to turn on/off File Hash checks when using the VMware Orchestrator to do so.
When a user is using the Orchestrator to configure the Cloud Web Security Inspection engine's File Hash check parameter for either "Action for Unknown File Download" and "Action for unknown document Download", these changes are not sent to the Inspection Engine and are not applied.
Workaround: There is no workaround for this issue.
- Issue 64541: For a customer using VMware Secure Access, when using the option in Workspace ONE UEM Configuration to configure Tunnel Hostname within the Organization Group, if a user selects 'Yes', the hostname will be created in the UEM console automatically instead of being configured manually.
The user should have the option to configure the hostname manually and not just have it automatically created.
Without the fix, the workaround is to manually set it in the UEM console.