Updated June 8th, 2022 VMware SD-WAN™ Orchestrator Version R421-20211216-GA Check regularly for additions and updates to these release notes. |
What's in the Release Notes
The release notes cover the following topics:Recommended Use
This release is recommended for all customers who require the features and functionality first made available in Release 4.2.0, as well as those customers impacted by the issues listed below which have been resolved since Release 4.2.0.
Compatibility
Release 4.2.1 Orchestrators, Gateways, and Hub Edges support all previous VMware SD-WAN Edge versions greater than or equal to Release 3.0.0
Note: this means releases prior to 3.0.0 are not supported.
The following interoperability combinations were explicitly tested:
Orchestrator |
Gateway |
Edge |
|
Hub |
Branch/Spoke |
||
4.2.0 |
4.2.0 |
4.2.0 |
4.2.1 |
4.2.0 |
4.2.0 |
4.2.1 |
4.2.1 |
4.2.1 |
3.4.2 |
3.4.2 |
3.4.2 |
4.2.1 |
4.2.1 |
3.4.2 |
3.4.2 |
4.2.1 |
4.2.1 |
4.2.1 |
3.4.2 |
4.2.1 |
4.2.1 |
3.4.2 |
4.2.1 |
4.2.1 |
4.2.1 |
3.4.5 |
3.4.5 |
4.2.1 |
4.2.1 |
4.2.1 |
3.4.3, 3.4.4, 3.4.5 |
4.2.1 |
4.2.1 |
3.4.3, 3.4.4, 3.4.5 |
4.2.1 |
4.2.1 |
3.3.2 P3 |
3.3.2 P3 |
3.3.2 P3 |
4.2.1 |
4.2.1 |
3.3.2 P3 |
3.3.2 P3 |
4.2.1 |
4.2.1 |
4.2.1 |
3.3.2 P2, 3.3.2 P3 |
4.2.1 |
4.2.1 |
3.3.2 P2 |
4.2.1 |
4.2.1 |
3.2.2 |
3.2.2 |
3.2.2 |
4.2.1 |
4.2.1 |
3.2.2 |
3.2.2 |
4.2.1 |
4.2.1 |
4.2.1 |
3.2.2 |
4.2.1 |
4.2.1 |
3.2.2 |
4.2.1 |
4.2.1 |
4.0.0 |
4.0.0 |
4.0.0 |
4.2.1 |
4.0.0 |
4.0.1 |
4.2.1 |
4.2.1 |
4.2.1 |
4.0.0 |
4.0.1 |
Warning: VMware SD-WAN Releases 4.0.x and 4.2.x are approaching End of Support.
- Release 4.0.x will reach End of General Support (EOGS) on September 30, 2022, and End of Technical Guidance (EOTG) December 31, 2022.
- Release 4.2.x Orchestrators and Gateways will reach End of General Support (EOGS) on December 30, 2022, and End of Technical Guidance (EOTG) March 30, 2023.
- Release 4.2.x Edges will reach End of General Support (EOGS) on June 30, 2023, and End of Technical Guidance (EOTG) September 30, 2023.
- For more information please consult the Knowledge Base article: Announcement: End of Support Life for VMware SD-WAN Release 4.x (88319)
Note: Release 3.x did not properly support AES-256-GCM, which meant that customers using AES-256 were always using their Edges with GCM disabled (AES-256-CBC). If a customer is using AES-256, they must explicitly disable GCM from the Orchestrator prior to upgrading their Edges to a 4.x Release. Once all their Edges are running a 4.x release, the customer may choose between AES-256-GCM and AES-256-CBC.
Important Notes
BGPv4 Filter Configuration Delimiter Change for AS-PATH Prepending
Through Release 3.x, the VMware SD-WAN BGPv4 filter configuration for AS-PATH prepending supported both comma and space based delimiters. However, beginning in Release 4.0.0 and forward, VMware SD-WAN will only support a space based delimiter in an AS-Path prepending configuration.
Customers upgrading from 3.x to 4.x need to edit their AS-PATH prepending configurations to "replace commas with spaces" prior to upgrade to avoid incorrect BGP best route selection.
Extended Upgrade Time for Edge 3x00 Models
Upgrades to this version may take longer than normal (3-5 minutes) on Edge 3x00 models (i.e., 3400, 3800 and 3810). This is due to a firmware upgrade which resolves issue 53676. If an Edge 3400 or 3800 had previously upgraded its firmware when on Release 3.4.5 or 4.0.2, then the Edge would upgrade as expected. For more information, please consult Fixed Issue 53676.
Limitation When Disabling Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810
When a user disables autonegotiation to hardcode speed and duplex on ports GE1 - GE4 on a VMware SD-WAN Edge model 620, 640 or 680; on ports GE3 or GE4 on an Edge 3400, 3800, or 3810; or on an Edge 520/540 when an SFP with a copper interface is used on ports SFP1 or SFP2, the user may find that even after a reboot the link does not come up.
This is caused by each of the listed Edge models using the Intel Ethernet Controller i350, which has a limitation that when autonegotiation is not used on both sides of the link, it is not able to dynamically detect the appropriate wires to transmit and receive on (auto-MDIX). If both sides of the connection are transmitting and receiving on the same wires, the link will not be detected. If the peer side also does not support auto-MDIX without autonegotiation, and the link does not come up with a straight cable, then a crossover Ethernet cable will be needed to bring the link up.
For more information please see the KB article Limitation When Disabling Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810 (87208).
Document Revision History
April 9th, 2021. First Edition.
April 13th, 2021. Second Edition.
- Added Fixed Issue 53676 to Edge/Gateway Resolved Issues section. This issue was erroneously omitted from the original Release Notes.
- Added an Important Notes section that addresses the extended upgrade time a 3x00 would experience if their firmware needed to be upgraded as part of the fix for 53676.
April 21st, 2021. Third Edition.
- Added a new Orchestrator build: R421-20210415-GA as the most current build. Added a new section for R421-20210415-GA to the Orchestrator Resolved Issues section.
- Added ticket #61312 to the R421-20210415-GA Resolved Issues section of the Orchestrator.
May 7th, 2021. Fourth Edition.
- Revised the Compatibility table to include two new tested combinations:
- Orchestrator and Gateway on Release 4.2.0 is tested as compatible with a Hub Edge using 4.2.0 and a Spoke Edge using 4.2.1.
- Orchestrator and Gateway on Release 4.2.0 is tested as compatible with a Hub Edge using 4.2.1 and a Spoke Edge using 4.2.1.
- Added Fixed Issues 55949 and 56149 to the Resolved Edge/Gateway section. This tickets should have been included in the original GA Release Notes.
June 15th, 2021. Fifth Edition.
- Amended Edge/Gateway Fixed Issue 56876 to account for a second scenario that the fix for this issue addresses which also regards memory management leading to an Edge kernel panic and reboot.
- Added Edge/Gateway Resolved Issue 54001, which was erroneously omitted from previous editions.
August 5th, 2021. Sixth Edition.
- Added six known issues to the Edge/Gateway Open Issues section: #60006, #60225, #61361, #62552, #63359, and #67790.
August 11th, 2021. Seventh Edition.
- Added a new Edge version R421-20210624-GA-57011-60130 and moved existing tickets 57011 and 60130 to the new section created for that Edge build.
September 16th, 2021. Eighth Edition.
- Added to Important Notes the Note: BGPv4 Filter Configuration Delimiter Change for AS-PATH Prepending.
December 21st, 2021. Ninth Edition.
- Added a new Orchestrator build R421-20211216-GA to Orchestrator Resolved Issues. This Orchestrator build remediates CVE-2021-44228, the Apache Log4j vulnerability, by updating to Log4j version 2.16.0. For more information on the Apache Log4j vulnerability, please consult the VMware Security Advisory VMSA-2021-0028.5.
- Added to Important Notes the Note: Limitation When Disabling Autonegotiation on VMware SD-WAN Edge Models 520, 540, 620, 640, 680, 3400, 3800, and 3810. This note covers an issue that may be encountered when configuring a forced speed on some Ethernet ports of the listed Edge models.
March 24th, 2022, Tenth Edition
- Added Issue #84825, to the Edge/Gateway Known Issues section.
June 7th, 2022, Eleventh Edition
- Added Fixed Issue #54493, to the Edge/Gateway Resolved Issues section. This issue was omitted from the original edition of the 4.2.1 Release Notes in error.
Resolved Issues
The resolved issues are grouped as follows.
Edge Resolved IssuesResolved in Version R421-20210624-GA-57011-60130
The below issues have been resolved since Edge version R421-20210407-GA.
- Fixed Issue 57011: For a site configured with a High-Availability topology, whenever segments are added and then deleted on that site, one of the HA Edges may experience a dataplane service failure and if the service failure is on the Active Edge, the site would also experience an HA failover
When segments are added, and then deleted from an HA site, there is the potential for stale segments (i.e., the deleted segments might still show up on one of the Edges in the HA pair). Due to this mismatch in segment information between the HA Edges, any event meant for the stale segment might be sent to the other Edge resulting in a dataplane service failure, an HA failover if the service failure is on the Active Edge, and the generation of a core dump that will be found on a diagnostic bundle taken after the failover. There is no workaround for this issue.
- Fixed Issue 60130: A site may experience intermittent periods of high packet loss and connectivity issues.
This is caused by the API that checks for ARP resolution telling the Edge there is a successful ARP resolution for a device while delivering a MAC address of 00:00:00:00. This address is kept in the ARP cache and any packets intended for the device where the MAC is listed as zero are dropped. In this issue, many such instances of successful ARP's with zero MAC addresses are delivered causing high packet loss and connectivity issues.
This fix corrects issues with the cached value of MAC addresses in a flow (the most common cause for the problem), however this fix does not address a rarer scenario where the ARP caches itself and then returns a zero MAC. That will be addressed in 62552. Other than having an Edge image with the fix, there is no workaround for this issue.
Resolved in Version R421-20210407-GA
The below issues have been resolved since Edge version R420-20201218-GA and Gateway version R420-20210208-GA-53243-54800.
- Fixed Issue 51025: When a WAN link flaps (alternates rapidly between an up and down state) on a VMware SD-WAN Edge, the route table entry for the routed interface’s default gateway may be removed and not reapplied.
When an Edge encounters this issue, there is a link flap, and the default gateway route entry gets removed for the interface using that link, resulting in an empty route table for the interface. However, if left with an empty route table, Linux connection tracking (conntrack) will route to the next table by default causing all packets to egress through the wrong routed interface.
- Fixed Issue 52102: For an enterprise using a Hub/Spoke topology, existing flows are dropped on a VMware SD-WAN Spoke Edge when recovering from a Hub Edge failover for a given tuple.
The sequence of events leads to this issue when a primary Hub Edge recovers from failover:
1. When the primary Hub Edge goes down, the route is removed from the FIB for that primary Hub Edge while retaining the routes in the RIB.
2. Existing flows will now switch to the secondary Hub Edge.
3. When the primary Hub comes back up, a tunnel is immediately established between the primary Hub from the Spoke Edge.
4. Routes in the RIB which were previously learned from the primary Hub via the Gateway are scanned and routes are installed in the FIB pointing to this primary Hub.
5. Traffic will switch back to the primary Hub whereas the primary Hub would not have learned the routes from its BGP neighbor.
6. This causes route lookup to match on the default route and return traffic is marked with a backhaul flag.
7. The Spoke Edge is not expecting return traffic with a backhaul flag set and this leads to traffic being dropped.Without this fix, the workaround is to navigate to the Hub Edge and run the Remote Diagnostic "Flush Flows" for the given tuple and traffic will be restored.
- Fixed Issue 53415: On a customer enterprise where Edge Network Intelligence (ENI) is enabled, if the enterprise's VMware SD-WAN Edge has Wi-Fi enabled, the ENI page may show an incorrect MAC address for the Wi-Fi access point and the access point IP will show as 160.254.3.1.
The issue is the result of a misconfiguration of the Wi-Fi access point MAC address being set to a value called 'selfMacAddress' and the access point IP address being always configured for 160.254.3.1 in the ENI page. The fix will derive the MAC address from the Wi-Fi interface wlan0, and the analytics interface's IP address.
- Fixed Issue 53477: When VMware SD-WAN Edges configured in a High Availability topology are moved to a different configuration profile, the Edges experience repeated Edge service restarts.
For this issue, one of the HA Edges is configured to have more LAN or WAN interfaces than the other HA Edge (e.g., a WAN port is disabled one of the Edges), and then if these Edges are moved to a different profile, the Edges will experience continuous Edge service restarts.
- Fixed Issue 53651: On a customer site using an Enhanced High-Availability topology, when making a configuration change to a VMware SD-WAN Edge device setting that requires an Edge service restart, two HA failovers in succession may occur.
For a device setting configuration change that requires an Edge service restart, the HA module wrongly updates the LAN/WAN count to the VMware SD-WAN Gateway before the Edge service is restarted during configuration processing. As a result, when the initial HA failover happens and the current Active Edge's service restarts as part of being demoted to Standby, the Gateway misunderstands that the new Standby Edge has a better LAN/WAN count and sends a failover command to the newly promoted Active Edge, leading to the second failover.
Note: for a list of Edge configuration changes that can trigger a service restart, please consult the KB article, VMware SD-WAN Edge configuration changes that can trigger a service restart (60247)
- Fixed Issue 53676: On the VMware SD-WAN Edge 3x00 platform, very brief periods of input voltage instability, as short as 4 milliseconds, can cause the Edge to reboot.
This issue is typically seen when using an Uninterruptible Power Supplies (UPS) that experiences slight output voltage instability when switching from line to battery. The fix for this issue upgrades the Edge’s firmware to tolerate 20-30ms of voltage instability prior to the Edge rebooting.
Note: Upgrading the 3x00's firmware will extend the Edge's upgrade time to 3-5 minutes if the Edge did not previously have their firmware upgraded when using Release 3.4.5 or 4.0.2.
For an Edge 3x00 model without this fix, the customer’s only option is to use a more sophisticated UPS that can switch its input without any output voltage instability.
- Fixed Issue 53789: In VMware SD-WAN Virtual Edges running under ESXi, /var/log/messages is filled with a spurious error message every 30 seconds.
The spurious error message will show as GuestInfoGetDiskDevice: Missing disk device name; VMDK mapping unavailable for "/", fsName: "/dev/root" and is always logged into /var/log/messages, filling up /var/log/messages and its saved counterpart /velocloud/log/messages*, causing more important messages to be rotated out and lost when consulting the logs for the affected Edge.
- Fixed Issue 53929: On a customer site using an Enhanced High-Availability topology, after an HA failover, 'Cloud via Gateway' flows switch to a 'Direct to Cloud' path.
After an HA failover, if the path to the VMWare SD-WAN Gateway is not up when the traffic reaches the VMware SD-WAN Edge, the traffic goes 'Direct to Cloud' instead of 'Cloud via Gateway'. This can have significant impact for flows that rely on Dynamic Multipath Optimizations like Realtime traffic (e.g. voice and video) because Direct traffic does not use these optimizations.
- Fixed Issue 54001: A VMware Edge is unable to send traffic after a Tx queue hang on SFP interfaces.
In rare cases, when the Edge sends an invalid sized packet (less than 17 bytes or greater than 1526 bytes) to DPDK, the transmit queue becomes stalled and causes any further traffic to not be forwarded by the Edge. Rebooting the Edge temporarily corrects the issue, but the problem can happen again when an invalid sized packet is sent from the Edge service to DPDK. Only upgrading to a level with the fix avoids this problem.
- Fixed Issue 54493: An Operator or Partner administrator may observe an increasing number of handoff queue drops for Edge traffic on a VMware SD-WAN Gateway.
For this issue the Gateway would not have CPU utilization issues or DPDK drops. The issue is triggered by a control plane event (for example, route recalculation) and the Gateway begins to drop Edge packets during handoff to different threads in the Gateway's pipeline. The cause of the issue is an insufficiently large queue size queue size for packet buffering.
- Fixed Issue 54694: When a customer uses SNMP polling, SNMP monitoring delivers inaccurate measurements for outbound traffic
The SNMP call for IF-MIB::ifHCOutOctets delivers TX packets instead of TX Bytes, resulting in inaccurate outbound octet counts which affects the customer's ability to monitor their enterprise. This issue is the result of the snmpagent process monitoring Tx packets versus Tx bytes.
- Fixed Issue 55949: In some scenarios a Non SD-WAN Destination (NSD) via Gateway tunnel goes down and does not recover for a period of time.
In a situation when a VMware SD-WAN Gateway triggers an IKE rekey with any other NSD destination and the rekey attempt does not succeed due to a networking issue in the middle of the negotiation, the IKE rekey will keep retrying. When a link establishes back, it is possible that Dead Peer Detection (DPD) event will delete a newly created Phase1 Security Association (SA). This causes the IPsec SA to be deleted as well with some peers, most notably with Zscaler. When a peer deletes IPsec SA, the Gateway will not be able to detect it and a tunnel will be down until the next rekey time. Without the fix, the only way to force this rekey is to bounce the tunnel by disabling and reenabling the affected NSD through the VMware SD-WAN Orchestrator.
- Fixed Issue 56149: After Dynamic Cost Calculation (DCC) is enabled on a customer enterprise which is using BGP, a VMware SD-WAN Edge may show an incorrect route preference value for auto-corrected routes if the BGP route for the underlay route flaps.
The impact to the customer is asymmetric routing due to the incorrect remote route preference, which results in higher latency and poor performance on all customer applications. After DCC is enabled, the new routing information base (RIB) preference value should be updated on the route and and the route should be re-advertised to the VMware SD-WAN Gateway with the new RIB preference value which is then communicated to all Edges. The cause of the issue is that when the route is auto-corrected, this RIB preference is not updated in the peer Edge's FIB table, which retains the old, pre-DCC value.
- Fixed Issue 56346: A customer may observe Handoff Queue Drops when looking at a VMware SD-WAN Edge's Monitor > System page.
A VCRP (VeloCloud Route Protocol) route event updates leads to handoff queue drops in the VCMP (VeloCloud Management Plane) data thread. This is because when a route update is received, all the routes in the respective segment are invalidated. This leads to new route lookups in the data path. A particular function that is called as part of the route lookup does a costly hash enumerate operation leading to 40% increased VCMP data thread utilization. For the instance when this issue was found in the field, the quantity of handoff queue drops was not sufficient to impact network performance.
- Fixed Issue 56483: Packet loss, jitter, and latency values not showing in WAN link live monitoring on a VMware SD-WAN Orchestrator under the Monitor > Transport screen.
A user is unable to get real time data for packet loss, jitter, or latency for a particular WAN link under Monitor > Transport, with the graph showing as a flat line. In addition, when looking at the Monitor > Edge > Overview screen, all values for loss, jitter, and latency are expressed as '0'. Historical statistics will show correctly in Monitor > Transport, this issue only affects "Live Mode" statistics.
- Fixed Issue 58535: When a customer has configured a Stateful Firewall, and under Network & Flood Protection has also configured a Denylist, the Denylist automatically sets itself to the most aggressive settings for new connections and the Stateful Firewall blocks any new connection.
The issue has a critical impact for customers using a Stateful Firewall as it renders the Denylist feature unusable. Once the Denylist feature is enabled the Firewall Events are filled with the logs: "FLOOD_ATTACK_DETECTED" and "Blacklisting source: xxx.xxx.x.x exceeded CPS limit : 0 per source". Where the IP address is the Edge's management IP address, and CPS = Connections Per Second. The New Connection Threshold limit is being set to 0% which effectively means any connection attempts will trigger the Denylist to block all connections. The default value of New Connection Threshold is 25%.
- Fixed Issue 56876: A VMware SD-WAN Edges may encounter an issue related to memory management and trigger a kernel panic, which will result in an Edge reboot.
This resolved issue includes fixes for two different scenarios involving memory management on an Edge which triggers a kernel panic:
- In the first scenario, where an Edge is using Dynamic Branch-to-Branch, the dynamic tunnels are created, and a small amount of memory is reserved for storing per-peer counters. When the dynamic tunnel is torn down, this memory is not cleaned up so as to optimize the bring up time the next time this same peer connects. On a small Edge (e.g., Edge 500, 510, 520, 610) which connects to a large number of different destinations over time, this can eventually exhaust available memory and trigger a kernel panic and an Edge reboot. Without this fix, a user needs to proactively restart the Edge's service if memory usage is greater than 90% of health statistics when looking at an Edge's Monitor > System screen on the VMware SD-WAN Orchestrator.
- In the process of fixing the memory leak caused by Dynamic Branch-to-Branch, it was noted that malloc_trim (a process that clears up fragmented memory) was not being properly invoked and this process was modified as well for this fix. Not invoking malloc_trim properly can cause a different issue and can affect any Edge (not just smaller Edges) and does not require the Edge to be using either Dynamic Branch-to-Branch nor does Monitor > System show a memory usage exceeding 90%. This scenario is much more likely to occur if the Edge has a high number of flows.
- Fixed Issue 56931: A customer site that has configured a Non SD-WAN Destination (NSD) via Edge may show incorrect Edge Health Statistics on the VMware SD-WAN Orchestrator UI.
When an NSD is configured from the Edge, the SD-WAN service sends health statistics from the Edge to the Orchestrator with a start time of 0 for the first time after reboot. This results in the Orchestrator displaying the wrong data after the Edge reboots.
- Fixed Issue 57063: If the start and end time for an API call overlaps exactly with the time at which a VMware SD-WAN Edge exports data to the VMware SD-WAN Orchestrator, two behaviors will be observed: a) Link metrics API calls issued from the Orchestrator UI or SDK clients will observe a value higher than normal being returned back in the response. b) Link series API calls issued from the Orchestrator UI or SDK clients will observe the last time series value to be higher than usual.
A user could observe this discrepancy when consulting the Monitor > Transport tab on the Orchestrator UI, or when an SDK client invokes getEdgeLinkMetrics, getEdgeLinkSeries, getAggregateLinkMetrics API calls. In either case, the actual times this would be observed are rare given the requirements noted in the Symptom description.
Orchestrator version R421-20211216-GA
Orchestrator version R421-20211216-GA was released on 12-20-2021. This Orchestrator build remediates CVE-2021-44228, the Apache Log4j vulnerability, by updating to Log4j version 2.16.0. For more information on the Apache Log4j vulnerability, please consult the VMware Security Advisory VMSA-2021-0028.5.
___________________________________________________________________
Resolved in Version R421-20210415-GA
The below issues have been resolved since Orchestrator version R421-20210326-GA.
- Fixed Issue 61312: A VMware SD-WAN Orchestrator may encounter an issue where routes are no longer updated and the CPU utilization of the Orchestrator is near 100%, especially after the Orchestrator is upgraded.
This issue manifests when an Edge sends ~2K+ route updates to the Orchestrator's routing API. In those scenarios where the Orchestrator is unable to process the entire set of routes sent on a particular API call within 60 seconds, it results on a timeout for that call which in turn results in the API call being rejected entirely. The Edge receives this rejection and attempts to push the same 2K+ routes to the Orchestrator again, leading to the same scenario as before which creates a loop that overloads the Orchestrator's vCPU resources. When present this issue can prevent route updates from being processed.
To address this issue, two system properties have been added:
edge.learnedRoute.maxRoutePerCall This property ensures only a limited number of routes are processed from an Edge. If the property value is ‘200’, then 200 routes will be processed per Edge request which ensures that an acknowledgment is sent to the Edge on time.
vco.learnedRoute.simultaneous.maxQueue This property ensures only the configured number of Edges may have route requests queued at a time. If the property value is ‘8’, then only 8 Edges would be permitted to send route requests at a time and those in excess of the configured value would be rejected immediately prior to the routes being processed.
______________________________________
Resolved in Version R421-20210326-GA
The below issues have been resolved since Orchestrator version R420-20210306-GA.
- Issue 20900: If the MaxMind geolocation service is enabled and cannot reach the MaxMind server, new VMware SD-WAN Edge activations will not work.
The Edge creates an HTTPS connection to the VMware SD-WAN Orchestrator in order to activate.The default timeout for the request is 120 seconds and for the proxied connection, it is 60 seconds. As the Orchestrator is attempting to geolocate the Edge (IPv4 remote address) uploads waits for the response from the Maxmind service in order to proceed with the activation. Hence, after 60 seconds, NGINX stops for the upload service’s response and closes the connection. Therefore the activation fails because of a 504 timeout from the NGINX.
With the new system property service.maxmind.timeout.seconds, the Maxmind API call is made with a custom timeout. If the timeout is reached, the call proceeds with the activation workflow and hence the Edge gets successfully activated.
- Fixed Issue 49997: If VMware Edge Network Intelligence Analytics Mode is enabled on a VMware SD-WAN Orchestrator, when a new operator user is created that operator is not able to connect with the analytics sections of the Orchestrator UI.
Operator users created after activation of analytics mode should be able to access the VMware Edge Network Intelligence UI of all enterprise customers that have enabled support access, and that is not the case with this issue.
- Fixed Issue 52379: The VMware SD-WAN Orchestrator sends out an ‘Edge Down’ alert email if the VMware SD-WAN Edge recovers within the configured delay interval.
Administrators can be falsely alerted of an Edge being down in their network even though they configured a delay to allow an Edge to be down for a period of time before triggering that alert.
- Fixed Issue 53525: When using the New UI on a VMware SD-WAN Orchestrator and viewing the Edge overview page, the Links column does not show the state of the link (e.g., Backup, Standby).
This link state information is correctly shown on the Old UI and with this fix will show as expected on the New UI.
- Fixed Issue 53652: When a customer enterprise that is using a custom application map is upgraded from 3.x to 4.x, the customer may observe random names for their custom applications created prior to the upgrade.
Whenever a custom application map is configured with an Application ID (appId) which already exists as part of a default initial application map, the VMware SD-WAN Orchestrator will always show the display name of the default initial application map and override the customer defined name. This is also true when the Orchestrator is upgraded from a lower version to a higher version and the higher version default initial application map has an appid which conflicts with an appId of the custom applications created in a lower version. After the Orchestrator upgrade, those custom applications will show an incorrect display name which is the display name of the appid for the higher version's default initial application map.
- Fixed Issue 53752: A customer enterprise migration fails when it is attempted from a VMware SD-WAN Orchestrator using a 3.4.x release to an Orchestrator using release 4.2.0.
The latest enterprise migration tool did not support Release 4.2.x and this was the cause of migration failures.
- Fixed Issue 53857: A VMware SD-WAN Orchestrator deployment which uses a KVM image based on Release 4.0.0 will fail to deploy.
The reason for the failure is that the KVM image has an incorrect virtual disk size and the volumes will not expand to the required size. On a deployment, the Orchestrator scripts automatically expand Orchestrator volumes to take 80% of the maximum size of the underlying disks (physical volumes). In this case, because of the incorrect virtual size, that expansion is inadequate for Orchestrator database requirements and the deployment fails. It is possible to deploy an Orchestrator using an older build without this fix, but the volumes must be resized manually.
- Fixed Issue 53987: On a VMware SD-WAN Orchestrator, the Edge Event 'Link MTU Detected' is not searchable on the Orchestrator UI.
This issue has been observed on Orchestrators using release 4.0.x and higher. While doing an event search in the Orchestrator UI under the Events page, 'Link MTU Detected'" is not available in the Event list for filter, making it difficult to isolate that event as part of a troubleshooting effort.
- Fixed Issue 54035: If VMware Edge Network Intelligence Analytics Mode is enabled, packets destined for the syslog daemon, aruba daemon and snmptrap daemon are dropped on the VMware SD-WAN Edge and this data will not show on the Edge Network Intelligence viewer.
The packets destined to the Edge Network Intelligence daemons (syslogd, amond and snmptrapd) are dropped in the Edge dataplane process due to missing corresponding iptable rules. As a result the corresponding stats are not received in the Edge Network Intelligence backend.
- Fixed Issue 55259: When an administrator creates a new VMware SD-WAN Edge on the VMware SD-WAN Orchestrator UI, the "Set Location" field is missing.
With this issue, the administrator can create the Edge but without location information and the Orchestrator cannot perform automatic geolocation for the Edge and assign the correct Gateways. The administrator has to perform an additional step after the Edge is created to go in and fill out the Edge location information on the Configure > Edge > Edge Overview page.
- Fixed Issue 55871: Some API calls to REST APIv2 (/sdwan) HTTP cause the server to produce HTTP 500 errors.
In some cases where customer data does not conform precisely to the schema that the API expects, the API produces an HTTP 500 error rather than return data which is inconsistent with the documented API schema. This behavior was driven by a design decision that has since been revisited. Calls to "GET /enterprises", "GET /enterprises/{enterpriseLogicalId}/edges", and "GET /enterprises/{enterpriseLogicalId}/clientDevices" are known to be affected.
- Fixed Issue 56763: On a VMware SD-WAN Orchestrator using Release 4.x or later with reports enabled, if a report fails to generate for whatever reason, all subsequent reports for all customers using the Orchestrator will also fail to generate until the Orchestrator's backend service is restarted.
This issue has a significant impact on an affected Orchestrator because all customers using the Orchestrator will be unable to get reports until the Orchestrator backend service is restarted. This issue is caused by a single report failure which puts the reporting service into a bad state from which it cannot recover outside of restarting the backend service on the Orchestrator. This is because new report generation does not occur independent of prior report generation. The fix ensures that the reporting service continues to generate new reports independently of a report generation failure.
- Fixed Issue 56824: On a VMware SD-WAN Orchestrator using Release 4.2.x, delivery of alerts to Webhook recipients fails when the recipient URL includes an explicit port number.
Users that had previously configured Webhook recipient URLs that included explicit port numbers may observe that alert delivery fails indefinitely to those recipients. Without the fix in this build, an administrator would need to configure a reverse proxy to pass requests along to the original Webhook recipient, and update the Webhook recipient URL to point to the reverse proxy.
- Fixed Issue 56896: User could experience API failures and Gateway timeouts.
This issues is the result of a VMware SD-WAN Orchestrator's disk storage becoming full due to an accumulation of files. This accumulation occurs because there is a way to disable flow stats processing for a VMware SD-WAN Edge or a list of Edges which resembles a block-list / deny-list feature. Although the flow processing is skipped for these Edges, the issue is that the files remain on the Orchestrator's disk without getting deleted. In the field found instance of this issue there was sufficient monitoring in place to catch the issue and prevent any user experience issues, but on an Orchestrator where there was less monitoring this could impact customer traffic. Without this fix, an Orchestrator operator would need to manually delete files on the Orchestrator's disk storage with a timestamp of more than 24 hours.
- Fixed Issue 56909: On a VMware SD-WAN Orchestrator using Release 4.x, report generation may fail when a backup link is included.
If a link has no link statistics records, the report generation throws an error. A link set to backup will generate no link statistics if it strictly remains backup during the period selected for the report. Without this fix the only way to generate a report is to unselect the backup link while generating a report so that the link has some statistical data for its record.
- Fixed Issue 57087: When the user tries to switch a VMware SD-WAN Edge`s profile from the Configure > Edge screen, the user will observe a validation error which includes a notification box with a generic error message instead of the actual reason.
The generic error seen reads "Error processing item. Please try again". For the actual validation error reason, the user had to check a web browser's debugging console. After the fix, the appropriate validation error/reason for failure is displayed.
- Fixed Issue 58627: Users configured to receive Alerts may receive a Link Up Alert when in reality the link remains down.
Sometimes after a link is marked as 'Down', statistics for that link that were generated before the link went down may not be sent to the VMware SD-WAN Orchestrator for up to a minute after the event. Once the Orchestrator receives these lagging link statistics, it is fooled into thinking the Link is back up and thus triggers a Link Up alert if the Alert settings are aggressive (e.g. 0 minute delay). The fix ensures that the Orchestrator does not interpret delayed link statistics as indicating that the link is now up.
- Fixed Issue 59094: When an operator is attempting to upgrade a VMware SD-WAN Orchestrator, the update script does not provide a proper warning message about the schema update requirements.
If an operator misses the step to apply schema changes on the larger tables, there could be an error on the Orchestrator services. Also there is not an easy way to find out what changes are missing. This fix addresses this issue when, upon a backend service restart, it will regenerate any missing schema changes required on a large table.
- Fixed Issue 59967: After a VMware SD-WAN Orchestrator is upgraded to a 4.2.x release or higher, when an Operator user attempts to access the Configure > Business Policy or Configure > Firewall policy pages, the page will not load and the user will see an error.
The error reads "An unexpected error has occurred". This affects Operator users, not Partner or Customer Administrators. The pages do not load due to a missing READ: OBJECT_GROUP privilege for Operators, meaning the Orchestrator does not recognize an Operator as having the necessary privileges to access the Business Policy and Firewall pages.
Known Issues
Open Issues in Release 4.2.1
The known issues are grouped as follows.
Edge/Gateway Known Issues- Issue 14655:
Plugging or unplugging an SFP adapter may cause the device to stop responding on the Edge 540, Edge 840, and Edge 1000 and require a physical reboot.
Workaround: The Edge must be physically rebooted. This may be done either on the Orchestrator using Remote Actions > Reboot Edge, or by power-cycling the Edge.
- Issue 25504:
Static route costs greater than 255 may result in unpredictable route ordering.
Workaround: Use a route cost between 0 and 255
- Issue 25595:
A restart may be required for changes to static SLA on a WAN overlay to work properly.
Workaround: Restart Edge after adding and removing Static SLA from WAN overlay
- Issue 25742:
Underlay accounted traffic is capped at a maximum of the capacity towards the VMware SD-WAN Gateway, even if that is less than the capacity of a private WAN link which is not connected to the Gateway.
- Issue 25758:
USB WAN links may not update properly when switched from one USB port to another until the VMware SD-WAN Edge is rebooted.
Workaround: Reboot the Edge after moving USB WAN links from one port to another.
- Issue 25855:
A large configuration update on the Partner Gateway (e.g. 200 BGP-enabled VRFs) may cause latency to increase for approximately 2-3 seconds for some traffic via the VMware SD-WAN Gateway.
Workaround: No workaround available.
- Issue 25921:
VMware SD-WAN Hub High Availability failover takes longer than expected (up to 15 seconds) when there are three thousand branch Edges connected to the Hub.
- Issue 25997:
The VMware SD-WAN Edge may require a reboot to properly pass traffic on a routed interface that has been converted to a switched port.
Workaround: Reboot the Edge after making the configuration change.
- Issue 26421:
The primary Partner Gateway for any branch site must also be assigned to a VMware SD-WAN Hub cluster for tunnels to the cluster to be established.
- Issue 28175:
Business Policy NAT fails when the NAT IP overlaps with the VMware SD-WAN Gateway interface IP.
- Issue 31210:
VRRP: ARP is not resolved in the LAN client for the VRRP virtual IP address when the VMware SD-WAN Edge is primary with a non-global CDE segment running on the LAN interface.
- Issue 32731:
Conditional default routes advertised via OSPF may not be withdrawn properly when the route is turned off. Re-enabling and disabling the route will retract it successfully.
- Issue 32960:
Interface “Autonegotiation” and “Speed” status might be displayed incorrectly on the Local Web UI for activated VMware SD-WAN Edges.
- Issue 32981:
Hard-coding speed and duplex on a DPDK-enabled port may require a VMware SD-WAN Edge reboot for the configurations to take effect as it requires disabling DPDK.
- Issue 34254:
When a Zscaler CSS is created and the Global Segment has FQDN/PSK settings configured, these settings are copied to Non-Global Segments to form IPsec tunnels to a Zscaler CSS.
- Issue 35778:
When there are multiple user-defined WAN links on a single interface, only one of those WAN links can have a GRE tunnel to Zscaler.
Workaround: Use a different interface for each WAN link that needs to build GRE tunnels to Zscaler.
- Issue 35807:
A DPDK routed interface will be disabled completely if the interface is disabled and re-enabled from the VMware SD-WAN Orchestrator.
- Issue 36923:
Cluster name may not be updated properly in the NetFlow interface description for a VMware SD-WAN Edge which is connected to that Cluster as its Hub.
- Issue 38682:
A VMware SD-WAN Edge acting as a DHCP server on a DPDK-enabled interface may not properly generate “New Client Device" events for all connected clients.
- Issue 38767:
When a WAN overlay that has GRE tunnels to Zscaler configured is changed from auto-detect to user-defined, stale tunnels may remain until the next restart.
Workaround: Restart the Edge to clear the stale tunnel.
- Issue 39134:
The System health statistic “CPU Percentage” may not be reported correctly on Monitor > Edge > System for the VMware SD-WAN Edge, and on Monitor > Gateways for the VMware SD-WAN Gateway.
Workaround: Users should use handoff queue drops for monitoring Edge capacity not CPU percentage.
- Issue 39374:
Changing the order of VMware SD-WAN Partner Gateways assigned to a VMware SD-WAN Edge may not properly set Gateway 1 as the local Gateway to be used for bandwidth testing.
- Issue 39608:
The output of the Remote Diagnostic “Ping Test” may display invalid content briefly before showing the correct results.
- Issue 39624:
Ping through a subinterface may fail when the parent interface is configured with PPPoE.
- Issue 39659:
On a site configured for Enhanced High Availability, with one WAN link on each VMware SD-WAN Edge, when the standby Edge has only PPPoE connected and the active has only non-PPPoE connected, a split brain state (active/active) may be possible if the HA cable fails.
- Issue 39753:
Disabling Dynamic Branch-to-Branch VPN may cause existing flows currently being sent using Dynamic Branch-to-Branch to stall.
- Issue 40096:
If an activated VMware SD-WAN Edge 840 is rebooted, there is a chance an SFP module plugged into the Edge will stop passing traffic even though the link lights and the VMware SD-WAN Orchestrator will show the port as 'UP'.
Workaround: Unplug the SFP module and then replug it back into the port.
- Issue 40421:
Traceroute is not showing the path when passing through a VMware SD-WAN Edge with an interface configured as a switched port.
- Issue 42278:
For a specific type of peer misconfiguration, the VMware SD-WAN Gateway may continuously send IKE init messages to a Non-SD-WAN peer. This issue does not disrupt user traffic to the Gateway; however, the Gateway logs will be filled with IKE errors and this may obscure useful log entries.
- Issue 42388:
On a VMware SD-WAN Edge 540, an SFP port is not detected after disabling and reenabling the interface from the VMware SD-WAN Orchestrator.
- Issue 42488:
On a VMware SD-WAN Edge where VRRP is enabled for either a switched or routed port, if the cable is disconnected from the port and the Edge Service is restarted, the LAN connected routes are advertised.
Workaround: There is no workaround for this issue.
- Issue 42872:
Enabling Profile Isolation on a Hub profile where a Hub cluster is associated does not revoke the Hub routes from the routing information base (RIB).
- Issue 43373:
When the same BGP route is learnt from multiple VMware SD-WAN Edges, if this route is moved from preferred to eligible exit in the Overlay Flow Control, the Edge is not removed from the advertising list and continues to be advertised.
Workaround: Enable distributed cost calculation on the VMware SD-WAN Orchestrator.
- Issue 44526: For an enterprise where two different sites deploy their VMware SD-WAN Edges as Hubs while also using a high-availability topology, and each site uses the other Hub site as a Hub in its profile. If one of the Hub sites triggers an HA failover, it may take up to 30 minutes for both Hub Edges to reestablish tunnels with each other.
On an HA failover, both Hub Edges try to initiate a tunnel with each other at the same time and neither replies to the peer, the packet exchange between both Hubs occurs, but IKE never succeeds. This leads to a deadlock that has been observed to take up to 30 minutes to resolve on its own. The issue is intermittent and does not occur after every HA failover.
Workaround: To prevent this issue from occurring, the customer should configure only one of the two HA Hub sites to use the other Hub site as a Hub for itself. For example, where there are two HA Hub sites, Hub1 and Hub2, Hub1 could have Hub2 as a Hub for itself in its profile, but Hub2 must not use Hub1 as a Hub in its profile.
- Issue 44832:
Traffic from one Non SD-WAN Destinations via Edge to another Non SD-WAN Destinations via Edge (i.e. 'hairpinning' or 'NAT loopback'), is dropped on the VMware SD-WAN Edge.
- Issue 44995:
OSPF routes are not revoked from VMware SD-WAN Gateways and VMware SD-WAN Spoke Edges when the routes are withdrawn from the Hub Cluster.
- Issue 45189:
With source LAN side NAT is configured, the traffic from a VMware SD-WAN Spoke Edge to a Hub Edge is allowed even without the static route configuration for the NAT subnet.
- Issue 45302:
In a VMware SD-WAN Hub Cluster, if one Hub loses connectivity for more than 5 minutes to all of the VMware SD-WAN Gateways common between itself and its assigned Spoke Edges, the Spokes may in rare conditions be unable to retain the hub routes after 5 minutes. The issue resolves itself when the Hub regains contact with the Gateways.
- Issue 46053:
BGP preference does not get auto-corrected for overlay routes when its neighbor is changed to an uplink neighbor.
Workaround: An Edge Service Restart will correct this issue.
- Issue 46137:
A VMware SD-WAN Edge running 3.4.x software does not initiate a tunnel with AES-GCM encryption even if the Edge is configured for GCM.
- Issue 46216:
On a Non SD-WAN Destinations via Gateway or Edge where the peer is an AWS instance, when the peer initiates Phase-2 re-key, the Phase-1 IKE is also deleted and forces a re-key. This means the tunnel is torn down and rebuilt, causing packet loss during the tunnel rebuild.
Workaround: To avoid tunnel destruction, configure the Non SD-WAN Destinations via Gateway/Edge or CSS IPsec rekey timer to less than 60 minutes. This prevents AWS from initiating the re-key.
- Issue 46391:
For a VMware SD-WAN Edge 3800, the SFP1 and SFP2 interfaces each have issues with Multi-Rate SFPs (i.e. 1/10G) and should not be used in those ports.
Workaround: Please use single rate SFP's per the KB article VMware SD-WAN Supported SFP Module List (79270). Multi-Rate SFPs may be used with SFP3 and SFP4.
- Issue 46918:
A VMware SD-WAN Spoke Edge using the 3.4.2 Release does not update the private network id of a Cluster Hub node properly.
- Issue 47084:
A VMware SD-WAN Hub Edge cannot establish more than 750 PIM (Protocol-Independent Multicast) neighbors when it has 4000 Spoke Edges attached.
- Issue 47244:
On an activated VMware SD-WAN Edge 6x0 with DPDK enabled, some Copper SFPs, the Edge will show the link as 'UP' even when no cable is inserted on the VMware SD-WAN Orchestrator UI.
Workaround: Plugging and unplugging a cable removes the false state.
- Issue 47355:
When the same route is learned via local underlay BGP, Hub BGP and/or statically configured on the Partner Gateway, the sorting order of the routes is incorrect with the Hub BGP being preferred over the underlay BGP.
- Issue 47664:
In a Hub and Spoke configuration where Branch-to-Branch via Hub VPN is disabled, trying to U-turn Branch-to-Branch traffic using a summary route on an L3 switch/router will cause routing loops.
Workaround: Configure Cloud VPN to enable Branch-to-Branch VPN and select “Use Hubs for VPN”.
- Issue 47681:
When a host on the LAN side of a VMware SD-WAN Edge uses the same IP as that Edge’s WAN interface, the connection from the LAN host to the WAN does not work.
- Issue 47787:
A VMware SD-WAN Spoke Edge configured with a backhaul business policy incorrectly sends traffic via the VMware SD-WAN Gateway path if that flow is initiated from the Hub Edge to that Spoke Edge.
- Issue 48166:
A VMware SD-WAN Virtual Edge on KVM is not supported when using a Ciena virtualization OS and the Edge will experience recurring Dataplane Service Failures.
- Issue 48175:
A VMware SD-WAN Edge running Release 3.4.2 will form an OSPF adjacency on a non-global segment if the non-global segment has an interface configured in the same IP range as an interface configured on the global segment
- Issue 48530:
VMware SD-WAN Edge 6x0 models do not perform autonegotiation for triple speed (10/100/1000 Mbps) copper SFP's.
Workaround: Edge 520/540 supports triple speed copper SFPs but this model has been marked for End-of-Sale by Q1 2021.
- Issue 48597: Multihop BGP neighborship does not stay up if one of the two paths to the peer goes down
If there is a Multihop BGP neighborship with a peer to which there are multiple paths and one of them goes down, user will notice that the BGP neighborship goes down and does not come up using the other available path(s). This includes the Local IP-loopback neighborship case too.
Workaround: There is no workaround for this issue.
- Issue 48666:
IPsec-fronted Gateway Path MTU calculation does not account for 61 Byte IPsec overhead, resulting in higher MTU advertisement to LAN client and subsequent IPsec packet fragmentation.
Workaround: There is no workaround for this issue.
- Issue 49172:
A Policy Based NAT rule configured with the same NAT subnet for two different VMware SD-WAN Edges does not work.
- Issue 49738:
In some cases, when a VMware SD-WAN Spoke Edge is configured to use multiple Hub Edges, the Spoke Edge may not form tunnels to one of the Hubs configured in the Hub list.
- Issue 50518:
On a VMware SD-WAN Gateway where PKI is enabled, if >6000 PKI tunnels attempt to connect to the Gateway, the tunnels may not all come up because inbound SAs do not get deleted.
Note: Tunnels using pre-shared key (PSK) authentication do not have this issue.
- Issue 51428: Multicast traffic loss may be observed on a site where the VMware SD-WAN Edge has a sub-interface configured with PIM.
When a sub-interface configured with PIM is moved from a segment to another on the fly, pimd (the process that manages PIM) may restart and the site would experience intermittent multicast traffic loss.
Workaround: Disable the sub-interface first, and then move the sub-interface to another segment. Once moved, re-enable the sub-interface.
- Issue 51436: For a site using an Enhanced High-Availability topology while deploying a VMware SD-WAN Edge using an LTE modem, if the site gets into a "split-brain" state, the HA failover takes ~5-6 minutes.
As part of the recovery from a split-brain state, the LAN ports are brought down on the Active Edge and this impacts LAN traffic during the time the ports are down and until the site can recover.
Workaround: There is no workaround for this issue
- Issue 52483: If underlay accounting is enabled for an interface, the VMware SD-WAN Edge wrongly forwards the traffic back to the same interface instead of forwarding to the overlay.
This behavior is caused by an issue with underlay accounting and a recursive route resolution.
Workaround: Disable underlay accounting for the affected interface.
- Issue 53219: After a VMware SD-WAN Hub Cluster rebalances, a few Spoke Edges may not have their RPF interface/IIF set properly.
On the affected Spoke Edges, multicast traffic will be impacted. What happens is that after a cluster rebalance, some of the Spoke Edge fail to send a PIM join.
Workaround: This issue will persist until the affected Spoke Edge has an Edge Service restart.
- Issue 53337: Packet drops may be observed with an AWS instance of a VMware SD-WAN Gateway when the throughput is above 3200 Mbps.
When traffic exceeds a throughput above 3200 Mbps and a packet size of 1300 bytes, packets drops are observed at RX and at IPv4 BH handoff.
Workaround: There is no workaround for this issue.
- Issue 53359: BGP/BFD session may fail during some DDoS attack scenarios.
If traffic is flooded from the client connected to the routed interface to the LAN client, the BGP/BFD session can fail. Also when real-time high priority traffic is flooded to the overlay destination, the BGP/BFD session can fail.
Workaround: There is no workaround for this issue.
- Issue 53830: On a VMware SD-WAN Edge, some of the routes in BGP view may not have the correct preference and advertise values when DCC flag is enabled causing incorrect sorting order in the Edge's FIB.
When Distributed Cost Calculation (DCC) is enabled in a scaled scenario with a large number of routes on an Edge, when looking at an Edge diagnostic bundle for the log bgp_view some of the routes may not be correctly updated with the preference and advertise values. This issue, if found at all, would be a found in a few Edges as part of a large enterprise (100+ Spoke Edges connected to either Hub Edges or Hub Clusters).
Workaround: This issue can be addressed by either relearning the underlay BGP routes or performing a "Refresh" option on the OFC page of the VMware SD-WAN Orchestrator for the affected routes. Please note that performing a "Refresh" of a route would re-learn the routes from all the Edges in the enterprise.
- Issue 53934: In an enterprise where a VMware SD-WAN Hub Cluster is configured, if the primary Hub has Multihop BGP neighborships on the LAN side, the customer may experience traffic drops on a Spoke Edge when there is a LAN side failure or when BGP is disabled on all segments.
In a Hub cluster, the primary Hub has Multihop BGP neighborship with a peer device to learn routes. If the physical interface on the Hub by which BGP neighborship is established, goes down, then BGP LAN routes may not become zero despite BGP view being empty. This may cause Hub Cluster rebalancing to not happen. The issue may also be observed when BGP is disabled for all segments and when there are one or more Multihop BGP neighborships.
Workaround: Restart the Hub which had the LAN-side failure (or BGP disabled).
- Issue 56218: For a customer site deployed with a High-Availability topology or where HA has just been enabled, when the Edges are upgrade from 3.2.x to 3.4.x, the Standby Edge may go down.
When HA is enabled or the HA Edges are upgraded from 3.2.2 to 3.4.x after a WAN setting is configured using the Local UI, the HA interface (e.g. LAN1 or GE1 depending on the Edge model) will be removed from the Standby Edge and HA status will be set to HA_FAILED on the VMware SD-WAN Orchestrator.
Workaround: Reboot the Standby Edge to recover it
- Issue 60006: When HA is enabled on hardware-based VMware SD-WAN Edges like the 620 and 640, the Standby Edge may reboot.
When HA is enabled on a 620 or 640 (these are the models on which this issue has been observed), the Standby Edge may detect an Active/Active panic and the Standby Edge would reboot to correct the Active/Active state. This issue is caused by the following: during Edge initialization there is a chance of a race condition between the HA interface initialization and the HA State-machine initialization. In other words, the HA state machine starts much earlier than the HA interface driver initialization completes and as a result the HA state machine detects no heartbeat from the peer Edge and moves to an Active state. This issue happens infrequently and should it happen for a particular site, it is unlikely it would happen twice in the same session. In other words the site is not expected to get into some endless cycle of Standby Edge reboots.
Workaround: There is no workaround for this issue, but usually the Standby recovers after the first reboot.
- Issue 60225: When running the Remote Diagnostic "Interface Status" for a VMware SD-WAN Edge, the output on the VMware SD-WAN Orchestrator for SFP interfaces shows the incorrect speed and duplex information.
The data on the Orchestrator is incorrect for SFP interfaces. For example, showing 0 Mbps / half-duplex where if viewed directly on the Edge, the data shows full duplex at 1000 Mbps, or something similar.
Workaround: There is no workaround.
- Issue 60523: Ping fails to a routed-client IP address if a SLA probe is enabled.
ICMP response packet fails to process by the Edge Dataplane Service If a SLA probe is enabled for the routed client IP address. Without the fix the only way to resolve this was to disable the ICMP probe.
Workaround: Disable the ICMP probe.
- Issue 61361: When applying a software update to upgrade a VMware SD-WAN Edge 3400, 3800 and 3810 to Edge Release 3.4.5, 4.0.2, or 4.2.1, there is a change the Edge models may not boot back up immediately after the update.
Release 3.4.5, 4.0.2, and 4.2.1 include a particular firmware update for the complex programmable logic device (CPLD), and the update triggers a reboot that can sometimes get "stuck", requiring a manual power cycle to restart the system.
Workaround: Manually power cycle the Edge to complete the update.
- Issue 61543: If more than one 1:1 NAT rule is configured on different interfaces with the same Inside IP, the inbound traffic can be received on one interface and the outbound packets of the same flow can be routed via different interface.
For the NAT flows from Outside to Inside, the 1:1 NAT rules will be matched against the Outside IP and the interface where the packets are received. For the outbound packets of the same flow, the VMWare SD-WAN Edge will try to match the NAT rules again comparing the Inside IP and the outbound traffic can be routed via the interface configured in the first matching rule with "Outbound Traffic" enabled.
Workaround: There is no workaround for this issue outside of ensuring no more than one 1:1 NAT rule is configured with a particular Inside IP address.
- Issue 62552: A site may experience intermittent periods of high packet loss and connectivity issues.
This is caused by the API that checks for ARP resolution telling the Edge there is a successful ARP resolution for a device while delivering a MAC address of 00:00:00:00. This address is kept in the ARP cache and any packets intended for the device where the MAC is listed as zero are dropped. In this issue, many such instances of successful ARP's with zero MAC addresses are delivered causing high packet loss and connectivity issues.
Note: Both issue 60130 and this issue have the same underlying behavior and cause but the expected fixes for each ticket differ. 60130 will have a defensive workaround fix while 62552 will have a complete fix that prevents any recurrence of this issue.
Workaround: There is no workaround for this issue.
- Issue 63359: For a site configured with a High-Availability topology and OSPF and where the VMware SD-WAN Edges are using a MGMT IP Edge build, when these Edges are upgraded from a 3.4.x to a 4.2.x MGMT-IP build, OSPF connectivity may be broken post-upgrade.
When the HA Edges are upgraded to a 4.2.x MGMT IP build, the HA systems may define its Router ID as 169.254.2.2. This is not the expected behavior given that the Edge selection of Router ID should not take the HA interface's IP Address into account. This Router ID breaks OSPF connectivity and there is a complete disconnection as route exchange no longer occurs.
Workaround: Restart the Edge service (triggering an HA failover) as this will force a reselection of the Router ID which should be a correct one after the restart.
- Issue 67790: For a customer enterprise which uses either BGP or OSPF and has configured an inbound filter(s) to ignore certain routes, when Dynamic Cost Calculation (DCC) is enabled on this enterprise, the inbound filter(s) will no longer be in effect and traffic will attempt to use those routes.
Prior to DCC being enabled, the forwarding information base (FIB) will not include the routes that were set to IGNORE on the BGP/OSPF inbound filter. After DCC is enabled the FIB now includes these routes and traffic will attempt to use these routes with the potential for significant traffic disruption for the customer enterprise.
Workaround: OSPF/BGP needs to be restarted for the inbound filter to be properly applied.
- Issue 84825: For a site deployed with a High-Availability topology where BGP is configured, if the site has greater than 512 BGPv4 match and set rules configured, the customer may observe the HA Edge pair continuously failing over without ever recovering.
Greater than 512 BGPv4 match and set rules is understood as a customer configuring more than 256 such rules on the inbound filter and 256 rules on the outbound filter. This issue would be disruptive to the customer as the repeated failover would cause flows for real time traffic like voice calls to be continuously dropped and then recreated. When HA Edges experience this issue, the process that synchronizes Edge CPU threads fails causing the Edge to reboot to recover, but the promoted Edge also experiences the same issue and reboots in turn with no recovery reached at the site.
Workaround: Without a fix for this issue, the customer must ensure that no more than 512 BGPv4 match and set rules are configured for an HA site.
If a site is experiencing this issue and has more than 512 BGP/v4 match and set rules configured, the customer must immediately reduce the number of rules to 512 or less to recover the site.
Alternatively, if the customer must have more than 512 BGPv4 match and set rules, they can downgrade the HA Edges to Release 3.4.6 where this issue is not encountered, but at the cost of Edge features found in later releases. This can only be done if their Edge model is supported on 3.4.6 and the customer should confirm that is so before downgrading.
- Issue 19566:
After High Availability failover, the serial number of the standby VMware SD-WAN Edge may be shown as the active serial number in the Orchestrator.
- Issue 21342:
When assigning Partner Gateways per-segment, the proper list of Gateway Assignments may not show under the Operator option "View" Gateways on the VMware SD-WAN Edge monitoring list.
- Issue 24269:
Monitor > Transport > Loss not graphing observed WAN link loss while QoE graphs do reflect this loss.
- Issue 25932:
The VMware SD-WAN Orchestrator allows VMware SD-WAN Gateways to be removed from the Gateway Pool even when they are in use.
- Issue 32335:
The ‘End User Service Agreement’ (EUSA) page throws an error when a user is trying to accept the agreement.
Workaround: Ensure no leading or trailing spaces are found in Enterprise Name.
- Issue 32435:
A VMware SD-WAN Edge override for a policy-based NAT configuration is permitted for tuples which are already configured at the profile level and vice versa.
- Issue 32856:
Though a business policy is configured to use the Hub cluster to backhaul internet traffic, the user can unselect the Hub cluster from a profile on a VMware SD-WAN Orchestrator that has been upgraded from Release 3.2.1 to Release 3.3.x.
- Issue 32913:
After Enabling High Availability, Multicast details for the VMware SD-WAN Edge are not displayed on the Monitoring Page. A failover resolves the issue.
- Issue 33026:
The ‘End User Service Agreement’ (EUSA) page does not reload properly after deleting the agreement.
- Issue 34828:
Traffic cannot pass between a VMware SD-WAN Spoke Edge using release 2.x and a Hub Edge using release 3.3.1.
- Issue 35658:
When a VMware SD-WAN Edge is moved from one profile to another which has a different CSS setting (e.g. IPsec in profile1 to GRE in profile2), the Edge level CSS settings will continue to use the previous CSS settings (e.g. IPsec versus GRE).
Workaround: Disable and then reenable GRE at the Edge level to resolve the issue.
- Issue 35667:
When a VMware SD-WAN Edge is moved from one profile to another profile which has the same CSS setting but a different GRE CSS name (the same endpoints), some GRE tunnels will not show in monitoring.
Workaround: Disable and then reenable GRE at the Edge level to resolve the issue.
- Issue 36665:
If the VMware SD-WAN Orchestrator cannot reach the internet, user interface pages that require accessing the Google Maps API may fail to load entirely.
- Issue 38056:
The Edge-Licensing export.csv file not show region data.
- Issue 38843:
When pushing an application map, there is no Operator event, and the Edge event is of limited utility.
- Issue 39633:
The Super Gateway hyper link does not work after a user assigns the Alternate Gateway as the Super Gateway.
- Issue 39790:
The VMware SD-WAN Orchestrator allows a user to configure a VMware SD-WAN Edge’s routed interface to have greater than the supported 32 subinterfaces, creating the risk that a user can configure 33 or more subinterfaces on an interface which would cause a Dataplane Service Failure for the Edge.
- Issue 40341:
Though the Skype application is properly categorized on the backend as Real Time traffic, when editing the Skype Business Policy on the VMware SD-WAN Orchestrator, the Service Class may erroneously display “Transactional”.
- Issue 41691:
User cannot change the 'Number of addresses' field although the DHCP pool is not exhausted on the Configure > Edge > Device page.
- Issue 43276:
User cannot change the Segment type when a VMware SD-WAN Edge or Profile has a partner gateway configured.
- Issue 44153:
The VMware SD-WAN Orchestrator does not consistently send alert emails to the email addresses configured in the 'Alerts and Notifications' section.
- Issue 46254:
During a VMware SD-WAN Edge activation, the VMware SD-WAN Orchestrator does not detect a changed WAN link MTU or the presence of a VLAN ID for DHCP configured interfaces.
- Issue 47269:
The VMware SD-WAN 510-LTE interface may appear for Edge models that do not support an LTE interface.
- Issue 47713:
If a Business Policy Rule is configured while Cloud VPN is disabled, the NAT configuration must be reconfigured upon enabling Cloud VPN.
- Issue 47820:
If a VLAN is configured with DHCP disabled at the Profile level, while also having an Edge Override for this VLAN on that Edge with DHCP enabled, and there is an entry for the DNS server field set to none (no IP configured), the user will be unable to make any changed on the Configure > Edge > Device page and will get an error message of ‘invalid IP address []’ that does not explain or point to the actual problem.
- Issue 48085:
The VMware SD-WAN Orchestrator allows a user to delete a VLAN which is associated with an interface.
- Issue 48737:
On a VMware SD-WAN Orchestrator which is using the Release 4.0.0 new user interface, If a user is on a Monitor page and changes the Start & End time interval and then navigates between tabs, the Orchestrator does not update Start & End interval time to the new values.
- Issue 49225:
VMware SD-WAN Orchestrator does not enforce a limit of 32 total VLANs.
- Issue 49790:
When a VMware SD-WAN Edge is activated to Release 4.0.0, the activation is posted twice in Events.
Workaround: Ignore the duplicate event.
- Issue 50531:
When two Operators of differing privileges use the same browser window when accessing the New UI on a 4.0.0 Release version of the VMware SD-WAN Orchestrator, and the Operator with lesser privileges tries to login after the Operator with higher privileges, that lesser privileged Operator will observe multiple errors stating that the "user does not have privilege".
Note: There is no escalation in privileges for the Operator with lower privileges, only the display of error messages.
Workaround: The next operator may refresh that page prior to logging in to prevent seeing the errors, or each Operator may use different browser windows to avoid this display issue.
- Issue 51722: On the Release 4.0.0 VMware SD-WAN Orchestrator, the time range selector is no greater than two weeks for any statistic in the Monitor > Edge tabs.
The time range selector does not show options greater than "Past 2 Weeks" in Monitor > Edge tabs even if the retention period for a set of statistics is much longer than 2 weeks. For example, flow and link statistics are retained for 365 days by default (which is configurable), while path statistics are retained only for 2 weeks by default (also configurable). This issue is making all monitor tabs conform to the lowest retained type of statistic versus allowing a user to select a time period that is consistent with the retention period for that statistic.
Workaround: A user may use the "Custom" option in the time range selector to see data for more than 2 weeks.
- Issue 60039: RMA Reactivation does not work when the VMware SD-WAN Edge model is changed.
When performing an RMA Reactivation for a site where the Edge model is also being changed, the VMware SD-WAN Orchestrator does not save the model change making the reactivation link ineffective. This only affects RMA Reactivations where the Edge model is changed, an RMA Reactivation where the Edge model remains the same will work as expected.
Workaround: If using a different Edge model for a site, the user would need to create a new Edge and manually apply all Edge-specific settings.