VMware NSX-T Data Center 3.1.0   |  30 October 2020  |  Build 17107167

Check regularly for additions and updates to these release notes.

What's in the Release Notes

The release notes cover the following topics:

What's New

NSX-T Data Center 3.1 provides a variety of new features to offer new functionalities for virtualized networking and security for private, public, and multi-clouds. Highlights include new features and enhancements in the following focus areas:

  • Cloud-scale Networking: Federation enhancements, Enhanced Multicast capabilities.
  • Move to Next Gen SDN: Simplified migration from NSX-V to NSX-T, 
  • Intrinsic Security: Distributed IPS, FQDN-based Enhancements
  • Lifecycle and monitoring: NSX-T support with vSphere Lifecycle Manager (vLCM), simplified installation, enhanced monitoring, search and filtering.
  • Inclusive terminology: In NSX-T 3.1, as part of a company-wide effort to remove instances of non-inclusive language in our products, the NSX team has made changes to some of the terms used in the product UI and documentation. APIs, CLIs, and logs still use legacy terms.

In addition to these enhancements, many other capabilities are added in every part of the product. More details on NSX-T 3.1 new features and enhancements are available in the NSX-T Data Center 3.1.0 release.

Federation

  • Support for standby Global Manager Cluster
    • Global Manager can now have an active cluster and a standby cluster in another location. Latency between active and standby cluster must be a maximum of 150ms round-trip time.
  • With the support of Federation upgrade and Standby GM, Federation is now considered production ready.

​L2 Networking

Change the display name for TCP/IP stack: The netstack keys remain "vxlan" and "hyperbus" but the display name in the UI is now "nsx-overlay" and "nsx-hyperbus".

  • The display name will change in both the list of Netstacks and list of VMKNICs 
  • This change will be visible with vCenter 6.7

Improvements in L2 Bridge Monitoring and Troubleshooting 

  • Consistent terminology across documentation, UI and CLI
  • Addition of new CLI commands to get summary and detailed information on L2 Bridge profiles and stats
  • Log messages to identify the bridge profile, the reason for the state change, as well as the logical switch(es) impacted

Support TEPs in different subnets to fully leverage different physical uplinks

A Transport Node can have multiple host switches attaching to several Overlay Transport Zones. However, the TEPs for all those host switches need to have an IP address in the same subnet. This restriction has been lifted to allow you to pin different host switches to different physical uplinks that belong to different L2 domains.

Improvements in IP Discovery and NS Groups: IP Discovery profiles can now be applied to NS Groups simplifying usage for Firewall Admins.

L3 Networking

Policy API enhancements

  • Ability to configure BFD peers on gateways and forwarding up timer per VRF through policy API.
  • Ability to retrieve the proxy ARP entries of gateway through policy API.

BGP enhancement

  • RFC5549 support: Tier-0 gateway can now advertise IPv4 prefixes over IPv6 only BGP peers. This minimizes the number of BGP peers needed to advertise both IPv4 and IPv6 prefixes.

Multicast

NSX-T 3.1 is a major release for Multicast, which extends its feature set and confirms its status as enterprise ready for deployment. 

  • Support for Multicast Replication on Tier-1 gateway. Allows to turn on multicast for a Tier-1 with Tier-1 Service Router (mandatory requirement) and have Multicast receivers and sources attached to it.
  • Support for IGMPv2 on all downlinks and uplinks from Tier-1
  • Support for PIM-SM on all uplinks (config max supported) between each Tier-0 and all TORs  (protection against TOR failure)
  • Ability to run Multicast in A/S and Unicast ECMP in A/A from Tier-1 → Tier-0 → TOR 
    • Please note that Unicast ECMP will not be supported from ESXi host → T1 when it is attached to a T1 which also has Multicast enabled. 
  • Support for static RP programming and learning through BS & Support for Multiple Static RPs
  • Distributed Firewall support for Multicast Traffic
  • Improved Troubleshooting: This adds the ability to configure IGMP Local Groups on the uplinks so that the Edge can act as a receiver. This will greatly help in triaging multicast issues by being able to attract multicast traffic of a particular group to Edge.

Edge Platform and Services

  • Inter TEP communication within the same host: Edge TEP IP can be on the same subnet as the local hypervisor TEP. This is supported only if a VLAN segment is used or a NSX controlled DVPG.
  • Support for redeployment of Edge node: A defunct Edge node, VM or physical server, can be replaced with a new one without requiring it to be deleted.
  • NAT connection limit per Gateway: The maximum NAT sessions can be configured per Gateway.
  • Edge Networking Performance increased: Overall networking performance has been greatly improved by adding more queues to a single vNIC. To achieve higher throughput, the parameters ethernetX.pnicFeature = "4" must be manually added in an edge node virtual machine .VMX file. Settings must be applied to the VMX file while the Edge virtual machine is powered off. The following parameters must be added to the VMX file:
    • ethernet0.pnicFeatures = "4"
    • ethernet1.pnicFeatures = "4"
    • ethernet2.pnicFeatures = "4"
    • ethernet3.pnicFeatures = "4"
  • The ethernetX.pnicFeatures settings must be added manually for every edge node that has been created prior to NSX-T version 3.1.0. Every edge node virtual machine created on release 3.1.0 will already have the settings applied to the .VMX file. Starting with NSX-T version 3.1.0 the ethernetX.pnicFeatures settings will be automatically added when an edge node virtual machine is re-deployed.

Firewall

  • Improvements in FQDN-based Firewall: You can define FQDNs that can be applied to a Distributed Firewall. You can either add individual FQDNs or import a set of FQDNs from CSV files.

Firewall Usability Features

  • Firewall Export & Import: NSX now provides the option for you to export and import firewall rules and policies as CSVs. 
  • Enhanced Search and Filtering: Improved search indexing and filtering options for firewall rules based on IP ranges.

Distributed Intrusion Detection/Prevention System (D-IDPS)

Distributed IPS

  • NSX-T will have a Distributed Intrusion Prevention System. You can block threats based on signatures configured for inspection.
  • Enhanced dashboard to provide details on threats detected and blocked.
  • IDS/IPS profile creation is enhanced with Attack Types, Attack Targets, and CVSS scores to create more targeted detection.

Load Balancing

  • HTTP server-side Keep-alive: An option to keep one-to-one mapping between the client side connection and the server side connection; the backend connection is kept until the frontend connection is closed.   
  • HTTP cookie security compliance: Support for "httponly" and "secure" options for HTTP cookie.
  • A new diagnostic CLI command: The single command captures various troubleshooting outputs relevant to Load Balancer.

VPN

  • TCP MSS Clamping for L2 VPN: The TCP MSS Clamping feature allows L2 VPN session to pass traffic when there is MTU mismatch.

Automation, OpenStack and API

  • NSX-T Terraform Provider support for Federation: The NSX-T Terraform Provider extends its support to NSX-T Federation. This allows you to create complex logical configurations with networking, security (segment, gateways, firewall etc.) and services in an infra-as-code model. For more details, see the NSX-T Terraform Provider release notes.
  • Conversion to NSX-T Policy Neutron Plugin for OpenStack environment consuming Management API: Allows you to move an OpenStack with NSX-T environment from the Management API to the Policy API. This gives you the ability to move an environment deployed before NSX-T 2.5 to the latest NSX-T Neutron Plugin and take advantage of the latest platform features.
  •  Ability to change the order of NAT and FWLL on OpenStack Neutron Router: This gives you the choice in your deployment for the order of operation between NAT and FWLL. At the OpenStack Neutron Router level (mapped to a Tier-1 in NSX-T), the order of operation can be defined to be either NAT then firewall or firewall then NAT. This is a global setting for a given OpenStack Platform.
  • NSX Policy API Enhancements: Ability to filter and retrieve all objects within a subtree of the NSX Policy API hierarchy. In previous version filtering was done from the root of the tree policy/api/v1/infra?filter=Type-, this will allow you to retrieve all objects from sub-trees instead. For example, this allows a network admin to look at all Tier-0 configurations by simply /policy/api/v1/infra/tier-0s?filter=Type-  instead of specifying from the root all the Tier-0 related objects.

Operations

  • NSX-T support with vSphere Lifecycle Manager (vLCM): Starting with vSphere 7.0 Update 1, VMware NSX-T Data Center can be supported on a cluster that is managed with a single vSphere Lifecycle Manager (vLCM) image. As a result, NSX Manager can be used to install, upgrade, or remove NSX components on the ESXi hosts in a cluster that is managed with a single image.
    • Hosts can be added and removed from a cluster that is managed with a single vSphere Lifecycle Manager and enabled with VMware NSX-T Data Center.
    • Both VMware NSX-T Data Center and ESXi can be upgraded in a single vSphere Lifecycle Manager remediation task. The workflow is supported only if you upgrade from VMware NSX-T Data Center version 3.1.
    • Compliance can be checked, a remediation pre-check report can be generated, and a cluster can be remediated with a single vSphere Lifecycle Manager image and that is enabled with VMware NSX-T Data Center.
  • Simplification of host/cluster installation with NSX-T: Through the "Getting Started" button in the VMware NSX-T Data Center user interface, simply select the cluster of hosts that needs to be installed with NSX, and the UI will automatically prompt you with a network configuration that is recommended by NSX based on your underlying host configuration. This can be installed on the cluster of hosts thereby completing the entire installation in a single click after selecting the clusters. The recommended host network configuration will be shown in the wizard with a rich UI, and any changes to the desired network configuration before NSX installation will be dynamically updated so users can refer to it as needed.
  • Enhancements to in-place upgrades: Several enhancements have been made to the VMware NSX-T Data Center in-place host upgrade process, like increasing the max limit of virtual NICs supported per host, removing previous limitations, and reducing the downtime in data path during in-place upgrades. Refer to the VMware NSX-T Data Center Upgrade Guide for more details.
  • Reduction of VIB size in NSX-T: VMware NSX-T Data Center 3.1.0 has a smaller VIB footprint in all NSX host installations so that you are able to install ESX and other 3rd party VIBs along with NSX on their hypervisors.
  • Enhancements to Physical Server installation of NSX-T: To simplify the workflow of installing VMware NSX-T Data Center on Physical Servers, the entire end-to-end physical server installation process is now through the NSX Manager. The need for running Ansible scripts for configuring host network connectivity is no longer a requirement.
  • ERSPAN support on a dedicated network stack with ENS: ERSPAN support on a dedicated network stack with ENS: ERSPAN can now be configured on a dedicated network mirror stack i.e., vmk mirror stack and supported with the enhanced NSX network switch i.e., ENS, thereby resulting in higher performance and throughput for ERSPAN Port Mirroring.
  • Singleton Manager with vSphere HA: NSX now supports the deployment of a single NSX Manager in production deployments. This can be used in conjunction with vSphere HA to recover a failed NSX Manager. Please note that the recovery time for a single NSX Manager using backup/restore or vSphere HA may be much longer than the availability provided by a cluster of NSX Managers.
  • Log consistency across NSX components: Consistent logging format and documentation across different components of NSX so that logs can be easily parsed for automation and you can efficiently consume the logs for monitoring and troubleshooting.
  • Support for Rich Common Filters: This is to support rich common filters for operations features like packet capture, port mirroring, IPFIX, and latency measurements for increasing the efficiency of customers while using these features. Currently, these features have either very simple filters which are not always helpful, or no filters leading to inconvenience.
  • CLI Enhancements: Several CLI related enhancements have been made in this release. See the NSX CLI Guide for more information.
    • CLI "get" commands will be accompanied with timestamps now to help with debugging
    • GET / SET / RESET the Virtual IP (VIP) of the NSX Management cluster through CLI
    • While debugging through the Central CLI, run ping commands for Edge Transport nodes directly from Manager appliance nodes eliminating extra steps needed to log in to respective Transport node and do the same
    • View the list of core on any NSX component through CLI
    • Use the "*" operator now in CLI
    • Commands for debugging L2Bridge through CLI have also been introduced in this release
  • Distributed Load Balancer Traceflow: Traceflow now supports Distributed Load Balancer for troubleshooting communication failures from endpoints deployed in vSphere with Tanzu to a service endpoint via the Distributed Load Balancer.

Monitoring

  • Events and Alarms
    • Capacity Dashboard: Maximum Capacity, Maximum Capacity Threshold, Minimum Capacity Threshold
    • Edge Health: Standby move to different edge node, Datapath thread deadlocked, NSXT Edge core file has been generated, Logical Router failover event, Storage Error
    • ISD/IPS: NSX-IDPS Engine Up/Down, NSX-IDPS Engine CPU Usage exceeded 75%, NSX-IDPS Engine CPU Usage exceeded 85%, NSX-IDPS Engine CPU Usage exceeded 95%, Max events reached, NSX-IDPS Engine Memory Usage exceeded 75%,
      NSX-IDPS Engine MemoryUsage exceeded 85%, NSX-IDPS Engine MemoryUsage exceeded 95%
    • IDFW: Connectivity to AD server, Errors during Delta Sync
    • Federation: GM to GM Split Brain
    • Communication: Control Channel to Transport Node Down, Control Channel to Transport Node Down for too Long, Control Channel to Manager Node Down, Control Channel to Manager Node Down for too Long, Management Channel to Transport Node Down, Management Channel to Transport Node Down for too Long, Manager FQDN Lookup Failure, Manager FQDN Reverse Lookup Failure
  • ERSPAN for ENS fast path: Support port mirroring for ENS fast path.
  • System Health Plugin Enhancements: System Health plugin enhancements and status monitoring of processes running on different nodes to ensure that system is running properly by on-time detection of errors.
  • Live Traffic Analysis & Tracing: A live traffic analysis tool to support bi-directional traceflow between on-prem and VMC data centers.
  • Latency Statistics and Measurement for UA Nodes: Latency measurements between NSX Manager nodes per NSX Manager cluster and between NSX Manager clusters across different sites.
  • Performance Characterization for Network Monitoring using Service Insertion: To provide performance metrics for network monitoring using Service Insertion.

Usability and User Interface

  • Graphical Visualization of VPN: The Network Topology map now visualizes the VPN tunnels and sessions that are configured. This aids you to quickly visualize and troubleshoot VPN configuration and settings.
  • Dark Mode: NSX UI now supports dark mode. You can toggle between light and dark mode.
  • Firewall Export & Import: NSX now provides the option for you to export and import firewall rules and policies as CSVs. 
  • Enhanced Search and Filtering: Improved the search indexing and filtering options for firewall rules based on IP ranges.
  • Reducing Number of Clicks: With this UI enhancement, NSX-T now offers a convenient and easy way to edit Network objects.

Licensing

  • Multiple license keys: NSX now has the ability to accept multiple license keys of same edition and metric. This functionality allows you to maintain all your license keys without having to combine your license keys.
  • License Enforcement: NSX-T now ensures that users are license-compliant by restricting access to features based on license edition. New users will be able to access only those features that are available in the edition that they have purchased. Existing users who have used features that are not in their license edition will be restricted to only viewing the objects; create and edit will be disallowed.
  • New VMware NSX Data Center Licenses: Adds support for new VMware NSX Firewall and NSX Firewall with Advanced Threat Prevention license introduced in October 2020, and continues to support NSX Data Center licenses (Standard, Professional, Advanced, Enterprise Plus, Remote Office Branch Office) introduced in June 2018, and previous VMware NSX for vSphere license keys. See the VMware Product Guide for more information about NSX licenses.

AAA and Platform Security

  • Security Enhancements for Use of Certificates And Key Store Management: With this architectural enhancement, NSX-T offers a convenient and secure way to store and manage a multitude of certificates that are essential for platform operations and be in compliance with industry and government guidelines. This enhancement also simplifies API use to install and manage certificates. 
  • Alerts for Audit Log Failures: Audit logs play a critical role in managing cybersecurity risks within an organization and are often the basis of forensic analysis, security analysis and criminal prosecution, in addition to aiding with diagnosis of system performance issues. Complying with NIST-800-53 and industry-benchmark compliance directives, NSX offers alert notification via alarms in the event of failure to generate or process audit data. 
  • Custom Role Based Access Control: Users desire the ability to configure roles and permissions that are customized to their specific operating environment. The custom RBAC feature allows granular feature-based privilege customization capabilities enabling NSX customers the flexibility to enforce authorization based on least privilege principles. This will benefit users in fulfilling specific operational requirements or meeting compliance guidelines. Please note in NSX-T 3.1, only policy based features are available for role customization.  
  • FIPS - Interoperability with vSphere 7.x: Cryptographic modules in use with NSX-T are FIPS 140-2 validated since NSX-T 2.5. This change extends formal certification to incorporate module upgrades and interoperability with vSphere 7.0.

NSX Data Center for vSphere to NSX-T Data Center Migration

  • Migration of NSX for vSphere Environment with vRealize Automation: The Migration Coordinator now interacts with vRealize Automation (vRA) in order to migrate environments where vRealize Automation provides automation capabilities. This will offer a first set of topologies which can be migrated in an environment with vRealize Automation and NSX-T Data Center. Note: This will require support on vRealize Automation.
  • Modular Distributed Firewall Config Migration: The Migration Coordinator is now able to migrate firewall configurations and state from a NSX Data Center for vSphere environment to NSX-T Data Center environment. This functionality allows a customer to do migrate virtual machines (using vMotion) from one environment to the other and keep their firewall rules and state.
  • Migration of Multiple VTEP: The NSX Migration Coordinator now has the ability to migrate environments deployed with multiple VTEPs.
  • Increase Scale in Migration Coordinator to 256 Hosts: The Migration Coordinator can now migrate up to 256 hypervisor hosts from NSX Data Center for vSphere to NSX-T Data Center.
  • Migration Coordinator coverage of Service Insertion and Guest Introspection: The Migration Coordinator can migrate environments with Service Insertion and Guest Introspection. This will allow partners to offer a solution for migration integrated with complete migrator workflow.

Compatibility and System Requirements

For compatibility and system requirements information, see the NSX-T Data Center Installation Guide.

API Deprecations and Behavior Changes

Retention Period of Unassigned Tags: In NSX-T 3.0.x, NSX Tags with 0 Virtual Machines assigned are automatically deleted by the system after five days. In NSX-T 3.1.0, the system task has been modified to run on a daily basis, cleaning up unassigned tags that are older than one day. There is no manual way to force delete unassigned tags.

API and CLI Resources

See code.vmware.com to use the NSX-T Data Center APIs or CLIs for automation.

Available Languages

NSX-T Data Center has been localized into multiple languages: English, German, French, Japanese, Simplified Chinese, Korean, Traditional Chinese, and Spanish. Because NSX-T Data Center localization utilizes the browser language settings, ensure that your settings match the desired language.

Document Revision History

October 29, 2020. First edition.
November 13, 2020. Second edition. Added known issues 2659168, 2655295, 2658950, 2609681, 2562189, 2657943, 2550492, 2653227, 2636771, 2637241, 2658092, 2643610, 2658199, 2555383, 2658713, 2662225, 2610851, 2652154, 2622576, 2587257, 2659234, 2622846, 2656929, 2587513, 2639671, 2645877.
March 11, 2021. Third edition. Added BGP enhancement to L3 Networking in What's New.
April 22, 2021. Fourth edition. Added edge networking performance improvements to Edge Platform and Services in What's New. Added known issues 2738345 and 2740587. Added resolved issue 2628121.
September 17, 2021. Fifth edition. Added known issue 2761589.
October 25, 2022. Sixth edition. Updated the "CLI Enhancements" section.

Resolved Issues

  • Fixed Issue 2462079: Some versions of ESXi hosts reboot during upgrade if there are stale DV filters present on the ESXi host.

    For hosts running ESXi 6.5-U2/U3 and/or 6.7-U1/U2, during maintenance mode upgrade to NSX-T 2.5.1, the host may reboot if stale DV filters are found to be present on the host after VMs are moved out.

  • Fixed Issue 2475963: NSX-T VIBs fail to install due to insufficient space.

    NSX-T VIBs fail to install due to insufficient space in bootbank on ESXi host, returning a BootBankInstaller.pyc: ERROR. Some ESXi images provided by third-party vendors may include VIBs which are not in use and can be relatively large in size. This can result in insufficient space in bootbank/alt-bootbank when installing/upgrading any VIBs.

  • Fixed Issue 2538956: DHCP Profile shows a message of "NOT SET" and the Apply buttons is disabled when configuring a Gateway DHCP on Segment.

    When attempting to configure Gateway DHCP on Segment when there is no DHCP configured on the connected Gateway, the DHCP Profile cannot be applied because there is no valid DHCP to be saved.

  • Fixed Issue 2538041: Groups containing Manager Mode IP Sets can be created from Global Manager.

    Global Manager allows you to create Groups that contain IP Sets that were created in Manager Mode. The configuration is accepted but the groups do not get realized on Local Managers.

  • Fixed Issue 2463947: When preemptive mode HA is configured, and IPSec HA is enabled, upon double failover, packet drops over VPN are seen.

    Traffic over VPN will drop on peer side. IPSec Replay errors will increase.

  • Fixed Issue 2540733: Service Instance is not created after re-adding the same host in the cluster.

    Service Instance in NSX is not created after re-adding the same host in the cluster, even though the service VM is present on the host. The deployment status will be shown as successful, but protection on the given host will be down.

  • Fixed Issue 2530822: Registration of vCenter with NSX manager fails even though NSX-T extension is created on vCenter.

    While registering vCenter as compute manager in NSX, even though the "com.vmware.nsx.management.nsxt" extension is created on vCenter, the compute manager registration status remains "Not Registered" in NSX-T. Operations on vCenter, such as auto install of edge etc., cannot be performed using the vCenter Server compute manager.

  • Fixed Issue 2532755: Inconsistencies between CLI output and policy output for routing-table.

    Routing table downloaded from the UI has extra number of routes compared to CLI output. There is an additional route (default route) listed in the output downloaded from policy. There is no functional impact.

  • Fixed Issue 2534855: Route maps and redistribution rules of Tier-0 gateways created on the simplified UI or policy API will replace the route maps and redistribution rules created on the advanced UI (or MP API).

    During upgrades, any existing route maps and rules that were created on the simplified UI (or policy API) will replace the configurations that were done directly on the advanced UI (or MP API).

  • Fixed Issue 2535355: Session timer may not take effect after upgrading to NSX-T 3.0 under certain circumstances.

    Session timer setting is not taking effect. The connection session (e.g., tcp established, tcp fin wait) will use its system default session timer instead of the custom session timer. This may cause the connection (tcp/udp/icmp) session to be established longer or shorter than expected.

  • Fixed Issue 2518183: For Manager UI screens, the Alarms column does not always show the latest alarm count.

    Recently generated alarms are not reflected on Manager entity screens.

  • Fixed Issue 2543353: NSX T0 edge calculates incorrect UDP checksum post-eSP encapsulation for IPsec tunneled traffic.

    Traffic is dropped due to bad checksum in UDP packet.

  • Fixed Issue 2556730: When configuring an LDAP identity source, authentication via LDAP Group -> NSX Role Mapping does not work if the LDAP domain name is configured using mixed case.

    Users who attempt to log in are denied access to NSX.

  • Fixed Issue 2572052: Scheduled backups might not get generated.

    In some corner case, scheduled backups are not generated.

  • Fixed Issue 2557166: Distributed Firewall rules using context-profiles (layer 7) are not working as expected when applied to Kubernetes pods.

    After configuring L7 rules on Kubernetes pods, traffic that is supposed to match L7 rules is hitting the default rule instead.

  • Fixed Issue 2486119: PNICs are migrated from NVDS back to VDS uplinks with mapping that is different from the original mapping in VDS.

    When a Transport Node is created with a Transport Node Profile that has PNIC install and uninstall mappings, PNICs are migrated from VDS to NVDS. Later when NSX-T Data Center is removed from the Transport Node, the PNICs are migrated back to the VDS, but the mapping of PNIC to uplink may be different from the original mapping in VDS.

  • Fixed Issue 2628634: Day2tools will try to migrate TN from NVDS to CVDS even after "vds-migrate disable-migrate" is called.

    The NVDS to CVDS migration will fail and host will leave in maintenance mode.

  • Fixed Issue 2607196: Service Insertion (SI) and Guest Introspection (GI) that use Host base deployments are not supported for NVDS to CVDS Migration.

    Cannot migrate Transport Nodes with NVDS using NVDS to CVDS tool.

  • Fixed Issue 2586606: Load balancer does not work when Source-IP persistence is configured on a large number of virtual servers. 

    When Source-IP persistence is configured on a large number of virtual servers on a load balancer, it consumes significant amount of memory and may lead to NSX Edge running out of memory. However the issue can reoccur with addition of more virtual servers.

  • Fixed Issue 2540352: No backups are available in the Restore Backup window for CSM.

    When restoring a CSM appliance from a backup, you enter the details of the backup file server in the Restore wizard but a list of backups does not appear in the UI even though it is available on the server.

  • Fixed Issue 2605420: UI shows general error message instead of specific one indicating Local Manager VIP changes.

    Global Manager to site communication is impacted.

  • Fixed Issue 2638571: Deleting 5000 NAT rules sometime takes more than 1 hour.

    NAT rules are still visible in the UI but grayed out. You have to wait for their cleanup before creating NAT rules with the same name. There is not impact with a different name. 

  • Fixed Issue 2629422: Message shown on UI is incomplete in case system tries to onboard a site having DNS service on Tier-1 gateway in addition to LB service.

    On-boarding is blocked for tier-1 gateway that offers DNS/DHCP service in addition to one-arm LB service. It is expected to block onboarding. 
    UI shows possible resolution text by giving reference of DHCP service only. But same resolution is applicable for DNS service as well.

  • Fixed Issue 2328126: Bare Metal issue: Linux OS bond interface when used in NSX uplink profile returns error.

    When you create a bond interface in the Linux OS and then use this interface in the NSX uplink profile, you see this error message: "Transport Node creation may fail." This issue occurs because VMware does not support Linux OS bonding. However, VMware does support Open vSwitch (OVS) bonding for Bare Metal Server Transport Nodes.

  • Fixed Issue 2390624: Anti-affinity rule prevents service VM from vMotion when host is in maintenance mode.

    If a service VM is deployed in a cluster with exactly two hosts, the HA pair with anti-affinity rule will prevent the VMs from vMotioning to the other host during any maintenance mode tasks. This may prevent the host from entering Maintenance Mode automatically.

  • Fixed Issue 2389993: Route map removed after redistribution rule is modified using the Policy page or API.

    If there is a route-map added using management plane UI/API in Redistribution Rule, it will get removed If you modify the same Redistribution Rule from Simplified (Policy) UI/API.

  • Fixed Issue 2400379: Context Profile page shows unsupported APP_ID error message.

    The Context Profile page shows the following error message: "This context profile uses an unsupported APP_ID - [<APP_ID>]. Please delete this context profile manually after making sure it is not being used in any rule." This is caused by the post-upgrade presence of six deprecated APP_IDs (AD_BKUP, SKIP, AD_NSP, SAP, SUNRPC, SVN) that no longer work on the data path.

  • Fixed Issue 2628121: For certain configurations where two Security Groups are configured with the same dynamic criteria, there is a rare condition that a compute VM matching the criteria may not be added in the Security group.

    For certain configurations where two Security Groups are configured with the same dynamic criteria, there is a rare condition that a compute VM matching the criteria may not be added in the Security group.

Known Issues

The known issues are grouped as follows.

General Known Issues
  • Issue 2320529: "Storage not accessible for service deployment" error thrown after adding third-party VMs for newly added datastores.

    "Storage not accessible for service deployment" error thrown after adding third-party VMs for newly added datastores even though the storage is accessible from all hosts in the cluster. This error state persists for up to thirty minutes.

    Retry after thirty minutes. As an alternative, make the following API call to update the cache entry of datastore:

    https://<nsx-manager>/api/v1/fabric/compute-collections/<CC Ext ID>/storage-resources?uniform_cluster_access=true&source=realtime

    where   <nsx-manager> is the IP address of the NSX manager where the service deployment API has failed, and < CC Ext ID> is the identifier in NSX of the cluster where the deployment is being attempted.

  • Issue 2329273: No connectivity between VLANs bridged to the same segment by the same edge node.

    Bridging a segment twice on the same edge node is not supported. However, it is possible to bridge two VLANs to the same segment on two different edge nodes.

    Workaround: None 

  • Issue 2355113: Unable to install NSX Tools on RedHat and CentOS Workload VMs with accelerated networking enabled in Microsoft Azure.

    In Microsoft Azure when accelerated networking is enabled on RedHat (7.4 or later) or CentOS (7.4 or later) based OS and with NSX Agent installed, the ethernet interface does not obtain an IP address.

    Workaround: After booting up RedHat or CentOS based VM in Microsoft Azure, install the latest Linux Integration Services driver available at https://www.microsoft.com/en-us/download/details.aspx?id=55106 before installing NSX tools.

  • Issue 2370555: User can delete certain objects in the Advanced interface, but deletions are not reflected in the Simplified interface.

    Specifically, groups added as part of a distributed firewall exclude list can be deleted in the Advanced interface Distributed Firewall Exclusion List settings. This leads to inconsistent behavior in the interface.

    Workaround: Use the following procedure to resolve this issue:

    1. Add an object to an exclusion list in the Simplified interface.
    2. Verify that it appears displayed in the Distributed Firewall exclusion list in the Advanced interface.
    3. Delete the object from the Distributed Firewall exclusion list in the Advanced interface.
    4. Return to the Simplified interface and a second object to the exclusion list and apply it.
    5. Verify that the new object appears in the Advanced interface.
  • Issue 2520803: Encoding format for Manual Route Distinguisher and Route Target configuration in EVPN deployments.

    You currently can configure manual route distinguisher in both Type-0 encoding and in Type-1 encoding. However, using the Type-1 encoding scheme for configuring Manual Route Distinguisher in EVPN deployments is highly recommended. Also, only Type-0 encoding for Manual Route Target configuration is allowed.

    Workaround: Configure only Type-1 encoding for Route Distinguisher.

  • Issue 2490064: Attempting to disable VMware Identity Manager with "External LB" toggled on does not work.

    After enabling VMware Identity Manager integration on NSX with "External LB", if you attempt to then disable integration by switching "External LB" off, after about a minute, the initial configuration will reappear and overwrite local changes.

    Workaround: When attempting to disable vIDM, do not toggle the External LB flag off; only toggle off vIDM Integration. This will cause that config to be saved to the database and synced to the other nodes.

  • Issue 2537989: Clearing VIP (Virtual IP) does not clear vIDM integration on all nodes.

    If VMware Identity Manager is configured on a cluster with a Virtual IP, disabling the Virtual IP does not result in the VMware Identity Manager integration being cleared throughout the cluster. You will have to manually fix vIDM integration on each individual node if the VIP is disabled.

    Workaround: Go to each node individually to manually fix the vIDM configuration on each.

  • Issue 2525205: Management plane cluster operations fail under certain circumstances.

    When attempting to join Manager N2 to Manager N1 by issuing a "join" command on Manager N2, the join command fails. You are unable to form a Management plane cluster, which might impact availability.

    Workaround:

    1. To retain Manager N1 in the cluster, issue a "deactivate" CLI command on Manager N1. This will remove all other Manager from the cluster, keeping Manager N1 as the sole member of the cluster.
    2. Ensure that the non-configuration Corfu server is up and running on Manager N1 by issuing the "systemctl start corfu-nonconfig-server" command.
    3. Join other new Managers to the cluster by issuing "join" commands on them.
  • Issue 2526769: Restore fails on multi-node cluster.

    When starting a restore on a multi-node cluster, restore fails and you will have to redeploy the appliance.

    Workaround: Deploy a new setup (one node cluster) and start the restore.

  • Issue 2523212: The nsx-policy-manager becomes unresponsive and restarts.

    API calls to nsx-policy-manager will start failing, with service being unavailable. You will not be able to access policy manager until it restarts and is available.

    Workaround: Invoke API with at most 2000 objects.

  • Issue 2521071: For a Segment created in Global Manager, if it has a BridgeProfile configuration, then the Layer2 bridging configuration is not applied to individual NSX sites.

    The consolidated status of the Segment will remain at "ERROR". This is due to failure to create bridge endpoint at a given NSX site. You will not be able successfully configure a BridgeProfile on Segments created via Global Manager.

    Workaround: Create a Segment at the NSX site and configure it with bridge profile.

  • Issue 2527671: When the DHCP server is not configured, retrieving DHCP statistics/status on a Tier0/Tier1 gateway or segment displays an error message indicating realization is not successful.

    There is no functional impact. The error message is incorrect and should report that the DHCP server is not configured.

    Workaround: None.

  • Issue 2532127: LDAP user can't log in to NSX only if the user's Active Directory entry does not contain the UPN (userPrincipalName) attribute and contains only the samAccountName attribute.

    User authentication fails and the user is unable to log in to the NSX user interface.

    Workaround: None.

  • Issue 2482580: IDFW/IDS configuration is not updated when an IDFW/IDS cluster is deleted from vCenter.

    When a cluster with IDFW/IDS enabled is deleted from vCenter, the NSX management plane is not notified of the necessary updates. This results in inaccurate count of IDFW/IDS enabled clusters. There is no functional impact. Only the count of the enabled clusters is wrong.

    Workaround: None.

  • Issue 2534933: Certificates that have LDAP based CDPs (CRL Distribution Point) fail to apply as tomcat/cluster certs.

    You can't use CA-signed certificates that have LDAP CDPs as cluster/tomcat certificate.

    Workaround: See VMware knowledge base article 78794.

  • Issue 2499819: Maintenance-based NSX for vSphere to NSX-T Data Center host migration for vCenter 6.5 or 6.7 might fail due to vMotion error.

    This error message is shown on the host migration page:
    Pre-migrate stage failed during host migration [Reason: [Vmotion] Can not proceed with migration: Max attempt done to vmotion vm b'3-vm_Client_VM_Ubuntu_1404-shared-1410'].

    Workaround: Retry host migration.

  • Issue 2557287: TNP updates done after backup are not restored.

    You won't see any TNP updates done after backup on a restored appliance.

    Workaround: Take a backup after any updates to TNP.

  • Issue 2549175: Searching in policy fails with the message: "Unable to resolve with start search resync policy."

    Searching in policy fails because search is out of sync when the NSX Manager nodes are provided with new IPs.

    Workaround: Ensure that the DNS PTR records (IP to hostname mappings in the DNS server) for all the NSX Managers are correct.

  • Issue 2588072: NVDS to CVDS switch migrator doesn't support Stateless ESX with vmks.

    NVDS to CVDS switch migrator cannot be used for migration with Stateless ESX hosts if the NVDS switch has vmks on it.

    Workaround: Either migrate out the vmks from NVDS or remove that host from NSX and perform migration.

  • Issue 2627439: If a transport node profile is applied on a cluster before migration, one extra transport node profile is created by the system in detached state after migration.

    There will one extra transport node profile generated for each original transport node profile.

    Workaround: None.

  • Issue 2392064: Edge stage migration fails with, "Failed to fetch error list" error.

    Migration fails but the reason for the failure (DHCP plugin exception) is not shown.

    Workaround: Roll back migration and retry.

  • Issue 2468774: When option 'Detect NSX configuration change' is enabled, backups are taken even when there is no configuration change.

    Too many backups are being taken because backups are being taken even when there are no configuration changes.

    Workaround: Increase the time associated with this option, thereby reducing the rate at which backups are taken.

  • Issue 2523421: LDAP authentication does not work properly when configured with an external load balancer (configured with round-robin connection persistence).

    The API LDAP authentication won't work reliably and will only work if the load balancer forwards the API request to a particular Manager.

    Workaround: None.

  • Issue 2527344: If selecting the Type as "TIER0_EVPN_TEP_IP" in Route redistribution of Tier 0 VRF LR, the TIER0_EVPN_TEP_IP is not redistributed into BGP as it is present in Parent Tier 0 LR, causing the datapath to break.

    Doing so will not advertise the TIER0_EVPN_TEP_IP to DC gateway (Peer BGP Router). In Tier 0 VRF LR Route Redistribution, you can select the Type as "TIER0_EVPN_TEP_IP", but there are no routes of "TIER0_EVPN_TEP_IP" on Tier 0 VRF LR.

    Workaround: Select "TIER0_EVPN_TEP_IP" in Route Redistribution of Parent Tier 0 LR, instead on VRF Tier 0 LR, and remove the option to select it from VRF Tier 0 LR.

  • Issue 2534921: Not specifying inter_sr_ibgp property in a PATCH API call will prevent other fields from being updated in the BgpRoutingConfig entity.

    PATCH API call fails to update BGP routing config entity. Error_message "BGP inter SR routing requires global BGP and ECMP flags enabled." BgoRoutingConfig will not be updated.

    Workaround: Specify inter_sr_ibgp property in the PATCH API call to allow other fields to be changed.

  • Issue 2566121: A UA node stopped accepting any New API calls with the message, "Some appliance components are not functioning properly."

    The UA node stops accepting any New API calls with the message, "Some appliance components are not functioning properly." There are around 200 connections stuck in CLOSE_WAIT state. These connections are not yet closed. New API call is rejected.

    Workaround: Restart proton service (service proton restart) or restart unified appliance node.

  • Issue 2573975: While configuring ANY to ANY SNAT rule, the addresses specified for source-network/destination-network/translated-network properties as empty string ("") will result in realization failure on management plane.

    The rule will not get realized on the edge and management plane.

    Workaround: For properties for source-network/destination-network/translated-network, either specify the correct address or set the values to null.

  • Issue 2574281: Policy will only allow a maximum of 500 VPN Sessions.

    NSX claims support of 512 VPN Sessions per edge in the large form factor, however, due to Policy doing auto plumbing of security policies, Policy will only allow a maximum of 500 VPN Sessions. Upon configuring the 501st VPN session on Tier0, the following error message is shown:
    {'httpStatus': 'BAD_REQUEST', 'error_code': 500230, 'module_name': 'Policy', 'error_message': 'GatewayPolicy path=[/infra/domains/default/gateway-policies/VPN_SYSTEM_GATEWAY_POLICY] has more than 1,000 allowed rules per Gateway path=[/infra/tier-0s/inc_1_tier_0_1].'}

    Workaround: Use Management Plane APIs to create additional VPN Sessions.

  • Issue 2596162: Unable to update the nsxaHealthStatus for a switch when the switch name contains a single quote.

     NSX configuration state is at partial success because the health status of a switch could not be updated. 

    Workaround: Change the host switch name so that is does not have any single quotes.

  • Issue 2596696: NsxTRestException observed in policy logs when creating SegmentPort from the API.

    NsxTRestException observed in policy logs. The SegmentPort cannot be created using the API.

    Workaround: Either populate the Id field in PortAttachmentDto or pass it as null in the API input.

  • Issue 2610718: Attempting to wire vIDM to NSX using the nsx-cli fails if lb_enable and vidm_enable flags are not explicitly specified.

    The error, "An error occurred attempting to update the vidm properties" will appear. You will only be able to wire vIDM using the UI or directly through REST API, or only through CLI while explicitly defining lb_enable and vidm_enable flags.

    Workaround: Treat the vidm_enable or lb_enable flags as non-optional when using nsx-cli to wire vIDM.

  • Issue 2628503: DFW rule remains applied even after forcefully deleting the manager nsgroup.

    Traffic may still be blocked when forcefully deleting the nsgroup.

    Workaround: Do not forcefully delete an nsgroup that is still used by a DFW rule. Instead, make the nsgroup empty or delete the DFW rule.

  • Issue 2631703: When doing backup/restore of an appliance with vIDM integration, vIDM configuration will break.

    Typically when an environment has been both upgraded and/or restored, attempting to restore an appliance where vIDM integration is up and running will cause that integration to break and you will to have to reconfigure.

    Workaround: After restore, manually reconfigure vIDM.

  • Issue 2362271: NVDS to CVDS migration not supported when VMK with pinned pnic are attached to logical switch backed by NVDS. 

     NVDS to CVDS migration will fail in migration phase.

    Workaround: None.

  • Issue 2638673: SRIOV vNICs for VMs are not discovered by inventory.

    SRIOV vNICs are not listed in Add new SPAN session dialog. You will not see SRIOV vNICs when adding new SPAN session.

    Workaround: None.

  • Issue 2641824: On UI BFD profile realization status is shown as "uninitialized."

    There is no impact. This realization status can be ignored.

    Workaround: None.

  • Issue 2643313: The realization state for a successfully realized Global T0 on an onboarded site incorrectly shows the "logical router port configuration realization error" for site edge transport nodes.

    You may notice realization failure with the error in realization of a T0 logical router port. If using policy API, this is transient and will resolve when T0-T1 plumbing completes.

    Workaround: If you are using management plane or management plane APIs, delete the logical router port created on T0 for connecting to a T1, or complete the connectivity with a T1.

  • Issue 2646814: Overall consolidated status shows IN_PROGRESS whereas individual consolidated status per enforcement point shows SUCCESS.

    The consolidated status shows IN_PROGRESS but it does not give information on which site the status is IN_PROGRESS.

    Workaround: None. The status is reflected properly once sync completes.

  • Issue 2647620: In an NSX configured environment with a large number of Stateless Hosts (TransportNodes), workload VMs on some Stateless hosts may lose connectivity temporarily when upgrading Management Plane nodes to 3.0.0 and above.

    This is applicable only to Stateless ESX Hosts configured for NSX 3.0.0 and above.

    Workaround: None.

  • Issue 2648349: Repeated ObjectNotFoundException log and exception backtrace when Management Plane is collecting observation for traceflow or trace action result for livetrace.

    You will see repeated ObjectNotFoundException log in nsxapi.log when you create an LTA/Traceflow session.

    Workaround: None.

  • Issue 2654842: A segment port can be created before the Windows physical server transport node creation is successful, causing the segment port to go to fail state.

    There is no visible UI indication that you need to wait until the transport node state is Success before proceeding with the Segment Port creation for Windows physical servers. If you try to create the Segment Port before the transport node is successful, it will fail and the Host will completely disconnect.

    Workaround: While installing a Windows physical server, select the continue later option when you reach step 3 of the wizard.
    When transport node is successfully installed later, create a segment port from the manage segment action.

  • Issue 2658484: NVDS to CVDS Migration via vSphere Update Manager is not fully supported via vSphere Update Manager "parallel mode."

    NVDS to CVDS migration issues may occur if migration is triggered by upgrading ESX via vSphere Update Manager in Parallel Mode. Switch will not be automatically migrated from NVDS to CVDS. System might be left in some inconsistent state.

    Workaround: 

    Set the number of hosts to be upgraded to a maximum of 4 in Update Manager parallel mode.

    If you have already run Update Manager upgrade without a limit, based on the state of the system, user one of these options:

    • Trigger migration manually for each Host.

    Or

    • Delete existing TransportNodes with NVDS switches and recreate them with CVDS switches. This will require you to move out workloads first from NVDS switch and then bring them back to CVDS switch based TN.
  • Issue 2658676: While creating security policies containing 1000 rules each the API failed.

    Failure occurs while creating a section with large number of rules.

    Workaround: None.

  • Issue 2649228: IPv6 duplicate IP detection (DAD) state machine introduces 3 second delay on T0/T1 SR back plane port (and for T1 SR uplink port), when the IP is moved from one SR to another during failover. 

    N-S connectivity has up to 6 second v6 traffic loss during failover with T0 A/A.

    Workaround: Avoid asymmetric traffic flows and rely on BGP 1/3 timers to detect HA failover at T0 uplink.

  • Issue 2623704: If you use the Management Plane API to configure L3 forwarding mode, Policy will overwrite Management Plane change with default mode IPV4_ONLY and will disrupt IPv6 connectivity.

    L3 forwarding mode configuration was introduced in Policy in the NSX-T 3.0.0 release. This issue impacts NSX-T 3.0.1 and 3.0.2 upgrades to NSX-T 3.1.0. Upgrades from NSX-T 2.5.x to NSX-T 3.1.0 and NSX-T 3.0.0 to NSX-T 3.1.0 are not affected. IPv6 forwarding is disabled in the data path, resulting in IPv6 connectivity loss.

    Workaround: See VMware knowledge base article 81349 for details.

  • Issue 2639424: Remediating a Host in a vLCM cluster with Host-based Service VM Deployment will fail after 95% Remediation Progress is completed.

    The remediation progress for a Host will be stuck at 95% and then Fail after 70 minute timeout is completed.

    Workaround: See VMware knowledge base article 81447.

  • Issue 2636855: Maximum capacity alarm raised when System-wide Logical Switch Ports is over 25K.

    Maximum capacity alarm raised when System-wide Logical Switch Ports is over 25K. But actually for PKS scale Env, the limitation for container port is 60K; >25K Logical Switch Ports in PKS Env is a normal case.

    Workaround: None.

  • Issue 2609681: DFW jumpto rule action is not supported on rules having Layer 7 APPID or FQDN Context profiles.

    The Traffic will not match the intended rule after vMotion. Rule match will not be correct, allowing the traffic to pass through if it was supposed to be blocked.

    Workaround: None.

  • Issue 2653227: When removing the Physical Server from NSX, connectivity is lost to Physical Server and NSX uninstall fails.

    The attached interface of segment port on Physical Server is configured as "Using existing IP."  When removing the Physical Server from NSX without removing segment port first, connectivity is lost to Physical Server and NSX uninstall fails.

    Workaround: Remove the Segment Port first before uninstalling Physical Server.

  • Issue 2636771: Search can return resource when a resource tagged with multiple tag pairs, and tag and scope match with any value of tag and scope.

    This affects search query with condition on tag and scope. Filter may return extra data if tag and scope match with any pair.

    Workaround: None.

  • Issue 2643610: Load balancer statistics APIs are not returning stats.

    Stats of API are not set. You can't see load balancer stats.

    Workaround: Reduce the number of load balancers configured.

  • Issue 2658199:  When adding Windows Bare Metal Server 2016/2019, a "host disconnected" error displays at the Applying NSX Switch Configuration step.

    Regardless of the error message -- which may or may not be an issue -- the installation appears to continue and eventually finish properly.

    Workaround: Do not resolve right away and wait for the installation to finish.

  • Issue 2555383: Internal server error during API execution.

    Internal server error observed during API call execution. API will result in 500 error and not give the desired output.

    Workaround: This error is encountered because the session is invalidated. In this case, re-execute the session creation api to create a new session.

  • Issue 2658713: When workload connected to T0 segment sends an IGMP leave for group G, the entry is not removed from the IGMP snooping table on T0 SR.

    When workload connected to T0 segment sends an IGMP leave for group G, the entry is not removed from the IGMP snooping table on T0 SR.

    Workaround: None.

  • Issue 2662225: When active edge node becomes non-active edge node during flowing S-N traffic stream, traffic loss is experienced.

    Current S->N stream is running on multicast active node. The preferred route on TOR to source should be through the multicast active edge node only.
    Bringing up another edge can take over multicast active node (lower rank edge is active multicast node). Current S->N traffic will experience loss up to four minutes. This is will not impact new stream or if current stream is stopped and started again.

    Workaround: Current S->N traffic will recover automatically within 3.5 to 4 minutes. For faster recovery, disable multicast and enable through configuration.

  • Issue 2610851: Namespaces, Compute Collection, L2VPN Service grid filtering might return no data for few combinations of resource type filters.

    Applying multiple filters for a few types at the same time returned no results even though data is available with matching criteria. It is not a common scenario and filter will fail only these grids for the following combinations of filter attribute:

    • For Namespaces grid ==> On Cluster Name and Pods Name filter
    • For Network Topology page  ==> On L2VPN service applying a remote ip filter
    • For Compute Collection ==> On ComputeManager filter

    Workaround: You can apply one filter at a time for these resource types.

  • Issue 2587257: In some cases, PMTU packet sent by NSX-T edge is ignored upon receipt at the destination.

    PMTU discovery fails resulting in fragmentation and reassembly, and packet drop. This results in performance drop or outage in traffic.

    Workaround: None.

  • Issue 2659234: Live Traffic Analysis (LTA) sessions are not freed when you trigger an LTA request in parallel.

    When there is an uncleared LTA session.

    1. Host will continuously report LTA result to Management Plane (MP), so MP always receives the leak LTA session and print log in nsxapi.log.
    2. Packets sent from a host that has an uncleared LTA session will be padded LTA INT Geneve header.

    Workaround: Reboot the ESXi host where the LTA session exists.

  • Issue 2622846: IDS with proxy settings enabled cannot access GitHub, which is used to download signatures.

    New updates of signature bundle downloads will fail.

    Workaround: Use the offline download feature to download and upload signatures in case of no network connectivity.

  • Issue 2656929: If Windows 2016/2019 Physical Server is not completed, segment port fails when attempting to create segment port.

    Segment port will show a failure.

    Workaround: Wait for the Configuration State of transport node to change to Success, then create segment port.

  • Issue 2587513: Policy shows error when multiple VLAN ranges are configured in bridge profile binding.

    You will see an "INVALID VLAN IDs" error message.

    Workaround: Create multiple bridge endpoints with the VLAN ranges on the segment instead of one with all VLAN ranges.

  • Issue 2639671: A REST API client is calling /fabric/nodes API every 2 minutes but breaking the connection before API response can be returned.

    No functional impact but 'java.io.IOException: Broken pipe' errors are seen in syslog.

    Workaround: None.

  • Issue 2645877: In Bare Metal Edge, traffic going through a VRF gateway may not reach its destination.

    Packet capture on the link partner shows that the packets are received with no checksum, but a fragmentation offset in the IP header. Traffic crossing the VRF gateway is dropped because the packets are corrupted.

    Workaround: To take advantage of the VRF Lite feature, use VM edges.

  • Issue 2651036: Cluster Remediation in vSphere Lifecycle Manager may fail on hosts with lockdown mode enabled.

    During vSphere cluster remediation, the operation may fail with NSX showing the hosts in a "skipped host" state. Log files may show messages, "Host scan task failed", "com.vmware.vcIntegrity.lifecycle.EsxImage.UnknownError<An unknown error occurred while performing the operation.>"

    Workaround: Add the root user to the exception list and retry the operation.

  • Issue 2534089: When the IDS service is enabled on a transport node (host), virtual machine traffic on IDS-enabled hosts will stop flowing unexpectedly.

    When enabling the NSX IDS/IPS (in either detect-only or detect-and-prevent mode) on a vSphere cluster and applying IDS/IPS to workloads, the lockup condition can get triggered just by having the IDPS engine enabled. As a result, all traffic to and from all workloads on the hypervisor subject to IDS/IPS or Deep Packet Inspection Services (L7 App-ID) will be dropped. Traffic not subject to IDS/IPS or Deep Packet Inspection is not impacted and as soon as IDS/IPS is disabled or no longer applied to traffic, traffic flow is restored.Workaround:

    Workaround: See VMware knowledge base article 82043.

  • Issue 2738345: BGP extended large community fails when it is configured with regex.

    If BGP extended large community is configured with regex, FRR-CLI fails and configuration does not take effect due to which the BGP route filtering will not work.

    Workaround: None.

  • Issue 2740587: When the Tier0 uplink is deleted before disabling the BFD or deleting the BGP neighbor, the routing module will have the BFD configuration as enable.

    No impact, but Tier0-HA status will go down as no northbound connectivity once the last uplink is deleted.

    Workaround: Disable the BFD or delete the BGP neighbor before deleting the uplink.

  • Issue 2761589: Default layer 3 rule configuration changes from DENY_ALL to ALLOW_ALL on Management Plane after upgrading from NSX-T 2.x to NSX-T 3.x.

    This issue occurs only when rules are not configured via Policy, and the default layer 3 rule on the Management Plane has the DROP action. After upgrade, the default layer 3 rule configuration changes from DENY_ALL to ALLOW_ALL on Management Plane.

    Workaround: Set the action of default layer3 rule to DROP from policy UI post upgrade.

Installation Known Issues
  • Fixed Issue 2522909: Service vm upgrade is not working after correcting url if upgrade deployment failed with Invaildurl.

    Upgrade would be in failed state, with wrong url, blocking upgrade.

  • Issue 2538492: Some transport nodes are stuck in pending state after restore operation.

    The affected transport nodes will show up in pending state post restore operation. Transport node status goes pending and configurations in the restored database may not be pushed to transport node.

    Workaround: Use the following POST api to reapply the existing transport node configuration.

    POST /api/v1/transport-nodes/<transportnode-id>?action=resync_host_config

    Resync the TransportNode configuration on a host. It is similar to updating the TransportNode with existing configuration,
    but force sync these configurations to the host (no backend optimizations).

  • Issue 2663483: The single-node NSX Manager will disconnect from the rest of the NSX Federation environment if you replace the APH-AR certificate on that NSX Manager.

    This issue is seen only with NSX Federation and with the single node NSX Manager Cluster. The single-node NSX Manager will disconnect from the rest of the NSX Federation environment if you replace the APH-AR certificate on that NSX Manager.

    Workaround: Single-node NSX Manager cluster deployment is not a supported deployment option, so have three-node NSX Manager cluster.

  • Issue 2562189: Transport node deletion goes on indefinitely when the NSX Manager is powered off during the deletion operation.

    If the NSX Managers are powered off while transport node deletion is in progress, the transport node deletion may go on indefinitely if there is no user intervention.

    Workaround: Once the Managers are back up, prepare the node again and start the deletion process again.

Upgrade Known Issues
  • Issue 2560981: On upgrade, vIDM config may not persist.

    You will have to re-login after successful upgrade if using vIDM, re-enable vIDM on the cluster.

    Workaround: Disable and enable vIDM config after upgrade.

  • Issue 2550492: During an upgrade, the message, "The credentials were incorrect or the account specified has been locked" is
    displayed temporarily and the system recovers automatically.

    Transient error message during upgrade.

    Workaround: None.

  • Issue 2674146: When upgrading to NSX-T 3.1.0, in the host upgrade section, a "Stage" button is present and active for all clusters/host groups. The option is applicable to vSphere Lifecycle Management (vLCM) enabled clusters only, which is not applicable in this scenario because NSX-T support with vLCM is enabled from the NSX-T 3.1.0 and vSphere 7.0 U1 version combination.

    The "Stage" button is active for all clusters although vSphere Lifecycle Management is not enabled on any of the clusters. If you click the "Stage" option, a confirmation message appears, which if you then click "Yes", the cluster shows as if it's been staged for upgrade. If you then click "Start" to begin upgrades, the cluster undergoes the same upgrades as it would have if initiated through regular NSX host upgrades.

    Workaround: Ignore the "Stage" option and perform NSX host upgrades following the regular NSX host upgrade procedure.

  • Issue 2655295: Post Upgrade, repository sync status is marked as FAILED on the NSX Manager UI.

    Repository sync status is shown as failed on the NSX Manager UI. However, you can perform all installation- and upgrade-related operations.

    Workaround: Perform resolve action from the NSX Manager UI.

  • Issue 2657943: Upgrade of edge deployed on bare metal server having more than one disk might fail.

    When the system is rebooting midway through the update, the bootup fails with errors related to mounting the filesystems. The edge upgrade will fail.

    Workaround: Make sure the edge has only one disk on the bare metal server.

NSX Edge Known Issues
  • Issue 2283559: https://<nsx-manager>/api/v1/routing-table and https://<nsx-manager>/api/v1/forwarding-table MP APIs return an error if the edge has 65k+ routes for RIB and 100k+ routes for FIB.

    If the edge has 65k+ routes for RIB and 100k+ routes for FIB, the request from MP to Edge takes more than 10 seconds and results in a timeout. This is a read-only API and has an impact only if they need to download the 65k+ routes for RIB and 100k+ routes for FIB using API/UI.

    Workaround: There are two options to fetch the RIB/FIB.

    • These APIs support filtering options based on network prefixes or type of route. Use these options to download the routes of interest.
    • CLI support in case the entire RIB/FIB table is needed and there is no timeout for the same.
  • Issue 2521230: BFD status displayed under ‘get bgp neighbor summary’ may not reflect the latest BFD session status correctly.

    BGP and BFD can set up their sessions independently. As part of ‘get bgp neighbor summary’ BGP also displays the BFD state. If the BGP is down, it will not process any BFD notifications and will continue to show the last known state. This could lead to displaying stale state for the BFD.

    Workaround: Rely on the output of ‘get bfd-sessions’ and check the ‘State’ field to get the most up-to-date BFD status.

  • Issue 2641990: During Edge vMotion, there can be multicast traffic loss up to 30 seconds (default pim hello interval).

    When edge is vMotioned and IGMP snooping is enabled on TOR, The TOR needs to know the new edge location. This can be learned by TOR when it gets any of the multicast control or data traffic from the edge. Multicast Traffic is lost up to 30 seconds on edge vMotion.

    Workaround: None. Traffic will recover upon TOR receiving multicast packets, or for faster recovery, disable/enable pim on the uplink interface connected to TOR.

NSX Cloud Known Issues
  • Issue 2289150: PCM calls to AWS start to fail.

    If a user updates the PCG role for an AWS account on CSM from old-pcg-role to new-pcg-role, CSM updates the role for the PCG instance on AWS to new-pcg-role. However, the PCM does not know that the PCG role has been updated and as a result continues to use the old AWS clients it had created using old-pcg-role. This causes the AWS cloud inventory scan and other AWS cloud calls to fail.

    Workaround: If you encounter this issue, do not modify/delete the old PCG role immediately after changing to new role for at least 6.5 hours. Restarting the PCG will re-initialize all AWS clients with new role credentials.

Security Known Issues
  • Issue 2491800: AR channel SSL certificates are not periodically checked for their validity, which could lead to using an expired/revoked certificate for an existing connection.

    The connection would be using an expired/revoked SSL.

    Workaround: Restart the APH on the Manager node to trigger a reconnection.

Federation Known Issues
  • Issue 2658950: Upgrade workflow of Local Manager interrupted if Local Manager was added to the Global Manager using its VIP but later FQDN of the Local Manager was published.

    If a Local Manager was added to the Global Manager using its VIP and later you have published the Local Manager's FQDN, the Global Manager fails to access the Local Manager to proceed with the upgrade.

    Workaround: Update the Local Manager's FQDN after publishing it.

  • Issue 2630813: SRM recovery for compute VMs will lose all the NSX tags applied to VM and Segment ports.

    If a SRM recovery test or run is initiated, the replicated compute VMs in the disaster recovery location will not have any NSX tags applied.

  • Issue 2601493: Concurrent config onboarding is not supported on Global Manager in order to prevent heavy processing load.

    Although parallel config onboarding does not interfere with each other, multiple such config onboarding executions on GM would make GM slow and sluggish for other operations in general.

    Workaround: Security Admin / Users must sync up maintenance windows to avoid initiating config onboarding concurrently.

  • Issue 2613113: If onboarding is in progress, and restore of Local Manager is done, the status on Global Manager does not change from IN_PROGRESS.

    UI shows IN_PROGRESS in Global Manager for Local Manager onboarding. Unable to import the configuration of the restored site.

    Workaround: Use the Local Manager API to start the onboarding of the Local Manager site, if required.

  • Issue 2628428: Global Manager status shows success initially and then changes to IN_PROGRESS.

    In a scale setup, if there are too many sections being modified at frequent intervals, it takes time for the Global Manager to reflect the correct status of the configuration. This causes a delay in seeing the right status on the UI/API for the distributed firewall configuration done.

    Workaround: None.

  • Issue 2625009: Inter-SR iBGP sessions keep flapping, when intermediate routers or physical NICs have lower or equal MTU as the inter-SR port.

    This can impact inter-site connectivity in Federation topologies.

    Workaround: Keep the pNic MTU and intermediate routers' MTU bigger than the global MTU (i.e., the MTU used by inter-SR port). The size of the packets becomes more than MTU because of encapsulation and packets don't go through.

  • Issue 2634034, 2656024: When the site role is changed for a stretched T1-LR (logical router), any traffic for that logical router is impacted for about 6-8 minutes.

    The static route takes long to get programmed and affects the datapath. There will be traffic loss of about 6-8 minutes when the site role is changed. This could be even longer based on the scale of the config.

    Workaround: None.

  • Issue 2547524: Firewall rules not applied for cross-site vMotion use-case where discovered segment ports are used in global groups.

    Rules are not pushed as expected.

    Workaround: Reconfigure rules.

  • Issue 2606452: Onboarding is blocked when trying to onboard via API.

    Onboarding API fails with the error message, "Default transport zone not found at site". 

    Workaround: Wait for fabric sync between Global Manager and Local Manager to complete.

  • Issue 2643632: No error is displayed while adding Global service as nested service on Local Manager.

    Policy intent creation is successful but realization fails. Local Service will not be realized on Management Plane, so it cannot be consumed in distributed firewall rules.

    Workaround: Reconfigure the service to remove the global service entry and add local services as members. Services created from Local Policy Manager should use only local nested services as its members.

  • Issue 2643749: Unable to nest group from custom region created on specific site into group that belongs to system created site specific region.

    You will not see the group created in site specific custom region while selecting child group as a member for the group in the system created region with the same location.

  • Issue 2649240: Deletion is slow when a large number of entities are deleted using individual delete APIs.

    It takes significant time to complete the deletion process.

    Workaround: Use hierarchical API to delete in bulk.

  • Issue 2649499: Firewall rule creation takes a long time when individual rules are created one after the other.

    Slow API takes more time to create rules.

    Workaround: Use Hierarchical API to create several rules.

  • Issue 2650292: In Federation, if a child group is used in two different groups and the second parent group has a higher span, the child group span will be greater than the span of the first group due to reference object creation. The span of the first group can be expanded but not reverted back to the original span.

    You will not be able to reduce the span of a domain on Global Manager.

    Workaround: Do not add the same child group in 2 parent groups where the parent group spans are different.

  • Issue 2652418: Slow deletion when large number of entities are deleted.

    Deletion will be slower.

    Workaround: Use the hierarchical API for bulk deletion.

  • Issue 2655539: Host names are not updated on the Location Manager page of the Global Manager UI when updating the host names using the CLI.

    The old host name is shown.

    Workaround: None.

  • Issue 2658656: State for compute manager added to active Global Manager is not replicated on standby Global Manager.

    Compute manager added to active Global Manager is not visible on Global Manager when standby Global Manager becomes active. You will not be able to use the auto-deploy feature for deploying new Global Manager nodes if standby Global Manager becomes active in case of failover.

    Workaround: Add vCenter as compute manager on Global Manager that is active now.

  • Issue 2658687: Global Manager switchover API reports failure when transaction fails, but the failover happens.

    API fails, but Global Manager switchover completes.

    Workaround: None.

  • Issue 2630819: Changing LM certificates should not be done after LM register on GM.

    When Federation and PKS need to be used on the same LM, PKS tasks to create external VIP & change LM certificate should be done before registering the LM on GM. If done in the reverse order, communications between LM and GM will not be possible after change of LM certificates and LM has to be registered again.

  • Issue 2661502: Unable to make a Global Manager as standby from the UI after an unplanned switchover where this Global Manager was lost but is back online after the switchover.

    If a previously active Global Manager is back online after you have switched to the standby Global Manager which is the new active Global Manager, the old active Global Manager has the status of NONE. When you try to set this as the standby Global Manager from the UI, the operation fails.

    Workaround: Use the following API with the payload including connection_info to change the status of the Global Manager from NONE to Standby:
     

    PUT https://<active-GM-IP>/global-manager/api/v1/global-infra/global-managers/<ID of GM with the status NONE>
    Example Request Body: 
    {
      "display_name": "<Display Name of GM with status NONE>",
      "mode": "STANDBY",
      "_revision" : <Revision-Number>,  
    
    "connection_info": [
                    {"fqdn": "<FQDN or IP of the GM with status NONE>",
                     "username": "<username>",
                     "password": "<password>",
                     "thumbprint": "<API thumbprint of the GM with status NONE>"}
      ]
    }
  • Issue 2659168: Async replicator channel sometimes remains down during node restart.

    Sync status is down. Changes on Global Manager won't go to Local Manager.

    Workaround: Restart the async replicator process by issuing the "service async-replicator-service restart" command.

  • Issue 2658092: Onboarding fails when NSX Intelligence is configured on Local Manager.

    Onboarding fails with a principal identity error. and you cannot onboard a system with principal identity user.

    Workaround: Create a temporary principal identity user with the same principal identity name that is used by NSX Intelligence.

  • Issue 2652154: The Global Manager can experience slowness in processing deletes during scenarios where high numbers of resources are being deleted across sites.

    Error messages are seen that a particular resource cannot be created or deleted because it will affect the span of many resources that are pending cleanup on a site. These error messages indicate that the Global Manager is still processing acknowledgements of cleanup from all of the sites for many resources.

    Workaround: Disconnect the Tier 1 from Tier 0 to remove the inherited span. If they cannot be disconnected and you want to send the resource back to the other site, wait for several minutes before retrying the operation.

  • Issue 2622576: Failures due to duplicate configuration are not propagated correctly to user.

    While onboarding is in progress, you see an "Onboarding Failure" message.

    Workaround: Restore Local Manager and retry onboarding.

  • Issue 2637241: Global Manager to Local Manager communication broken (workflows relying on password/thumbprint) after Local Manager cert replacement.

    Absence of status updates and some functionality for Global Manager to Local Manager communication.

    Workaround: Replace Local Manager thumbprint using API call on Global Manager.

check-circle-line exclamation-circle-line close-line
Scroll to top icon