VMware NSX-T Data Center 3.1.1   |  27 January 2021  |  Build 17483185

Check regularly for additions and updates to these release notes.

What's in the Release Notes

The release notes cover the following topics:

What's New

NSX-T Data Center 3.1.1 provides a variety of new features to offer new functionalities for virtualized networking and security for private, public, and multi-clouds. Highlights include new features and enhancements in the following focus areas.

L3 Networking

  • OSPFv2 Support on Tier-0 Gateways
    • NSX-T Data Center now supports OSPF version 2 as a dynamic routing protocol between Tier-0 gateways and physical routers. OSPF can be enabled only on external interfaces and can all be in the same OSPF area (standard area or NSSA), even across multiple Edge Nodes. This simplifies migration from the existing NSX for vSphere deployment already using OSPF to NSX-T Data Center.

NSX Data Center for vSphere to NSX-T Data Center Migration

  • Support of Universal Objects Migration for a Single Site
    • You can migrate your NSX Data Center for vSphere environment deployed with a single NSX Manager in Primary mode (not secondary). As this is a single NSX deployment, the objects (local and universal) are migrated to local objects on a local NSX-T.  This feature does not support cross-vCenter environments with Primary and Secondary NSX Managers.
  • Migration of NSX-V Environment with vRealize Automation - Phase 2
    • The Migration Coordinator interacts with vRealize Automation (vRA) to migrate environments where vRealize Automation provides automation capabilities. This release adds additional topologies and use cases to those already supported in NSX-T 3.1.0.
  • Modular Migration for Hosts and Distributed Firewall
    • The NSX-T Migration Coordinator adds a new mode to migrate only the distributed firewall configuration and the hosts, leaving the logical topology(L3 topology, services) for you to complete. You can benefit from the in-place migration offered by the Migration Coordinator (hosts moved from NSX-V to NSX-T while going through maintenance mode, firewall states and memberships maintained, layer 2 extended between NSX for vSphere and NSX-T during migration) that lets you (or a third party automation) deploy the Tier-0/Tier-1 gateways and relative services, hence giving greater flexibility in terms of topologies. This feature is available from UI and API.
  • Modular Migration for Distributed Firewall available from UI
    •  The NSX-T user interface now exposes the Modular Migration of firewall rules. This feature was introduced in 3.1.0 (API only) and allows the migration of firewall configurations, memberships and state from an NSX Data Center for vSphere environment to an NSX-T Data Center environment. This feature simplifies lift-and-shift migration where you vMotion VMs between an environment with hosts with NSX for vSphere and another environment with hosts with NSX-T by migrating firewall rules and keeping states and memberships (hence maintaining security between VMs in the old environment and the new one).
  • Fully Validated Scenario for Lift and Shift Leveraging vMotion, Distributed Firewall Migration and L2 Extension with Bridging
    • This feature supports the complete scenario for migration between two parallel environments (lift and shift) leveraging NSX-T bridge to extend L2 between NSX for vSphere and NSX-T, the Modular Distributed Firewall.

Identity Firewall

  • NSX Policy API support for Identity Firewall configuration - Setup of Active Directory, for use in Identity Firewall rules, can now be configured through NSX Policy API (https://<nsx-mgr>/policy/api/v1/infra/firewall-identity-stores), equivalent to existing NSX Manager API (https://<nsx-mgr>/api/v1/directory/domains).

Advanced Load Balancer Integration

  •  Support Policy API for Avi Configuration
    • The NSX Policy API can be used to manage the NSX Advanced Load Balancer configurations of virtual services and their dependent objects. The unique object types are exposed via the https://<nsx-mgr>/policy/api/v1/infra/alb-<objecttype> endpoints.
       
  • Service Insertion Phase 2
    • This feature supports the Transparent LB in NSX-T advanced load balancer (Avi). Avi sends the load balanced traffic to the servers with the client's IP as the source IP. This feature leverages service insertion to redirect the return traffic back to the service engine to provide transparent load balancing without requiring any server side modification.

Edge Platform and Services

  • DHCPv4 Relay on Service Interface
    • Tier-0 and Tier-1 Gateways support DHCPv4 Relay on Service Interfaces, enabling a 3rd party DHCP server to be located on a physical network 

AAA and Platform Security

  • Guest Users - Local User accounts: NSX customers integrate their existing corporate identity store to onboard users for normal operations of NSX-T. However, there is an essential need for a limited set of local users -- to aid identity and access management in many scenarios. Scenarios such as (1) the ability to bootstrap and operate NSX during early stages of deployment before identity sources are configured in non-administrative mode or (2) when there is failure of communication/access to corporate identity repository. In such cases, local users are effective in bringing NSX-T to normal operational status. Additionally, in certain scenarios such as (3) being able to manage NSX in a specific compliant-state catering to industry or federal regulations, use of local guest users are beneficial. To enable these use-cases and ease-of-operations, two guest local-users have been introduced in 3.1.1, in addition to existing admin and audit local users. With this feature, the NSX admin has extended privileges to manage the lifecycle of the users (e.g., Password rotation, etc.) including the ability to customize and assign appropriate RBAC permissions. Please note that the local user capability is available on both NSX-T Local Managers (LM) and Global Managers (GM) but is unavailable on edge nodes in 3.1.1 via API and UI. The guest users are disabled by default and have to be explicitly activated for consumption and can be disabled at any time. 
     
  • FIPS Compliant Bouncy Castle Upgrade: NSX-T 3.1.1 contains an updated version of FIPS compliant Bouncy Castle (v1.0.2.1). Bouncy Castle module is a collection of Java based cryptographic libraries, functions, and APIs. Bouncy Castle module is used extensively on NSX-T Manager. The upgraded version resolves critical security bugs and facilitates compliant and secure operations of NSX-T. 

NSX Cloud

  • NSX Marketplace Appliance in Azure: Starting with NSX-T 3.1.1, you have the option to deploy the NSX management plane and control plane fully in Public Cloud (Azure only, for NSX-T 3.1.1. AWS will be supported in a future release). The NSX management/control plane components and NSX Cloud Public Cloud Gateway (PCG) are packaged as VHDs and made available in the Azure Marketplace. For a greenfield deployment in the public cloud, you also have the option to use a 'one-click' terraform script to perform the complete installation of NSX in Azure. 
  • NSX Cloud Service Manager HA: In the event that you deploy NSX management/control plane in the public cloud, NSX Cloud Service Manager (CSM) also has HA. PCG is already deployed in Active-Standby mode thereby enabling HA. 
  • NSX-Cloud for Horizon Cloud VDI enhancements: Starting with NSX-T 3.1.1, when using NSX Cloud to protect Horizon VDIs in Azure, you can install the NSX agent as part of the Horizon Agent installation in the VDIs. This feature also addresses one of the challenges with having multiple components ( VDIs, PCG, etc.) and their respective OS versions. Any version of the PCG can work with any version of the agent on the VM. In the event that there is an incompatibility, the incompatibility is displayed in the NSX Cloud Service Manager (CSM), leveraging the existing framework. 

Operations

  • UI-based Upgrade Readiness Tool for migration from NVDS to VDS with NSX-T Data Center
    • To migrate Transport Nodes from NVDS to VDS with NSX-T, you can use the Upgrade Readiness Tool present in the Getting Started wizard in the NSX Manager user interface. Use the tool to get recommended VDS with NSX configurations, create or edit the recommended VDS with NSX, and then automatically migrate the switch from NVDS to VDS with NSX while upgrading the ESX hosts to vSphere Hypervisor (ESXi) 7.0 U2.

Licensing

  • Enable VDS in all vSphere Editions for NSX-T Data Center Users: Starting with NSX-T 3.1.1, you can utilize VDS in all versions of vSphere. You are entitled to use an equivalent number of CPU licenses to use VDS. This feature ensures that you can instantiate VDS.

Container Networking and Security

  • This release supports a maximum scale of 50 Clusters (ESXi clusters) per vCenter enabled with vLCM, on clusters enabled for vSphere with Tanzu as documented at configmax.vmware.com

Federation

Compatibility and System Requirements

For compatibility and system requirements information, see the NSX-T Data Center Installation Guide.

API Deprecations and Behavior Changes

Retention Period of Unassigned Tags: In NSX-T 3.0.x, NSX Tags with 0 Virtual Machines assigned are automatically deleted by the system after five days. In NSX-T 3.1.0, the system task has been modified to run on a daily basis, cleaning up unassigned tags that are older than one day. There is no manual way to force delete unassigned tags.

Duplicate certificate extensions not allowed: Starting with NSX-T 3.1.1, NSX-T will reject x509 certificates with duplicate extensions (or fields) following RFC guidelines and industry best practices for secure certificate management. Please note this will not impact certificates that are already in use prior to upgrading to 3.1.1. Otherwise, checks will be enforced when NSX administrators attempt to replace existing certificates or install new certificates after NSX-T 3.1.1 has been deployed.

API and CLI Resources

See code.vmware.com to use the NSX-T Data Center APIs or CLIs for automation.

Available Languages

NSX-T Data Center has been localized into multiple languages: English, German, French, Japanese, Simplified Chinese, Korean, Traditional Chinese, and Spanish. Because NSX-T Data Center localization utilizes the browser language settings, ensure that your settings match the desired language.

Document Revision History

January 27, 2021. First edition.
February 5, 2021. Second edition. Added known issues 2663483, 2693576, 2697537, 2711497.
March 4, 2021. Third edition. Added "Federation" to What's New section.

Resolved Issues

  • Fixed Issue 2641824: On UI BFD profile realization status is shown as "uninitialized."

    There is no impact. This realization status can be ignored.

  • Fixed Issue 2643313: The realization state for a successfully realized Global T0 on an onboarded site incorrectly shows the "logical router port configuration realization error" for site edge transport nodes.

    You may notice realization failure with the error in realization of a T0 logical router port. If using policy API, this is transient and will resolve when T0-T1 plumbing completes.

  • Fixed Issue 2643632: No error is displayed while adding Global service as nested service on Local Manager.

    Policy intent creation is successful but realization fails. Local Service will not be realized on Management Plane, so it cannot be consumed in distributed firewall rules.

  • Fixed Issue 2646814: Overall consolidated status shows IN_PROGRESS whereas individual consolidated status per enforcement point shows SUCCESS.

    The consolidated status shows IN_PROGRESS but it does not give information on which site the status is IN_PROGRESS.

  • Fixed Issue 2648349: Repeated ObjectNotFoundException log and exception backtrace when Management Plane is collecting observation for traceflow or trace action result for livetrace.

    You will see repeated ObjectNotFoundException log in nsxapi.log when you create an LTA/Traceflow session.

  • Fixed Issue 2650292: In Federation, if a child group is used in two different groups and the second parent group has a higher span, the child group span will be greater than the span of the first group due to reference object creation. The span of the first group can be expanded but not reverted back to the original span.

    You will not be able to reduce the span of a domain on Global Manager.

  • Fixed Issue 2654842: A segment port can be created before the Windows physical server transport node creation is successful, causing the segment port to go to fail state.

    There is no visible UI indication that you need to wait until the transport node state is Success before proceeding with the Segment Port creation for Windows physical servers. If you try to create the Segment Port before the transport node is successful, it will fail and the Host will completely disconnect.

  • Fixed Issue 2658484: NVDS to CVDS Migration via vSphere Update Manager is not fully supported via vSphere Update Manager "parallel mode."

    NVDS to CVDS migration issues may occur if migration is triggered by upgrading ESX via vSphere Update Manager in Parallel Mode. Switch will not be automatically migrated from NVDS to CVDS. System might be left in some inconsistent state.

  • Fixed Issue 2649228: IPv6 duplicate IP detection (DAD) state machine introduces 3 second delay on T0/T1 SR back plane port (and for T1 SR uplink port), when the IP is moved from one SR to another during failover. 

    N-S connectivity has up to 6 second v6 traffic loss during failover with T0 A/A.

  • Fixed Issue 2661502: Unable to make a Global Manager as standby from the UI after an unplanned switchover where this Global Manager was lost but is back online after the switchover.

    If a previously active Global Manager is back online after you have switched to the standby Global Manager which is the new active Global Manager, the old active Global Manager has the status of NONE. When you try to set this as the standby Global Manager from the UI, the operation fails.

  • Fixed Issue 2623704: If you use the Management Plane API to configure L3 forwarding mode, Policy will overwrite Management Plane change with default mode IPV4_ONLY and will disrupt IPv6 connectivity.

    L3 forwarding mode configuration was introduced in Policy in the NSX-T 3.0.0 release. This issue impacts NSX-T 3.0.1 and 3.0.2 upgrades to NSX-T 3.1.0. Upgrades from NSX-T 2.5.x to NSX-T 3.1.0 and NSX-T 3.0.0 to NSX-T 3.1.0 are not affected. IPv6 forwarding is disabled in the data path, resulting in IPv6 connectivity loss.

  • Fixed Issue 2674146: When upgrading to NSX-T 3.1.0, in the host upgrade section, a "Stage" button is present and active for all clusters/host groups. The option is applicable to vSphere Lifecycle Management (vLCM) enabled clusters only, which is not applicable in this scenario because NSX-T support with vLCM is enabled from the NSX-T 3.1.0 and vSphere 7.0 U1 version combination.

    The "Stage" button is active for all clusters although vSphere Lifecycle Management is not enabled on any of the clusters. If you click the "Stage" option, a confirmation message appears, which if you then click "Yes", the cluster shows as if it's been staged for upgrade. If you then click "Start" to begin upgrades, the cluster undergoes the same upgrades as it would have if initiated through regular NSX host upgrades.

  • Fixed Issue 2659168: Async replicator channel sometimes remains down during node restart.

    Sync status is down. Changes on Global Manager won't go to Local Manager.

  • Fixed Issue 2655295: Post Upgrade, repository sync status is marked as FAILED on the NSX Manager UI.

    Repository sync status is shown as failed on the NSX Manager UI. However, you can perform all installation- and upgrade-related operations.

  • Fixed Issue 2658950: Upgrade workflow of Local Manager interrupted if Local Manager was added to the Global Manager using its VIP but later FQDN of the Local Manager was published.

    If a Local Manager was added to the Global Manager using its VIP and later you have published the Local Manager's FQDN, the Global Manager fails to access the Local Manager to proceed with the upgrade.

  • Fixed Issue 2609681: DFW jumpto rule action is not supported on rules having Layer 7 APPID or FQDN Context profiles.

    The Traffic will not match the intended rule after vMotion. Rule match will not be correct, allowing the traffic to pass through if it was supposed to be blocked.

  • Fixed Issue 2657943: Upgrade of edge deployed on bare metal server having more than one disk might fail.

    When the system is rebooting midway through the update, the bootup fails with errors related to mounting the filesystems. The edge upgrade will fail.

  • Fixed Issue 2653227: When removing the Physical Server from NSX, connectivity is lost to Physical Server and NSX uninstall fails.

    The attached interface of segment port on Physical Server is configured as "Using existing IP."  When removing the Physical Server from NSX without removing segment port first, connectivity is lost to Physical Server and NSX uninstall fails.

  • Fixed Issue 2658199:  When adding Windows Bare Metal Server 2016/2019, a "host disconnected" error displays at the Applying NSX Switch Configuration step.

    Regardless of the error message -- which may or may not be an issue -- the installation appears to continue and eventually finish properly.

  • Fixed Issue 2658713: When workload connected to T0 segment sends an IGMP leave for group G, the entry is not removed from the IGMP snooping table on T0 SR.

    When workload connected to T0 segment sends an IGMP leave for group G, the entry is not removed from the IGMP snooping table on T0 SR.

  • Fixed Issue 2652154: The Global Manager can experience slowness in processing deletes during scenarios where high numbers of resources are being deleted across sites.

    Error messages are seen that a particular resource cannot be created or deleted because it will affect the span of many resources that are pending cleanup on a site. These error messages indicate that the Global Manager is still processing acknowledgements of cleanup from all of the sites for many resources.

  • Fixed Issue 2659234: Live Traffic Analysis (LTA) sessions are not freed when you trigger an LTA request in parallel.

    When there is an uncleared LTA session.

    1. Host will continuously report LTA result to Management Plane (MP), so MP always receives the leak LTA session and print log in nsxapi.log.
    2. Packets sent from a host that has an uncleared LTA session will be padded LTA INT Geneve header.
  • Fixed Issue 2622846: IDS with proxy settings enabled cannot access GitHub, which is used to download signatures.

    New updates of signature bundle downloads will fail.

  • Fixed Issue 2656929: If Windows 2016/2019 Physical Server is not completed, segment port fails when attempting to create segment port.

    Segment port will show a failure.

  • Fixed Issue 2645877: In Bare Metal Edge, traffic going through a VRF gateway may not reach its destination.

    Packet capture on the link partner shows that the packets are received with no checksum, but a fragmentation offset in the IP header. Traffic crossing the VRF gateway is dropped because the packets are corrupted.

  • Fixed Issue 2637241: Global Manager to Local Manager communication broken (workflows relying on password/thumbprint) after Local Manager cert replacement.

    Absence of status updates and some functionality for Global Manager to Local Manager communication.

  • Fixed Issue 2522909: Service VM upgrade is not working after correcting URL if upgrade deployment failed with Invalid url.

    Upgrade would be in failed state, with wrong URL, blocking upgrade.

  • Fixed Issue 2527671: When the DHCP server is not configured, retrieving DHCP statistics/status on a Tier0/Tier1 gateway or segment displays an error message indicating realization is not successful.

    There is no functional impact. The error message is incorrect and should report that the DHCP server is not configured.

  • Fixed Issue 2549175: Searching in policy fails with the message: "Unable to resolve with start search resync policy."

    Searching in policy fails because search is out of sync when the NSX Manager nodes are provided with new IPs.

  • Fixed Issue 2588072: NVDS to CVDS switch migrator doesn't support Stateless ESX with vmks.

    NVDS to CVDS switch migrator cannot be used for migration with Stateless ESX hosts if the NVDS switch has vmks on it.

  • Fixed Issue 2627439: If a transport node profile is applied on a cluster before migration, one extra transport node profile is created by the system in detached state after migration.

    There will one extra transport node profile generated for each original transport node profile.

  • Fixed Issue 2628428: Global Manager status shows success initially and then changes to IN_PROGRESS.

    In a scale setup, if there are too many sections being modified at frequent intervals, it takes time for the Global Manager to reflect the correct status of the configuration. This causes a delay in seeing the right status on the UI/API for the distributed firewall configuration done.

  • Fixed Issue 2634034, 2656024: When the site role is changed for a stretched T1-LR (logical router), any traffic for that logical router is impacted for about 6-8 minutes.

    The static route takes long to get programmed and affects the datapath. There will be traffic loss of about 6-8 minutes when the site role is changed. This could be even longer based on the scale of the config.

  • Fixed Issue 2527344: If selecting the Type as "TIER0_EVPN_TEP_IP" in Route redistribution of Tier 0 VRF LR, the TIER0_EVPN_TEP_IP is not redistributed into BGP as it is present in Parent Tier 0 LR, causing the datapath to break.

    Doing so will not advertise the TIER0_EVPN_TEP_IP to DC gateway (Peer BGP Router). In Tier 0 VRF LR Route Redistribution, you can select the Type as "TIER0_EVPN_TEP_IP", but there are no routes of "TIER0_EVPN_TEP_IP" on Tier 0 VRF LR.

  • Fixed Issue 2547524: Firewall rules not applied for cross-site vMotion use-case where discovered segment ports are used in global groups.

    Rules are not pushed as expected.

  • Fixed Issue 2573975: While configuring ANY to ANY SNAT rule, the addresses specified for source-network/destination-network/translated-network properties as empty string ("") will result in realization failure on management plane.

    The rule will not get realized on the edge and management plane.

Known Issues

The known issues are grouped as follows.

General Known Issues
  • Issue 2329273: No connectivity between VLANs bridged to the same segment by the same edge node.

    Bridging a segment twice on the same edge node is not supported. However, it is possible to bridge two VLANs to the same segment on two different edge nodes.

    Workaround: None 

  • Issue 2355113: Unable to install NSX Tools on RedHat and CentOS Workload VMs with accelerated networking enabled in Microsoft Azure.

    In Microsoft Azure when accelerated networking is enabled on RedHat (7.4 or later) or CentOS (7.4 or later) based OS and with NSX Agent installed, the ethernet interface does not obtain an IP address.

    Workaround: After booting up RedHat or CentOS based VM in Microsoft Azure, install the latest Linux Integration Services driver available at https://www.microsoft.com/en-us/download/details.aspx?id=55106 before installing NSX tools.

  • Issue 2520803: Encoding format for Manual Route Distinguisher and Route Target configuration in EVPN deployments.

    You currently can configure manual route distinguisher in both Type-0 encoding and in Type-1 encoding. However, using the Type-1 encoding scheme for configuring Manual Route Distinguisher in EVPN deployments is highly recommended. Also, only Type-0 encoding for Manual Route Target configuration is allowed.

    Workaround: Configure only Type-1 encoding for Route Distinguisher.

  • Issue 2490064: Attempting to disable VMware Identity Manager with "External LB" toggled on does not work.

    After enabling VMware Identity Manager integration on NSX with "External LB", if you attempt to then disable integration by switching "External LB" off, after about a minute, the initial configuration will reappear and overwrite local changes.

    Workaround: When attempting to disable vIDM, do not toggle the External LB flag off; only toggle off vIDM Integration. This will cause that config to be saved to the database and synced to the other nodes.

  • Issue 2537989: Clearing VIP (Virtual IP) does not clear vIDM integration on all nodes.

    If VMware Identity Manager is configured on a cluster with a Virtual IP, disabling the Virtual IP does not result in the VMware Identity Manager integration being cleared throughout the cluster. You will have to manually fix vIDM integration on each individual node if the VIP is disabled.

    Workaround: Go to each node individually to manually fix the vIDM configuration on each.

  • Issue 2526769: Restore fails on multi-node cluster.

    When starting a restore on a multi-node cluster, restore fails and you will have to redeploy the appliance.

    Workaround: Deploy a new setup (one node cluster) and start the restore.

  • Issue 2523212: The nsx-policy-manager becomes unresponsive and restarts.

    API calls to nsx-policy-manager will start failing, with service being unavailable. You will not be able to access policy manager until it restarts and is available.

    Workaround: Invoke API with at most 2000 objects.

  • Issue 2521071: For a Segment created in Global Manager, if it has a BridgeProfile configuration, then the Layer2 bridging configuration is not applied to individual NSX sites.

    The consolidated status of the Segment will remain at "ERROR". This is due to failure to create bridge endpoint at a given NSX site. You will not be able successfully configure a BridgeProfile on Segments created via Global Manager.

    Workaround: Create a Segment at the NSX site and configure it with bridge profile.

  • Issue 2532127: LDAP user can't log in to NSX only if the user's Active Directory entry does not contain the UPN (userPrincipalName) attribute and contains only the samAccountName attribute.

    User authentication fails and the user is unable to log in to the NSX user interface.

    Workaround: None.

  • Issue 2482580: IDFW/IDS configuration is not updated when an IDFW/IDS cluster is deleted from vCenter.

    When a cluster with IDFW/IDS enabled is deleted from vCenter, the NSX management plane is not notified of the necessary updates. This results in inaccurate count of IDFW/IDS enabled clusters. There is no functional impact. Only the count of the enabled clusters is wrong.

    Workaround: None.

  • Issue 2534933: Certificates that have LDAP based CDPs (CRL Distribution Point) fail to apply as tomcat/cluster certs.

    You can't use CA-signed certificates that have LDAP CDPs as cluster/tomcat certificate.

    Workaround: See VMware knowledge base article 78794.

  • Issue 2499819: Maintenance-based NSX for vSphere to NSX-T Data Center host migration for vCenter 6.5 or 6.7 might fail due to vMotion error.

    This error message is shown on the host migration page:
    Pre-migrate stage failed during host migration [Reason: [Vmotion] Can not proceed with migration: Max attempt done to vmotion vm b'3-vm_Client_VM_Ubuntu_1404-shared-1410'].

    Workaround: Retry host migration.

  • Issue 2557287: TNP updates done after backup are not restored.

    You won't see any TNP updates done after backup on a restored appliance.

    Workaround: Take a backup after any updates to TNP.

  • Issue 2392064: Edge stage migration fails with, "Failed to fetch error list" error.

    Migration fails but the reason for the failure (DHCP plugin exception) is not shown.

    Workaround: Roll back migration and retry.

  • Issue 2468774: When option 'Detect NSX configuration change' is enabled, backups are taken even when there is no configuration change.

    Too many backups are being taken because backups are being taken even when there are no configuration changes.

    Workaround: Increase the time associated with this option, thereby reducing the rate at which backups are taken.

  • Issue 2523421: LDAP authentication does not work properly when configured with an external load balancer (configured with round-robin connection persistence).

    The API LDAP authentication won't work reliably and will only work if the load balancer forwards the API request to a particular Manager.

    Workaround: None.

  • Issue 2534921: Not specifying inter_sr_ibgp property in a PATCH API call will prevent other fields from being updated in the BgpRoutingConfig entity.

    PATCH API call fails to update BGP routing config entity. Error_message "BGP inter SR routing requires global BGP and ECMP flags enabled." BgoRoutingConfig will not be updated.

    Workaround: Specify inter_sr_ibgp property in the PATCH API call to allow other fields to be changed.

  • Issue 2566121: A UA node stopped accepting any New API calls with the message, "Some appliance components are not functioning properly."

    The UA node stops accepting any New API calls with the message, "Some appliance components are not functioning properly." There are around 200 connections stuck in CLOSE_WAIT state. These connections are not yet closed. New API call is rejected.

    Workaround: Restart proton service (service proton restart) or restart unified appliance node.

  • Issue 2574281: Policy will only allow a maximum of 500 VPN Sessions.

    NSX claims support of 512 VPN Sessions per edge in the large form factor, however, due to Policy doing auto plumbing of security policies, Policy will only allow a maximum of 500 VPN Sessions. Upon configuring the 501st VPN session on Tier0, the following error message is shown:
    {'httpStatus': 'BAD_REQUEST', 'error_code': 500230, 'module_name': 'Policy', 'error_message': 'GatewayPolicy path=[/infra/domains/default/gateway-policies/VPN_SYSTEM_GATEWAY_POLICY] has more than 1,000 allowed rules per Gateway path=[/infra/tier-0s/inc_1_tier_0_1].'}

    Workaround: Use Management Plane APIs to create additional VPN Sessions.

  • Issue 2596162: Unable to update the nsxaHealthStatus for a switch when the switch name contains a single quote.

     NSX configuration state is at partial success because the health status of a switch could not be updated. 

    Workaround: Change the host switch name so that is does not have any single quotes.

  • Issue 2596696: NsxTRestException observed in policy logs when creating SegmentPort from the API.

    NsxTRestException observed in policy logs. The SegmentPort cannot be created using the API.

    Workaround: Either populate the Id field in PortAttachmentDto or pass it as null in the API input.

  • Issue 2610718: Attempting to wire vIDM to NSX using the nsx-cli fails if lb_enable and vidm_enable flags are not explicitly specified.

    The error, "An error occurred attempting to update the vidm properties" will appear. You will only be able to wire vIDM using the UI or directly through REST API, or only through CLI while explicitly defining lb_enable and vidm_enable flags.

    Workaround: Treat the vidm_enable or lb_enable flags as non-optional when using nsx-cli to wire vIDM.

  • Issue 2628503: DFW rule remains applied even after forcefully deleting the manager nsgroup.

    Traffic may still be blocked when forcefully deleting the nsgroup.

    Workaround: Do not forcefully delete an nsgroup that is still used by a DFW rule. Instead, make the nsgroup empty or delete the DFW rule.

  • Issue 2631703: When doing backup/restore of an appliance with vIDM integration, vIDM configuration will break.

    Typically when an environment has been both upgraded and/or restored, attempting to restore an appliance where vIDM integration is up and running will cause that integration to break and you will to have to reconfigure.

    Workaround: After restore, manually reconfigure vIDM.

  • Issue 2638673: SRIOV vNICs for VMs are not discovered by inventory.

    SRIOV vNICs are not listed in Add new SPAN session dialog. You will not see SRIOV vNICs when adding new SPAN session.

    Workaround: None.

  • Issue 2647620: In an NSX configured environment with a large number of Stateless Hosts (TransportNodes), workload VMs on some Stateless hosts may lose connectivity temporarily when upgrading Management Plane nodes to 3.0.0 and above.

    This is applicable only to Stateless ESX Hosts configured for NSX 3.0.0 and above.

    Workaround: None.

  • Issue 2639424: Remediating a Host in a vLCM cluster with Host-based Service VM Deployment will fail after 95% Remediation Progress is completed.

    The remediation progress for a Host will be stuck at 95% and then Fail after 70 minute timeout is completed.

    Workaround: See VMware knowledge base article 81447.

  • Issue 2636855: Maximum capacity alarm raised when System-wide Logical Switch Ports is over 25K.

    Maximum capacity alarm raised when System-wide Logical Switch Ports is over 25K. But actually for PKS scale Env, the limitation for container port is 60K; >25K Logical Switch Ports in PKS Env is a normal case.

    Workaround: None.

  • Issue 2636771: Search can return resource when a resource tagged with multiple tag pairs, and tag and scope match with any value of tag and scope.

    This affects search query with condition on tag and scope. Filter may return extra data if tag and scope match with any pair.

    Workaround: None.

  • Issue 2643610: Load balancer statistics APIs are not returning stats.

    Stats of API are not set. You can't see load balancer stats.

    Workaround: Reduce the number of load balancers configured.

  • Issue 2555383: Internal server error during API execution.

    Internal server error observed during API call execution. API will result in 500 error and not give the desired output.

    Workaround: This error is encountered because the session is invalidated. In this case, re-execute the session creation api to create a new session.

  • Issue 2662225: When active edge node becomes non-active edge node during flowing S-N traffic stream, traffic loss is experienced.

    Current S->N stream is running on multicast active node. The preferred route on TOR to source should be through the multicast active edge node only.
    Bringing up another edge can take over multicast active node (lower rank edge is active multicast node). Current S->N traffic will experience loss up to four minutes. This is will not impact new stream or if current stream is stopped and started again.

    Workaround: Current S->N traffic will recover automatically within 3.5 to 4 minutes. For faster recovery, disable multicast and enable through configuration.

  • Issue 2610851: Namespaces, Compute Collection, L2VPN Service grid filtering might return no data for few combinations of resource type filters.

    Applying multiple filters for a few types at the same time returned no results even though data is available with matching criteria. It is not a common scenario and filter will fail only these grids for the following combinations of filter attribute:

    • For Namespaces grid ==> On Cluster Name and Pods Name filter
    • For Network Topology page  ==> On L2VPN service applying a remote ip filter
    • For Compute Collection ==> On ComputeManager filter

    Workaround: You can apply one filter at a time for these resource types.

  • Issue 2587257: In some cases, PMTU packet sent by NSX-T edge is ignored upon receipt at the destination.

    PMTU discovery fails resulting in fragmentation and reassembly, and packet drop. This results in performance drop or outage in traffic.

    Workaround: None.

  • Issue 2587513: Policy shows error when multiple VLAN ranges are configured in bridge profile binding.

    You will see an "INVALID VLAN IDs" error message.

    Workaround: Create multiple bridge endpoints with the VLAN ranges on the segment instead of one with all VLAN ranges.

  • Issue 2682480: Possible false alarm for NCP health status.

    The NCP health status alarm may be unreliable in the sense that it is raised when NCP system is healthy.

    Workaround: None.

  • Issue 2690457: When joining an MP to an MP cluster where publish_fqdns is set on the MP cluster and where the external DNS server is not configured properly, the proton service may not restart properly on the joining node.

    The joining manager will not work and the UI will not be available.

    Workaround: Configure the external DNS server with forward and reverse DNS entries for all Manager nodes.

  • Issue 2691432: Restore may fail.

    Restore may not work in some cases.

    Workaround: Click the Retry button in the UI to try the restore again.

  • Issue 2685550: FW Rule realization status is always shown as "In Progress" when applied to bridged segments.

    When applying FW Rules to an NSGroup that contains bridged segments as one of its members, realization status will always be shown as in progress. You won't be able to check the realization status of FW Rules applied to bridged segments.

    Workaround: Manually remove bridged segment from the NSGroup members list.

  • Issue 2690996: Cross-site packet forwarding may fail on KVM nodes if system assigned l2 forwarder vtep group id conflicts with VTEP label assigned to transport nodes.

    VM attached to stretched segment may lose cross-location connectivity. Cross-site traffic would not work for conflicting segments for KVM deployments.

    Workaround: Define new segment and switch workload to new segment.

  • Issue 2694496: Accessing VDI though Webclient/UAGs throws an error.

    When trying to access VDI from the Horizon portal, it times out with an error on port "22443".

    Workaround: Reboot the VDI.

  • Issue 2694707: The operational status of firewall rules on cloud VMs may show unknown for some rules in case an HA failover of public cloud gateways happens.

    The operational status of firewall rules on NSX Policy UI may show unknown. There is no functional impact. All rules are successfully realized. The status should clear itself and become healthy when both public cloud gateways are online.

    Workaround: None.

  • Issue 2684574: If the edge has 6K+ routes for Database and Routes, the Policy API times out.

    These Policy APIs for the OSPF database and OSPF routes return an error if the edge has 6K+ routes:
    /tier-0s/<tier-0s-id>/locale-services/<locale-service-id>/ospf/routes
    /tier-0s/<tier-0s-id>/locale-services/<locale-service-id>/ospf/routes?format=csv
    /tier-0s/<tier-0s-id>/locale-services/<locale-service-id>/ospf/database
    /tier-0s/<tier-0s-id>/locale-services/<locale-service-id>/ospf/database?format=csv

    If the edge has 6K+ routes for Database and Routes, the Policy API times out. This is a read-only API and has an impact only if the API/UI is used to download 6k+ routes for OSPF routes and database.

    Workaround: Use the CLI commands to retrieve the information from the edge.

  • Issue 2674689: If the transport node is updated between URT and the start of migration, it loses the extra config profile.

    Migration of transport node fails in TN_Validate Stage.

    Workaround: Run URT and trigger MigrateToCvdsTask again. You can use this API for cleanup: "POST https://<manager-ip>/api/v1/nvds-urt?action=cleanup"

  • Issue 2697549: If there is GI service deployed on the cluster, URT ApplyTopology will fail as URT is unable to make change to the transport node deployed GI service.

    URT ApplyTopology returns overall status of APPLY_TOPOLOGY_FAILED.

    Workaround: Remove the deployment of GI and rerun URT precheck and ApplyTopology.

  • Issue 2687948: LR does not work after a switchover from IP address to FQDN.

     "Fetching LR status timed out" error observed in UI and GM log replication will stop.

    Workaround: Unset FQDN and restart all LR nodes on Active and Standby sites.

  • Issue 2603550: Some VMs are vMotioned and lose network connectivity during UA nodes upgrade.

    During NSX UA nodes upgrading, you may find some VMs are migrated by DRS and lose network connectivity after the migration.

    Workaround: Change the DRS automation mode to manual before performing UA upgrade.

  • Issue 2622240: NVDS to CVDS Migration is triggered only for ESX upgrades that cross the 7.0.2 (X.Y.Z-U.P) release.

    Migration will not be triggered for any "U.P" (update-patch) upgrades. ESX version is specified as X.Y.Z-U.P where, X = Major, Y = Minor, Z = Maintenance, U = Update, P = Patch

    Workaround: NVDS to CVDS migration needs to be started using API/UI.
    POST https://{{nsxmanager-ip}}/api/v1/transport-nodes/{{transportnode-id}}?action=migrate_to_vds

  • Issue 2702168: After upgrading from NSX-T 3.0 to NSX-T 3.1, you cannot make any changes to VRF LR.

    If TIER0_EVPN_TEP_IP was added in VRF LR redistribution rule, you are unable to make any changes to VRF LR. A validation error states that "TIER0_EVPN_TEP_IP" is not supported for VRF LR.

    Workaround:

    1. Use the policy API to remove TIER0_EVPN_TEP_IP in the VRF locale service.
    2. Change the name as needed.
  • Issue 2534089: When the IDS service is enabled on a transport node (host), virtual machine traffic on IDS-enabled hosts will stop flowing unexpectedly.

    When enabling the NSX IDS/IPS (in either detect-only or detect-and-prevent mode) on a vSphere cluster and applying IDS/IPS to workloads, the lockup condition can get triggered just by having the IDPS engine enabled. As a result, all traffic to and from all workloads on the hypervisor subject to IDS/IPS or Deep Packet Inspection Services (L7 App-ID) will be dropped. Traffic not subject to IDS/IPS or Deep Packet Inspection is not impacted and as soon as IDS/IPS is disabled or no longer applied to traffic, traffic flow is restored.

    Workaround: See VMware knowledge base article 82043.

  • Issue 2688584: Fetching LR sync status timed out because one LR node hit TransactionAbortedException and shut down its thread pool.

     You will not be able to switch over and LR will stop.

    Workaround: Restart all LR nodes on Active and Standby sites.

  • Issue 2692344: If you delete the Avi Enforcement point, it deletes all the realized objects from the policy, which deletes all default object’s realized entities from the policy. Adding new enforcement point fails to re-sync the default object from the Avi Controller. 

    You will not be able to use the system-default objects after deletion and recreation of the Enforcement point of AVIConnectionInfo.

    Workaround: The enforcement point should not be deleted. If there are any changes it can be updated but should not be deleted.

  • Issue 2636420: Host will go to "NSX install skipped" state and cluster in "Failed" state post restore if "Remove NSX" is run on cluster post backup.

    "NSX Install Skipped" will be shown for host.

    Workaround: Following restore, you should have to run "Remove NSX" on the cluster again to achieve the state that was present following backup (not configured state).

  • Issue 2679344: Logging in to NSX-T Manager node as an LDAP user in a scaled Active Directory configuration may take a long time or fail.

    Logging in takes a long time or times out and may fail.

    Workaround: See VMware knowledge base article 82331.

  • Issue 2711497: NSX Cloud Upgrade from an older version to NSX-T 3.1.1 may temporarily move the agented VMs to error state.

    You will lose access to the VMs and there could be application downtime until the PCG is upgraded.

    Workaround:

    1. Navigate to the CSM upgrade page.
    2. Wait for the upgrade page to load. (It may take up to 5 minutes to load.)
    3. Click Begin Upgrade. After the sync is done, vpc/instance listing will be shown.
    4. Click Next.
    5. Click Skip NSX Tools Upgrade.
    6. Select PCGs for Upgrade.
    7. Click Next.
    8. Restart upgrade coordinator on the CSM CLI: restart service install-upgrade
    9. Navigate to the CSM upgrade page. Wait for the upgrade page to load. (It may take up to 5 minutes to load.)
    10. Click Begin Upgrade. After the sync is done, vpc/instance listing will be shown.
    11. Select agents to upgrade.
    12. Click Next. Agents upgrade status will be shown.
  • Issue 2697537: There is up to a 4 minute delay in creating first logical switch after enabling lockdown mode.

    Creation of first logical switch is delayed by 4 minutes after enabling lockdown mode.

    Workaround: None.

Installation Known Issues
  • Issue 2562189: Transport node deletion goes on indefinitely when the NSX Manager is powered off during the deletion operation.

    If the NSX Managers are powered off while transport node deletion is in progress, the transport node deletion may go on indefinitely if there is no user intervention.

    Workaround: Once the Managers are back up, prepare the node again and start the deletion process again.

Upgrade Known Issues
  • Issue 2693576: Transport Node shows "NSX Install Failed" after KVM RHEL 7.9 upgrade to RHEL 8.2​.

    After RHEL 7.9 upgrade to 8.2, dependencies nsx-opsagent and nsx-cli are missing. Host is marked as install failed. Resolving the failure from the UI doesn't work: Failed to install software on host. Unresolved dependencies: [PyYAML, python-mako, python-netaddr, python3]

    Workaround: Manually install the NSX RHEL 8.2 vibs after the host OS upgrade and resolve it from the UI.

  • Issue 2560981: On upgrade, vIDM config may not persist.

    You will have to re-login after successful upgrade if using vIDM, re-enable vIDM on the cluster.

    Workaround: Disable and enable vIDM config after upgrade.

  • Issue 2550492: During an upgrade, the message, "The credentials were incorrect or the account specified has been locked" is
    displayed temporarily and the system recovers automatically.

    Transient error message during upgrade.

    Workaround: None.

NSX Edge Known Issues
  • Issue 2283559: https://<nsx-manager>/api/v1/routing-table and https://<nsx-manager>/api/v1/forwarding-table MP APIs return an error if the edge has 65k+ routes for RIB and 100k+ routes for FIB.

    If the edge has 65k+ routes for RIB and 100k+ routes for FIB, the request from MP to Edge takes more than 10 seconds and results in a timeout. This is a read-only API and has an impact only if they need to download the 65k+ routes for RIB and 100k+ routes for FIB using API/UI.

    Workaround: There are two options to fetch the RIB/FIB.

    • These APIs support filtering options based on network prefixes or type of route. Use these options to download the routes of interest.
    • CLI support in case the entire RIB/FIB table is needed and there is no timeout for the same.
  • Issue 2521230: BFD status displayed under ‘get bgp neighbor summary’ may not reflect the latest BFD session status correctly.

    BGP and BFD can set up their sessions independently. As part of ‘get bgp neighbor summary’ BGP also displays the BFD state. If the BGP is down, it will not process any BFD notifications and will continue to show the last known state. This could lead to displaying stale state for the BFD.

    Workaround: Rely on the output of ‘get bfd-sessions’ and check the ‘State’ field to get the most up-to-date BFD status.

  • Issue 2641990: During Edge vMotion, there can be multicast traffic loss up to 30 seconds (default pim hello interval).

    When edge is vMotioned and IGMP snooping is enabled on TOR, The TOR needs to know the new edge location. This can be learned by TOR when it gets any of the multicast control or data traffic from the edge. Multicast Traffic is lost up to 30 seconds on edge vMotion.

    Workaround: None. Traffic will recover upon TOR receiving multicast packets, or for faster recovery, disable/enable pim on the uplink interface connected to TOR.

Security Known Issues
  • Issue 2491800: AR channel SSL certificates are not periodically checked for their validity, which could lead to using an expired/revoked certificate for an existing connection.

    The connection would be using an expired/revoked SSL.

    Workaround: Restart the APH on the Manager node to trigger a reconnection.

  • Issue 2689449: Incorrect inventory may be seen if the Public Cloud Gateway (PCG) is rebooting.

    The managed state of managed instances is shown as unknown. Some inventory information, such as managed state, errors and quarantine status will not be available to the Cloud Service Manager.

    Workaround: Wait for PCG to be up, and either wait for periodic sync or trigger account sync.

Federation Known Issues
  • Issue 2630813: SRM recovery for compute VMs will lose all the NSX tags applied to VM and Segment ports.

    If a SRM recovery test or run is initiated, the replicated compute VMs in the disaster recovery location will not have any NSX tags applied.

  • Issue 2601493: Concurrent config onboarding is not supported on Global Manager in order to prevent heavy processing load.

    Although parallel config onboarding does not interfere with each other, multiple such config onboarding executions on GM would make GM slow and sluggish for other operations in general.

    Workaround: Security Admin / Users must sync up maintenance windows to avoid initiating config onboarding concurrently.

  • Issue 2613113: If onboarding is in progress, and restore of Local Manager is done, the status on Global Manager does not change from IN_PROGRESS.

    UI shows IN_PROGRESS in Global Manager for Local Manager onboarding. Unable to import the configuration of the restored site.

    Workaround: Use the Local Manager API to start the onboarding of the Local Manager site, if required.

  • Issue 2625009: Inter-SR iBGP sessions keep flapping, when intermediate routers or physical NICs have lower or equal MTU as the inter-SR port.

    This can impact inter-site connectivity in Federation topologies.

    Workaround: Keep the pNic MTU and intermediate routers' MTU bigger than the global MTU (i.e., the MTU used by inter-SR port). The size of the packets becomes more than MTU because of encapsulation and packets don't go through.

  • Issue 2606452: Onboarding is blocked when trying to onboard via API.

    Onboarding API fails with the error message, "Default transport zone not found at site". 

    Workaround: Wait for fabric sync between Global Manager and Local Manager to complete.

  • Issue 2643749: Unable to nest group from custom region created on specific site into group that belongs to system created site specific region.

    You will not see the group created in site specific custom region while selecting child group as a member for the group in the system created region with the same location.

  • Issue 2649240: Deletion is slow when a large number of entities are deleted using individual delete APIs.

    It takes significant time to complete the deletion process.

    Workaround: Use hierarchical API to delete in bulk.

  • Issue 2649499: Firewall rule creation takes a long time when individual rules are created one after the other.

    Slow API takes more time to create rules.

    Workaround: Use Hierarchical API to create several rules.

  • Issue 2652418: Slow deletion when large number of entities are deleted.

    Deletion will be slower.

    Workaround: Use the hierarchical API for bulk deletion.

  • Issue 2655539: Host names are not updated on the Location Manager page of the Global Manager UI when updating the host names using the CLI.

    The old host name is shown.

    Workaround: None.

  • Issue 2658656: State for compute manager added to active Global Manager is not replicated on standby Global Manager.

    Compute manager added to active Global Manager is not visible on Global Manager when standby Global Manager becomes active. You will not be able to use the auto-deploy feature for deploying new Global Manager nodes if standby Global Manager becomes active in case of failover.

    Workaround: Add vCenter as compute manager on Global Manager that is active now.

  • Issue 2658687: Global Manager switchover API reports failure when transaction fails, but the failover happens.

    API fails, but Global Manager switchover completes.

    Workaround: None.

  • Issue 2630819: Changing LM certificates should not be done after LM register on GM.

    When Federation and PKS need to be used on the same LM, PKS tasks to create external VIP & change LM certificate should be done before registering the LM on GM. If done in the reverse order, communications between LM and GM will not be possible after change of LM certificates and LM has to be registered again.

  • Issue 2658092: Onboarding fails when NSX Intelligence is configured on Local Manager.

    Onboarding fails with a principal identity error. and you cannot onboard a system with principal identity user.

    Workaround: Create a temporary principal identity user with the same principal identity name that is used by NSX Intelligence.

  • Issue 2622576: Failures due to duplicate configuration are not propagated correctly to user.

    While onboarding is in progress, you see an "Onboarding Failure" message.

    Workaround: Restore Local Manager and retry onboarding.

  • Issue 2679614: When the API certificate is replaced on the Local Manager, the Global Manager's UI will display the message, "General Error has occurred."

    When the API certificate is replaced on the Local Manager, the Global Manager's UI will display the message, "General Error has occurred."

    Workaround:

    1. Open the "Location Manager" of the Global Manager UI.
    2. Click the "ACTION" tab under the affected Local Manager and then enter the new thumbprint.
    3. If this does not work, off-board the Local Manager and then re-onboard the Local Manager.
  • Issue 2681092: You can switch from the active Global Manager to the stand-by Global Manager even when the certificate of the latter has expired.

    The expired certificate on the standby Global Manager continues to allow communication when it shouldn't.

    Workaround: Ensure that certificates have not expired. Alarms are raised when certificates are about to expire.

  • Issue 2697111: Unable to use "import CRL" functionality using Global Manager UI.

    When trying to Import CRL, the operation fails due to the wrong URL hit from the UI. You will not be able to use the Import CRL option on the Global Manager.

    Workaround: Use the API:

    PUT https://{{gm-ip}}/global-manager/api/v1/global-infra/crls/cr11

  • Issue 2680854: Second attempt for Config Onboarding for a site is failing after rollback is successful on Global Manager.

    The Config Onboarding status is stuck in "In progress" indefinitely. You will not be able to config onboard site a second time after first attempt ends up in rollback.

    Workaround: Off-board and re-onboard the site. This operation will trigger clearing up site-specific config onboarding statuses on the Global Manager and help start fresh for config onboarding.

  • Issue 2663483: The single-node NSX Manager will disconnect from the rest of the NSX Federation environment if you replace the APH-AR certificate on that NSX Manager.

    This issue is seen only with NSX Federation and with the single node NSX Manager Cluster. The single-node NSX Manager will disconnect from the rest of the NSX Federation environment if you replace the APH-AR certificate on that NSX Manager.

    Workaround: Single-node NSX Manager cluster deployment is not a supported deployment option, so have three-node NSX Manager cluster.

check-circle-line exclamation-circle-line close-line
Scroll to top icon