times

VMware NSX Container Plugin 3.0.2 | 10 September, 2020 | Build 16863080

Check regularly for additions and updates to this document.

What's in the Release Notes

The release notes cover the following topics:

What's New
Compatibility Requirements
Resolved Issues
Known Issues

What's New

NSX Container Plugin 3.0.2 has the following new features:

Support for OpenShift 4.4 (with NSX-T Data Center 3.0.1.1 or later)
Support for JWT client authentication for Ingress (with NSX-T Data Center 3.0.0 or later)
Support for Kubernetes service of type LoadBalancer without selector
Ability to specify specific SNAT IP per namespace and service via annotation
New annotation to enable logging traffic for distributed firewall rules
Support for Photon OS
Support for pod security policy
Support for the pathType attribute for Ingress

NSX Container Plugin 3.0.2.2 is a patch release. It resolves an OVS kernel module compilation issue on Pivotal Stemcells 621.85 and above, 456.121 and above.

Compatibility Requirements

Product	Version
NCP/NSX-T Tile for Tanzu Application Service (PCF)	3.0.2
NSX-T	2.5.2, 2.5.2.1, 2.5.2.2, 2.5.3, 3.0.0, 3.0.1, 3.0.2, 3.0.3
vSphere	6.7, 7.0
Kubernetes	1.17, 1.18
OpenShift 3	3.11
OpenShift 4	RHCOS 4.3, 4.4
Kubernetes Host VM OS	Ubuntu 16.04, Ubuntu 18.04, CentOS 7.7, CentOS 7.8, CentOS 8.1, RHEL 7.8, RHEL 8.1 Note: For RHEL/CentOS 7.8, 8.1, nsx-ovs is not supported. It is only compatible with the upstream OVS.
OpenShift Host VM OS	RHEL 7.7, RHEL 7.8
OpenShift BMC (Deprecated in this release)
Tanzu Application Service (Pivotal Cloud Foundry)	Ops Manager 2.8 + PAS 2.8 Ops Manager 2.9 + PAS 2.9 Ops Manager 2.10 + PAS 2.10

Support for upgrading to this release:

All previous 3.0.x releases and all NCP 2.5.x releases

Resolved Issues

Issue 2518312: NCP bootstrap container fails to install nsx-ovs kernel module on Ubuntu 18.04.4, kernel 4.15.0-88
The NCP bootstrap container (nsx-ncp-bootstrap) fails to install nsx-ovs kernel module on Ubuntu 18.04.4, kernel 4.15.0-88.

Do not install NSX-OVS on this kernel by setting use_nsx_ovs_kernel_module = False in nsx-node-agent-config. Instead, use the upstream OVS kernel module (Ubuntu comes by default with an OVS kernel module) on the host. If there is no OVS kernel module on the host, either install OVS kernel module manually and set use_nsx_ovs_kernel_module = False in nsx-node-agent-config, or downgrade the kernel version to 4.15.0-76 so that NSX-OVS can be installed.
Issue 2548815: In an NCP cluster imported from Manager to Policy, NCP fails to delete an automatically scaled tier-1 router
An automatically scaled tier-1 router cannot be deleted by NCP running in Policy mode after Manager to Policy import because it is still being referenced by its LocaleService.

Workaround: Manually delete the tier-1 router using the NSX Manager UI.
Issue 2549433: OpenShift node using a single interface configured as the ovs_uplink_port loses name server information when DHCP lease expires
An OpenShift node with a single interface, which is configured as the ovs_uplink_port in the nsx-node-agent config, loses name server information when the DHCP lease of the ovs_uplink_port expires.

Workaround: Use a static IP address.
Issue 2550625: After migrating a cluster from Manager to Policy, the IP addresses in a shared IP pool are not released
After a cluster is migrated from Manager to Policy, deleting a namespace does not release the IP addresses that were allocated to that namespace.

Workaround: None.
Issue 2549765: Importing Manager objects to Policy fails if there is a NAT rule with multiple destination ports
The Manager to Policy import process will fail if there is a NAT rule with multiple destination ports on the top tier router. One such scenario is when the Kubernetes parameter ingress_mode is 'nat' in NCP config and there exists a pod with the annotation 'ncp/ingress-controller' in Kubernetes.

Workaround: While NCP is not running and before initiating the import, edit the NAT Rule and remove the '80' and '443' destination ports.
Issue 2550474: In an OpenShift environment, changing an HTTPS route to an HTTP can cause the HTTP route to not work as expected
If you edit an HTTPS route and delete the TLS-related data to convert it to an HTTP route, the HTTP route might not work as expected.

Workaround: Delete the HTTPS route and create a new HTTP route.
NCP fails at startup when no container IP block is configured
For completely routed deployments with only “no SNAT IP blocks” configured but no "container IP block" configured, NCP 3.0.1 fails at startup with “IndexError: list index out of range”.

Known Issues

Issue 2131494: NGINX Kubernetes Ingress still works after changing the Ingress class from nginx to nsx
When you create an NGINX Kubernetes Ingress, NGINX create traffic forwarding rules. If you change the Ingress class to any other value, NGINX does not delete the rules and continues to apply them, even if you delete the Kubernetes Ingress after changing the class. This is a limitation of NGINX.

Workaround: To delete the rules created by NGINX, delete the Kubernetes Ingress when the class value is nginx. Than re-create the Kubernetes Ingress.
For a Kubernetes service of type ClusterIP, Client-IP based session affinity is not supported
NCP does not support Client-IP based session affinity for a Kubernetes service of type ClusterIP.

Workaround: None
For a Kubernetes service of type ClusterIP, the hairpin-mode flag is not supported
NCP does not support the hairpin-mode flag for a Kubernetes service of type ClusterIP.

Workaround: None
Issue 2192489: After disabling 'BOSH DNS server' in PAS director config, the Bosh DNS server (169.254.0.2) still appears in the container's resolve.conf file.
In a PAS environment running PAS 2.2, after you disable 'BOSH DNS server' in PAS director config, the Bosh DNS server (169.254.0.2) still appears in the container's resove.conf file. This causes a ping command with a fully qualified domain name to take a long time. This issue does not exist with PAS 2.1.

Workaround: None. This is a PAS issue.
Issue 2224218: After a service or app is deleted, it takes 2 minutes to release the SNAT IP back to the IP pool
If you delete a service or app and recreate it within 2 minutes, it will get a new SNAT IP from the IP pool.

Workaround: After deleting a service or app, wait 2 minutes before recreating it if you want to reuse the same IP.
Issue 2404302: If multiple load balancer application profiles for the same resource type (for example, HTTP) exist on NSX-T, NCP will choose any one of them to attach to the Virtual Servers.
If multiple HTTP load balancer application profiles exist on NSX-T, NCP will choose any one of them with the appropriate x_forwarded_for configuration to attach to the HTTP and HTTPS Virtual Server. If multiple FastTCP and UDP application profiles exist on NSX-T, NCP will choose any one of them to attach to the TCP and UDP Virtual Servers, respectively. The load balancer application profiles might have been created by different applications with different settings. If NCP chooses to attach one of these load balancer application profiles to the NCP-created Virtual Servers, it might break the workflow of other applications.

Workaround: None
Issue 2397621: OpenShift 3 installation fails
OpenShift 3 installation expects a node's status to be ready and this is possible after the installation of the CNI plugin. In this release there is no separate CNI plugin file, causing OpenShift installation to fail.

Workaround: Create the /etc/cni/net.d directory on each node before starting the installation.
Issue 2413383: OpenShift 3 upgrade fails because not all nodes are ready
By default the NCP bootstrap pod is not scheduled on the master node. As a result, the master node status is always Not Ready.

Workaround: Assign the master node with the role "compute" to allow nsx-ncp-bootstrap and nsx-node-agent DaemonSets to create pods. The node status will change to "Ready" once the nsx-ncp-bootstrap installs the NSX-CNI.
Issue 2460219: HTTP redirect does not work without a default server pool
If the HTTP virtual server is not bound to a server pool, HTTP redirect fails. This issue occurs in NSX-T 2.5.0 and earlier releases.

Workaround: Create a default server pool or upgrade to NSX-T 2.5.1.
Issue 2518111: NCP fails to delete NSX-T resources that have been updated from NSX-T
NCP creates NSX-T resources based on the configurations that you specify. If you make any updates to those NSX-T resources through NSX Manager or the NSX-T API, NCP might fail to delete those resources and re-create them when it is necessary to do so.

Workaround: Do not update NSX-T resources created by NCP through NSX Manager or the NSX-T API.
Issue 2524778: NSX Manager shows NCP as down or unhealthy after the NCP master node is deleted
After an NCP master node is deleted, for example, after a successful switch-over to a backup node, the health status of NCP still says down when it should be up.

Workaround: Use the Manager API DELETE /api/v1/systemhealth/container-cluster/<cluster-id>/ncp/status to clear the stale status manually.
Issue 2416376: NCP fails to process a PAS ASG (App Security Group) that binds to more than 128 Spaces
Because of a limit in NSX-T distributed firewall, NCP cannot process a PAS ASG that binds to more than 128 Spaces.

Workaround: Create multiple ASGs and bind each of them to no more than 128 Spaces.
Issue 2534726: If upgrading to NCP 3.0.1 via NSX-T Tile fails, using the BOSH command line to redo the upgrade causes performance problems
When upgrading to NCP 3.0.1 via NSX-T Tile on OpsMgr, the upgrade process will mark HA switching profiles in NSX Manager used by NCP as inactive. The switching profiles will be deleted when NCP restarts. if the upgrade fails and you use a BOSH command such as “bosh deploy -d <deployment-id> -n <deployment>.yml” to redo the upgrade, the HA switching profiles will not be deleted. NCP will still run but performance will be degraded.

Workaround: Always upgrade NCP via OpsMgr and not the BOSH command line.
Issue 2537221: After upgrading NSX-T to 3.0, the networking status of container-related objects in the NSX Manager UI is shown as Unknown
In NSX Manager UI, the tab Inventory > Containers shows container-related objects and their status. In a PKS environment, after upgrading NSX-T to 3.0, the networking status of the container-related objects is shown as Unknown. The issue is caused by the fact that PKS does not detect the version change of NSX-T. This issue does not occur if NCP is running as a pod and the liveness probe is active.

Workaround: After the NSX-T upgrade, restart the NCP instances gradually (no more than 10 at the same time) so as not to overload NSX Manager.
Issue 2552918: Rollback for Manager to Policy import is unsuccessful for distributed firewall which causes cluster rollback to fail
On rare occasions, the Manager to Policy import process must perform a rollback, which is unsuccessful for distributed firewall sections and rules. This causes the cluster rollback to fail and leaves stale resources in NSX Manager.

Workaround: Use the backup and restore feature to restore the NSX Manager to a healthy state.
Issue 2552573: In an OpenShift 4.3 environment, cluster installation might fail if DHCP is configured using Policy UI
In an OpenShift 4.3 environment, cluster installation requires that a DHCP server is available to provide IP addresses and DNS information. If you use the DHCP server that is configured in NSX-T using the Policy UI, the cluster installation might fail.

Workaround: Configure a DHCP server using the Manager UI, delete the cluster that failed to install and recreate the cluster.
Issue 2552564: In an OpenShift 4.3 environment, DNS forwarder might stop working if overlapping address found
In an OpenShift 4.3 environment, cluster installation requires that a DNS server be configured. If you use NSX-T to configure a DNS forwarder and there is IP address overlap with the DNS service, the DNS forwarder will stop working and cluster installation will fail.

Workaround: Configure an external DNS service, delete the cluster that failed to install and recreate the cluster.
Issue 2483242: IPv6 traffic from containers being blocked by NSX-T SpoofGuard
IPv6 link local address is not being auto-whitelisted with SpooGuard enabled.

Workaround: Disable SpoofGuard by setting nsx_v3.enable_spoofguard = False in the NCP configuration.
Issue 2552609 - Incorrect X-Forwarded-For (XFF) and X-Forwarded-Port data
If you configure XFF with either INSERT or REPLACE for HTTPS Ingress rules (Kubernetes) or HTTPS routes (OpenShift), you might see incorrect X-Forwarded-For and X-Forwarded-Port values in XFF headers.

Workaround: None.
Issue 2555336: Pod traffic not working due to duplicate logical ports created in Manager mode
This issue is more likely to occur when there are many pods in several clusters. When you create a pod, traffic to the pod does not work. NSX-T shows multiple logical ports created for the same container. In the NCP log only the ID of one of the logical ports can be found.

Workaround: Delete the pod and recreate it. The stale ports on NSX-T will be removed when NCP restarts.
Issue 2554357: Load balancer auto scaling does not work for IPv6
In an IPv6 environment, a Kubernetes service of type LoadBalancer will not be active when the existing load balancer scale is reached.

Workaround: Set nsx_v3.lb_segment_subnet = FE80::/10 in /var/vcap/jobs/ncp/config/ncp.ini for PKS deployments and in nsx-ncp-configmap for others. Then restart NCP.
Issue 2597423: When importing manager objects to policy, a rollback will cause the tags of some resources to be lost
When importing manager objects to policy, if a rollback is necessary, the tags of the following objects will not be restored:
- Spoofguard profiles (part of shared and cluster resources)
- BgpneighbourConfig (part of shared resources)
- BgpRoutingConfig (part of shared resources)
- StaticRoute BfdPeer (part of shared resources)
Workaround: For resources that are part of the shared resources, manually restore the tags. Use the backup and restore feature to restore resources that are part of cluster resources.
Issue 2579968: When changes are made to Kubernetes services of type LoadBalancer at a high frequency, some virtual servers and server pools are not be deleted as expected
When changes are made to Kubernetes services of type LoadBalancer at a high frequency, some virtual servers and server pools might remain in the NSX-T environment when they should be deleted.

Workaround: Restart NCP. Alternatively, manually remove stale virtual servers and their associated resources. A virtual server is stale if no Kubernetes service of type LoadBalancer has the virtual server's identifier in the external_id tag.
Issue 2536383: After upgrading NSX-T to 3.0 or later, the NSX-T UI does not show NCP-related information correctly
After upgrading NSX-T to 3.0 or later, the Inventory > Containers tab in the NSX-T UI shows the networking status of container-related objects as Unknown. Also, NCP clusters do not appear in the System > Fabric > Nodes > NCP Clusters tab. This issue is typically seen in a PKS environment.

Workaround: After the NSX-T upgrade, restart the NCP instances gradually (no more than 10 at the same time).
Issue 2622099: Kubernetes service of type LoadBalancer initialization fails with error code NCP00113 and error message "The object was modified by somebody else"
In a single-tier deployment with policy API, if you use an existing tier-1 gateway as the top tier gateway and the pool allocation size of the gateway is ROUTING, a Kubernetes service of type LoadBalancer might fail to initialize with the error code NCP00113 and error message "The object was modified by somebody else. Please retry."

Workaround: When the problem appears, wait 5 minutes. Then restart NCP. The problem will be resolved.
Issue 2633679: NCP operator does not support OpenShift nodes attached to a tier-1 segment created using API /policy/api/v1/infra/tier-1s/<tier1-id>/segments/<segment-id>
NCP operator does not support OpenShift nodes attached to a tier-1 segment created using API /policy/api/v1/infra/tier-1s/<tier1-id>/segments/<segment-id>.

Workaround: Use API /policy/api/v1/infra/segments/<segment-id> to create the segment.
NCP fails to start when "logging to file" is enabled during Kubernetes installation
This issue happens when uid:gid=1000:1000 on the container host does not have permission to the log folder.

Workaround: Do one of the following:
- Change the mode of the log folder to 777 on the container hosts.
- Grant “rwx” permission of the log folder to uid:gid=1000:1000 on the container hosts.
- Disable the “logging to file” feature.
Issue 3033821: After manager-to-policy migration, distributed firewall rules not enforced correctly
After a manager-to-policy migration, newly created network policy-related distributed firewall (DFW) rules will have higher priority than the migrated DFW rules.

Workaround: Use the policy API to change the sequence of DFW rules as needed.