VMware NSX Container Plugin 2.5.1 | 19 December, 2019 | Build 15287458 Check regularly for additions and updates to this document. |
What's in the Release Notes
The release notes cover the following topics:
What's New
- Kubernetes error CRD (CustomResourceDefinition) enhancements.
- Support for regular expressions when specifying the path in Ingress rules (if you want to capture groups in the path’s regular expression and use them to rewrite URI, then NSX-T 2.5.1 or later is required).
- A new DaemonSet to uninstall NSX-CNI, NSX-OVS, and optionally installed OVS packages on Kubernetes nodes.
- Support for redirecting HTTP traffic to HTTPS on a per Ingress basis.
- Automatic removal of duplicate resources in a Network Policy Controller.
- Support for IPSet for load balancer VIPs.
- Support for custom labels in PAS.
- Support for LoadBalancer and NSXLoadBalancerMonitor CRDs for Ingress scaling.
- Support allocating desired IP from IP pool for L4 and L7 virtual servers.
Compatibility Requirements
Product | Version |
---|---|
NCP / NSX-T Tile for PAS | 2.5.1 |
NSX-T | 2.4.1, 2.4.2, 2.4.3, 2.5.0, 2.5.1, 2.5.2, 2.5.2.2 |
Kubernetes | 1.15, 1.16, 1.17 |
OpenShift | 3.10, 3.11 |
Kubernetes Host VM OS | Ubuntu 16.04, Ubuntu 18.04, CentOS 7.5, CentOS 7.6, CentOS 7.7 |
OpenShift Host VM OS | RHEL 7.5, RHEL 7.6, RHEL 7.7 |
OpenShift BMC | RHEL 7.5, RHEL 7.6 |
PAS (PCF) | Ops Manager 2.6 + PAS 2.6 Ops Manager 2.7 + PAS 2.7 Ops Manager 2.8 + PAS 2.8 Ops Manager 2.10 + PAS 2.10 Note: PAS 2.7.0 + NCP 2.5.1 is not supported. |
Note: For RHEL and CentOS Host VMs, only kernel versions 862 and 957 are supported.
If the RHEL nodes have kernel version lower than 3.10.0-957.27.2, OpenShift installation will fail. Upgrading the kernel version is not recommended on a bare-metal container node because OVS will fail to run. To deploy OpenShift 3.11 with a lower kernel version, the openshift-ansible repository should use commit: e0499023ea91741ab4afd29391e420a26b8859b5 as the top commit.
Support for upgrading to this release:
- NCP 2.5 and all NCP 2.4.x releases
Resolved Issues
- Issue 2397438: After a restart, NCP reports MultipleObjects error
Before the restart, NCP failed to create distributed firewall sections because of a ServerOverload error. NCP retried until the maximum number of attempts was reached. However, the firewall sections were still created. When NCP was restarted, it received duplicate firewall sections and reports the MultipleObjects error.
Workaround: Manually delete the stale and duplicate distributed firewall sections and then restart NCP.
- Issue 2397684: NCP found the correct transport zone but then failed with the error "Default transport-zone is not configured"
When you create Kubernetes namespaces with policy API-based NCP, the infra segment creation might fail due to the presence of multiple overlay transport zones in NSX-T. This issue occurs if none of the overlay transport zone is marked as default.
Workaround: Update the overlay transport zone, configured in NCP ConfigMap, and set the "is_default" field to "True".
- Issue 2412421: Docker fails to restart a container
If (1) ConfigMap is updated, (2) the container uses subPath to mount the ConfigMap, and (3) the container is restarted, then Docker fails to start the container.
Workaround: Delete the Pod so that the DaemonSet will re-create the Pod.
- Issue 2410909: After a restart, NCP may take a long time to initialize its cache in a large-scale environment (especially if there are many network policies), and can take around half an hour to come up and process new pods and resources
After a restart, NCP can take a long time to come up. The processing of resources such as pods, namespaces and network policies might take an additional amount of time depending on the quantity of resources involved.
Workaround: None
- Issue 2423240: The nsx-ncp-bootstrap container fails if any IP route has a link-down status
The nsx-ncp-bootstrap container assumes that all IP routes have a link status of "up" and will fail if that is not the case.
- Issue 2425050: The nsx-ncp-bootstrap container fails to compile the OVS package on Linux version 4.15.0-59-generic or later
The compilation fails because of a missing header file during the compilation of the OVS kernel module.
- Issue 2398430: Connectivity to a node is lost after a restart
If OVS is configured to run on a node at startup and the node is restarted when the NSX node agent DaemonSet is running and IP is persisted on the ovs_uplink_port, then the connectivity to the node will be lost.
Known Issues
- Issue 2131494: NGINX Kubernetes Ingress still works after changing the Ingress class from nginx to nsx
When you create an NGINX Kubernetes Ingress, NGINX create traffic forwarding rules. If you change the Ingress class to any other value, NGINX does not delete the rules and continues to apply them, even if you delete the Kubernetes Ingress after changing the class. This is a limitation of NGINX.
Workaround: To delete the rules created by NGINX, delete the Kubernetes Ingress when the class value is nginx. Than re-create the Kubernetes Ingress.
- For a Kubernetes service of type ClusterIP, Client-IP based session affinity is not supported
NCP does not support Client-IP based session affinity for a Kubernetes service of type ClusterIP.
Workaround: None
- For a Kubernetes service of type ClusterIP, the hairpin-mode flag is not supported
NCP does not support the hairpin-mode flag for a Kubernetes service of type ClusterIP.
Workaround: None
- Issue 2192489: After disabling 'BOSH DNS server' in PAS director config, the Bosh DNS server (169.254.0.2) still appears in the container's resolve.conf file.
In a PAS environment running PAS 2.2, after you disable 'BOSH DNS server' in PAS director config, the Bosh DNS server (169.254.0.2) still appears in the container's resove.conf file. This causes a ping command with a fully qualified domain name to take a long time. This issue does not exist with PAS 2.1.
Workaround: None. This is a PAS issue.
- Issue 2224218: After a service or app is deleted, it takes 2 minutes to release the SNAT IP back to the IP pool
If you delete a service or app and recreate it within 2 minutes, it will get a new SNAT IP from the IP pool.
Workaround: After deleting a service or app, wait 2 minutes before recreating it if you want to reuse the same IP.
- Issue 2330811: When creating Kubernetes services of type LoadBalancer while NCP is down, the services might not get created when NCP is restarted
When NSX-T resources are exhausted for Kubernetes services of type LoadBalancer, you can create new services after deleting some of the existing services. However, if you delete and create the services while NCP is down, NCP will fail to create the new services.
Workaround: When NSX-T resources are exhausted for Kubernetes services of type LoadBalancer, do not perform both the delete and the create operations while NCP is down.
- Issue 2370137: The nsx-ovs and nsx-node-agent containers fail to run because the OVS database files are not in /etc/openvswitch
When the nsx-ovs and nsx-node-agent containers start, they look for the OVS database files in /etc/openvswitch. If there are symlinks in the directory that link to the actual OVS files (for example, conf.db), the nsx-ovs and nsx-node-agent containers will not run.
Workaround: Move the OVS database files to /etc/openvswitch and remove the symlinks.
- Issue 2404302: If multiple load balancer application profiles for the same resource type (for example, HTTP) exist on NSX-T, NCP will choose any one of them to attach to the Virtual Servers.
If multiple HTTP load balancer application profiles exist on NSX-T, NCP will choose any one of them with the appropriate x_forwarded_for configuration to attach to the HTTP and HTTPS Virtual Server. If multiple FastTCP and UDP application profiles exist on NSX-T, NCP will choose any one of them to attach to the TCP and UDP Virtual Servers, respectively. The load balancer application profiles might have been created by different applications with different settings. If NCP chooses to attach one of these load balancer application profiles to the NCP-created Virtual Servers, it might break the workflow of other applications.
Workaround: None
- Issue 2397621: OpenShift installation fails
OpenShift installation expects a node's status to be ready and this is possible after the installation of the CNI plugin. In this release there is no separate CNI plugin file, causing OpenShift installation to fail.
Workaround: Create the /etc/cni/net.d directory on each node before starting the installation.
- Issue 2408100: In a large Kubernetes cluster with multiple NCP instances in active-standby mode or liveness probe enabled, NCP frequently restarts
In a large Kubernetes cluster (about 25,000 pods, 2,500 namespaces and 2,500 network policies), if multiple NCP instances are running in active-standby mode, or if liveness probe is enabled, NCP processes might be killed and restarted frequently due to "Acquiring lock conflicted" or liveness probe failure.
Workaround: Perform the following steps:
- Set
replicas
of NCP deployment to 1, or increase the configuration optionha.master_timeout
in ncp.ini from the default value 18 to 30. - Increase the liveness probe arguments as follows:
containers: - name: nsx-ncp livenessProbe: exec: command: - /bin/sh - -c - timeout 20 check_pod_liveness nsx-ncp initialDelaySeconds: 20 timeoutSeconds: 20 periodSeconds: 20 failureThreshold: 5
- Set
- Issue 2413383: OpenShift upgrade fails because not all nodes are ready
By default the NCP bootstrap pod is not scheduled on the master node. As a result, the master node status is always Not Ready.
Workaround: Assign the master node with the role "compute" to allow nsx-ncp-bootstrap and nsx-node-agent DaemonSets to create pods. The node status will change to "Ready" once the nsx-ncp-bootstrap installs the NSX-CNI.
- Issue 2451442: After repeatedly restarting NCP and recreating a namespace, NCP might fail to allocate IP addresses to Pods
If you repeatedly delete and recreate the same namespace while restarting NCP, NCP might fail to allocate IP addresses to Pods in that namespace.
Workaround: Delete all stale NSX resources (logical routers, logical switches, and logical ports) associated with the namespace, and then recreate them.
- Issue 2447127: When upgrading NCP from 2.4.1 to 2.5.0 or 2.5.1, it might take NCP extra time to be up and running
During the upgrade of NCP from 2.4.1 to 2.5.x, NSX-T 2.4.1 might have an issue of slow response when NCP calls the switching profile API for leader election. This causes NCP to take several extra minutes to be up and running.
Workaround: None.
- Issue 2460219: HTTP redirect does not work without a default server pool
If the HTTP virtual server is not bound to a server pool, HTTP redirect fails. This issue occurs in NSX-T 2.5.0 and earlier releases.
Workaround: Create a default server pool or upgrade to NSX-T 2.5.1.
- Issue 2518312: NCP bootstrap container fails to install nsx-ovs kernel module on Ubuntu 18.04.4, kernel 4.15.0-88
The NCP bootstrap container (nsx-ncp-bootstrap) fails to install nsx-ovs kernel module on Ubuntu 18.04.4, kernel 4.15.0-88.
Workaround: Downgrade the kernel version to 4.15.0-76 so that NSX-OVS can be installed.
- Issue 2520402: Hyperbus vmk50 is missing if the N-VDS host switch is in ENS mode
Hyperbus vmk50 is not created on the ESXi host if the N-VDS host switch is in ENS mode.
Workaround: Use standard mode for the N-VDS host switch.