VMware NSX Container Plugin 4.1.2 Release Notes

VMware NSX Container Plugin 4.1.2 \| 19 October 2023 \| Build 22596735 Check for additions and updates to these release notes.

VMware NSX Container Plugin 4.1.2 | 19 October 2023 | Build 22596735

Check for additions and updates to these release notes.

What's New

Support for creating new clusters in TAS in Manager mode. Note that this feature will not be supported in the next release.
Support for migrating TAS foundations from Manager mode to Policy mode.
A new runbook to generate a debugging report for NCP configuration steps.

Deprecation Notice

NAT support for third-party Ingress controllers is deprecated and will be removed in a future release.

This feature is controlled by the k8s.ingress_mode parameter and is enabled on Ingress controller pods using the ncp/ingress_controller annotation. With this feature the Ingress controller pod will be exposed with a DNAT rule. The preferred way of exposing these pods is to use a service of type LoadBalancer.
The NSX OVS kernel modules is deprecated and will be removed in the next release.
Multus is no longer supported. It will be removed in the next release.

Compatibility Requirements

Product	Version
NCP/NSX Tile for Tanzu Application Service (TAS)	4.1.2
NSX	3.2.3, 4.0.1, 4.1.1, 4.1.2
Kubernetes	1.25, 1.26, 1.27
OpenShift 4	4.11, 4.12
Kubernetes Host VM OS	Ubuntu 20.04 Ubuntu 22.04 with kernel 5.15 (both nsx-ovs kernel module and upstream OVS kernel module supported) Ubuntu 22.04 with kernel later than 5.15 (only upstream OVS kernel module supported) RHEL 8.6, 8.8, 9.2 See notes below.
Tanzu Application Service (TAS)	Ops Manager 2.10 + TAS 2.13 Ops Manager 3.0 + TAS 2.13 Ops Manager 2.10 + TAS 4.0 Ops Manager 3.0 + TAS 4.0 Ops Manager 2.10 + TAS 5.0 Ops Manager 3.0 + TAS 5.0
Tanzu Kubernetes Grid Integrated (TKGI)	1.18.0

Notes:

For all supported integrations, use the Red Hat Universal Base Image (UBI). For more information, https://www.redhat.com/en/blog/introducing-red-hat-universal-base-image.

Support for upgrading to this release:

4.1.0. 4.1.1, and all 4.0.x releases.

Limitations

The "baseline policy" feature for NCP creates a dynamic group which selects all members in the cluster. NSX-T has a limit of 8,000 effective members of a dynamic group (for details, see Configuration Maximums). Therefore, this feature should not be enabled for clusters that are expected to grow beyond 8,000 pods. Exceeding this limit can cause delays in the creation of resources for the pods.
Transparent mode load balancer
- Only north-south traffic for a Kubernetes cluster is supported. Intra-cluster traffic is not supported.
- Not supported for services attached to a LoadBalancer CRD or when auto scaling is enabled. Auto scaling must be disabled for this feature to work.
- It is recommended to use this feature only on newly deployed clusters.
Manager-to-policy migration
- It is not possible to migrate a Kubernetes cluster if a previous migration failed and the cluster is rolled back. This is a limitation with NSX 4.0.0.1 or earlier releases only.
There is a risk of significant performance degradation in the actual group member calculation, with impact on network traffic, when implementing Network Policies that use multi-selectors criteria in Ingress/Egress rules. To address this limitation, there is a new configuration option, enable_mixed_expression_groups, which affects Kubernetes Network Policies using multi-selectors in Policy mode. Clusters in Manager mode are not affected. The default value of this option is False. We recommend the following values in your cluster:
- TKGi
  - New clusters, Policy mode: False
  - Existing clusters (Policy-based): True
  - After Manager-to-Policy migration: True
- OC: Set to True to ensure Kubernetes Network Policy conformance
- DIY Kubernetes
  - New Clusters (Policy-based): False
  - Existing clusters (Policy-based): True
  - After Manager-to-Policy migration: True
This limitation applies when enable_mixed_expression_groups is set to True. This affects installations that use NCP version 3.2.0 and later, and NSX-T version 3.2.0 and later. There is no limitation on the number of namespaces that the Network Policy affects. If this option is set to True and NCP is restarted, NCP will sync all Network Policies again to implement this behavior.

When enable_mixed_expression_groups is set to False, Network Policies that use multi-selectors criteria in Ingress/Egress rules are realized with dynamic NSX groups that are not affected by any performance degradation in calculating the actual members. However, the rules can be enforced on only up to 5 namespaces, depending on the other criteria defined in the Network Policy. If the Network Policy affects more than 5 namespaces at any point in time, it will be annotated with "ncp/error: NETWORK_POLICY_VALIDATION_FAILED" and not enforced in NSX. Note that this can happen when a new namespace is created that satisfies the multi-selector conditions or an existing namespace is updated. If this option is set to False and NCP is restarted, NCP will sync all Network Policies again to implement this behavior.

Resolved Issues

Issue 3239352: In a TAS environment, when a Task cannot be allocated, retry may not work

In an NCP TAS environment, when a Task cannot be allocated the Auctioneer rejects the task and the BBS retries placement of the task up to the number of times specified by the setting task.max_retries. When task.max_retries is reached, the BBS updates the Task from the PENDING state to the COMPLETED state, marking it as Failed and including a FailureReason that explains that the cluster has no capacity for the task.

During retry, the task may be scheduled to a new cell which notifies NCP with a task_changed event. Since NCP does not handle the task_changed event the task cannot be assigned a new port in the new cell. The task cannot run properly.

Workaround: Disable the retry and set the task.max_retries value to 0.
Issue 3043496: NCP stops running if Manager-to-Policy migration fails

NCP provides the migrate-mp2p job to migrate NSX resources used by NCP and TKGI. If migration fails, all migrated resources are rolled back but NCP is not restarted in Manager mode.
Workaround:
1. Make sure that all resources were rolled back. This can be done by checking the logs of the migrate-mp2p job. The logs must end with the line "All imported MP resources to Policy completely rolled back."
2. If all resources were rolled back, ssh into each master node and run the command "sudo /var/vcap/bosh/bin/monit start ncp".
Issue 2939886: Migrating objects from Manager Mode to Policy Mode fails

Migrating objects from Manager Mode to Policy Mode fails if, in the network policy specification, egress and ingress have the same selector.

Workaround: None

Known Issues

Issue 3396034: Manager-to-policy migration fails because the migration process cannot infer the Service UUID needed to migrate NCP-created load balancer pools

During a manager-to-policy migration, tags on certain NSX resources created by NCP need to be updated. This operation may require specific Kubernetes resources to exist. If these resources do not exist, migration fails and all NSX resources are rollbacked to manager mode. In this case, manager-to-policy migration fails because Ingress has rules that use a Service that does not exist in Kubernetes.

Workaround: Remove the Ingress rules that are using Kubernetes Services that no longer exist.
Issue 3327390: In an OCP environment, nsx-node-agent has high memory usage

In some situations, the nsx-ovs container inside an nsx-node-agent pod may have high memory usage, and the memory usage keeps increasing. This is caused by the multicast snooping check in the nsx-ovs container.
Workaround:

For OpenShift 4.11 or later:

Step 1. Set enable_ovs_mcast_snooping to False in nsx-ncp-operator-config ConfigMap:
```
[nsx_node_agent]
enable_ovs_mcast_snooping = False
```
Step 2. Disable OVS liveness probe from nsx node agent DaemonSet. Note that you must disable it again every time the operator restarts because NCP operator will revert to the default nsx node agent DaemonSet manifest.

For OpenShift versions earlier than 4.11:

Step 1. Run the following command to clear the cache.
```
$ echo 2 > /proc/sys/vm/drop_caches
```
Step 2. Disable OVS liveness probe from nsx node agent DaemonSet. Note that you must disable it again every time the operator restarts because NCP operator will revert to the default nsx node agent DaemonSet manifest.
Issue 3293981: Creating two Ingresses with defaultBackend in a short period of time causes DEFAULT_BACKEND_IN_USE error

If two Ingresses with defaultBackend specified are created within a short period of time, both Ingresses might get the DEFAULT_BACKEND_IN_USE error.

Workaround: Create one Ingress at a time. To resolve the DEFAULT_BACKEND_IN_USE error, delete defaultBackend from both Ingresses and add it back to one Ingress at a time.
Issue 3292003: In an OCP environment, node goes down after applying taint with NoExecute effect

In an OCP environment, if you remove {"effect":"NoExecute","operator":"Exists"} toleration for the node agent DaemonSet, and add a NoExecute effect taint on a node, for example, test=abc:NoExecute, the node may go down. In this scenario, the node agent pod will be evicted from the node with taint. During node agent pod eviction, it is possible that the node will lose connectivity because the node agent pod does not exit gracefully and the node uplink interface is not configured correctly from br-int back to ens192.

Workaround: Reboot the node VM from vCenter Server.
Issue 3293969: In an OCP environment, during NCP upgrade, a node becomes not ready

In an OCP environment, during NCP upgrade, node agent will be restarted. It is possible that a node will lose connectivity because the node agent pod does not exit gracefully and the node uplink interface is not configured correctly from br-int back to ens192. The node status will become NotReady.

Workaround: Reboot the node VM from vCenter Server.
Issue 3252571: Manager-to-Policy migration never completes if NSX Manager becomes unavailable

If NSX Manager becomes unavailable during Manager-to-Policy migration, the migration may never complete. One indication is that the logs will have no updates about the migration.

Workaround: Re-establish the connection to NSX Manager and restart the migration.
Issue 3248662: Worker node fails to access a service. The OVS flow for the service is not created on the node.

The nsx-kube-proxy log has the error message "greenlet.error: cannot switch to a different thread."

Workaround: Restart nsx-kube-proxy on the node.
Issue 3241693: Layer-7 routes take more than 10 minutes to start working when the number of routes created exceeds some limits

In an OpenShift environment, you can deploy more than 1000 routes by setting the flags 'relax_scale_validation' to True and 'l4_lb_auto_scaling' to False in the ConfigMap. However, routes will take more than 10 minutes to start working when the number of routes created exceeds limitation. The limits are 500 HTTPs routes and 2000 HTTP routes.

Workaround: Do not exceed the limits for the number of routes. If you create 500 HTTPS plus 2000 HTTP routes, you must deploy the routes using a large-size edge VM.
Issue 3158230: nsx-ncp-bootstrap container fails to initialize while loading AppArmor profiles on Ubuntu 20.04

The nsx-ncp-bootstrap container in nsx-ncp-bootstrap DaemonSet fails to initialize because of different package versions of AppArmor on the host OS and the container image. The logs of the container show messages such as "Failed to load policy-features from '/etc/apparmor.d/abi/2.13': No such file or directory".

Workaround: Update AppArmor to version 2.13.3-7ubuntu5.2 or the latest available from focal-updates on the host OS.
Issue 3179549: Changing the NAT mode for an existing namespace is not supported

For a namespace with existing pods, if you change the NAT mode from SNAT to NO_SNAT, the pods will still use IP addresses allocated from the IP blocks specified in container_ip_blocks. If the segment subnet in the namespace still has available IP addresses, newly created pods will still use the IP addresses of the existing segment subnet. For a newly created segment, the subnet is allocated from no_snat_ip_block. But on the namespace, the SNAT rule will be deleted.

Workaround: None.
Issue 3218243: Security Policy in NSX created for Kubernetes Network Policy that uses multi-selector criteria gets removed after upgrading NCP to version 4.1.1 or when user creates/updates namespace

Verify that the option "enable_mixed_expression_groups" is set to False in NCP (default value is False). If that is the case, the Network Policy is leading to the creation of more than 5 group criteria on NSX, which is not supported.

Workaround: Set enable_mixed_expression_groups to True in NCP config map and restart NCP. Note that there is a risk of significant performance degradation in the actual group member calculation with impact on network traffic in this case.
Issue 3235394: The baseline policy with namespace setting does not work in a TKGI setup

In a TGKI environment, if you set baseline_policy_type to allow_namespace or allow_namespace_strict, NCP will create an explicit baseline policy to allow only pods within the same namespace to communicate with each other and deny ingress from other namespaces. This baseline policy will also block a system namespace, such as kube-system, from accessing pods in different namespaces.

Workaround: None. NCP does not support this feature in a TKGI setup.
Issue 3179960: Application instance not reachable after vMotion and has the same IP address as another application instance

When bulk vMotion happens, for example, during NSX host upgrade, hosts go into maintenance mode one by one and Diego Cells migrate between hosts. After the vMotion, some segment ports might be missing, some application instances might be unreachable, and two application instances might have the same IP address. This issue is more likely to happen with TAS 2.13.18.

Workaround: Re-create the application instances affected by this issue.
Issue 3108579: Deleting LB CRD and recreating it immediately with the same secret fails

In Manager mode, if you delete Ingress on an LB CRD, delete the LB CRD, and immediately recreate the Ingress and LB CRD with the same certificate, you may see the error "Attempted to import a certificate which has already been imported." This is caused by a timing issue because the deletion of LB CRD must wait for the deletion of Ingress to be completed.
Workaround: Do one of the following:

- Run the following command to wait for the deletion of Ingress to be completed and then delete the LB CRD.
- kubectl exec -ti <pod name> -nnsx-system -- nsxcli -c get ingress-caches|grep ‘name: <Ingress name>’
- Wait for at least 2 minutes before recreating the Ingress and LB CRD.
Issue 3221191: Creation of domain group fails when cluster has more than 4000 pods

If the NCP option k8s.baseline_policy_type is set to allow_cluster, allow_namespace, or allow_namespace_strict, and the cluster has more than 4000 pods, the domain group (with a name such as dg-k8sclustername), which contains all the IP addresses of the pods, will fail to be created. This is caused by a limitation on NSX.

Workaround: Do not set the option k8s.baseline_policy_type or ensure that there are fewer than 4000 pods in the cluster.
Issue 2131494: NGINX Kubernetes Ingress still works after changing the Ingress class from nginx to nsx

When you create an NGINX Kubernetes Ingress, NGINX create traffic forwarding rules. If you change the Ingress class to any other value, NGINX does not delete the rules and continues to apply them, even if you delete the Kubernetes Ingress after changing the class. This is a limitation of NGINX.

Workaround: To delete the rules created by NGINX, delete the Kubernetes Ingress when the class value is nginx. Than re-create the Kubernetes Ingress.
Issue 2999131: ClusterIP services not reachable from the pods

In a large-scale TKGi environment, ClusterIP services are not reachable from the pods. Other related issues are: (1) The nsx-kube-proxy stops outputting the logs of nsx-kube-proxy; and (2) The OVS flows are not created on the node.

Workaround: Restart nsx-kube-proxy.
Issue 2984240: The "NotIn" operator in matchExpressions does not work in namespaceSelector for a network policy's rule

When specifying a rule for a network policy, if you specify namespaceSelector, matchExpressions and the "NotIn" operator, the rule does not work. The NCP log has the error message "NotIn operator is not supported in NS selectors."

Workaround: Rewrite matchExpressions to avoid using the "NotIn" operator.
Issue 3033821: After manager-to-policy migration, distributed firewall rules not enforced correctly

After a manager-to-policy migration, newly created network policy-related distributed firewall (DFW) rules will have higher priority than the migrated DFW rules.

Workaround: Use the policy API to change the sequence of DFW rules as needed.
For a Kubernetes service of type ClusterIP, the hairpin-mode flag is not supported

NCP does not support the hairpin-mode flag for a Kubernetes service of type ClusterIP.

Workaround: None
Issue 2224218: After a service or app is deleted, it takes 2 minutes to release the SNAT IP back to the IP pool

If you delete a service or app and recreate it within 2 minutes, it will get a new SNAT IP from the IP pool.

Workaround: After deleting a service or app, wait 2 minutes before recreating it if you want to reuse the same IP.
Issue 2404302: If multiple load balancer application profiles for the same resource type (for example, HTTP) exist on NSX-T, NCP will choose any one of them to attach to the Virtual Servers.

If multiple HTTP load balancer application profiles exist on NSX-T, NCP will choose any one of them with the appropriate x_forwarded_for configuration to attach to the HTTP and HTTPS Virtual Server. If multiple FastTCP and UDP application profiles exist on NSX-T, NCP will choose any one of them to attach to the TCP and UDP Virtual Servers, respectively. The load balancer application profiles might have been created by different applications with different settings. If NCP chooses to attach one of these load balancer application profiles to the NCP-created Virtual Servers, it might break the workflow of other applications.

Workaround: None
Issue 2518111: NCP fails to delete NSX-T resources that have been updated from NSX-T

NCP creates NSX-T resources based on the configurations that you specify. If you make any updates to those NSX-T resources through NSX Manager or the NSX-T API, NCP might fail to delete those resources and re-create them when it is necessary to do so.

Workaround: Do not update NSX-T resources created by NCP through NSX Manager or the NSX-T API.
Issue 2416376: NCP fails to process a TAS ASG (App Security Group) that binds to more than 128 Spaces

Because of a limit in NSX-T distributed firewall, NCP cannot process a TAS ASG that binds to more than 128 Spaces.

Workaround: Create multiple ASGs and bind each of them to no more than 128 Spaces.
NCP fails to start when "logging to file" is enabled during Kubernetes installation

This issue happens when uid:gid=1000:1000 on the container host does not have permission to the log folder.
Workaround: Do one of the following:
- Change the mode of the log folder to 777 on the container hosts.
- Grant “rwx” permission of the log folder to uid:gid=1000:1000 on the container hosts.
- Disable the “logging to file” feature.
Issue 2653214: Error while searching the segment port for a node after the node's IP address was changed

After changing a node's IP address, if you upgrade NCP or if the NCP operator pod is restarted, checking the NCP operator status with the command "oc describe co nsx-ncp" will show the error message "Error while searching segment port for node ..."

Workaround: None. Adding a static IP address on a node interface which also has DHCP configuration is not supported.
Issue 2672677: In a highly stressed OpenShift 4 environment, a node can become unresponsive

In an OpenShift 4 environment with a high level of pod density per node and a high frequency of pods getting deleted and created, a RHCOS node might go into a "Not Ready" state. Pods running on the affected node, with the exception of daemonset members, will be evicted and recreated on other nodes in the environment.

Workaround: Reboot the impacted node.
Issue 2707174: A Pod that is deleted and recreated with the same namespace and name has no network connectivity

If a Pod is deleted and recreated with the same namespace and name when NCP is not running and nsx-ncp-agents are running, the Pod might get wrong network configurations and not be able to access the network.

Workaround: Delete the Pod and recreate it when NCP is running.
Issue 2745907: "monit" commands return incorrect status information for nsx-node-agent

On a diego_cell VM, when monit restarts nsx-node-agent, if it takes more than 30 seconds for nsx-node-agent to fully start, monit will show the status of nsx-node-agent as "Execution failed" and will not update its status to "running" even when nsx-node-agent is fully functional later.

Workaround: None.
Issue 2735244: nsx-node-agent and nsx-kube-proxy crash because of liveness probe failure

nsx-node-agent and nsx-kube-proxy use sudo to run some commands. If there are many entries in /etc/resolv.conf about DNS server and search domains, sudo can take a long time to resolve hostnames. This will cause nsx-node-agent and nsx-kube-proxy to be blocked by the sudo command for a long time, and liveness probe will fail.
Workaround: Perform one of the two following actions:
- Add hostname entries to /etc/hosts. For example, if hostname is 'host1', add the entry '127.0.0.1 host1'.
- Set a larger value for the nsx-node-agent liveness probe timeout. Run the command 'kubectl edit ds nsx-node-agent -n nsx-system' to update the timeout value for both the nsx-node-agent and nsx-kube-proxy containers.
Issue 2736412: Parameter members_per_small_lbs is ignored if max_allowed_virtual_servers is set

If both max_allowed_virtual_servers and members_per_small_lbs are set, virtual servers may fail to attach to an available load balancer because only max_allowed_virtual_servers is taken into account.

Workaround: Relax the scale constraints instead of enabling auto scaling.
Issue 2740552: When deleting a static pod using api-server, nsx-node-agent does not remove the pod's OVS bridge port, and the network of the static pod which is re-created automatically by Kubernetes is unavailable

Kubernetes does not allow removing a static pod by api-server. A mirror pod of static pod is created by Kubernetes so that the static pod can be searched by api-server. While deleting the pod by api-server, only the mirror pod will be deleted and NCP will receive and handle the delete request to remove all NSX resource allocated for the pod. However, the static pod still exists, and nsx-node-agent will not get the delete request from CNI to remove OVS bridge port of static pod.

Workaround: Remove the static pod by deleting the manifest file instead of removing the static pod by api-server.
Issue 2824129: A node has the status network-unavailable equal to true for more than 3 minutes after a restart

If you use NCP operator to manage NCP's lifecycle, when an nsx-node-agent daemonset recovers from a non-running state, its node will have the status network-unavailable equal to true until it has been running for 3 minutes. This is expected behavior.

Workaround: Wait for at least 3 minutes after nsx-node-agent restarts.
Issue 2832480: For a Kubernetes service of type ClusterIP, sessionAffinityConfig.clientIP.timeoutSeconds cannot exceed 65535

For a Kubernetes service of type ClusterIP, if you set sessionAffinityConfig.clientIP.timeoutSeconds to a value greater than 65535, the actual value will be 65535.

Workaround: None
Issue: 2940772: Migrating NCP resources from Manager to Policy results in failure with NSX-T 3.2.0

Migrating NCP resources from Manager to Policy is supported with NSX-T 3.1.3 and NSX-T 3.2.1, but not NSX-T 3.2.0.

Workaround: None
Issue 2934195: Some types of NSX groups are not supported for distributed firewall rules

An NSX groups of type "IP Addresses Only" is not supported for distributed firewall (DFW) rules. An NSX group of type "Generic" with manually added IP addresses as members is also not supported.

Workaround: None
Issue 3066449: Namespace subnets are not always allocated from the first available IP block when use_ip_blocks_in_order is set to True

When creating multiple namespaces with use_ip_blocks_in_order set to True, the first namespace's subnet is sometimes not allocated from the first available IP block. For example, assume that container_ip_blocks = '172.52.0.0/28,172.53.0.0/28', and subnet prefix length is 29，and subnet 172.52.0.0/29 is already allocated. If you create 2 namespaces ns-1 and ns-2, the subnets allocation could be (1) ns-1: 172.52.0.8/29, ns-2: 172.53.0.0/29, or (2) ns-1: 172.53.0.0/29, ns-2: 172.52.0.8/29.

The use_ip_blocks_in_order parameter only ensures that different IP blocks are used in the order they appear in the container_ip_blocks parameter. When creating multiple namespaces at the same time, any namespace may request a subnet through an API call before another namespace. Therefore, there is no guarantee that a specific namespace will be allocated a subnet from a specific IP block.

Workaround: Create the namespaces separately, that is, create the first namespace, make sure its subnet has been allocated, and then create the next namespace.