Since NSX-OVS is not supported in the latest kernel version, you can switch the NSX-OVS kernel module to the upstream OVS kernel module before upgrading the kernel to the latest version. If NCP does not work with the latest kernel after a kernel upgrade, you can do a rollback (switch back to NSX-OVS and downgrade the kernel).

The first procedure below describes how to switch the NSX-OVS kernel module to the upstream OVS kernel module when you upgrade the kernel. The second procedure describes how to switch back to the NSX-OVS kernel module when you downgrade the kernel.

Both procedures involve the Kubernetes concepts taints and tolerations. For more information about these concepts, see

Switch to the upstream OVS kernel module

  1. Modify the tolerations of both daemonset.apps/nsx-ncp-bootstrap and daemonset.apps/nsx-node-agent. Change the following:
          - effect: NoExecute
            operator: Exists
          - effect: NoExecute
            key: evict-user-pods
  2. Modify the nsx-node-agent configmap. Change use_nsx_ovs_kernel_module to False.
  3. Taint worker-node1 "evict-user-pods:NoExecute" to evict all user pods in this node to other nodes:
    kubectl taint nodes worker-node1 evict-user-pods:NoExecute
  4. Taint worker-node1 "evict-ncp-pods:NoExecute" to evict nsx-node-agent and nsx-ncp-bootstrap pods in this node to other nodes:
    kubectl taint nodes worker-node1 evict-ncp-pods:NoExecute
  5. Uninstall the ovs-kernel module and restore the upstream OVS kernel module on worker-node1.
    1. Delete kmod files vport-geneve.ko, vport-gre.ko, vport-lisp.ko, vport-stt.ko, vport-vxlan.ko, openvswitch.ko in directory /lib/modules/$(uname -r)/weak-updates/openvswitch.
    2. If there are vport-geneve.ko, vport-gre.ko, vport-lisp.ko, vport-stt.ko, vport-vxlan.ko, openvswitch.ko files in directory /lib/modules/$(uname -r)/nsx/usr-ovs-kmod-backup, move them to directory /lib/modules/$(uname -r)/weak-updates/openvswitch.
    3. Delete directory /lib/modules/$(uname -r)/nsx.
  6. Upgrade the kernel of worker-node1 to the latest version and reboot it.

    Note: Set SELinux to Permissive mode on worker-node1 if containerd and kubelet cannot be running.

  7. Restart kubelet.
  8. Remove taint "evict-ncp-pods:NoExecute" from worker-node1. Verify that bootstrap and node-agent can start.
  9. Remove taint "evict-user-pods:NoExecute" from worker-node1. Verify that all pods in this node are running.
  10. Repeat steps 3-9 for other nodes.
  11. Recover the tolerations of both nsx-ncp-bootstrap and nsx-node-agent DaemonSets in step 1.

Switch back to the NSX-OVS kernel module

  1. Modify the tolerations of both daemonset.apps/nsx-ncp-bootstrap and daemonset.apps/nsx-node-agent. Change the following:
          - effect: NoExecute
            operator: Exists
          - effect: NoExecute
            key: evict-user-pods
  2. Modify the nsx-node-agent configmap. Change use_nsx_ovs_kernel_module to True.
  3. Taint worker-node1 "evict-user-pods:NoExecute" to evict all user pods in this node to other nodes:
    kubectl taint nodes worker-node1 evict-user-pods:NoExecute
  4. Taint worker-node1 "evict-ncp-pods:NoExecute" to evict nsx-node-agent and nsx-ncp-bootstrap pods in this node to other nodes:
    kubectl taint nodes worker-node1 evict-ncp-pods:NoExecute
  5. Downgrade the kernel of worker-node1 to a supported version and reboot it.

    Note: Set SELinux to Permissive mode on worker-node1 if containerd and kubelet cannot be running.

  6. Restart kubelet.
  7. Remove taint "evict-ncp-pods:NoExecute" from worker-node1. Verify that bootstrap and node-agent can start.
  8. Remove taint "evict-user-pods:NoExecute" from worker-node1. Verify that all pods in this node are running.
  9. Repeat steps 3-8 for other nodes.
  10. Recover the tolerations of both nsx-ncp-bootstrap and nsx-node-agent DaemonSets in step 1.