Switching Between NSX-OVS and Upstream OVS Kernel Modules

Since NSX-OVS is not supported in the latest kernel version, you can switch the NSX-OVS kernel module to the upstream OVS kernel module before upgrading the kernel to the latest version. If NCP does not work with the latest kernel after a kernel upgrade, you can do a rollback (switch back to NSX-OVS and downgrade the kernel).

The first procedure below describes how to switch the NSX-OVS kernel module to the upstream OVS kernel module when you upgrade the kernel. The second procedure describes how to switch back to the NSX-OVS kernel module when you downgrade the kernel.

Both procedures involve the Kubernetes concepts taints and tolerations. For more information about these concepts, see https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration.

Switch to the upstream OVS kernel module

Modify the tolerations of both daemonset.apps/nsx-ncp-bootstrap and daemonset.apps/nsx-node-agent. Change the following:
```
      - effect: NoExecute
        operator: Exists
```
to:
```
      - effect: NoExecute
        key: evict-user-pods
```
Modify the nsx-node-agent configmap. Change use_nsx_ovs_kernel_module to False.
Taint worker-node1 "evict-user-pods:NoExecute" to evict all user pods in this node to other nodes:
```
kubectl taint nodes worker-node1 evict-user-pods:NoExecute
```
Taint worker-node1 "evict-ncp-pods:NoExecute" to evict nsx-node-agent and nsx-ncp-bootstrap pods in this node to other nodes:
```
kubectl taint nodes worker-node1 evict-ncp-pods:NoExecute
```
Uninstall the ovs-kernel module and restore the upstream OVS kernel module on worker-node1.
1. Delete kmod files vport-geneve.ko, vport-gre.ko, vport-lisp.ko, vport-stt.ko, vport-vxlan.ko, openvswitch.ko in directory /lib/modules/$(uname -r)/weak-updates/openvswitch.
2. If there are vport-geneve.ko, vport-gre.ko, vport-lisp.ko, vport-stt.ko, vport-vxlan.ko, openvswitch.ko files in directory /lib/modules/$(uname -r)/nsx/usr-ovs-kmod-backup, move them to directory /lib/modules/$(uname -r)/weak-updates/openvswitch.
3. Delete directory /lib/modules/$(uname -r)/nsx.
Upgrade the kernel of worker-node1 to the latest version and reboot it.
Note: Set SELinux to Permissive mode on worker-node1 if containerd and kubelet cannot be running.
Restart kubelet.
Remove taint "evict-ncp-pods:NoExecute" from worker-node1. Verify that bootstrap and node-agent can start.
Remove taint "evict-user-pods:NoExecute" from worker-node1. Verify that all pods in this node are running.
Repeat steps 3-9 for other nodes.
Recover the tolerations of both nsx-ncp-bootstrap and nsx-node-agent DaemonSets in step 1.

Switch back to the NSX-OVS kernel module

Modify the tolerations of both daemonset.apps/nsx-ncp-bootstrap and daemonset.apps/nsx-node-agent. Change the following:
```
      - effect: NoExecute
        operator: Exists
```
to:
```
      - effect: NoExecute
        key: evict-user-pods
```
Modify the nsx-node-agent configmap. Change use_nsx_ovs_kernel_module to True.
Taint worker-node1 "evict-user-pods:NoExecute" to evict all user pods in this node to other nodes:
```
kubectl taint nodes worker-node1 evict-user-pods:NoExecute
```
Taint worker-node1 "evict-ncp-pods:NoExecute" to evict nsx-node-agent and nsx-ncp-bootstrap pods in this node to other nodes:
```
kubectl taint nodes worker-node1 evict-ncp-pods:NoExecute
```
Downgrade the kernel of worker-node1 to a supported version and reboot it.
Note: Set SELinux to Permissive mode on worker-node1 if containerd and kubelet cannot be running.
Restart kubelet.
Remove taint "evict-ncp-pods:NoExecute" from worker-node1. Verify that bootstrap and node-agent can start.
Remove taint "evict-user-pods:NoExecute" from worker-node1. Verify that all pods in this node are running.
Repeat steps 3-8 for other nodes.
Recover the tolerations of both nsx-ncp-bootstrap and nsx-node-agent DaemonSets in step 1.