Upgrade ESXi That Runs Tanzu Kubernetes Worker Node VMs Configured with SR-IOV

If the VM with SR-IOV network adapter happens to be a Tanzu Kubernetes worker node in a Tanzu Kubernetes cluster, follow these steps to upgrade the ESXi host.

Prerequisites

Identify the vSphere cluster that has the ESXi hosts you want to upgrade:

Create a host baseline for ESXi upgrade in vCenter Lifecycle Manager. For instructions, see Create a Host Upgrade Baseline.
Navigate to the Hosts and Clusters view in the compute vCenter Server web UI.
Identify the vSphere cluster that has SR-IOV enabled on the hosts and has Tanzu Kubernetes worker node VMs with SR-IOV network adapter.

Procedure

For each ESXi host in the vSphere cluster, identify the list of Tanzu Kubernetes worker node VMs with SR-IOV network adapter.

Note:
Make a note of the names of the worker node VMs in the ESXi host.
Log in to the TCA-CP VM as an 'admin' user and switch to a 'root' user.
For each worker node VM with the SR-IOV network adapter, identify the Tanzu Kubernetes workload cluster that the worker node VM is part of and then drain the pods of the worker node VM:
1. List all the Tanzu Kubernetes management clusters in the bootstrapper.
```
ccli list mc
```
  Note:
  Make a note of the index number of the management cluster (for example, MC1), where you want to look for the worker node VM.
2. Go to the Tanzu Kubernetes management cluster MC1.
```
ccli go <index number of the Tanzu Kubernetes management cluster>
```
3. List the Tanzu Kubernetes workload clusters managed by the management cluster MC1.
```
ccli list wc
```
  Note:
  Make a note of the index number of the workload cluster (for example, WC1), where you want to look for the worker node VM.
4. List the names and namespaces of all workload clusters managed by the management cluster MC1.
```
kubectl get clusters -A
```
  Note:
  Make a note of the name and namespace of the workload cluster WC1, where you want to look for the worker node VM.
5. Go to the workload cluster WC1 managed by the management cluster MC1.
```
ccli go <index number of Tanzu Kubernetes workload cluster>
```
6. List all the worker node VMs in the workload cluster WC1 and identify whether the list shows the worker node VM that you are looking for.
```
ccli list nodes
```
  - If the worker node VM that you are looking for is shown in the worker node list of the workload cluster WC1, skip to Step 3g.
  - If the worker node VM that you are looking for is not shown in the worker node list of the workload cluster WC1, repeat the previous Steps 3e and 3f for the remaining workload clusters managed by the management cluster MC1.
  - If the worker node VM that you are looking for is not shown in the worker node lists of all workload clusters managed by the management cluster MC1, repeat Steps 3b-3f for the remaining management clusters in the bootstrapper.
7. After identifying the workload cluster that the worker node VM is part of, pause the health check of the worker node VM by running the following commands in sequence:
```
ccli list mc
ccli go <index number of the management cluster that the worker node VM belongs to>
kubectl patch cluster <cluster_name> --type merge -p '{"spec": {"paused": true}}' -n <cluster_namespace>
```
  cluster_name specifies the name of the workload cluster that the worker node VM belongs to.
  
  cluster_namespace specifies the namespace of the workload cluster that the worker node VM belongs to.
8. Drain the pods of the worker node VM by running the following commands in sequence:
```
ccli list wc
ccli go <index number of the Tanzu Kubernetes workload cluster that the worker node VM belongs to>
kubectl drain <worker_node_name> --ignore-daemonsets --delete-local-data --force
```
  worker_node_name specifies the name of the worker node VM.
9. Repeat Steps 3a-3h for the remaining worker node VMs that use the SR-IOV network adapter on the ESXi host.
Power off all the worker node VMs that use the SR-IOV network adapter on the ESXi host.
In Lifecycle Manager, apply the baseline to the ESXi host and remediate. For more information, see Remediating ESXi Hosts Against vSphere Lifecycle Manager Baselines.
After the ESXi host remediation is completed, power ON all the worker node VMs that use the SR-IOV network adapter on the ESXi host.

Uncordon each worker node VM by running the following commands in sequence:

ccli list mc
ccli go <index number of the management cluster>
ccli list wc
ccli go <index number of the workload cluster>
kubectl get nodes
kubectl uncordon <worker node name>

Resume the health check of each worker node VM:
```
kubectl patch cluster <cluster_name> --type merge -p '{"spec": {"paused": false}}' -n <cluster_namespace>
```
cluster_name specifies the name of the workload cluster that the worker node VM belongs to.

cluster_namespace specifies the namespace of the workload cluster that the worker node VM belongs to.
Upgrade all the drivers and firmware on the ESXi server based on the Telco Cloud Platform RAN 3.1 BOM. For more information, see KB87936.

Note:
For instructions to upgrade ACC 100, see ACC 100 Support for ESXi Upgrade.
Repeat Steps 1-9 for the remaining ESXi hosts in the vSphere cluster.

Results

The ESXi host is successfully upgraded and rebooted.