Data plane intensive CNFs require CPU pinning to avoid CPU scheduling delays. CPU pinning also ensures that a deterministic CPU capacity is available consistently to the data plane CNF. The availability of CPU capacity leads to deterministic network performance, which is the desired outcome for data plane intensive workloads.

You must apply CPU pinning for data plane CNFs through two distinct levels in the CaaS layer:

  • The Worker Node CPUs are pinned (with exclusive affinity) to the CPU cores (Physical or Hyperthreaded) in the ESXi Hypervisor.

  • The data plane containers are pinned to the Worker Node CPUs.

This ensures that the data plane containers have exclusive access to the CPU cores and meet the low latency and high throughput requirements.

Note:

Depending on the characteristics of the data plane intensive CNF, you can pin the Worker Node CPUs to either Physical or Hyperthreaded CPU cores in the ESXi Hypervisor. Ensure that you do not assign all cores in a NUMA node, particularly NUMA node 0, to the Worker Node. Some compute must be available for ESXi and other system services.

Pin all Worker Node CPUs to Physical CPU cores

For the data plane intensive CNFs that are extremely sensitive to latency, set LatencySensitivity=High on the Worker Nodes. This setting enables CPU pinning for all Worker Node CPUs by applying exclusive affinity. It also enables complete core isolation for the pinned cores by leaving the hyperthread siblings idle in ESXi. CPU pinning and core isolation reduce the interference and improve the performance of the workloads running in the Worker Node.

When the LatencySensitivity=High setting is applied on the Worker Node, its CPU and memory resources are NUMA aligned. Because the hyperthread siblings remain idle in ESXi, the Worker Nodes in a single NUMA can only use half of the maximum number of the Logical cores available in a CPU.

Note:

The following node customization is required for applying LatencySensitivity=High configuration on the Worker Nodes:

  • Add the node component latency_sensitivity with the value high

For more details, see Node Customization in the Telco Cloud Automation documentation.

Pin Worker Node CPUs to Hyperthreaded CPU cores

For CPU-bound data plane intensive CNFs that are not extremely sensitive to latency, pin the Worker Node CPUs to the Hyperthreaded CPU cores in ESXi. This ensures that the Worker Nodes in a single NUMA can use the maximum number of the Logical cores available in a CPU.

LatencySensitivity=High must not be set on the Worker Node when pinning its CPUs on the Hyperthreaded cores in ESXi. CPU pinning is enabled by applying exclusive affinity in ESXi. Core isolation is not completely applied because Hyperthread siblings must share hardware resources.

Note:

It is not mandatory and also not recommended to pin all the CPUs of the Worker Node to the Hyperthreaded CPU cores in ESXi. Leave some CPUs in a Worker node for its OS and Kubernetes tasks. These unpinned CPUs will use shared resources allocated by ESXi.

Node Customization in Telco Cloud Automation ensures that the CPU and memory resources of the Worker Node are fully reserved to apply exclusive affinity and they are also NUMA aligned in a multi-socket server.

Note:

The following node customizations are required for pinning Worker Node CPUs to the Hyperthreaded CPU cores in ESXi:

  • Define the kernel argument isolcpus under kernel: kernel_args

  • Add the node component isNumaConfigNeeded with the value true

For more details, see Node Customization in the Telco Cloud Automation documentation.

Pin Data Plane Container to Worker Node CPUs

When you deploy the Workload Cluster with a dedicated Node Pool for the data plane intensive CNF, use the CPU Manager Policy as 'Static'. This setting enables 'Static' CPU management policy in the Kubernetes CPU Manager. Static policy is required to allow CPU affinity for the data plane containers.

For deterministic performance, the data plane containers must be exclusively pinned to the Worker Node CPUs. To apply this setting, you must define the container as part of a Guaranteed Pod and with integer CPU requests.