The following tables summarize the data plane performance best practices described in this guide.

Infrastructure Layer

Performance Decision

Performance

Justification

Performance

Implication

Set the BIOS power profile to Maximum Performance.

The CPUs assigned to the data plane CNFs must not throttle down their speed.

CPU speeds are not throttled. High and consistent data plane performance is achieved.

Enable Hyperthreading.

CPU-bound data plane CNFs need more CPU cores to deliver high throughput.

The number of available logical CPU cores is increased, thereby improving the performance of the data plane CNFs.

Note: You can always keep Hyperthreading enabled to allow shared resources among ESXi and system tasks.

Enable Turbo Boost.

The data plane CNFs must be able to use the maximum CPU speeds available.

CPUs are overclocked to provide higher than base CPU speed for accelerating the data plane CNFs.

Note: Ensure that the processor is working in the power and temperature specification limits of the thermal design power.

Ensure that NUMA Node Interleaving is not enabled.

Remote memory access has a penalty on data plane latency and throughput.

The memory pages are placed local to the CPU, thereby avoiding remote memory access. This improves data plane latency and throughput.

CaaS Layer

Performance Decision

Performance justification

Performance Implication

Assign maximum compute in the host to Worker Nodes.

Data plane CNFs require large compute capacity.

Data plane CNFs use maximum compute available in the host and achieve high throughput.

Pin the Worker Node CPUs to the Physical CPU cores using LatencySensitivity=High.

CPU core pinning and complete CPU core isolation must be available to achieve low data plane latency.

Hyperthread siblings are deactivated on the pinned CPU cores. Data plane CNFs use the isolated physical CPU cores exclusively to achieve low latency and high throughput.

Pin the Worker Node CPUs to the Hyperthreaded CPU cores.

CPU-bound data plane CNFs can compromise latency slightly by using Hyperthreading.

The number of available CPU cores for data plane CNFs increases, thereby improving the packet throughput.

Set the CPU Manager policy to Static.

Deterministic performance is required. CPU affinity and exclusivity must be available.

Allows the requested CPU affinity and exclusivity to the data plane CNFs.

Use integer CPU requests for the data plane CNF Pods.

Deterministic performance is required. CPU affinity and exclusivity must be available.

Data plane CNFs get the requested CPU affinity and exclusivity. CPUs are not shared, and deterministic performance is achieved.

Use 1 GB huge pages.

High performance with a large memory footprint.

The number of memory pages required for the data plane CNF is reduced. MMU operates efficiently and performance is improved.

Use the DPDK kernel module igb_uio for DPDK acceleration.

Userspace DPDK framework must be used for fast packet processing.

Unlike vfio-pci, igb_uio can operate without IOMMU emulation overheads. Any performance impact due to IOMMU emulation is avoided.

Use the Secondary Network on the data plane Pod.

Data plane interface must support high throughput, low latency, and advanced technologies such as SR-IOV with DPDK.

The Primary Network Pod traffic does not interfere with the data plane traffic on the Secondary Network. The data plane is accelerated using DPDK and SR-IOV.

Use SR-IOV for the Secondary Network on the data plane Pod.

The data plane interface must support high throughput and low latency.

SR-IOV eliminates all emulation layers in the data path. It delivers high throughput and low latency for the data plane traffic.