The following tables summarize the data plane performance best practices described in this guide.
Infrastructure Layer
Performance Decision |
Performance Justification |
Performance Implication |
---|---|---|
Set the BIOS power profile to Maximum Performance. |
The CPUs assigned to the data plane CNFs must not throttle down their speed. |
CPU speeds are not throttled. High and consistent data plane performance is achieved. |
Enable Hyperthreading. |
CPU-bound data plane CNFs need more CPU cores to deliver high throughput. |
The number of available logical CPU cores is increased, thereby improving the performance of the data plane CNFs. Note: You can always keep Hyperthreading enabled to allow shared resources among ESXi and system tasks. |
Enable Turbo Boost. |
The data plane CNFs must be able to use the maximum CPU speeds available. |
CPUs are overclocked to provide higher than base CPU speed for accelerating the data plane CNFs. Note: Ensure that the processor is working in the power and temperature specification limits of the thermal design power. |
Ensure that NUMA Node Interleaving is not enabled. |
Remote memory access has a penalty on data plane latency and throughput. |
The memory pages are placed local to the CPU, thereby avoiding remote memory access. This improves data plane latency and throughput. |
CaaS Layer
Performance Decision |
Performance justification |
Performance Implication |
---|---|---|
Assign maximum compute in the host to Worker Nodes. |
Data plane CNFs require large compute capacity. |
Data plane CNFs use maximum compute available in the host and achieve high throughput. |
Pin the Worker Node CPUs to the Physical CPU cores using LatencySensitivity=High. |
CPU core pinning and complete CPU core isolation must be available to achieve low data plane latency. |
Hyperthread siblings are deactivated on the pinned CPU cores. Data plane CNFs use the isolated physical CPU cores exclusively to achieve low latency and high throughput. |
Pin the Worker Node CPUs to the Hyperthreaded CPU cores. |
CPU-bound data plane CNFs can compromise latency slightly by using Hyperthreading. |
The number of available CPU cores for data plane CNFs increases, thereby improving the packet throughput. |
Set the CPU Manager policy to Static. |
Deterministic performance is required. CPU affinity and exclusivity must be available. |
Allows the requested CPU affinity and exclusivity to the data plane CNFs. |
Use integer CPU requests for the data plane CNF Pods. |
Deterministic performance is required. CPU affinity and exclusivity must be available. |
Data plane CNFs get the requested CPU affinity and exclusivity. CPUs are not shared, and deterministic performance is achieved. |
Use 1 GB huge pages. |
High performance with a large memory footprint. |
The number of memory pages required for the data plane CNF is reduced. MMU operates efficiently and performance is improved. |
Use the DPDK kernel module igb_uio for DPDK acceleration. |
Userspace DPDK framework must be used for fast packet processing. |
Unlike vfio-pci, igb_uio can operate without IOMMU emulation overheads. Any performance impact due to IOMMU emulation is avoided. |
Use the Secondary Network on the data plane Pod. |
Data plane interface must support high throughput, low latency, and advanced technologies such as SR-IOV with DPDK. |
The Primary Network Pod traffic does not interfere with the data plane traffic on the Secondary Network. The data plane is accelerated using DPDK and SR-IOV. |
Use SR-IOV for the Secondary Network on the data plane Pod. |
The data plane interface must support high throughput and low latency. |
SR-IOV eliminates all emulation layers in the data path. It delivers high throughput and low latency for the data plane traffic. |