Depending on the real-life workloads, the NSX Advanced Load Balancer SE settings can be tweaked by using configurations at the SE Group level.
The following are the guidelines to follow while planning capacity for SEs:
General Guidelines
The following are the general guidelines:
CPU and memory reservations are recommended for NSX Advanced Load Balancer SE virtual machines for consistent and deterministic performance.
Use compact mode in NSX Advanced Load Balancer SE Group settings for virtual service placements on SEs. This ensures NSX Advanced Load Balancer uses the minimum number of SEs required for virtual service placement. It helps in saving the cost in the case of public cloud use cases.
Dispatcher Configurations
The following are the dispatcher configurations:
The
dedicated_dispatcher
is set to False by default at the SE group level. This configuration is optimal for SEs with smaller computer capacities, such as one and two cores.NSX Advanced Load Balancer recommends
dedicated_dispatcher
set to True for SE size greater than two cores.
GRO and TSO Configurations
The following are the GRO and TSO configurations:
The default settings for GRO is deactivated, and TSO is enabled. This configuration works normally for most of the workloads.
GRO can be enabled whenever there is enough dispatchers (greater than or equal to 4), and their utilization is low.
Starting with NSX Advanced Load Balancer version 22.1.2, if the SE group has SEs with greater than or equal to 8 vCPUs, GRO will be enabled.
LRO Configurations
Starting with NSX Advanced Load Balancer 22.1.3 version, LRO is enabled on serviceenginegroup
by default for supported environments, such as, Vmware/ NSX-T Cloud.
For more details on LRO configuration, refer Configuring TSO, LRO, GRO, and RSS section in this guide.
Receive Side Scaling Configurations
The following are the Receive Side Scaling (RSS) configurations:
You can enable RSS for better performance. RSS can realize better PPS with more dispatchers and queues per NIC.
The number of dispatchers can only be set in the power of two, that is, the number of dispatchers can be one, two, four, eight, and so on.
Default value of
max_queues_per_vnic
is one. Setting the value to zero automatically decides the number of queues based on the dispatcher count configured. You can set this value as per the requirements.If the number of queues available per NIC is lesser than the dispatcher, the number of the dispatcher is floored to the number of queues. So, it is recommended to have the number of dispatchers per NIC greater than the number of available queues.
Datapath Isolation
You can enable SE datapath isolation for latency and jitter-sensitive applications. The feature creates two independent CPU sets for datapath and control plane SE functions.
Recommendations for Auto RSS on Public Clouds
All NSX Advanced Load Balancer public cloud instances can enable auto RSS by setting the appropriate values of max_queues_per_vnic and num_dispatcher_cores knob in the SE group. For more details on configuring auto RSS, see Configuring TSO, LRO, GRO RSS.
It is recommended to set a dedicated dispatcher through the dedicated_dispatcher_core knob in the SE Group for all instances where RSS is enabled and the number of vCPUs of the instance is greater than or equal to eight. Dedicated dispatcher is a runtime property and does not require reboot.
Recommendations for different Workloads
The following are the recommendations for different workloads:
High PPS load such as high connections per second with small file GETs must have more dispatchers to do higher PPS.
Workloads with high SSL transactions are proxy heavy and benefit from a high count of proxy cores.
Default settings are recommended for one-core and two-core SEs.
The following examples explain the configuration recommendation for a six-core SE running on the vCenter full access cloud.
1 – PPS heavy traffic profile
Let us assume 100 layer 4 virtual services with TCP and doing average of 1000 new TCP connections per second, with each connection lasting three seconds and downloading a single small file over single GET request.
Considering 18 to 20 packets for each TCP transaction for both the front end and the back end, this requirement translates to nearly one million packets per second for new TCP connections. Given the volume of packets, the NSX Advanced Load Balancer SE must be configured with the following configuration:
Dedicated dispatcher: True
Number of dispatchers: 2
Number of proxy cores: 4
Number of queues per NIC: 2
2 – SSL throughput and TPS heavy traffic profile
Let us assume multiple SSL applications doing a total of 2000 ECC transactions per second and two Gbps of SSL throughput of GET.
For the above requirements, the dispatcher cores will not be busy as the packets per second will not be very high, and SSL processing will be consuming proxy cores for doing ECC transactions and throughput. In this use case, RSS is ineffective, and the following workloads are recommended:
Dedicated dispatcher: True
Number of dispatchers: 1
Number of proxy cores: 5
Number of queues per NIC: 1
3 – HTTP workloads with 50% of IP routing traffic
Multiple L7 applications doing nearly five to six Gbps with 1.5 Million packets per second with a single SE with 50% of IP routing traffic. Application runnings are latency and jitter sensitive.
To achieve the above requirements, NSX Advanced Load Balancer recommends dedicating one of the SE cores for non-data-path tasks. It can be achieved with the following configuration:
Dedicated Dispatcher: True
se_dp isolation mode: True
Number of non-dp cores: 1
Number of dispatcher cores: 2
Number of queues per NIC: 2
Number of proxy cores: 3