Multiple Queues per Dispatcher

This feature is introduced in SE DP DPDK mode to achieve higher PPS in environments where the size of the NIC queue is limited.

The following are the supported NICs:

VIRTIO - KVM, OpenStack
ENA - AWS

You achieve higher PPS with shallow rings by utilizing more than one queue per dispatcher, which provides increased packet burst ability.

The following are the two modes of operation based on the operating environment:

Compact Mode:: Single dispatcher manages all the queues of the VNIC.
Distributed Mode:: Multiple dispatchers manage a subset of queues of the VNIC.

You can configure the maximum number of queues per VNIC using the max_queues_per_vnic parameter in SE-group properties.

The max_queues_per_vnic parameter supports the following values:

Zero (Reserved): Auto (deduces the optimal number of queues per dispatcher based on the NIC and operating environment)
One (Reserved): One Queue per NIC (Default)
Integer Value: Power of 2; the maximum limit is 16

Note:

You must set max_queues_per_vnic to 0 (auto) for non-DPDK mode of operation to utilize multiple dispatchers.

The migration routine ensures that the max_queues_per_vnic parameter is set to num_dispatcher_cores if the distribute_queues is enabled, else max_queues_per_vnic will be set to 1.

The migration routine ensures if the distribute_queues and num_dispatcher_cores value is set, then max_queues_per_vnic must be set to num_dispatcher_cores. If the distribute_queues is set and num_dispatcher_cores is not set, then the number of queues will be the dispatcher cores.

The following are the environment specific behavior upon setting the max_queues_per_vnic value to 0 (auto):


Mode	Description
OpenStack, AWS, KVM (DPDK mode)	The number of queues can be more than the dispatchers to utilize more than one queue per dispatcher.
Baremetal (DPDK mode)	The number of queues is the same as that of dispatchers. Utilizes one queue per dispatcher.
Azure, AWS (Non-DPDK mode)	The number of queues is the same as that of dispatchers. Utilizes one queue per dispatcher.

Note:

You need to enable se_image_property and hw_vif_multiqueue_enabled parameters in OpenStack to utilise max_queues_per_vnic. This ensures that the number of queues equals the number of vCPUs.


Code	Description
`se_dispatcher_cores`	Total number of SE cores handled by the dispatcher.
`g_num_queues_per_dispatcher`	Total number of queues handled by the dispatcher.
`g_num_queue_per_vnic`	Total number of queues per vNIC.

The max_queues_per_vnic value is derived from the service_engine group property where in the environments which use VIRTIO (Openstack, KVM, excluding GCP), the queue size is 256, and one dispatcher core that can handle all the queues of the VNIC. This variable is known as se_dispatcher_cores.

The vnic_owner on the interface of the Service Engine shows the Service Engine core that is assigned as the dispatcher core. The num_dispatcher_cores in the SE group property only shows the configuration value, the real run-time value (num_dispatcher_cpu) depends on many other factors, such as, the number of cores on the SE, the type of cloud, other configuration values such as RSS setting.

The num_dispatcher_cpu value is calculated as follows:

num_dispatcher_cpu = max_num_se_dps (number of cores on SE - default) - num_flow_cpu

Where,

max_num_se_dps is an optional field in SEgroup property. It takes the number of cores on SE as the default value.

In AWS, the ring size is 1024, and the number of queues equals the number of cores. Hence the queues are distributed across dispatcher cores.

This value is derived automatically based on environments where:

se_dispatcher_cores = max_queues_per_vnic/ g_num_queues_per_dispatcher

In VIRTIO (excluding GCP):

g_num_queues_per_dispatcher = max_queues_per_vnic

In AWS:

g_num_queues_per_dispatcher = max_queues_per_vnic/ num_dispatcher_cores_available

If se_dispatcher_cores is more than 1 in ipstk_drv_send, you will receive the queue number. However, you need to determine the core handling this queue.

The g_rss_queue_to_core_table value is populated during the init of the se_dp process. Indexing this array with the queue number will give the corresponding core, and the queue number is assigned to the m_rsshash field in the mbuf. Since the dispatcher handles multiple queues, it must know the queue number to send this packet out in the rte_eth_tx_burst.

If the number of dispatchers is equal to 1 in ipstk_drv_send, you get the queue number and assign the same in the m_rsshash field event, though the packet will be sent out only by one core, that is, the owner core. Ensure that the packets belonging to a particular flow will use the same queue, indirectly ensuring that all the queues handled by the owner core are uniformly loaded.

	`num_dispatcher_cores`	ENA	LSC	VIRTIO	Comments
Auto (0)	Auto(0) / N	Distributed	Distributed	Compact	If `num_dispatcher_cores` is N and `max_queues_per_vnic` is 0, then `max_queues_per_vnic` is also N.
N	Auto(0) / N	Distributed	Distributed	Compact

The number of queues per dispatcher will be indicated in show serviceengine <se> se_agent.

With migration routine, max_queues_per_vnic will be equal to num_dispatcher_cores (if num_dispatcher_cores > 0).

When max_queues_per_vnic is auto, you need to limit the maximum number of queues per dispatcher by first determining the least common maximum ring size and accordingly cap the maximum number of queues per dispatcher. The intention is to use enough number is queues to realize an aggregate ring size of 4096 whenever possible. This ensures that CPU resources are preserved, and optimal pps/ burst ability is achieved.

"ring-size-max-qs-per-disp": {
    "128": 32,
    "256": 16,
    "512": 8,
    "1024": 4,
    "2048": 2,
    "4096": 1,
    "8192": 1,
    "default": 1
}