The SD-WAN Gateway runs on a standard hypervisor (KVM or VMware ESXi).
Minimum Server Requirements
To run the hypervisor:
- CPU: Intel XEON (10 cores minimum to run a single 8-core gateway VM) with minimum clock speed of 2.0 Ghz is required to achieve maximum performance.
- ESXi vmxnet3 network scheduling functions must have 2 cores reserved per Gateway virtual machine (VM), regardless of the number of cores assigned to the Gateway.
- Example: Assume there is a 24-core server running ESXi+vmxnet3. You can deploy 2 - (8 core) Gateways. i.e. 2 gateways multiplied by 8 cores requires 16 cores reserved for gateway application and leaves 8 free cores. By using the formula above, in order to support these two Gateways running at peak performance scale the ESXi/vmxnet3 system requires an additional 4 cores (two cores for each of the two Gateways deployed). That is a total of 20 cores required to run 2 gateways on a 24 core system.
Note: When using SR-IOV, the network scheduling function is offloaded to the pNIC to achieve higher performance. However, the hypervisor must still perform other scheduling functions like CPU, memory, NUMA allocation management. It is required to always keep two free cores for hypervisor usage.
- Example: Assume there is a 24-core server running ESXi+vmxnet3. You can deploy 2 - (8 core) Gateways. i.e. 2 gateways multiplied by 8 cores requires 16 cores reserved for gateway application and leaves 8 free cores. By using the formula above, in order to support these two Gateways running at peak performance scale the ESXi/vmxnet3 system requires an additional 4 cores (two cores for each of the two Gateways deployed). That is a total of 20 cores required to run 2 gateways on a 24 core system.
- ESXi vmxnet3 network scheduling functions must have 2 cores reserved per Gateway virtual machine (VM), regardless of the number of cores assigned to the Gateway.
- The CPU must support and enable the following instruction sets: AES-NI, SSSE3, SSE4, RDTSC, RDSEED, RDRAND, AVX/AVX2/AVX512.
- A minimum of 4GB free RAM must be available to the server system aside from the memory assigned to the PGW VMs. One Gateway VM requires 16GB RAM, or 32GB RAM if certificate-based authentication is enabled.
- Minimum of 150GB magnetic or SSD based, persistent disk volume (One Gateway VM requires 64GB or 96GB Disk Volume, if certificate-based authentication is enabled).
- Minimum required IOPS performance: 200 IOPS.
- Minimum 1x10Ge network interface ports and 2 ports is preferred when enabling the Gateway partner hand-off interface (1Ge NICs are supported, but will bottleneck performance). The physical NIC cards supporting SR-IOV are Intel 82599/82599ES and Intel X710/XL710 chipsets. (See the ‘Enable SR-IOV’ guide).
Note: SR-IOV does not support NIC bonding. For redundant uplinks, use ESXi vSwitch.
- VMware SD-WAN Gateway is a data-plane intensive workload that requires dedicated CPU cycles to ensure optimal performance and reliability. Meeting these defined settings are required to ensure the Gateway VM is not oversubscribing the underlying hardware and causing actions that can destabilize the Gateway service (e.g. NUMA boundary crossing, memory, and/or vCPU oversubscription).
- Ensure that the SD-WAN Partner Gateway VM and the resources such as network interfaces, memory, physical CPUs used to support it fit within a single NUMA node.
-
Note: Configure the host BIOS settings as follows:
- Hyper-threading - Turned off
- Power Savings - Turned off
- CPU Turbo - Enabled
- AES-NI - Enabled
- NUMA Node Interleaving - Turned off
- Use ESXi host version: ESXi-6.7.0-14320388-standard or above
- Upgrade VM compatibility should be set before starting the SD-WAN Gateway instance
Example Server Specifications
NIC Chipset | Hardware | Specification |
---|---|---|
Intel 82599/82599ES | HP DL380G9 | http://www.hp.com/hpinfo/newsroom/press_kits/2014/ComputeEra/HP_ProLiantDL380_DataSheet.pdf |
Intel X710/XL710 | Dell PowerEdge R640 | https://www.dell.com/en-us/work/shop/povw/poweredge-r640
|
Intel X710/XL710 | Supermicro SYS-6018U-TRTP+ | https://www.supermicro.com/en/products/system/1U/6018/SYS-6018U-TRTP_.cfm
|
Required NIC Specifications for SR-IOV Support
Hardware Manufacturer | Firmware Version | Host Driver for Ubuntu 18.04 | Host Driver for ESXi 6.7 |
---|---|---|---|
Dual Port Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ | 7.0 | 2.10.19.30 | 1.8.6 and 1.10.9.0 |
Dual Port Intel Corporation Ethernet Controller X710 for 10GbE SFP+ | 7.0 | 2.10.19.30 | 1.8.6 and 1.10.9.0 |
Quad Port Intel Corporation Ethernet Controller X710 for 10GbE SFP+ | 7.0 | 2.10.19.30 | 1.8.6 and 1.10.9.0 |
Dell rNDC X710/350 card | nvm 7.10 and FW 19.0.12 | 2.10.19.30 | 1.8.6 and 1.10.9.0 |
Supported Hypervisor Versions
Hypervisor | Supported Versions |
---|---|
VMware |
|
KVM |
|
SD-WAN Gateway Virtual Machine (VM) Specification
- If using VMware ESXi:
- Latency Sensitivity must be set to 'High'.
- Procedure (Adjust Latency Sensitivity)
- Browse to the virtual machine in the vSphere Client.
- To find a virtual machine, select a data center, folder, cluster, resource pool, or host.
- Click the VMs tab.
- Right-click the virtual machine, and click Edit Settings.
- Click VM Options and click Advanced.
- Select a setting from the Latency Sensitivity drop-down menu.
- Click OK.
- Browse to the virtual machine in the vSphere Client.
- CPU reservation set to 100% .
- CPU shares set to high.
- CPU Limit must be set to Unlimited.
- 8 vCPUs (4vCPUs are supported but expect lower performance).
Important: All vCPU cores should be mapped to the same socket with the Cores per Socket parameter set to either 8 with 8 vCPUs, or 4 where 4 vCPUs are used.Note: Hyper-threading must be deactivated to achieve maximum performance.
- Procedure for Allocate CPU Resources:
- Click Virtual Machines in the VMware Host Client inventory.
- Right-click a virtual machine from the list and select Edit settings from the pop-up menu.
- On the Virtual Hardware tab, expand CPU, and allocate CPU capacity for the virtual machine.
Option Description Reservation Guaranteed CPU allocation for this virtual machine. Limit Upper limit for this virtual machine’s CPU allocation. Select Unlimited to specify no upper limit. Shares CPU shares for this virtual machine in relation to the parent’s total. Sibling virtual machines share resources according to their relative share values bounded by the reservation and limit. Select Low, Normal, or High, which specify share values respectively in a 1:2:4 ratio. Select Custom to give each virtual machine a specific number of shares, which express a proportional weight.
- Procedure (Adjust Latency Sensitivity)
- CPU affinity must be enabled. Follow the steps below.
- In the vSphere Web Client go to the VM Settings tab.
- Choose the Options tab and click Advanced General >Configuration Parameters .
- Add entries for numa.nodeAffinity=0, 1, ..., where 0 and 1 are the processor socket numbers.
- vNIC must be of type 'vmxnet3' (or SR-IOV, see SR-IOV section for support details).
- Minimum of any one of the following vNICs:
- The First vNIC is the public (outside) interface, which must be an untagged interface.
- The Second vNIC is optional and acts as the private (inside) interface that can support VLAN tagging dot1q and Q-in-Q. This interface typically faces the PE router or L3 switch.
- Optional vNIC (if a separate management/OAM interface is required).
- Memory reservation is set to ‘maximum.’
- 16GB of memory (32GB RAM is required when enabling certificate-based authentication).
- 64 GB of virtual disk (96GB disk is required when enabling certificate- based authentication).
Note: VMware uses the above defined settings to obtain scale and performance numbers. Settings that do not align to the above requirements are not tested by VMware and can yield unpredictable performance and scale results
- Latency Sensitivity must be set to 'High'.
- If using KVM:
- vNIC must be of 'Linux Bridge' type. (SR-IOV is required for high performance, see SR-IOV section for support details).
- 8 vCPUs (4vCPUs are supported but expect lower performance).
Important: All vCPU cores should be mapped to the same socket with the Cores per Socket parameter set to either 8 with 8 vCPUs, or 4 where 4 vCPUs are used.Note: Hyper-threading must be deactivated to achieve maximum performance.
- 16GB of memory (32GB RAM is required when enabling certificate- based authentication)
- Minimum of any one of the following vNICs:
- The First vNIC is the public (outside) interface, which must be an untagged interface.
- The Second vNIC is optional and acts as the private (inside) interface that can support VLAN tagging dot1q and Q-in-Q. This interface typically faces the PE router or L3 switch.
- Optional vNIC (if a separate management/OAM interface is required).
- 64 GB of virtual disk (96GB disk is required when enabling certificate- based authentication).
Firewall/NAT Requirements
- The firewall needs to allow outbound traffic from the SD-WAN Gateway to TCP/443 (for communication with SD-WAN Orchestrator).
- The firewall needs to allow inbound traffic from the Internet to UDP/2426 (VCMP), UDP/4500, and UDP/500. If NAT is not used, then the firewall needs to also allow IP/50 (ESP).
- If NAT is used, the above ports must be translated to an externally reachable IP address. Both the 1:1 NAT and port translations are supported.
Git Repository with Templates and Samples
The following Git repository contains templates and samples.
git clone https://gitlab.eng.vmware.com/velocloud/velocloud.src.git
Use of DPDK on SD-WAN Gateways
To improve packet throughput performance, SD-WAN Gateways take advantage of Data Plane Development Kit (DPDK) technology. DPDK is a set of data plane libraries and drivers provided by Intel for offloading TCP packet processing from the operating system kernel to processes running in user space and results in higher packet throughput. For more details, see https://www.dpdk.org/.
On VMware hosted Gateways and Partner Gateways, DPDK is used on interfaces that manage data plane traffic and is not used on interfaces reserved for management plane traffic. For example, on a typical VMware hosted Gateway, eth0 is used for management plane traffic and would not use DPDK. In contrast, eth1, eth2, and eth3 are used for data plane traffic and use DPDK.