VMware Cloud Foundation uses vSphere Distributed Switch for virtual networking.

Logical vSphere Networking Design for VMware Cloud Foundation

When you design vSphere networking, consider the configuration of the vSphere Distributed Switches, distributed port groups, and VMkernel adapters in the VMware Cloud Foundation environment.

vSphere Distributed Switch Design

The default cluster in a workload domain uses a single vSphere Distributed Switch with a configuration for system traffic types, NIC teaming, and MTU size.

VMware Cloud Foundation supports NSX Overlay traffic over a single vSphere Distributed Switch per cluster. Additional distributed switches are supported for other traffic types.

When using vSAN ReadyNodes, you must define the number of vSphere Distributed Switches at workload domain deployment time. You cannot add additional vSphere Distributed Switches post deployment.

Table 1. Configuration Options for vSphere Distributed Switch for VMware Cloud Foundation

vSphere Distributed Switch Configuration

Management Domain Options

VI Workload Domain Options

Benefits

Drawbacks

Single vSphere Distributed Switch for hosts with two physical NICs

  • One vSphere Distributed Switch for each cluster with all traffic using two uplinks.

  • One vSphere Distributed Switch for each cluster with all traffic using two uplinks.

Requires the least number of physical NICs and switch ports.

All traffic shares the same two uplinks.

Single vSphere Distributed Switch for hosts with four or six physical NICs

  • One vSphere Distributed Switch for each cluster with four uplinks by using the predefined profiles in the Deployment Parameters Workbook in VMware Cloud Builder to deploy the default management cluster.

  • One vSphere Distributed Switch for each cluster with four or six uplinks by using the VMware Cloud Builder API to deploy the default management cluster.

  • One vSphere Distributed Switch for each cluster with four or six uplinks.

  • Provides support for traffic separation across different uplinks.

  • You must provide additional physical NICs and switch ports.

Multiple vSphere Distributed Switches

  • Maximum two vSphere Distributed Switches by using the predefined profiles in the Deployment Parameters Workbook in VMware Cloud Builder to deploy the default management cluster.

  • Maximum 16 vSphere Distributed Switches per cluster. You use the VMware Cloud Builder API to deploy the default management cluster using combinations of vSphere Distributed Switches and physical NIC configurations that are not available as predefined profiles in the Deployment Parameters Workbook

  • You can use only one of the vSphere Distributed Switches for NSX overlay traffic.

  • Maximum 16 vSphere Distributed Switches per cluster.

  • You can use only one of the vSphere Distributed Switches for NSX overlay traffic.

  • Provides support for traffic separation across different uplinks or vSphere Distributed Switches.

  • Provides support for traffic separation onto different physical network fabrics.

  • You must provide additional physical NICs and switch ports.

  • More complex with additional configuration and management overhead.

Distributed Port Group Design

VMware Cloud Foundation requires several port groups on the vSphere Distributed Switch for a workload domain. The VMkernel adapters for the NSX host TEPs are connected to the host overlay network, but do not require a dedicated port group on the distributed switch. The VMkernel network adapter for NSX host TEP is automatically created when VMware Cloud Foundation configures the ESXi host as an NSX transport node.
Table 2. Distributed Port Group Configuration for VMware Cloud Foundation

Function

Teaming Policy

Configuration

  • VM management

  • Host management

  • vSphere vMotion
  • vSAN

  • NFS (not applicable for the default cluster of the management Domain)

Route based on physical NIC load.

Recommended.

  • Failover Detection: Link status only

  • Failback: Yes

    Occurs only on saturation of the active uplink.

  • Notify Switches: Yes

Recommended.

  • Host overlay

Not applicable.

Not applicable.

  • Edge uplinks and overlay

Use explicit failover order.

Required.

  • Edge RTEP (NSX Federation only)

Not applicable.

Not applicable.

VMkernel Network Adapter Design

The VMkernel networking layer provides connectivity to hosts and handles the system traffic for management, vSphere vMotion, vSphere HA, vSAN, NFS and others.

Table 3. VMkernel Adapters for Workload Domain Hosts

VMkernel Adapter Service

Connected Port Group

Activated Services

Recommended MTU Size (Bytes)

Management

Management Port Group

Management Traffic

1500 (Default)

vMotion

vMotion Port Group

vMotion Traffic

9000

vSAN

vSAN Port Group

vSAN

9000

NFS

NFS Port Group

NFS

9000

Host TEPs

Not applicable

Not applicable

9000

vSphere Distributed Switch Data Path Modes

vSphere Distributed Switch supports three data path modes - Standard Datapath, Enhanced Datapath Interrupt, and Enhanced Datapath. Data path is a networking stack mode that is configured on a vSphere Distributed Switch when an NSX Transport Node Profile is applied during the installation of NSX on an ESXi cluster. Each data path mode has performance characteristics that are suitable for a particular workload running on the cluster. The following table provides details of the various modes available in VMware Cloud Foundation, along with recommended cluster workload types for each mode.

Table 4. Data Path Modes for VMware Cloud Foundation

Data Path Mode Name

Description

Use Cases

Requirements

Standard

  • Standard data path is installed by default. It is designed to address applications with large flows.

  • CPU usage in Standard Datapath is on demand. For applications with high packet processing requirements, such as NSX Edge, Standard Datapath requires significant tuning.

Compute workload domains or clusters

The driver-firmware combination must be on VMware Compatibility Guide for I/O Devices and must support the following features:

  • Geneve Offload

  • Geneve Rx/Tx Filters or RSS

Enhanced Datapath Interrupt

(Referred to as Enhanced Datapath - Standard in NSX Manager UI)

  • Enhanced Datapath Interrupt is a performance-oriented data path that combines the flexibility of on-demand CPU usage of the existing standard data path and Data Plane Development Kit (DPDK) like features for performance.

  • This mode auto-scales up and down core usage for packet processing, based on need.

  • Enhanced Datapath Interrupt has proven performance characteristics, especially for smaller flows that primarily focus on packet processing such as NSX Edge, with no additional tuning required.

vSphere clusters running NSX Edge nodes

The driver-firmware combination must be on VMware Compatibility Guide for I/O Devices with Enhanced Data Path - Interrupt mode support

Enhanced Datapath

(Referred to Enhanced Datapath - Performance in NSX Manager UI)

  • Enhanced Datapath mode is a performance-oriented data path that leverages DPDK-like performance features, including dedicated CPU cores for networking data path processing.

  • This mode is best suited for workloads where the traffic patterns and the performance requirements are well defined.

  • This mode is fixed from a core allocation point of view and does not auto-scale up or down based on need. The cores assigned to the data path will not be available for workloads, even when there is no network traffic.

  • There is a possibility of not having enough cores for packet processing unless the cores are pre-assigned accurately sizing the workload.

Telco or NFV workloads

The driver-firmware combination must be on the VMware Compatibility Guide for I/O Devices with Enhanced data path - Poll mode support.

vSphere Networking Design Requirements and Recommendations for VMware Cloud Foundation

Consider the requirements and recommendations for vSphere networking in VMware Cloud Foundation, such as distributed port group configuration, MTU size, port binding, teaming policy, and traffic-specific network shares.

vSphere Networking Design Requirements for VMware Cloud Foundation

You must meet the following design requirements in your vSphere networking design for VMware Cloud Foundation.

Table 5. vSphere Networking Design Requirements for a Multi-Rack Compute VI Workload Domain Cluster for VMware Cloud Foundation

Requirement ID

Design Requirement

Justification

Implication

VCF-VDS-L3MR-REQD-CFG-001

For each rack, create a port group on the vSphere Distributed Switch for the cluster for for the following traffic types:

  • Host management

  • vSAN

  • vSphere vMotion

Required for using separate VLANs per rack.

None.

vSphere Networking Design Recommendations for VMware Cloud Foundation

In your vSphere networking design for VMware Cloud Foundation, you can apply certain best practices for vSphere Distributed Switch and distributed port groups.

Table 6. vSphere Networking Design Recommendations for VMware Cloud Foundation

Recommendation ID

Design Recommendation

Justification

Implication

VCF-VDS-RCMD-CFG-001

Use a single vSphere Distributed Switch per cluster.

  • Reduces the complexity of the network design.

Reduces the number of vSphere Distributed Switches that must be managed per cluster.

VCF-VDS-RCMD-CFG-002

Do not share a vSphere Distributed Switch across clusters.

  • Enables independent lifecycle management of vSphere Distributed Switch per cluster.

  • Reduces the size of the fault domain.

For multiple clusters, you manage more vSphere Distributed Switches .

VCF-VDS-RCMD-CFG-003

Configure the MTU size of the vSphere Distributed Switch to 9000 for jumbo frames.

  • Supports the MTU size required by system traffic types.

  • Improves traffic throughput.

When adjusting the MTU packet size, you must also configure the entire network path (VMkernel ports, virtual switches, physical switches, and routers) to support the same MTU packet size.

VCF-VDS-RCMD-DPG-001

Use ephemeral port binding for the VM management port group.

Using ephemeral port binding provides the option for recovery of the vCenter Server instance that is managing the distributed switch.

The VM management network is not required for a multi-rack compute-only cluster in a VI workload domain.

Port-level permissions and controls are lost across power cycles, and no historical context is saved.

VCF-VDS-RCMD-DPG-002

Use static port binding for all non-management port groups.

Static binding ensures a virtual machine connects to the same port on the vSphere Distributed Switch. This configuration provides support for historical data and port-level monitoring.

None.

VCF-VDS-RCMD-DPG-003

  • Use the Route based on physical NIC load teaming algorithm for the VM management port group.

  • VM management network is not required for a compute only L3 multi-rack deployment.

Reduces the complexity of the network design, increases resiliency, and can adjust to fluctuating workloads.

None.

VCF-VDS-RCMD-DPG-004

Use the Route based on physical NIC load teaming algorithm for the ESXi management port group.

Reduces the complexity of the network design, increases resiliency, and can adjust to fluctuating workloads.

None.

VCF-VDS-RCMD-DPG-005

Use the Route based on physical NIC load teaming algorithm for the vSphere vMotion port group.

Reduces the complexity of the network design, increases resiliency, and can adjust to fluctuating workloads.

None.

VCF-VDS-RCMD-DPG-006

Use the Route based on physical NIC load teaming algorithm for the vSAN port group.

Reduces the complexity of the network design, increases resiliency, and can adjust to fluctuating workloads.

None.

VCF-VDS-RCMD-NIO-001

Enable Network I/O Control on the vSphere Distributed Switch of the management domain cluster.

Do not enable Network I/O Control on dedicated vSphere clusters for NSX Edge nodes.

Increases resiliency and performance of the network.

Network I/O Control might impact network performance for critical traffic types if misconfigured.

VCF-VDS-RCMD-NIO-002

Set the share value for management traffic to Normal.

By keeping the default setting of Normal, management traffic is prioritized higher than vSphere vMotion but lower than vSAN traffic. Management traffic is important because it ensures that the hosts can still be managed during times of network contention.

None.

VCF-VDS-RCMD-NIO-003

Set the share value for vSphere vMotion traffic to Low.

During times of network contention, vSphere vMotion traffic is not as important as virtual machine or storage traffic.

During times of network contention, vMotion takes longer than usual to complete.

VCF-VDS-RCMD-NIO-004

Set the share value for virtual machines to High.

Virtual machines are the most important asset in the SDDC. Leaving the default setting of High ensures that they always have access to the network resources they need.

None.

VCF-VDS-RCMD-NIO-005

Set the share value for vSAN traffic to High.

During times of network contention, vSAN traffic needs guaranteed bandwidth to support virtual machine performance.

None.

VCF-VDS-RCMD-NIO-006

Set the share value for other traffic types to Low.

By default, VMware Cloud Foundation does not use other traffic types, like vSphere FT traffic. Hence, these traffic types can be set the lowest priority.

None.