Design of the physical data center network includes defining the network topology for connecting the physical switches and the ESXi hosts, determining switch port settings for VLANs and link aggregation, and designing routing.

A software-defined network (SDN) both integrates with and uses components of the physical data center. SDN integrates with your physical network to support east-west transit in the data center and north-south transit to and from the SDDC networks.

Several typical data center network deployment topologies exist:

  • Core-Aggregation-Access
  • Leaf-Spine
  • Hardware SDN

VMware Validated Design uses the leaf-spine networking topology, because in a single data center deployment, it provides predictable performance, scalable nature, and applicability across multiple vendors.

In an environment with multiple availability zones, Layer 2 networks must be stretched between the availability zones by the physical infrastructure. You also must provide a Layer 3 gateway that is highly available between availability zones. The method for stretching these Layer 2 networks and providing a highly available Layer 3 gateway is vendor-specific.

In an environment with multiple availability zones or regions, dynamic routing is needed to provide networks the ability to fail ingress and egress of traffic from availability zone to availability zone, or region to region. This design uses BGP as the dynamic routing protocol. As such, BGP must be present in the customer environment to facilitate the failover of networks from site to site. Because of the complexity of local-ingress, local-egress is not generally in use. In this design, network traffic flows in and out of a primary site.

Switch Types and Network Connectivity

Follow the best practices for physical switches, switch connectivity, VLANs and subnets, and access port settings.

Figure 1. Host-to-ToR Connectivity


Table 1. Design Components for Physical Switches in the SDDC

Design Component

Configuration Best Practices

Top of rack (ToR) physical switches

  • Configure redundant physical switches to enhance availability.

  • Configure switch ports that connect to ESXi hosts manually as trunk ports.

  • Modify the Spanning Tree Protocol (STP) on any port that is connected to an ESXi NIC to reduce the time to transition ports over to the forwarding state, for example using the Trunk PortFast feature found in a Cisco physical switch.

  • Provide DHCP or DHCP Helper capabilities on all VLANs used by TEP VMkernel ports. This setup simplifies the configuration by using DHCP to assign IP address based on the IP subnet in use.

  • Configure jumbo frames on all switch ports, inter-switch link (ISL), and switched virtual interfaces (SVIs).

Top of rack connectivity and network settings

Each ESXi host is connected redundantly to the top of rack switches SDDC network fabric by two 25 GbE ports. Configure the top of rack switches to provide all necessary VLANs using an 802.1Q trunk. These redundant connections use features in vSphere Distributed Switch and NSX-T to guarantee that no physical interface is overrun and available redundant paths are used.

VLANs and Subnets for Single Region and Single Availability Zone

Each ESXi host uses VLANs and corresponding subnets.

Follow these guidelines:

  • Use only /24 subnets to reduce confusion and mistakes when handling IPv4 subnetting.

  • Use the IP address .254 as the (floating) interface with .252 and .253 for Virtual Router Redundancy Protocol (VRPP) or Hot Standby Routing Protocol (HSRP).

  • Use the RFC1918 IPv4 address space for these subnets and allocate one octet by region and another octet by function.

Note:

Implement VLAN and IP subnet configuration according to the requirements of your organization.

Table 2. VLANs and IP Ranges in This Design

Function

VLAN ID

IP Range

Management

1611

172.16.11.0/24

vSphere vMotion

1612

172.16.12.0/24

vSAN

1613

172.16.13.0/24

Host Overlay

1614

172.16.14.0/24

NFS

1615

172.16.15.0/24

Uplink01

2711

172.27.11.0/24

Uplink02

2712

172.27.12.0/24

Edge Overlay

2713

172.27.13.0/24

VLANs and Subnets for Multiple Available Zones

In an environment with multiple availability zones, you can apply this configuration. The management, Uplink 01, Uplink 02, and Edge Overlay networks in each availability zone must be stretched to facilitate failover of the NSX-T Edge appliances between availability zones. The Layer 3 gateway for the management and Edge Overlay networks must be highly available across the availability zones.

Function

Availability Zone 1

Availability Zone 2

VLAN ID

IP Range

HA Layer 3 Gateway

Management

1611 (Stretched)

172.16.11.0/24

vSphere vMotion

X

1612

172.16.12.0/24

vSAN

X

1613

172.16.13.0/24

Host Overlay

X

1614

172.16.14.0/24

Uplink01

2711 (Stretched)

172.27.11.0/24

X

Uplink02

2712 (Stretched)

172.27.12.0/24

X

Edge Overlay

2713 (Stretched)

172.27.13.0/24

Management

X

1621

172.16.21.0/24

vSphere vMotion

X

1622

172.16.22.0/24

vSAN

X

1623

172.16.23.0/24

Host Overlay

X

1624

172.16.24.0/24

Physical Network Requirements

Physical requirements determine the MTU size for networks that carry overlay traffic, dynamic routing support, time synchronization through an NTP server, and forward and reverse DNS resolution.

Requirement

Comment

Use 25 GbE (10 GbE minimum) port on each ToR switch for ESXi host uplinks. Connect each host to two ToR switches.

25 GbE provides required bandwidth for hyperconverged networking traffic. Connection to two ToR switches provides redundant physical network paths to each host.

Provide an MTU size of 1,700 bytes or greater on any network that carries Geneve overlay traffic.

Geneve packets cannot be fragmented. The MTU size must be large enough to support extra encapsulation overhead.

Geneve is an extensible protocol, therefore the MTU size might increase with future capabilities. While 1,600 is sufficient, an MTU size of 1,700 bytes provides more room for increasing the Geneve MTU without the need to change physical infrastructure MTU.

This design uses an MTU size of 9,000 bytes for Geneve traffic.

Enable BGP dynamic routing support on the upstream Layer 3 devices.

You use BGP on the upstream Layer 3 devices to establish routing adjacency with the Tier-0 service routers (SRs). NSX-T supports only the BGP routing protocol.

Dynamic routing enables ECMP failover for upstream connectivity.

BGP Autonomous System Number (ASN) allocation

A BGP ASN must be allocated for the NSX-T SDN. Use a private ASN according to RFC1930.

Physical Network Design Decisions

The physical network design decisions determine the physical layout and use of VLANs. They also include decisions on jumbo frames, and on network-related requirements such as DNS and NTP.

Table 3. Design Decisions on the Physical Network Infrastructure for NSX-T Data Center

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-001

Use two ToR switches for each rack.

Supports the use of two 10 GbE (25 GbE or greater recommended) links to each server and provides redundancy and reduces the overall design complexity.

Requires two ToR switches per rack which can increase costs.

SDDC-MGMT-VI-SDN-002

Implement the following physical network architecture:

  • One 25 GbE (10 GbE minimum) port on each ToR switch for ESXi host uplinks.

  • Layer 3 device that supports BGP.

  • Provides availability during a switch failure.

  • Provides support for BGP as the only dynamic routing protocol that is supported by NSX-T Data Center.

  • Might limit the hardware choices.

  • Requires dynamic routing protocol configuration in the physical network

SDDC-MGMT-VI-SDN-003

Do not use EtherChannel (LAG, LACP, or vPC) configuration for ESXi host uplinks

  • Simplifies configuration of top of rack switches.

  • Teaming options available with vSphere Distributed Switch and N-VDS provide load balancing and failover.

  • EtherChannel implementations might have vendor-specific limitations.

None.

SDDC-MGMT-VI-SDN-004

Use a physical network that is configured for BGP routing adjacency

  • Supports flexibility in network design for routing multi-site and multi-tenancy workloads.

  • BGP is the only dynamic routing protocol that is supported by NSX-T.

  • Supports failover between ECMP Edge uplinks.

Requires BGP configuration in the physical network.

Access Port Network Settings

Configure additional network settings on the access ports that connect the ToR switches to the corresponding servers.

Table 4. Access Port Network Configuration

Setting

Value

Spanning Tree Protocol (STP)

Although this design does not use the Spanning Tree Protocol, switches usually include STP configured by default. Designate the access ports as trunk PortFast.

Trunking

Configure the VLANs as members of a 802.1Q trunk. Optionally, the management VLAN can act as the native VLAN.

MTU

  • Set the MTU for management VLANs and SVIs to 1,500 bytes.
  • Set the MTU for vSphere vMotion, vSAN, NFS, uplinks, host overlay, and edge overlay VLANs and SVIs to 9,000 bytes.

DHCP Helper

Configure a DHCP helper (sometimes called a DHCP relay) on all TEP VLANs. Set the DHCP helper (relay) to point to a DHCP server by IPv4 address.

Table 5. Design Decisions on Access Ports for NSX-T Data Center

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-005

Assign persistent IP configurations to each management component in the SDDC with the exception for NSX-T tunnel endpoints (TEPs) that use dynamic IP allocation.

Ensures that endpoints have a persistent management IP address.

In VMware Cloud Foundation, you assign storage (vSAN and NFS) and vSphere vMotion IP configurations by using user-defined network pools.

Requires precise IP address management.

SDDC-MGMT-VI-SDN-006

Set the lease duration for the TEP DHCP scope to at least 7 days.

NSX-T TEPs are assigned by using a DHCP server.

  • NSX-T TEPs do not have an administrative endpoint. As a result, they can use DHCP for automatic IP address assignment. If you must change or expand the subnet, changing the DHCP scope is simpler than creating an IP pool and assigning it to the ESXi hosts.

  • DHCP simplifies the configuration of default gateway for TEP if hosts within same cluster are on separate Layer 2 domains.

Requires configuration and management of a DHCP server.

SDDC-MGMT-VI-SDN-007

Use VLANs to separate physical network functions.

  • Supports physical network connectivity without requiring many NICs.

  • Isolates the different network functions of the SDDC so that you can have differentiated services and prioritized traffic as needed.

Requires uniform configuration and presentation on all the trunks that are made available to the ESXi hosts.

Jumbo Frames

IP storage throughput can benefit from the configuration of jumbo frames. Increasing the per-frame payload from 1500 bytes to the jumbo frame setting improves the efficiency of data transfer. You must configure jumbo frames end-to-end. Select an MTU that matches the MTU of the physical switch ports.

  • According to the purpose of the workload, determine whether to configure jumbo frames on a virtual machine. If the workload consistently transfers large amounts of network data, configure jumbo frames, if possible. In that case, confirm that both the virtual machine operating system and the virtual machine NICs support jumbo frames.

  • Using jumbo frames also improves the performance of vSphere vMotion.

  • The Geneve overlay requires an MTU value of 1600 bytes or greater.

Table 6. Design Decisions on the Jumbo Frames for NSX-T Data Center

Decision ID

Decision Decision

Decision Justification

Decision Implication

SDDC-MGMT-VI-SDN-008

Set the MTU size to at least 1, 700 bytes (recommended 9,000 bytes for jumbo frames) on the physical switch ports, vSphere Distributed Switches, vSphere Distributed Switch port groups, and N-VDS switches that support the following traffic types:

  • Overlay (Geneve)

  • vSAN

  • vSphere vMotion

  • NFS

  • Improves traffic throughput.

  • Supports Geneve by increasing the MTU size to a minimum of 1,600 bytes.

  • Geneve is an extensible protocol. The MTU size might increase with future capabilities. While 1,600 is sufficient, an MTU size of 1,700 bytes provides more room for increasing the Geneve MTU size without the need to change the MTU size of the physical infrastructure.

When adjusting the MTU packet size, you must also configure the entire network path (VMkernel network adapters, virtual switches, physical switches, and routers) to support the same MTU packet size.

Networking for Multiple Availability Zone

Specific requirements for the physical data center network exist for a topology with multiple availability zones. These requirements extend those for an environment with a single availability zone.
Table 7. Physical Network Requirements for Multiple Availability Zone

Component

Requirement

MTU

  • VLANs that are stretched between availability zones must meet the same requirements as the VLANs for intra-zone connection including the MTU size.

  • MTU value must be consistent end-to-end including components on the inter zone networking path.

  • Set the MTU for management VLANs and SVIs to 1,500 bytes.
  • Set the MTU for vSphere vMotion, vSAN, NFS, uplinks, host overlay, and edge overlay VLANs and SVIs to 9,000 bytes.

Layer 3 gateway availability

For VLANs that are are stretched between available zones, configure data center provided method, for example, VRRP or HSRP, to failover the Layer 3 gateway between availability zones.

DHCP availability

For VLANs that are stretched between availability zones, provide high availability for the DHCP server so that a failover operation of a single availability zone will not impact DHCP availability.

BGP routing

Each availability zone data center must have its own Autonomous System Number (ASN).

Ingress and egress traffic

  • For VLANs that are stretched between availability zones, traffic flows in and out of a single zone. Local egress is not supported.

  • For VLANs that are not stretched between availability zones, traffic flows in and out of the zone where the VLAN is located.

  • For NSX-T virtual network segments that are stretched between regions, trafficflows in and out of a single availability zone. Local egress is not supported.

Latency

  • Maximum network latency between NSX-T Managers is 10 ms.

  • Maximum network latency between the NSX-T Manager cluster and transport nodes is 150 ms.

Table 8. Design Decisions on the Physical Network for Multiple Available Zones for NSX-T Data Center

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-009

Set the MTU size to at least 1,700 bytes (recommended 9,000 bytes for jumbo frames) on physical inter- availability zone networking components which are part of the networking path between availability zones for the following traffic types.

  • Overlay (Geneve)

  • vSAN

  • vSphere vMotion

  • NFS

  • Improves traffic throughput.

  • Geneve packets are tagged as do not fragment.

  • For optional performance, provides a consistent MTU size across the environment.

  • Geneve is an extensible protocol. The MTU size might increase with future capabilities. While 1,600 is sufficient, an MTU size of 1,700 bytes provides more room for increasing the Geneve MTU size without the need to change the MTU size of the physical infrastructure.

When adjusting the MTU packet size, you must also configure the entire network path (VMkernel ports, virtual switches, physical switches, and routers) to support the same MTU packet size.

In multi-AZ deployments, the MTU must be configured on the entire network path between AZs.

SDDC-MGMT-VI-SDN-010

Configure VRRP, HSRP, or another Layer 3 gateway availability method for these networks.

  • Management
  • Edge Overlay

Ensures that the VLANs that are stretched between availability zones are connected to a highly- available gateway if a failure of an availability zone occurs. Otherwise, a failure in the Layer 3 gateway will cause disruption in the traffic in the SDN setup.

Requires configuration of a high availability technology for the Layer 3 gateways in the data center.