NSX-T Edge clusters are pools of capacity for NSX-T service router and load balancing functions.

North - South Routing

The routing design considers different levels of routing in the environment, such as number and type of NSX-T gateways, dynamic routing protocol, and so on. At each level, you apply a set of principles for designing a scalable routing solution.

Routing can be defined in the following directions:

  • North-south traffic is traffic leaving or entering the NSX-T domain, for example, a virtual machine on an overlay network communicating with an end-user device on the corporate network.

  • East-west traffic is traffic that remains in the NSX-T domain, for example, two virtual machines on the same or different segments communicating with each other.

As traffic flows north-south, edge nodes can be configured to pass traffic in an active-standby or an active-active model, where active-active can scale up to 8 active nodes. NSX-T service routers (SRs) for north-south routing are configured an active-active equal-cost multi-path (ECMP) mode that supports route failover of Tier-0 gateways in seconds.

Table 1. Features of Active-Active and Active-Standby SRs

Design Component

Active-Active

Active-Standby

Comment

Bandwidth per node

0

0

Bandwidth per node is the same because it is independent of the Tier- 0 gateway failover model.

Total aggregate bandwidth

↑↑↑↑

0

  • The active-active mode can support up to 8 NSX-T Edge nodes per northbound SR.

  • The active-standby mode is limited to a single node.

Availability

0

With up to 8 active-active NSX-T Edge nodes, availability can be as high as N+7, while for the active-standby mode it is N+1.

Failover Time

0

0

Both are capable of sub-second failover with use of BFD.

Routing Protocol Support

0

The active-active mode requires BGP for ECMP failover.

Figure 1. Dynamic Routing in a Single Availability Zone

The NSX-T Edge two-node cluster manages a Tier-0 gateway and Tier-1 gateway. The gateway functionality is distributed between distributed routers on the ESXi transport nodes and the distributed and service routers on the NSX-T Edge nodes. The routing protocol between Tier-0 gateway provide routing is BGP with ECMP.
Table 2. Design Decisions on the High Availability Mode of Tier-0 Gateways

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-037

Deploy an active-active Tier-0 gateway.

Supports ECMP north-south routing on all Edge nodes in the NSX-T Edge cluster.

Active-active Tier-0 gateways cannot provide stateful services such as NAT.

Table 3. Design Decisions on Edge Uplink Configuration for North-South Routing

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-038

To enable ECMP between the Tier-0 gateway and the Layer 3 devices (ToR switches or upstream devices), create two VLANs .

The ToR switches or upstream Layer 3 devices have an SVI on one of the two VLANS and each Edge node in the cluster has an interface on each VLAN.

Supports multiple equal-cost routes on the Tier-0 gateway and provides more resiliency and better bandwidth use in the network.

Additional VLANs are required.

SDDC-MGMT-VI-SDN-039

Assign a named teaming policy to the VLAN segments to the Layer 3 device pair.

Pins the VLAN traffic on each segment to its target Edge node interface. From there the traffic is directed to the host physical NIC that is connected to the target top of rack switch.

None.

SDDC-MGMT-VI-SDN-040

Create a VLAN transport zone for Edge uplink traffic.

Enabled the configuration of VLAN segments on the N-VDS in the Edge nodes.

Additional VLAN transport zones are required if the edge nodes are not connected to the same top of rack switch pair.
Table 4. Design Decisions on Dynamic Routing

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-041

Use BGP as the dynamic routing protocol.

  • Enables the dynamic routing by using NSX-T. NSX-T supports only BGP.

  • SDDC architectures with multiple availability zones or multiple regions architectures require BGP.

In environments where BGP cannot be used, you must configure and manage static routes.

SDDC-MGMT-VI-SDN-042

Configure the BGP Keep Alive Timer to 4 and Hold Down Timer to 12 between the top of tack switches and the Tier-0 gateway.

Provides a balance between failure detection between the top of rack switches and the Tier-0 gateway, and overburdening the top of rack switches with keep-alive traffic.

By using longer timers to detect if a router is not responding, the data about such a router remains in the routing table longer. As a result, the active router continues to send traffic to a router that is down.

SDDC-MGMT-VI-SDN-043

Do not enable Graceful Restart between BGP neighbors.

Avoids loss of traffic.

On the Tier-0 gateway, BGP peers from all the gateways are always active. On a failover, the Graceful Restart capability increases the time a remote neighbor takes to select an alternate Tier-0 gateway. As a result, BFD-based convergence is delayed.

None.

SDDC-MGMT-VI-SDN-044

Enable helper mode for Graceful Restart mode between BGP neighbors.

Avoids loss of traffic.

During a router restart, helper mode works with the graceful restart capability of upstream routers to maintain the forwarding table which in turn will forward packets to a down neighbor even after the BGP timers have expired causing loss of traffic.

None.

SDDC-MGMT-VI-SDN-045

Enable Inter-SR iBGP routing.

In the event that an edge node as all of its northbound eBGP sessions are down, north-south traffic will continue to flow by routing traffic to a different Edge node.

None.

Intra-SDN Routing

Gateways are needed to provide routing between logical segments created in the NSX-T based SDN. Logical segments can be connected directly to a Tier-0 or Tier-1 gateway.

Table 5. Design Decisions on Tier-1 Gateway Configuration

Decision ID

Design Decision

Design Implication

Design Justification

SDDC-MGMT-VI-SDN-046

Deploy a Tier-1 gateway and connect it to the Tier-0 gateway.

Creates a two-tier routing architecture.

A Tier-1 gateway can only be connected to a single Tier-0 gateway.

In cases where multiple Tier-0 gateways are required, you must create multiple Tier-1 gateways.

SDDC-MGMT-VI-SDN-047

Deploy a Tier-1 gateway to the NSX-T Edge cluster.

Enables stateful services, such as load balancers and NAT, for SDDC management components.

Because a Tier-1 gateway always works in active-standby mode, the gateway supports stateful services.

None.

SDDC-MGMT-VI-SDN-048

Deploy a Tier-1 gateway in non-preemptive failover mode.

Ensures that after a failed NSX-T Edge transport node is back online, it does not take over the gateway services thus causing a short service outage.

None.

Dynamic Routing in Multiple Availability Zones

In an environment with multiple availability zones, plan for failover of the NSX-T Edge nodes and configuring BGP so that traffic from the top of rack switches is directed to Availability Zone 1 unless a failure in Availability Zone 1 occurs.
Figure 2. Dynamic Routing in Multiple Availability Zones

When using two availability zones, the NSX-T Edge two-node cluster manages a Tier-0 gateway and Tier-1 gateway. The gateway functionality is distributed between distributed routers on the ESXi transport nodes and the distributed and service routers on the NSX-T Edge nodes. The routing protocol between Tier-0 gateway provide routing is BGP with ECMP.
Table 6. Design Decisions on North-South Routing for Multiple Availability Zones

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-049

When you have two availability zones, extend the uplink VLANs to the top of rack switches so that the VLANs are stretched between both availability zones.

Because the NSX-T Edge nodes will fail over between the availability zones, ensures uplink connectivity to the top of rack switches in both availability zones regardless of the zone the NSX-T Edge nodes are presently in.

You must configure a stretched Layer 2 network between the availability zones by using physical network infrastructure.

SDDC-MGMT-VI-SDN-050

When you have two availability zones, provide this SVI configuration on the top of the rack switches or upstream Layer 3 devices.

  • In Availability Zone 2, configure the top of rack switches or upstream Layer 3 devices with an SVI on each of the two uplink VLANs.

  • Make the top of rack switch SVI in both availability zones part of a common stretched Layer 2 network between the availability zones.

Enables the communication of the NSX-T Edge nodes to both the top of rack switches in both availability zones over the same uplink VLANs.

You must configure a stretched Layer 2 network between the availability zones by using the physical network infrastructure.

SDDC-MGMT-VI-SDN-051

When you have two availability zones, provide this VLAN configuration.

  • Use two VLANs to enable ECMP between the Tier-0 gateway and the Layer 3 devices (top of rack switches or upstream devices).

  • The ToR switches or upstream Layer 3 devices have an SVI to one of the two VLANS and each NSX-T Edge node has an interface to each VLAN.

Supports multiple equal-cost routes on the Tier-0 gateway, and provides more resiliency and better bandwidth use in the network.

Extra VLANs are required.

Requires stretching uplink VLANs between Availability zones

SDDC-MGMT-VI-SDN-052

Create an IP prefix list that permits access to route advertisement by any network instead of using the default IP prefix list.

Used in a route map to prepend a path to one or more autonomous system (AS-path prepend) for BGP neighbors in Availability Zone 2.

You must manually create an IP prefix list that is identical to the default one.

SDDC-MGMT-VI-SDN-053

Create a route map-out that contains the custom IP prefix list and an AS-path prepend value set to the Tier-0 local AS added twice.

  • Used for configuring neighbor relationships with the Layer 3 devices in Availability Zone 2.

  • Ensures that all ingress traffic passes through Availability Zone 1.

You must manually create the route map.

The two NSX-T Edge nodes will route north-south traffic through Availability Zone 2 only if the connection to their BGP neighbors in Availability Zone 1 is lost, for example, if a failure of the top of the rack switch pair or in the availability zone occurs.

SDDC-MGMT-VI-SDN-054

Create an IP prefix list that permits access to route advertisement by network 0.0.0.0/0 instead of using the default IP prefix list.

Used in a route map to configure local-reference on learned default-route for BGP neighbors in Availability Zone 2.

You must manully create an IP prefix list that is identical to the default one.

SDDC-MGMT-VI-SDN-055

Apply a route map-in that contains the IP prefix list for the default route 0.0.0.0/0 and assign a lower local-preference , for example, 80, to the learned default route and a lower local-preference, for example, 90 any routes learned.

  • Used for configuring neighbor relationships with the Layer 3 devices in Availability Zone 2.

  • Ensures that all egress traffic passes through Availability Zone 1.

You must manually create the route map.

The two NSX-T Edge nodes will route north-south traffic through Availability Zone 2 only if the connection to their BGP neighbors in Availability Zone 1 is lost, for example, if a failure of the top of the rack switch pair or in the availability zone occurs.

SDDC-MGMT-VI-SDN-056

Configure Availability Zone 2 neighbors to use the route maps as In and Out filters respectively.

Makes the path in and out of Availability Zone 2 less preferred because the AS path is longer. As a result, all traffic passes through Availability Zone 1.

The two NSX-T Edge nodes will route north-south traffic through Availability Zone 2 only if the connection to their BGP neighbors in Availability Zone 1 is lost, for example, if a failure of the top of the rack switch pair or in the availability zone occurs.

Load Balancers

The logical load balancer in NSX-T Data Center offers high-availability service for applications and distributes the network traffic load among multiple servers.

Because it is a stateful service, the load balancer is instantiated in a Tier-1 gateway.

Table 7. Design Decisions on Load Balancer Configuration

Decision ID

Design Decision

Design Justification

Design Implication

SDDC-MGMT-VI-SDN-057

Deploy a standalone Tier-1 gateway to support advanced stateful services such as load balancing for other management components.

Provides independence between north-south Tier-1 gateways to support advanced deployment scenarios.

You must add a separate Tier-1 gateway.

SDDC-MGMT-VI-SDN-058

Connect the standalone Tier-1 gateway to the cross-region virtual network.

You connect the Tier-1 gateway manually to the networks it provides services on.

For information on the virtual network segment configuration, see Virtual Network Segment Design for NSX-T for the Management Domain.

You must connect the gateway to each network that requires load balancing.

SDDC-MGMT-VI-SDN-059

Configure the standalone Tier-1 gateway with static routes to the gateways of the networks it is connected to.

Because the Tier-1 gateway is standalone, it does not autoconfigure it routes and you must configure it manually with static routes.

You must configure the gateway for each network that requires load balancing.